Comparing bae1a70307..202ef9b8eb - mesa

fran/mesa

Author	SHA1	Message	Date
Chad Versace	202ef9b8eb	egl/android: Mark surface as lost when dequeueBuffer fails This ensures that future calls to eglSwapBuffers and eglMakeCurrent emit an error. This patch is part of a series for fixing android.hardware.camera2.cts.RobustnessTest#testAbandonRepeatingRequestSurface on Chrome OS x86 devices. Cc: Tomasz Figa <tfiga@chromium.org> Cc: Nicolas Boichat <drinkcat@chromium.org> Cc: Tapani Pälli <tapani.palli@intel.com>	2017-05-03 16:46:03 -07:00
Chad Versace	58bfeb4ef2	egl/android: Cancel any outstanding ANativeBuffer in surface destructor That is, call ANativeWindow::cancelBuffer in droid_destroy_surface(). This should prevent application deadlock when the app destroys the EGLSurface after EGL has acquired a buffer from SurfaceFlinger (ANativeWindow::dequeueBuffer) but before EGL has released it (ANativeWindow::enqueueBuffer). This patch is part of a series for fixing android.hardware.camera2.cts.RobustnessTest#testAbandonRepeatingRequestSurface on Chrome OS x86 devices. Cc: Tomasz Figa <tfiga@chromium.org> Cc: Nicolas Boichat <drinkcat@chromium.org> Cc: Tapani Pälli <tapani.palli@intel.com>	2017-05-03 16:46:01 -07:00
Chad Versace	9c2b74ba2a	egl: Emit error when EGLSurface is lost Add a new bool, _EGLSurface::Lost, and check it in eglMakeCurrent and eglSwapBuffers. The EGL 1.5 spec says that those functions emit errors when the native surface is no longer valid. This patch just updates core EGL. No driver sets _EGLSurface::Lost yet. I discovered that Mesa failed to detect lost surfaces while debugging an Android CTS camera test, android.hardware.camera2.cts.RobustnessTest#testAbandonRepeatingRequestSurface. This patch doesn't fix the test though, though, because the test expects EGL_BAD_SURFACE when the surface becomes lost, and this patch actually complies with the EGL spec. If I interpreted the EGL spec correctly, EGL_BAD_NATIVE_WINDOW or EGL_BAD_CURRENT_SURFACE is the correct error. Cc: Tomasz Figa <tfiga@chromium.org> Cc: Nicolas Boichat <drinkcat@chromium.org> Cc: Tapani Pälli <tapani.palli@intel.com>	2017-05-03 16:45:48 -07:00
Jason Ekstrand	4201cc2dd3	anv: Implement VK_KHX_external_semaphore_fd This implementation allocates a 4k BO for each semaphore that can be exported using OPAQUE_FD and uses the kernel's already-existing synchronization mechanism on BOs. Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-05-03 15:09:46 -07:00
Jason Ekstrand	ef2e427d78	anv: Pull the guts of cmd_buffer_execbuf into a helper Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-05-03 15:09:46 -07:00
Jason Ekstrand	975c0f339f	anv: Implement VK_KHX_external_semaphore Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-05-03 15:09:46 -07:00
Jason Ekstrand	298e054d0c	anv: Implement VK_KHX_external_semaphore_capabilities This just stubs things out. Real external semaphore support will come with VK_KHX_external_semaphore_fd. Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-05-03 15:09:46 -07:00
Jason Ekstrand	65aa89e75f	anv: Add a real semaphore struct It's just a dummy for now, but we'll flesh it out as needed for external semaphores. Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-05-03 15:09:46 -07:00
Marek Olšák	f466683cb0	radeonsi/gfx9: fix gl_ViewportIndex v2: remove unnecessary LLVMBuildAnd calls Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-03 22:58:27 +02:00
Marek Olšák	ec34632859	radeonsi/gfx9: set VGT_REUSE_OFF = 0 same as Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-03 22:58:27 +02:00
Christian Gmeiner	a8007ed687	etnaviv: add L8A8_UNORM texture format No piglit regressions. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>	2017-05-03 22:43:10 +02:00
Andres Gomez	e4ae4d2789	glsl: Corrected some typos and error messages v2: left code style/formatting corrections out. Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-05-03 23:18:00 +03:00
Grazvydas Ignotas	8aab792e92	radv: don't leak DRM devices After successful drmGetDevices2() call, drmFreeDevices() needs to be called. Fixes: `743315f2` "radv: do not open random render node(s)" Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-05-03 22:04:52 +03:00
Grazvydas Ignotas	898cbb491b	radv: fix possible stack corruption drmGetDevices2 takes count and not size. Probably hasn't caused problems yet in practice and was missed as setups with more than 8 DRM devices are not very common. Fixes: `743315f2` "radv: do not open random render node(s)" Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-05-03 22:02:45 +03:00
Marek Olšák	b08715499e	ac: eliminate duplicated VS exports Only very few shaders have them (from 48486 shaders): shaders/private/left_4_dead_2/765.shader_test - ac: 1 matches 2 shaders/private/left_4_dead_2/877.shader_test - ac: 1 matches 6 shaders/private/left_4_dead_2/2141.shader_test - ac: 1 matches 6 shaders/private/ue4_effects_cave/11.shader_test - ac: 4 matches 5 shaders/private/ue4_effects_cave/14.shader_test - ac: 5 matches 6 shaders/private/ue4_effects_cave/46.shader_test - ac: 5 matches 6 shaders/private/ue4_effects_cave/42.shader_test - ac: 4 matches 5 shaders/private/ue4_effects_cave/104.shader_test - ac: 4 matches 5 shaders/private/f1-2015/336.shader_test - ac: 3 matches 4 shaders/private/f1-2015/948.shader_test - ac: 6 matches 7 shaders/private/f1-2015/602.shader_test - ac: 0 matches 3 shaders/private/f1-2015/600.shader_test - ac: 0 matches 3 shaders/private/f1-2015/1214.shader_test - ac: 0 matches 1 shaders/private/f1-2015/988.shader_test - ac: 4 matches 5 shaders/private/ue4_elemental/149.shader_test - ac: 3 matches 4 shaders/private/ue4_elemental/346.shader_test - ac: 4 matches 5 shaders/private/ue4_elemental/178.shader_test - ac: 3 matches 4 shaders/private/ue4_elemental/136.shader_test - ac: 4 matches 5 shaders/private/ue4_elemental/168.shader_test - ac: 4 matches 5 shaders/private/ue4_elemental/690.shader_test - ac: 3 matches 4 shaders/private/ue4_elemental/19.shader_test - ac: 5 matches 6 shaders/private/dota2/1901.shader_test - ac: 0 matches 5 shaders/private/dota2/1357.shader_test - ac: 0 matches 5 shaders/private/dota2/1375.shader_test - ac: 0 matches 5 shaders/private/dota2/1369.shader_test - ac: 0 matches 5 shaders/private/dota2/1583.shader_test - ac: 0 matches 5 shaders/private/dota2/1811.shader_test - ac: 0 matches 5 shaders/private/dota2/1893.shader_test - ac: 0 matches 5 shaders/private/dota2/1533.shader_test - ac: 0 matches 5 shaders/private/dota2/1951.shader_test - ac: 0 matches 5 shaders/private/dota2/1361.shader_test - ac: 0 matches 5 shaders/private/mad_max/2792.shader_test - ac: 0 matches 1 shaders/private/mad_max/2794.shader_test - ac: 0 matches 1 shaders/private/mad_max/2780.shader_test - ac: 0 matches 1 shaders/private/mad_max/2902.shader_test - ac: 0 matches 1 shaders/private/bioshock-infinite/3050.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/2544.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3062.shader_test - ac: 3 matches 8 shaders/private/bioshock-infinite/2012.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3058.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3270.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/732.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3026.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3258.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3198.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3046.shader_test - ac: 3 matches 7 shaders/private/bioshock-infinite/3168.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/2550.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3210.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/3032.shader_test - ac: 3 matches 6 shaders/private/bioshock-infinite/668.shader_test - ac: 3 matches 7 Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-03 20:55:00 +02:00
Marek Olšák	7647e90b15	ac: rename ac_eliminate_const_vs_outputs -> ac_optimize_vs_outputs Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-03 20:55:00 +02:00
Marek Olšák	faa37475e9	ac: first parse VS exports before eliminating constant ones A later commit will make use of this. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-03 20:55:00 +02:00
Jason Ekstrand	f8d7c23e1f	anv: Trivially implement multiDrawIndirect Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	272b7e7d25	anv: Enable VK_KHX_multiview and SPV_KHR_multiview Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	3dbd7737d4	anv/cmd_buffer: Emit instanced draws for multiple views Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	32abb0e13c	anv/cmd_buffer: Pull indirect draw parameter loading into a helper Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	0db7070330	anv/pipeline: Add shader lowering for multiview v2 (Jason Ekstrand): - Take a view_mask rather than a whole subpass - Build the view mask into the VS shader key Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	ca5bdfdfc6	anv/pipeline: Add a subpass field to anv_pipeline This simplifies the code a variety of places. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	c4549e05aa	anv/pipeline: Call nir_gather_info later We want to insert more lowering code that may insert system values and we need to gather info after that lowering. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	dcb6a68bb4	anv: Move shader hashing to anv_pipeline Shader hashing is very closely related to shader compilation. Putting them right next to each other in anv_pipeline makes it easier to verify that we're actually hashing everything we need to be hashing. The only real change (other than the order of hashing) is that we now hash in the shader stage. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	d6b8106eea	anv/pass: Store the per-subpass view mask Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	e997f548de	anv: Add the KHX_multiview boilerplate Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	0bed97006f	anv/nir: Delete the apply_dynamic_offsets prototype That pass hasn't existed since `dd4db84640` but the prototype stuck around for no reason. Reviewed-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	f903f78b72	spirv: Add support for SPV_KHR_multiview Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	99d0709553	spirv: Bump the SPIR-V header to the latest public version Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-05-03 11:25:46 -07:00
Jason Ekstrand	bb41d9a1d3	compiler: Add a system value and varying for ViewIndex Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-05-03 11:25:46 -07:00
Bartosz Tomczyk	fcf941068e	mesa/vbo: reduce prim array size We always use only single element. v2: Change single element arrays to variables Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-03 18:22:58 +02:00
Brian Paul	a30313abf6	mesa: add const qualifier on _mesa_valid_to_render() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-05-03 08:48:46 -06:00
Samuel Iglesias Gonsálvez	f57e234fdd	i965/vec4: don't modify regioning parameters to the sources of DF align1 instructions The regioning parameters are now properly set by convert_to_hw_regs() and we don't need to fix them in the generator. That latter fix previously done in the generator was strictly speaking wrong for any non-identity regions. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "17.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-05-03 15:32:39 +02:00
Samuel Iglesias Gonsálvez	aaeb1c99be	i965/vec4: fix register width for DF VGRF and UNIFORM On gen7, the swizzles used in DF align16 instructions works for element size of 32 bits, so we can address only 2 consecutive DFs. As we assumed that in the rest of the code and prepare the instructions for this (scalarize_df()), we need to set it to two again. However, for DF align1 instructions, a width of 2 is wrong as we are not reading the data we want. For example, an uniform would have a region of <0, 2, 1> so it would repeat the first 2 DFs, when we wanted to access to the first 4. This patch sets the default one to 4 and then modifies the width of align16 instruction's DF sources when we translate the logical swizzle to the physical one. v2: - Remove conditional (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "17.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-05-03 15:32:39 +02:00
Samuel Iglesias Gonsálvez	7f728bce81	i965/vec4: fix vertical stride to avoid breaking region parameter rule From IVB PRM, vol4, part3, "General Restrictions on Regioning Parameters": "If ExecSize = Width and HorzStride ≠ 0, VertStride must be set to Width * HorzStride." In next patch, we are going to modify the region parameter for uniforms and vgrf. For uniforms that are the source of DF align1 instructions, they will have <0, 4, 1> regioning and the execsize for those instructions will be 4, so they will break the regioning rule. This will be the same for VGRF sources where we use the vstride == 0 exploit. As we know we are not going to cross the GRF boundary with that execsize and parameters (not even with the exploit), we just fix the vstride here. v2: - Move is_align1_df() (Curro) - Refactor exec_size == width calculation (Curro) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "17.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-05-03 15:32:39 +02:00
Dave Airlie	3bf3f9866c	radv/ac: canonicalize the output for 32-bit float min/max. This fixes: dEQP-VK.glsl.builtin.precision.min.* dEQP-VK.glsl.builtin.precision.max.* dEQP-VK.glsl.builtin.precision.clamp.* The problem is the hw doesn't compare denorms properly, so we have to flush them, even though the spec says flushing is optional, if you don't flush the results should be correct. The -pro driver changes the shader float mode, it would be nice if llvm could grow that perhaps. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 12:55:34 +10:00
Dave Airlie	83e58b036e	radv: flush f32->f16 conversion denormals to zero. (v2) SPIR-V defines the f32->f16 operation as flushing denormals to 0, this compares the class using amd class opcode. Thanks to Matt Arsenault for figuring it out. This fix is VI+ only, add a TODO for SI/CIK. This fixes: dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 12:55:34 +10:00
Bas Nieuwenhuizen	eeff7e1154	radv: Add userspace fence buffer per context. Having it in the winsys didn't work when multiple devices use the same winsys, as we then have multiple contexts per queue, and each context counts separately. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Fixes: `7b9963a28f` "radv: Enable userspace fence checking."	2017-05-03 03:10:12 +02:00
Dave Airlie	2a2a21450b	radv: enable lower_sub to fix loop unrolling. Loop unroll asserts if it hits a sub, we don't really want to lower subs as llvm handles these things, but do this for now, until we can fix loop unroll to work with subs. Fixes: `14ae0bfa5` (radv: Add NIR loop unrolling) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 09:03:43 +10:00
Bas Nieuwenhuizen	9e847eedd5	radv: Don't set dynamic state for pipelines with rasterizer dicard. All of the dynamic states apply to rasterization & fragment processing, so we don't need to set them if we don't rasterize. We don't clear the dirty flags for them though, so we don't miss any updates for the next pipeline with rasterization. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Fixes: `76603aa90b` "radv: Drop the default viewport when 0 viewports are given."	2017-05-03 00:12:56 +02:00
Dave Airlie	a524704025	radv: flush more stages when semaphore are waiting. This still doesn't give us complete pWaitDstStageMask support, but it should provide enough to be correct if not as efficent as possible. If we have wait semaphores we must flush between submits and flush the shaders as well. This fixes the remaining fails in: dEQP-VK.synchronization.op.single_queue.semaphore.ssbo Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 07:21:31 +10:00
Samuel Pitoiset	e0e01895b0	glsl: set vector_elements to 1 for samplers I don't see any reasons why vector_elements is 1 for images and 0 for samplers. This increases consistency and allows to clean up some code a bit. This will also help for ARB_bindless_texture. No piglit regressions with RadeonSI. This time the Intel CI system doesn't report any failures. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-05-02 22:40:45 +02:00
Eric Anholt	ece06defe7	vc4: Use runtime CPU detection for whether NEON is available. This will allow Raspbian's ARMv6 builds to take advantage of the new NEON code, and could prevent problems if vc4 ends up getting used on a v7 CPU without NEON. v2: Drop dead NEON_SUFFIX (noted by Erik Faye-Lund)	2017-05-02 13:35:23 -07:00
Eric Anholt	a373f77662	vc4: Use a wrapper file to set VC4_BUILD_NEON instead of CFLAGS. Android.mk was setting the flag across the entire driver, so we didn't have non-NEON versions getting built. This was going to be a problem with the next commit, when I start auto-detecting NEON support and use the non-NEON version when appropriate. Reviewed-by: Rob Herring <robh@kernel.org>	2017-05-02 13:35:23 -07:00
Eric Anholt	463b7d0332	gallium: Enable ARM NEON CPU detection. I wrote this code with reference to pixman, though I've only decided to cover Linux (what I'm testing) and Android (seems obvious enough). Linux has getauxval() as a cleaner interface to the /proc entry, but it's more glibc-specific and I didn't want to add detection for that. This will be used to enable NEON at runtime on ARMv6 builds of vc4. v2: Actually initialize the temp vars in the Android path (noticed by daniels) v3: Actually pull in the cpufeatures library (change by robher). Use O_CLOEXEC. Break out of the loop when we find our feature. v4: Drop VFP code, which was confused about what it was detecting and not actually used yet. Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>	2017-05-02 13:35:23 -07:00
Dave Airlie	3c73063974	radv: fix stencil only clears. If we are clearing stencil only, we still need to provide a a valid Z output from the vertex shader, we can't rely on the depth clear value having any meaning, as we use this for the position output, and it could get clipped, so we don't end up clearing anything. Fixes: dEQP-VK.renderpass.simple.stencil since I added S8 support. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 06:31:20 +10:00
Philipp Zabel	b539335e50	renderonly: use drmIoctl To restart interrupted system calls, use drmIoctl. Fixes: `848b49b288` ("gallium: add renderonly library") CC: <mesa-stable@lists.freedesktop.org> Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-05-02 22:22:53 +02:00
Philipp Zabel	cd8ee259c8	renderonly: drop resources on destroy The renderonly_scanout holds a reference on its prime pipe resource, which should be released when it is destroyed. If it was created by renderonly_create_kms_dumb_buffer_for_resource, the dumb BO also has to be destroyed. Fixes: `848b49b288` ("gallium: add renderonly library") CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-05-02 22:19:23 +02:00
Philipp Zabel	ab51cd2f26	renderonly: close transfer prime_fd prime_fd is only used to transfer the scanout buffer to the GPU inside renderonly_create_kms_dumb_buffer_for_resource. It should be closed immediately to avoid leaking the DMA-BUF file handle. Fixes: `848b49b288` ("gallium: add renderonly library") CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-05-02 22:19:19 +02:00
Dave Airlie	09034aab64	radv/wsi: report presentation error per image request This ports `0fcb92c17d` anv: wsi: report presentation error per image request This fixes: dEQP-VK.wsi.xlib.incremental_present.scale_none.* Reviewed-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 06:11:19 +10:00
Dave Airlie	ce0f692528	radv: minor pahole related improvements. This just reduces the structs by 4-8 bytes each. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 06:03:07 +10:00
Dave Airlie	9399870ef0	radv/image: resize some surface members. Oops meant to be part of previous series. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 06:03:02 +10:00
Dave Airlie	fe6d9c0825	radv: drop unused surface level members. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 06:00:42 +10:00
Dave Airlie	5d0f792f06	radv/image: drop blk_d This was pretty much unused. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 06:00:38 +10:00
Dave Airlie	052487be4c	radv: remove some members of radeon surface. We would be storing this info twice per image, no need to, remove it from the surface struct. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 06:00:35 +10:00
Dave Airlie	7e8d0a402b	radv: move some image info into a separate struct. This is to rework the surface code like radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 06:00:17 +10:00
Dave Airlie	d5400a5ec2	radv: provide a helper for comparing an image extents. This just makes it easier to do the follow in cleanups of the surface. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-03 05:59:52 +10:00
Daniel Stone	80ac89a952	gbm/dri: Fix sign-extension in modifier query When we were assembling the unsigned 64-bit query return from its two signed 32-bit component parts, the lower half was getting sign-extended into the top half. Be more explicit about what we want to do. Fixes gbm_bo_get_modifier() returning ((1 << 64) - 1) rather than ((1 << 56) - 1), i.e. DRM_FORMAT_MOD_INVALID. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2017-05-02 19:55:13 +01:00
Eric Anholt	fba6559a1e	nir: Pick just the channels we want for bitmap and drawpixels lowering. NIR now validates that SSA references use the same number of channels as are in the SSA value. v2: Reword commit message, since the commit didn't land before the validation change did. Fixes: `370d68babc` ("nir/validate: Validate that bit sizes and components always match") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Cc: <mesa-stable@lists.freedesktop.org>	2017-05-02 10:24:40 -07:00
Jason Ekstrand	6ef1bd4fa5	anv/tests: Create a dummy instance as well as device This fixes crashes caused by `35e626bd0e` which made us start referencing the instance in the allocators. With this commit, the tests now happily pass again. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100877 Tested-by: Vinson Lee <vlee@freedesktop.org>	2017-05-01 17:06:40 -07:00
Bas Nieuwenhuizen	6681ab1f97	radv: Use correct stage for ready bit. Set the bit in the same stage as the timestamp, instead always at top of pipe. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com>	2017-05-02 00:54:44 +02:00
Bas Nieuwenhuizen	568aec29d9	radv: Add top of pipe timestamp queries. Does not fix brokenness with the ready bit. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-05-02 00:54:18 +02:00
Bas Nieuwenhuizen	14ae0bfa54	radv: Add NIR loop unrolling. Not much effect on dota2/talos, but positive on deferred. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@itsqueeze.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-05-02 00:09:42 +02:00
Randy Xu	6f21b5601c	i965: Solve Android native fence fd double close The Android native fence in i965 has two fds: _EGLSync::SyncFd and brw_fence::sync_fd. The semantics of __DRI2fenceExtensionRec::create_fence_fd are unclear on whether the DRI driver takes ownership of the incoming fd (which is the same incoming fd from eglCreateSync). i965 did take ownership, but all other Mesa drivers do not; instead, they dup the incoming fd. As a result, _EGLSync::SyncFd and brw_fence::sync_fd were the same fd, and both egl_dri2 and i965 believed they owned it. On eglDestroySync, that led to a double-close. Fix the double-close by making brw_dri_create_fence_fd dup the incoming fd, just like the other drivers do. Signed-off-by: Randy Xu <randy.xu@intel.com> Test: Run Vulkan and GLES stress test and no crash. Fixes: `6403e37651` ("i965/sync: Implement fences based on Linux sync_file") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org> [chadv: Polish the commit message] Cc: mesa-stable@lists.freedesktop.org	2017-05-01 14:46:50 -07:00
Eric Anholt	d884d1a654	vc4: Only build the NEON code on arm32. NEON is sufficiently different on arm64 that we can't just reuse this code. Disable it on arm64 for now. v2: Use PIPE_ARCH_ARM instead, as __ARM_ARCH may be 8 for a 32-bit build for a v8 CPU. Signed-off-by: Eric Anholt <eric@anholt.net> Cc: <mesa-stable@lists.freedesktop.org>	2017-05-01 13:27:39 -07:00
Samuel Pitoiset	dec5b27b1b	gm107/ir: add a missing assertion in emitISCADD() For consistency, similar to the other emitters. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-05-01 11:56:49 +02:00
Timothy Arceri	de8e01698f	i965: Don't allocate uniform space for samplers Samplers are encoded into the instruction word, so there's no need to make space in the uniform file. Previously matrix_columns and vector_elements were set to 0, making this else case a no-op. Commit `75a31a20af` changed that, causing malloc corruption in thousands of tests on i965. Fixes: `75a31a20af` ("glsl: set vector_elements to 1 for samplers") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100871	2017-05-01 07:54:18 +10:00
Emil Velikov	a5c6ca9602	egl: initialise dummy_thread via _eglInitThreadInfo Considering we cannot make dummy_thread a constant we might as well, initialise by the same function that handles the actual thread info. This way we don't need to worry about mismatch between the initialiser and initialising function. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-04-29 14:40:53 +01:00
Emil Velikov	e5efaeb85c	egl: polish dri2_to_egl_attribute_map[] Annotate the array as static const and use C99 initialiser to populate it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-29 14:40:09 +01:00
Ilia Mirkin	6af14778a3	gallium/targets: fix bool setting on BE architectures val_bool and val_int are in a union. val_bool gets the first byte, which happens to work on LE when setting via the int, but breaks on BE. By setting the value properly, we are able to use DRI3 on BE architectures. Tested by running glxgears with a NV34 in a G5 PPC. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org [Emil Velikov: squash the vmwgfx hunk] Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-04-29 14:32:20 +01:00
Emil Velikov	e5c24adc22	docs: add release calendar page and references to it Add a page that has information which release is expected when and associated information. Reference to it from the "Releasing process" and "Release notes" pages. v2: - Add Andres for 17.0.5 - Rework table format to include the branch (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-04-29 13:43:06 +01:00
Emil Velikov	b1d45c3366	travis: bump MAKEFLAGS to -j4 The instance should have 2 cores, yet bumping the jobs to 4 should give us a minor speed improvement. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:39:40 +01:00
Emil Velikov	27a0b383b9	travis: enable wayland support Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:39:40 +01:00
Emil Velikov	0e6a36cd3f	travis: add Gallium state-tracker targets Split into OpenCL and others, since the former is quite time consuming. v2: - explicitly enable/disable components - build libvdpau 1.1 requirement - enable st/vdpau - build libva 1.6.2 (API 0.38) requirement v3: Drop ubuntu-toolchain-r-test from sources (Andres) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:39:40 +01:00
Emil Velikov	b3f2076549	travis: model scons check target like the make one Should make things a bit more consistent across the board. Cc: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:39:40 +01:00
Emil Velikov	7e2af37474	travis: split the make target to three separate ones Split the target to allow faster builds for each run. The overall build time will be more, yet Travis runs multiple builds in parallel so we're limited by the slowest one. Things are split roughly as: - DRI loaders, classic DRI drivers, classic OSMesa, make check - All Gallium drivers (minus the SWR) alongside st/dri (mesa) - The Vulkan drivers - ANV and RADV, make check (anv) v2: - rework RUN_CHECK to MAKE_CHECK_COMMAND - explicitly disable DRI loaders - generate linux/memfd.h locally and enable ANV - add libedit-dev v3: Use printf to create the header (Andres). v4: Really add the libedit + printf hunks. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:38:11 +01:00
Emil Velikov	8479fd8a10	travis: add "make swr" to the build matrix v2: Quote OVERRIDE variables. v3: Add missplaced libedit-dev hunk (Andres). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:35:17 +01:00
Emil Velikov	f55d98ac85	travis: add "scons swr" to the build matrix Requires GCC 5.0 (due to the C++14 requirement) and LLVM 3.9. v2: Enable the target, add libedit-dev, rework check target. v3: Comment the current check target, add -j4 SCONSFLAGS, quote OVERRIDE variables. v4: Keep check target as-is (Andres) Cc: Tim Rowley <timothy.o.rowley@intel.com> Cc: George Kyriazis <george.kyriazis@intel.com> Reviewed-by: George Kyriazis <george.kyriazis@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:35:17 +01:00
Emil Velikov	85ee2c6cfc	travis: add separate "scons" and "scons llvm" targets The former does not require any LLVM, while the latter uses LLVM 3.3. This way we'll quickly catch any LLVM 3.3+ functionality that gets introduced where it shouldn't. Add the full list of addons for each build permutation. v2: Keep libedit-dev, rework check target. v3: Comment the current check target, add -j4 SCONSFLAGS v4: - Remove llvm-toolchain-trusty-3.3 source (Andres) - Keep check target as-is (Andres) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:35:17 +01:00
Emil Velikov	56ba252e23	travis: split out matrix from env With next commits we'll add a couple of more options. v2: Rework check target. v3: Comment the current check target, add -j4 SCONSFLAGS v4: Keep check target as-is, will rework with later patch. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:35:17 +01:00
Emil Velikov	abcfea23ad	travis: rework "if test" blocks in the script section Split the "if test" blocks so that we get more sensible output in case of a failure. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:35:17 +01:00
Emil Velikov	ae713a7b79	travis: remove unused -dev packages We effectively override libdrm-dev and libxcb-dri2-0-dev since we build and install the package locally. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:35:17 +01:00
Emil Velikov	6431b98c54	travis: automatically manage ccache caching According to the manual "If you are using ccache, use: language: c # or other C/C++ variants cache: ccache to cache $HOME/.ccache and automatically add /usr/lib/ccache to your $PATH." Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:35:17 +01:00
Emil Velikov	486f28ba88	travis: enable apt cache Provides a small, but consistent improvement. Example numbers of the jobs added later in the series. "make loaders/classic DRI" - 1s "scons SWR" - 6s Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:34:55 +01:00
Andres Gomez	29322daef2	travis: add the possibility of using the txc-dxtn library The txc-dxtn library implements the patented S3 Texture Compression algorithm. By default it won't be used but we add the possibility of setting the USE_TXC_DXTN variable to yes in the travis web UI so it will be installed and used for the scons tests. Cc: Eric Anholt <eric@anholt.net> Cc: Rhys Kidd <rhyskidd@gmail.com> Signed-off-by: Andres Gomez <agomez@igalia.com> [Emil Velikov: keep the LIB prefix, drop the LD_LIBRARY_PATH, fold URL] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-29 13:34:53 +01:00
Andres Gomez	7819d265c7	travis: replace Trusty-based LLVM toolchain apt-get with apt addon Trusty's LLVM toochain repository was whitelisted some time ago. See: `479067c5e7` Signed-off-by: Andres Gomez <agomez@igalia.com> [Emil Velikov] - set sudo to false - reference the Trusty change (Rhys) - keep libedit-dev Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-29 13:34:53 +01:00
Emil Velikov	cb820daa3f	travis: explicitly LD_LIBRARY_PATH the local libraries Some of the libraries may be dlopened, which may not always work due to the non-standard prefix that we're using. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-04-29 13:34:53 +01:00
Brian Paul	52d69c2e8d	st/wgl: whitespace, formatting fixes in stw_pixelformat.c Trivial.	2017-04-28 22:01:34 -06:00
Charmaine Lee	ba8e2ea19a	st/wgl: allow WGL_BIND_TO_TEXTURE_RGB_ARB for RGBA visuals We do not need to restrict WGL_BIND_TO_TEXTURE_RGB_ARB to RGB visuals only. It can be supported with RGBA visuals as well. This fixes the early exit of cinebench-r15-test trace. Tested with cinebench-r15, piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-28 22:01:24 -06:00
Brian Paul	d06045dfdd	st/wgl: use ARRAY_SIZE() macro in wglChoosePixelFormatARB() Trivial.	2017-04-28 21:37:07 -06:00
Brian Paul	394f8dacbc	st/wgl: whitespace/formatting fixes in stw_ext_pixelformat.c Trivial.	2017-04-28 21:37:06 -06:00
Neha Bhende	197907c926	svga: implement sRGB rendering for imported surfaces If texture is imported and templ format is sRGB, use compatible sRGB format to the imported texture format while creating surface view. tested with MTT piglit, glretrace, viewperf and conform Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-28 21:03:06 -06:00
Neha Bhende	1b415a5b28	svga: add function svga_linear_to_srgb() This function will return compatible svga srgb format for corresponding linear format Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-28 21:03:06 -06:00
Neha Bhende	6e06e281c6	glx: add missing sRGB attribute check in fbconfigs_compatible() This patch will allow driver to choose srgb capable FBconfig if GLX_FRAMEBUFFER_SRGB_CAPABLE_ARB attribute is 1 Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-28 21:03:06 -06:00
Thomas Hellstrom	ca59fd1706	svga: Add a more elaborate format compatibility determination v2 dri3 is a bit sloppy about its format compatibility requirements, so add a possibility to import xrgb surfaces as argb textures and vice versa. At the same time, make the svga_texture_from_handle() function a bit more readable and fix the error path where we leaked a winsys surface. v2: Addressed review comments by Brian. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-28 21:03:06 -06:00
Tim Rowley	18d5c452d0	swr/rast: add memory api to SwrGetInterface() Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:57:09 -05:00
Tim Rowley	a46539af11	swr/rast: use gather instruction for odd format fetch Small fetch performance optimization - use gather instruction for odd format fetch instead of slow emulated code. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:57:02 -05:00
Tim Rowley	eff909de7d	swr/rast: enable SIMD16 8x2 tile backend Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:56:56 -05:00
Tim Rowley	5fde2ae533	swr/rast: add SwrInit() to init backend/memory tables Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:56:50 -05:00
Tim Rowley	e8d58049f6	swr/rast: increment depth/stencil tile pointer in SIMD16 BE Misplaced #endif preventing depth and stencil hot tile pointers from incrementing in SIMD16 8x2 configuration of BackendPixelRate. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:56:42 -05:00
Tim Rowley	d4c1486737	swr/rast: add SwrGetInterface() function to return api Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:56:34 -05:00
Tim Rowley	dabd0499a6	swr/rast: enable per-warp scratch space for CS Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:56:28 -05:00
Tim Rowley	0424e6249a	swr/rast: reduce simd{16}vertex stack for VS output Frontend - reduce simdvertex/simd16vertex stack usage for VS output in ProcessDraw, fixes stack overflow in some of the deeper call stacks under SIMD16. 1. Move the vertex store out of PA_FACTORY, and off the stack 2. Allocate the vertex store out of the aligned heap (pointer is temporarily stored in TLS, but will be migrated to thread pool along with other frontend temporary buffers). 3. Grow the vertex store as necessary for the number of verts per primitive, in chunks of 8/4 simdvertex/simd16vertex Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:56:17 -05:00
Tim Rowley	536baf507e	swr/rast: remove default argument from SwrSync() Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:56:11 -05:00
Tim Rowley	145bf5aa5b	swr/rast: remove unused variables in the SIMD16 FE Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:55:57 -05:00
Tim Rowley	20f3a30219	swr/rast: move construction of const above goto Fixes gcc error for SIMD16 FE. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:55:50 -05:00
Tim Rowley	feefd3ef4e	swr/rast: name threads to aid debugging Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:55:40 -05:00
Tim Rowley	9b907599b6	swr/rast: disable buffer overrun warning for Assemble() Disabling buffer overrun warning for Assemble(uint32_t slot, simdvector *verts) due to what looks like a MSVC compiler bug when compiling the SIMD16 FE. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:55:33 -05:00
Tim Rowley	d523b82498	swr/rast: clean up clipper comments Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:55:26 -05:00
Tim Rowley	8c0e0bf141	swr/rast: add SIMDAPI decorators in binner/clipper Fixes MSVC errors with SIMD16 FE. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:55:20 -05:00
Tim Rowley	42d804b2a3	swr/rast: add additional jit utility functions Not used yet. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:55:02 -05:00
Tim Rowley	a373f1f27a	swr/rast: more flexible max attribute slots Ability to allocate space for an arbitrary number (at compile time) of positions in the vertex layout. Removes KNOB_NUM_ATTRIBUTES from knobs.h, replaces the VTX slot number #defines with the SWR_VTX_SLOTS enum (which contains replacement for NUM_ATTRIBUTES: SWR_VTX_NUM_SLOTS) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-28 19:53:39 -05:00
Kenneth Graunke	54d42cd976	i965: Drop BRW_NEW_CONTEXT from 3DSTATE_DS/GS on Gen7-7.5. We already have BRW_NEW_BATCH, which completely covers all the cases that BRW_NEW_CONTEXT would handle. Drop it. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-04-28 17:03:33 -07:00
Kenneth Graunke	1d0e974406	i965: Drop _NEW_TRANSFORM from 3DSTATE_DS/GS on Gen7-7.5. There's no reason for this as far as I can tell. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-04-28 17:03:33 -07:00
Kenneth Graunke	a1f12574b0	i965: Set point rasterization rule to UPPER_RIGHT on Gen6-7.5. Gen4-5 and Gen8+ already set this, but Gen6-7.5 did not. We ought to be consistent - the answer depends on the API, not the hardware generation. The Sandybridge PRM says about RASTRULE_UPPER_RIGHT: "To match OpenGL point rasterization rules (round to +infinity, where this is the upper right direction wrt OpenGL screen origin of lower left). So this is likely the one we should use. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2017-04-28 17:03:33 -07:00
Kenneth Graunke	4878ab9bd4	i965: Always set AALINEDISTANCE_TRUE on Sandybridge. We set this unconditionally on every other platform. Zero (Manhattan) isn't even listed as an option in the Sandybridge docs - only "true". Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-28 17:03:33 -07:00
Kenneth Graunke	b625bcc601	i965: Use true AA line distance on G45/Ironlake. The original Broadwater and Crestline platforms computed antialiased line distances using "manhattan" distance, aka a + b = c. Eaglelake and Cantiga added "true" distance, which apparently does something like max(a, b) + min(a, b) / 4. Not exactly "true", but at least more accurate. The G45 documentation indicates that the old manhattan distance setting is "only for debug purposes" and should never be used. The Ironlake documentation no longer mentions AALINEDISTANCE_MANHATTAN, though it does still contain the narrative about the feature. At any rate, we should use the more accurate mode. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-28 17:03:33 -07:00
Andres Gomez	81149c8f52	docs: add news item and link release notes for 17.0.5 Signed-off-by: Andres Gomez <agomez@igalia.com>	2017-04-29 01:21:17 +03:00
Andres Gomez	e06aec99f2	docs: add sha256 checksums for 17.0.5 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `6cb65ce2d3`)	2017-04-29 01:20:51 +03:00
Andres Gomez	0ad8c4f375	docs: add release notes for 17.0.5 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `61b134a862`)	2017-04-29 01:19:51 +03:00
Marek Olšák	7a515a607c	radeonsi: don't load unused compute shader input SGPRs and VGPRs Basically, don't load GRID_SIZE or BLOCK_SIZE if they are unused, determine whether to load BLOCK_ID for each component separately, and set the number of THREAD_ID VGPRs to load. Now we should get the maximum CS launch wave rate in most cases. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:57:44 +02:00
Marek Olšák	46e48d4044	tgsi/scan: record compute shader system value usage v2: just do indexing with swizzle[i] Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	fa15436e63	radeonsi: add a HUD query for draw calls with primitive restart Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	55445ff189	radeonsi: tell LLVM not to remove s_barrier instructions LLVM 5.0 removes s_barrier instructions if the max-work-group-size attribute is not set. What a surprise. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	0490074cab	radeonsi: fix tess offchip offset for per-patch attributes We need 4 more bits there. I don't know what is fixed by this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	4e50062028	radeonsi: pass tessellation ring addresses via user SGPRs This removes s_load_dword latency for tess rings. We need just 1 SGPR for the address if we use 64K alignment. The final asm for recreating the descriptor is: // s2 is (address >> 16) s_mov_b32 s3, 0 s_lshl_b64 s[4:5], s[2:3], 16 s_mov_b32 s6, -1 s_mov_b32 s7, 0x27fac v2: bitcast the descriptor type from v2i64 to v4i32 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	2823e15f60	radeonsi: use si_insert_input_ret in si_llvm_emit_tcs_epilogue Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	9fd9a7d0ba	radeonsi: remove VS epilog code, compile VS with PrimID export on demand The use of PrimID in the pixel shader is too rare to deserve such a sizable support code. The initial idea of the VS epilog was to move the clipping code there and remove it based on states, but optimized variants are now used to do that and are easier to support, so the VS epilog has turned out to be not so useful. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	3b2e93e472	radeonsi: get InstanceID from VGPR1 (or VGPR2 for tess) instead of VGPR3 VGPR1 = InstanceID / StepRate0; // StepRate0 can be set to 1 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	678d568c7b	radeonsi: don't load PrimID in TES if it's not used Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	808c33f6f0	radeonsi: explain (non-)monolithic shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	fc478248f3	radeonsi/gfx9: enable OpenGL 4.5 Tentatively enable it, expecting the scratch buffer support to be done before the next Mesa release. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	ed9a51cd3b	radeonsi/gfx9: 2nd shader of merged shaders should hold a reference of the 1st Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	ef40937854	radeonsi: add reference counting for shader selectors The 2nd shader of merged shaders should take a reference of the 1st shader. The next commit will do that. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	6c15e15af4	radeonsi/gfx9: set VGT_VERTEX_REUSE for ES in ES-GS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	887ef1de34	radeonsi/gfx9: set TES registers for merged ES-GS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	49cd0cbfd5	radeonsi/gfx9: disallow scratch buffer for LS-HS and ES-GS not implemented yet Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	2857b14bba	radeonsi/gfx9: always compile monolithic ES-GS (asynchronously) In addition to the non-monolithic variant. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	a82398a8f5	radeonsi/gfx9: add support for monolithic ES-GS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	6a9c20fdd5	radeonsi/gfx9: make sure the 1st shader's main part exists for merged shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	7df682c291	radeonsi/gfx9: select shader parts for non-monolithic ES-GS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	cd99c442c4	radeonsi/gfx9: add GS prolog support for merged ES-GS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	e0570bc283	radeonsi/gfx9: add VS prolog support for merged ES-GS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	6b93452b24	radeonsi/gfx9: pass GS input SGPRs and VGPRs from the ES part to GS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	37e22ab65e	radeonsi/gfx9: store ES outputs to LDS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	d616c57342	radeonsi/gfx9: load GS inputs from LDS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	fc781fa0ab	radeonsi/gfx9: get GS wave ID from the correct input Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	bcaf905129	radeonsi/gfx9: add the function signature of merged ES-GS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	8b220877ad	radeonsi/gfx9: set registers and shader key for merged ES-GS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	ab197ad8d1	radeonsi/gfx9: add GS user SGPRs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	b2f5d03152	radeonsi: rename declare_tess_lds -> declare_lds_as_pointer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	e3caa1cd36	radeonsi: simplify some shader type conditions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	021e65640e	radeonsi: rename the swizzle parameter of lds_store Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	dcea7e5d19	radeonsi: add si_shader::prolog2 For a GS prolog in merged ES-GS. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	eb35238ffe	radeonsi/gfx9: move RW_BUFFERS to s[0:1] for merged shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	0af00f179e	radeonsi/gfx9: add support for monolithic merged LS-HS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	0d6d25475d	radeonsi/gfx9: set EXEC for non-mono merged shaders, add a barrier between them Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	a84a6feac9	radeonsi/gfx9: don't store the HS control word GFX9 doesn't have it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	1d90ecd3a5	radeonsi/gfx9: pass inputs from LS to TCS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	cbd1bc2e3e	radeonsi/gfx9: add TCS epilog support for merged LS-HS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	f11ced475e	radeonsi/gfx9: add VS prolog support for merged LS-HS HS input VGPRs must be reserved. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	82a0e4f658	radeonsi/gfx9: merged shaders have scratch offset at the beginning also, screen wasn't initialized for compute shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	0c253557b2	radeonsi/gfx9: define LS-HS main shader function prototype Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	852ea69a2d	radeonsi: assign VS/TCS/TES/GS shader input parameter locations dynamically They will vary with merged stages. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	067dacd1b1	radeonsi/gfx9: define and set LS-HS user SGPRs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	0588146cb0	radeonsi/gfx9: set up shader registers for merged LS-HS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	62abdb17bb	radeonsi/gfx9: add initial code generation for non-monolithic merged LS-HS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	c73d9bd643	radeonsi: separate out code for selecting the VS prolog Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	a98c9ba580	radeonsi/gfx9: add si_shader::previous_stage for merged shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	cfb0798bb3	radeonsi/gfx9: enlarge num_input_sgprs in shader keys due to higher hw limit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	4ab36e0ebc	radeonsi/gfx9: update the summary of shader stage configs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	9d6ed572d9	radeonsi: adjust the signature of si_get_vs_prolog_key Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	b1ed3ffc56	radeonsi: separate out VS prolog key generation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	e4542f00ce	radeonsi: separate out VS prolog key printing Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	983d7e743e	radeonsi: code shuffling in si_emit_derived_tess_state Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	130e198c49	radeonsi: separate out TGSI initialization of si_shader_context so that we can put multiple different TGSI shaders into one module. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	c3f37e9b50	st/mesa: use min_index and max_index directly from vbo also remove the incorrect comment about primitive restart. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:46:44 +02:00
Marek Olšák	53cd67859d	vbo: set min_index = 0 so gallium can use the value directly We could also remove index_bounds_valid and use max_index != ~0 instead. Opinions on that are welcome. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:46:44 +02:00
Matt Turner	ee70937d15	Revert "glsl: reject image qualifiers with non-image types inside uniform blocks" This reverts commit `24011ead71`. This causes lots of ES 3.1 CTS tests to fail to compile a bit of code like: layout(binding = 0) buffer InOut { highp uint inputValues[384]; highp uint outputValues[384]; coherent highp uint groupValues[64]; <----- } sb_inout; error: memory qualifiers may only be applied to images	2017-04-28 12:31:20 -07:00
Brian Paul	27469aa72e	st/mesa: add more fallback gallium formats for GL integer formats The VMware driver has a limited set of integer texture formats. We often have to fall back to 4-component formats when 1- or 2-component formats are missing. This fixes about 8 integer texture Piglit tests with the VMware driver on Linux. We've had this code in-house for a long time but I guess it was never up-streamed to Mesa master. This shouldn't regress any other drivers since we're either choosing an earlier format in the list, or failing anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-28 13:12:31 -06:00
Brian Paul	6b60153f04	mesa: optimize color_buffer_writes_enabled() Return as soon as we find an existing color channel that's enabled for writing. Typically, this allows us to return true on the first loop iteration intead of doing four iterations. No piglit regressions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-28 13:12:31 -06:00
Brian Paul	054fb129e1	st/mesa: whitespace clean-ups in st_manager.c Trivial.	2017-04-28 13:12:31 -06:00
Matt Turner	b64da3d14e	Revert "glsl: set vector_elements to 1 for samplers" This reverts commit `75a31a20af`. This breaks thousands of tests on i965 with malloc corruption.	2017-04-28 11:48:57 -07:00
Chad Versace	85ca563b58	anv: Drop 'x11' prefix from non-X11 WSI funcs Drop it from x11_anv_wsi_image_create and x11_anv_wsi_image_free. The functions are used by Wayland WSI too. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2017-04-28 08:54:45 -07:00
Jason Ekstrand	ebd1bd6998	anv: Alphabetize KHR extensions Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-04-28 07:41:03 -07:00
Emil Velikov	c0139955fa	ac: automake: sort sources list alphabetically Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-28 14:13:01 +01:00
Emil Velikov	ecc39b6650	ac: include all sources in the tarball Fixes: `e2659176ce` ("radeonsi/ac: move vertex export remove to common code.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-28 14:13:00 +01:00
Nicolai Hähnle	9d346af322	st/mesa: remove redundant stfb->iface checks stfb->iface is always non-NULL for an st_framebuffer. These checks were incorrect, relying on out-of-bounds memory access in the surface-less case of EGL_KHR_surfaceless_context. v2: remove redundant stread check (Marek) Reviewed-by: Marek Olšák <marek@olsak@amd.com> (v2)	2017-04-28 11:34:00 +02:00
Nicolai Hähnle	19b61799e3	st/mesa: don't cast the incomplete framebufer to st_framebuffer The incomplete framebuffer is set for a surfaceless context. This leads to the following error in piglit spec@egl_khr_surfaceless_context@viewport: ==26703==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7f6886e43240 at pc 0x7f68854db0fd bp 0x7ffca404b3b0 sp 0x7ffca404b3a0 READ of size 8 at 0x7f6886e43240 thread T0 #0 0x7f68854db0fc in st_viewport ../../../mesa-src/src/mesa/state_tracker/st_cb_viewport.c:57 #1 0x556840176cdb in main tests/egl/spec/egl_khr_surfaceless_context/viewport.c:101 #2 0x7f688edcf3f0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x203f0) #3 0x556840176e19 in _start (/home/nha/amd/piglit/bin/egl-surfaceless-context-viewport+0xe19) 0x7f6886e43240 is located 32 bytes to the left of global variable 'DummyRenderbuffer' defined in '../../../mesa-src/src/mesa/main/fbobject.c:69:31' (0x7f6886e43260) of size 112 0x7f6886e43240 is located 8 bytes to the right of global variable 'IncompleteFramebuffer' defined in '../../../mesa-src/src/mesa/main/fbobject.c:73:30' (0x7f6886e42de0) of size 1112 SUMMARY: AddressSanitizer: global-buffer-overflow ../../../mesa-src/src/mesa/state_tracker/st_cb_viewport.c:57 in st_viewport Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek@olsak@amd.com>	2017-04-28 11:34:00 +02:00
Nicolai Hähnle	28ec0fc7b8	st/glsl_to_tgsi: make undef_src and undef_dst const	2017-04-28 11:34:00 +02:00
Nicolai Hähnle	6cbb8f99d2	st/glsl_to_tgsi: cleanup using visit_generic_intrinsic It turns out that explicitly setting the writemask isn't actually needed; emit_asm does the right thing based on looking at the types.	2017-04-28 11:34:00 +02:00
Nicolai Hähnle	ce55afc4d6	glsl: remove the shader_group_vote and shader_ballot expression ops They are now no longer used.	2017-04-28 11:33:59 +02:00
Nicolai Hähnle	0aef96e00c	glsl: implement arb_shader_ballot builtins using intrinsics	2017-04-28 11:33:59 +02:00
Nicolai Hähnle	2c30ea3fcd	glsl: implement arb_shader_group_vote builtins via intrinsics Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-28 11:33:59 +02:00
Nicolai Hähnle	944455217b	st/glsl_to_tgsi: implement shader_group_vote and shader_ballot intrinsics Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-28 11:33:59 +02:00
Nicolai Hähnle	99941a9724	glsl: add intrinsics for ARB_shader_group_vote and ARB_shader_ballot These operations are currently implemented as IR expressions. However, they cannot be transformed and moved in the way that other IR expressions can because they have non-trivial interactions with control-flow. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-28 11:33:58 +02:00
Samuel Pitoiset	24011ead71	glsl: reject image qualifiers with non-image types inside uniform blocks Fixes the following ARB_shader_image_load_store tests: format-layout-with-non-image-type.frag memory-qualifier-with-non-image-type.frag Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-28 10:43:53 +02:00
Samuel Pitoiset	edb4a1ab2d	glsl: introduce validate_image_qualifier_for_type() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-28 10:43:13 +02:00
Samuel Pitoiset	80738425e4	glsl: fix error when using format qualifiers with non-image types Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-28 10:43:04 +02:00
Timothy Arceri	22fa3d90a9	util/disk_cache: remove percentage based max cache limit The more I think about it the more this seems like a bad idea. When we were deleting old cache dirs this wasn't so bad as it was unlikely we would ever hit the actual limit before things were cleaned up. Now that we only start cleaning up old cache items once the limit is reached the a percentage based max cache limit is more risky. For the inital release of shader cache I think its better to stick to a more conservative cache limit, at least until we have some way of cleaning up the cache more aggressively. Cc: "17.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-04-28 14:35:27 +10:00
Jason Ekstrand	032861693e	anv: Move queues, events, and semaphores to their own file Things are about to get more complicated, especially as far as semaphores are concerned. Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-04-27 20:08:46 -07:00
Jason Ekstrand	9bd1f03487	anv: Implement VK_KHX_external_memory_fd This commit just exposes the memory handle type. There's interesting we need to do here for images. So long as the user doesn't set any crazy environment variables such as INTEL_DEBUG=nohiz, all of the compression formats etc. should "just work" at least for opaque handle types. v2 (chadv): - Rebase. - Fix vkGetPhysicalDeviceImageFormatProperties2KHR when handleType == 0. - Move handleType-independency comments out of handleType-switch, in vkGetPhysicalDeviceExternalBufferPropertiesKHX. Reduces diff in future dma_buf patches. Co-authored-with: Chad Versace <chadversary@chromium.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-04-27 20:08:46 -07:00
Jason Ekstrand	818b857914	anv: Use the BO cache for DeviceMemory allocations Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-04-27 20:08:46 -07:00
Jason Ekstrand	494d6f65a7	anv/allocator: Add a BO cache This cache allows us to easily ensure that we have a unique anv_bo for each gem handle. We'll need this in order to support multiple-import of memory objects and semaphores. v2 (Jason Ekstrand): - Reject BO imports if the size doesn't match the prime fd size as reported by lseek(). Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-04-27 20:08:46 -07:00
Jason Ekstrand	5d25ac6a4b	anv: Implement VK_KHX_external_memory This is the trivial implementation that just exposes the extension string but exposes zero external handle types. Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-04-27 20:08:46 -07:00
Chad Versace	354ca7a1d4	anv: Implement VK_KHX_external_memory_capabilities This is a complete but trivial implementation. It's trivial becasue We support no external memory capabilities yet. Most of the real work in this commit is in reworking the UUIDs advertised by the driver. v2 (chadv): - Fix chain traversal in vkGetPhysicalDeviceImageFormatProperties2KHR. Extract VkPhysicalDeviceExternalImageFormatInfoKHX from the chain of input structs, not the chain of output structs. - In vkGetPhysicalDeviceImageFormatProperties2KHR, iterate over the input chain and the output chain separately. Reduces diff in future dma_buf patches. Co-authored-with: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-27 20:08:46 -07:00
Jason Ekstrand	d4d9258b61	anv/physical_device: Rename uuid to pipeline_cache_uuid We're about to have more UUIDs for different things so this one really needs to be properly labeled. Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-04-27 20:08:46 -07:00
Jason Ekstrand	02767cb4ff	anv: Refactor device_get_cache_uuid into physical_device_init_uuids Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-04-27 20:08:46 -07:00
Jason Ekstrand	35e626bd0e	anv: Set EXEC_OBJECT_ASYNC when available Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-04-27 20:08:46 -07:00
Jason Ekstrand	bd3a9813b9	anv/cmd_buffer: Use the device allocator for QueueSubmit The command is really operating on a Queue not a command buffer and the nearest object to that with an allocator is VkDevice. Reviewed-by: Chad Versace <chadversary@chromium.org> Cc: "17.0 17.1" <mesa-dev@lists.freedesktop.org>	2017-04-27 20:08:46 -07:00
Timothy Arceri	2bc06767e1	mesa: remove wip framebuffer code This was added in `34b3b40af9` back in 2006. Seems it wasn't needed. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-28 10:19:59 +10:00
Samuel Pitoiset	75a31a20af	glsl: set vector_elements to 1 for samplers I don't see any reasons why vector_elements is 1 for images and 0 for samplers. This increases consistency and allows to clean up some code a bit. This will also help for ARB_bindless_texture. No piglit regressions with RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-27 22:52:21 +02:00
Jan Vesely	b295a52836	clover: Fix build since clang r301442 v2: rename default_ik -> ik_opencl Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-27 12:52:25 -04:00
Timothy Arceri	4e1f3afea9	disk_cache: use block size rather than file size The majority of cache files are less than 1kb this resulted in us greatly miscalculating the amount of disk space used by the cache. Using the number of blocks allocated to the file is more conservative and less likely to cause issues. This change will result in cache sizes being miscalculated further until old items added with the previous calculation have all been removed. However I don't see anyway around that, the previous patch should help limit that problem. Cc: "17.1" <mesa-stable@lists.freedesktop.org> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2017-04-27 20:44:00 +10:00
Timothy Arceri	ce41237151	disk_cache: reduce default cache size to 5% of filesystem Modern disks are extremely large and are only going to get bigger. Usage has shown frequent Mesa upgrades can result in the cache growing very fast i.e. wasting a lot of disk space unnecessarily. 5% seems like a more reasonable default. Cc: "17.1" <mesa-stable@lists.freedesktop.org> Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2017-04-27 20:43:50 +10:00
Dave Airlie	f4743763ce	radeon/ac: remove assert causing regression This assert wasn't in the original radeonsi code but I added it without totally understanding the original code, it caused some regressions in variable-indexing tessellation shaders. Fixes: `e2659176` radeonsi/ac: move vertex export remove to common code. Reported-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-27 11:38:54 +01:00
Dave Airlie	550281f934	radeon/ac: fix build on llvm 3.8.1 Add missing include to fix build. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-27 11:22:12 +01:00
Boyan Ding	63df869f08	nvc0: Enable compute support for Pascal Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-27 11:11:15 +02:00
Boyan Ding	d03bfb078b	nvc0: Add new launch descriptor format for GP100 v2: Also handle the the new format in indirect dispatch Use compute class check instead of chipset check Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-27 11:11:12 +02:00
Boyan Ding	2e35bd964e	nvc0: Fix index of unk fields in nve4_cp_launch_desc Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-27 11:11:10 +02:00
Boyan Ding	4a9f7bfe90	nouveau: Fix indentation of maxwell compute class definitions Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-27 11:11:07 +02:00
Jason Ekstrand	c43b4bc85e	anv: Don't place scratch buffers above the 32-bit boundary This fixes rendering corruptions in DOOM. Hopefully, it will also make Jenkins a bit more stable as we've been seeing some random failures and GPU hangs ever since turning on 48bit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100620 Fixes: `651ec926fc` "anv: Add support for 48-bit addresses" Tested-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "17.1" <mesa-stable@lists.freedesktop.org>	2017-04-27 02:04:57 -07:00
Dave Airlie	f205e19e4f	radv/ac: eliminate unused vertex shader outputs. (v2) This is ported from radeonsi, and I can see at least one Talos shader drops an export due to this, and saves some VGPR usage. v2: use shared code. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-27 05:18:52 +01:00
Dave Airlie	e2659176ce	radeonsi/ac: move vertex export remove to common code. This code can be shared by radv, we bump the max to VARYING_SLOT_MAX here, but that shouldn't have too much fallout. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-27 05:17:47 +01:00
Dave Airlie	9da1045933	radv: fix regression in descriptor set freeing. Since the host pool changes, Fixes: dEQP-VK.api.descriptor_pool.out_of_pool_memory Fixes: `126d5ad` "radv: Use host memory pool for non-freeable descriptors." Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-27 10:50:46 +10:00
Timothy Arceri	f8a2d00046	glsl: remove duplicate validation Varying types have already been validated in apply_type_qualifier_to_variable() by this point. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-27 08:21:28 +10:00
Timothy Arceri	52c76dbad3	glsl: use without_array() rather than get_scalar_type() Here get_scalar_type() was just being use to remove the array after that we converted it back to base_type anyway so just use the without_array() helper. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-04-27 08:21:21 +10:00
Brian Paul	28feb63580	svga: fix vertex buffer binding issue When we ran Viewperf11's Maya-03 test 3 we saw warnings about flushing the command buffer with mapped buffers. This happened when transitioning from hardware rendering to a 'draw' fallback path. The problem is the util_set_vertex_buffers_count() function doesn't do exactly what we want in svga_hwtnl_vertex_buffers(). In a case such as dst_count=2, dst={bufA, bufB}, count=1 and src={bufC}, when the function returns we'll have dst_count=2 and dst={bufC, bufB}. What we really want is dst_count=1 and dst={bufC, NULL}. As it was, we were telling the svga device that there were two vertex buffers when in fact we really only needed one for the subsequent drawing command. In this particular case, we first did hardware drawing with {bufA, bufB} then we transitioned to the 'draw' module, consuming vertex data from bufA and bufB and writing the new vertex data to bufC. bufA and bufB are mapped for reading when we flush the command buffer but should not be referenced by the command buffer. The above change fixes that. No Piglit regressions. Also tested with Viewperf, Google Earth, Heaven, etc. VMware bug 1842059 Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-26 11:38:00 -06:00
Brian Paul	a36a1ea80a	gallium/util: reduce util_snprintf() calls in debug_flush_might_flush_cb() We only need to construct the debug message if the mapped_sync flag is set. This should make the function faster since the flag is usually false. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-26 11:38:00 -06:00
Brian Paul	495840658e	gallium/util: add some comments in u_debug_flush.c Trivial. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-26 11:37:59 -06:00
Charmaine Lee	fbda9b905a	svga: Removed the unused label 'done' in svga_validate_surface_view() Trivial fix	2017-04-26 11:37:59 -06:00
Charmaine Lee	019d5d5346	svga: use the winsys interface to invalidate surface Instead of directly sending the InvalidateGBSurface command, this patch uses the invalidate_surface interface. Fixes Linux VM piglit failures including ext_texture_array-gen-mipmap, fbo-generatemipmap-array S3TC_DXT1 Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-26 11:37:59 -06:00
Charmaine Lee	5bd5ec6a0f	svga: fix format for screen target This patch revises the fix in commit 606f13afa31c9f041a68eb22cc32112ce813f944 to properly translate the surface format for screen target. Instead of changing the svga format for PIPE_FORMAT_B5G6R5_UNORM to SVGA3D_R5G6B5 for all texture surfaces, this patch only restricts SVGA3D_R5G6B5 for screen target surfaces. This avoids rendering failures when specify a non-vgpu10 format in a vgpu10 context with software renderer. Fixes piglit failures spec@!opengl 1.1@draw-pixels, spec@!opengl 1.1@teximage-colors gl_r3_g3_b2 spec@!opengl 1.1@texwrap formats Tested Xorg with 16bits depth. Also tested with MTT piglit, MTT glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-26 11:37:59 -06:00
Charmaine Lee	3626112214	svga: cache the backing surface handle in the texture object CinebenchR15 not only binds the same texture for rendering and sampling, it actually changes the framebuffer buffer attachment very often, causing a lot of backed surface view to be created and a lot of surface copies to be done. This patch caches the backed surface handle in the texture resource and allows the backed surface view to reuse the backed surface handle. With this patch, the number of backed surface view reduces from 1312 to 3. Unfortunately, this does not eliminate all the surface copies. There are still surface copies involved when we switch from original to backed surface handle for rendering. Tested with CinebenchR15, NobelClinicianViewer, Turbine, Lightsmark2008, MTT glretrace, MTT piglit. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-26 11:37:59 -06:00
Charmaine Lee	7f2f695d4d	svga: Update the backing resource only if needed This patch adds a timestamp in svga_surface structure to keep track of when the backing surface is last sync with the original resource. This helps to avoid unnecessary surface copy from the original resource to the backing surface if the original resource has not since been modified. This reduces the amount of surface copy with CinebenchR15. Tested with CinebenchR15, mtt glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-26 11:37:59 -06:00
Charmaine Lee	c6576461f5	svga: Set the surface dirty bit for the right surface view For VGPU10, we will render to a backed surface view when the same resource is used for rendering and sampling. In this case, we will mark the dirty bit for the backed surface view. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-26 11:37:59 -06:00
Charmaine Lee	dc30ac5c24	svga: Move rendertarget view related fields to hw_clear state This patch moves the rendertarget view related fields from svga_hw_draw_state to svga_hw_clear_state where all the hw framebuffer related state resides. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-26 11:37:59 -06:00
Charmaine Lee	f482493dcf	svga: Move setting the rendered_to flags to framebuffer emit time Instead of setting the rendered_to flags at set time, this patch moves the setting of the flags to framebuffer emit time. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-26 11:37:59 -06:00
Brian Paul	1ee181b354	svga: add const qualifiers on svga_check_sampler_view_resource_collision() We don't change any of the argument objects. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-26 11:37:59 -06:00
Brian Paul	0f236ea785	svga: improve surface view debug messages The old ones were somewhat cryptic. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-26 11:37:59 -06:00
Brian Paul	943f4f47e0	svga: add DEBUG_SAMPLERS The debug output in svga_create_sampler_state() was controlled by DEBUG_VIEWS but that's not consistent with the other debug output for sampler views. Create/use a new debug flag just for this. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-26 11:37:59 -06:00
Brian Paul	577e114e46	svga: fail screen creation if HW version is too old Tested by verifying 3D acceleration works with HWv8 but not earlier. For HWv7 and older we get the GDI Generic renderer. Reviewed-by: Neha Bhende<bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-26 11:37:59 -06:00
Deepak Rawat	8de0452ec4	winsys/svga: fix error path when kernel is not able to create surface If for some reason kernel is not able to create surface, when no buffer was provided the function vmw_svga_winsys_surface_create should return NULL. This patch fixes the issue where the code was not following the clean up path in case of error, which used to cause SIGSEGV. Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2017-04-26 11:37:59 -06:00
Brian Paul	75be43ed33	draw: whitespace fixes in draw_pipe_vbuf.c Remove trailing whitespace, fix formatting, etc. Trivial.	2017-04-26 11:37:59 -06:00
Brian Paul	4bb19a1514	st/mesa: minor clean-ups in st_update_renderbuffer_surface() Remove unneeded parens. Add const qualifiers. Move var decls closer to where they're used. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Neha Bhende<bhenden@vmware.com>	2017-04-26 11:37:59 -06:00
Samuel Pitoiset	00b5044740	nv50,nvc0: disable the TGSI merge registers pass shader-db results on GK106 (Thanks Karol): total instructions in shared programs : 3931608 -> 3929463 (-0.05%) total gprs used in shared programs : 481255 -> 479014 (-0.47%) total local used in shared programs : 27481 -> 27381 (-0.36%) total bytes used in shared programs : 36031256 -> 36011120 (-0.06%) local gpr inst bytes helped 14 1471 1309 1309 hurt 1 88 384 384 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-26 19:15:54 +02:00
Samuel Pitoiset	0bceefc295	radeonsi: disable the TGSI merge registers pass 47109 shaders in 29632 tests Totals: SGPRS: 1917364 -> 1916620 (-0.04 %) VGPRS: 1165802 -> 1165202 (-0.05 %) Spilled SGPRs: 1880 -> 1843 (-1.97 %) Spilled VGPRs: 70 -> 65 (-7.14 %) Private memory VGPRs: 1184 -> 1184 (0.00 %) Scratch size: 1312 -> 1308 (-0.30 %) dwords per thread Code Size: 60211356 -> 60192268 (-0.03 %) bytes LDS: 1077 -> 1077 (0.00 %) blocks Max Waves: 428597 -> 428674 (0.02 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 238173 -> 237429 (-0.31 %) VGPRS: 149556 -> 148956 (-0.40 %) Spilled SGPRs: 1263 -> 1226 (-2.93 %) Spilled VGPRs: 25 -> 20 (-20.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 20 -> 16 (-20.00 %) dwords per thread Code Size: 10457904 -> 10438816 (-0.18 %) bytes LDS: 50 -> 50 (0.00 %) blocks Max Waves: 41283 -> 41360 (0.19 %) Wait states: 0 -> 0 (0.00 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-26 19:15:40 +02:00
Samuel Pitoiset	066a572955	st/glsl_to_tgsi: disable the merge registers pass conditionally The main goal of this pass to merge temporary registers in order to reduce the total number of registers and also to produce optimal TGSI code. In fact, compilers seem to be confused when temporary variables are already merged, maybe because it's done too early in the process. Skipping the pass, reduce both the register pressure and the code size, at least for Nouveau and RadeonSI because they have a real backend compiler. Found by luck while fixing an issue in the TGSI dead code elimination pass which affects tex instructions with bindless samplers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-26 19:15:37 +02:00
Samuel Pitoiset	3a927e0aa3	gallium: add PIPE_SHADER_CAP_TGSI_SKIP_MERGE_REGISTERS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-26 19:15:34 +02:00
Samuel Pitoiset	ec301497b8	radeonsi: use unsynchronized transfers for shader binary uploads Because the buffer is new, it can't be referenced by any CS. This can save few CPU cycles by skipping the whole PIPE_TRANSFER_UNSYNCHRONIZED if in amdgpu_bo_map(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-26 19:15:22 +02:00
Marek Olšák	96b0cfc82e	radeonsi: turn si_shader_key::mono into a non-union A merged LS-HS shader needs both fix_fetch and inputs_to_copy for compilation. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 13:08:05 +02:00
Marek Olšák	3f2a0649ab	radeonsi: adjust ESGS ring buffer size computation on VI Cc: 17.0 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 13:08:05 +02:00
Marek Olšák	80814819c2	radeonsi/gfx9: don't set deprecated field PARTIAL_ES_WAVE_ON Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 13:08:05 +02:00
Marek Olšák	60a20e6879	radeonsi/gfx9: set MAX_PRIMGRP_IN_WAVE in the correct register Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 13:08:05 +02:00
Marek Olšák	8e8570a9e8	radeonsi/gfx9: add a workaround for viewing a slice of 3D as a 2D image Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 13:08:05 +02:00
Marek Olšák	482e6b07cc	radeonsi/gfx9: fix 1D array shader images Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 13:08:05 +02:00
Marek Olšák	5c94779585	radeonsi/gfx9: fix most things wrong with shader images There are 2 major hw changes: - The address must always point to the address of level 0. GFX9 tiling modes don't allow binding to a non-0 level. - 3D must always be bound as 3D, because 2D and 3D use entirely different tiling modes, and the texture target determines which set of modes is used. Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 13:08:05 +02:00
Marek Olšák	65e0c3fba7	radeonsi/gfx9: fix texture buffer objects and image buffers with IDXEN==0 Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 13:08:05 +02:00
Eric Engestrom	9d1dbf2aa1	configure: print LDFLAGS alongside CFLAGS & co. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 10:27:17 +01:00
Timothy Arceri	2895d96a05	mesa: tidy up left over APPLE_vertex_array_object semantics Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 10:03:06 +10:00
Timothy Arceri	f38845b9cb	mesa: inline bind_vertex_array() helper The previous commit removed the only other user of this function. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 10:03:06 +10:00
Timothy Arceri	7927d0378f	mesa: drop APPLE_vertex_array_object support Shared context support for VAOs was dropped in `0b2750620b`. From the ARB_vertex_array_object spec: "This extension differs from GL_APPLE_vertex_array_object in that client memory cannot be accessed through a non-zero vertex array object. It also differs in that vertex array objects are explicitly not sharable between contexts." Nobody should be using this extension over ARB_vertex_array_object anymore so just drop it rather than adding locking back just for VAOs created from these functions. For reference the Nvidia blob doesn't expose this extension. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 10:03:06 +10:00
Bas Nieuwenhuizen	7b9963a28f	radv: Enable userspace fence checking. v2: - Added some error handling. - memset the buffer to 0. v3: Added assert for buffer size. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-26 01:32:41 +02:00
Matt Turner	ee5f96581a	i965: Remove unused variable 'options' Should have been removed in commit `ad55b1a770`	2017-04-25 15:28:33 -07:00
Matt Turner	71d11f3998	glsl: Initialize current_var CID: 1324644 (Uninitialized pointer field)	2017-04-25 15:28:33 -07:00
Dave Airlie	7f77554b5b	radv/ac: setup mrt exports then export them in one go. (v2) Noticed while looking at Sascha Willems deferred shaders. This is a bit of an llvm workaround, llvm was producing this: v_cvt_pkrtz_f16_f32_e64 v4, v7, v8 ; D2960004 00021107 v_cvt_pkrtz_f16_f32_e64 v6, v9, 1.0 ; D2960006 0001E509 s_waitcnt vmcnt(0) ; BF8C0F70 exp mrt0 v4, v4, v6, v6 compr ; C400040F 00000604 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e64 v4, v12, v5 ; D2960004 00020B0C v_cvt_pkrtz_f16_f32_e64 v5, v14, 1.0 ; D2960005 0001E50E exp mrt1 v4, v4, v5, v5 compr ; C400041F 00000504 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e64 v0, v0, v1 ; D2960000 00020300 v_cvt_pkrtz_f16_f32_e64 v1, v2, v3 ; D2960001 00020702 exp mrt2 v0, v0, v1, v1 done compr vm ; C4001C2F 00000100 After this change: v_cvt_pkrtz_f16_f32_e64 v4, v7, v8 ; D2960004 00021107 s_waitcnt vmcnt(0) ; BF8C0F70 v_cvt_pkrtz_f16_f32_e64 v0, v0, v1 ; D2960000 00020300 v_cvt_pkrtz_f16_f32_e64 v6, v9, 1.0 ; D2960006 0001E509 v_cvt_pkrtz_f16_f32_e64 v5, v12, v5 ; D2960005 00020B0C v_cvt_pkrtz_f16_f32_e64 v7, v14, 1.0 ; D2960007 0001E50E exp mrt0 v4, v4, v6, v6 compr ; C400040F 00000604 v_cvt_pkrtz_f16_f32_e64 v1, v2, v3 ; D2960001 00020702 exp mrt1 v5, v5, v7, v7 compr ; C400041F 00000705 exp mrt2 v0, v0, v1, v1 done compr vm ; C4001C2F 00000100 No waitcnt for exports are emitted. v2: fixup index->mrt mapping (Bas). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-25 23:26:11 +01:00
Dave Airlie	b2cedb3ea9	radv/ac: overhaul vs output/ps input routing In order to cleanly eliminate exports rewrite the code first to mirror how radeonsi works for now. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-25 23:24:39 +01:00
Dave Airlie	b858cb4df8	radv/ac: move point coord after layer/viewport. These need to be ordered as per shader enum ordering, I'll rewrite this soon, but this is a bug fix. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-25 23:24:21 +01:00
Samuel Pitoiset	1c66522ecc	gallium: remove u_caps.c/h interface No longer used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-25 23:26:44 +02:00
Marek Olšák	04d7978b8c	ddebug: implement get_query_result_resource Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-25 22:39:31 +02:00
Marek Olšák	231dfa5a02	trace: don't trace resource_destroy due to the lack of pipe_resource wrapping, we can get this call from inside of driver calls, which would try to lock an already-locked mutex. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-25 22:39:31 +02:00
Marek Olšák	2c1ec23a06	gallium/util: add debugging helpers printing pipeline statistics typically useful for hw bring-up Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-25 22:39:31 +02:00
Rob Herring	26a36c1af7	Android: fix r300g only build If r300g is the only radeon driver built, the Android build fails to build: ninja: error: 'out/target/product/linaro_x86_64/obj/STATIC_LIBRARIES/libmesa_pipe_radeon_intermediates/export_includes', needed by 'out/target/product/linaro_x86_64/obj/SHARED_LIBRARIES/gallium_dri_intermediates/import_includes', missing and no known rule to make it This is because the path to build libmesa_pipe_radeon was only getting added for r600g and radeonsi, but the library dependency was added for all radeon drivers. As libmesa_pipe_radeon is not needed for r300g, drop the library dependency. Cc: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-25 17:08:06 +01:00
Timothy Arceri	347fe24f82	mesa: use locked version of HashWalk for xfb objects From Chapter 5 'Shared Objects and Multiple Contexts' of the OpenGL 4.5 spec: "Objects which contain references to other objects include framebuffer, program pipeline, query, transform feedback, and vertex array objects. Such objects are called container objects and are not shared" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-25 09:58:47 +10:00
Timothy Arceri	a82d6a307d	mesa: create locked version of HashWalk Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-25 09:58:39 +10:00
Rafael Antognolli	6a40ccec4b	genxml: Fix gen_pack_header.py crash when field type is invalid. Just return earlier in that case. Also set prefix to an empty string, so we don't get to use it undefined. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 15:14:12 -07:00
Rafael Antognolli	9670124e31	genxml: Make BLEND_STATE command support variable length array. We need to emit BLEND_STATE, which size is 1 + 2 * nr_draw_buffers dwords (on gen8+), but the BLEND_STATE struct length is always 17. By marking it size 1, which is actually the size of the struct minus the BLEND_STATE_ENTRY's, we can emit a BLEND_STATE of variable number of entries. For gen6 and gen7 we set length to 0, since it only contains BLEND_STATE_ENTRY's, and no other data. With this change, we also change the code for blorp and anv to emit only the needed BLEND_STATE_ENTRY's, instead of always emitting 16 dwords on gen6-7 and 17 dwords on gen8+. v2: - Use designated initializers on blorp and remove 0 from initialization (Jason) - Default entries to disabled on Vulkan (Jason) - Rebase code. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 15:14:10 -07:00
Rafael Antognolli	4ace73b1f6	genxml: Fix python crash when no dwords are found. If the 'dwords' dict is empty, max(dwords.keys()) throws an exception. This case could happen when we have an instruction that is only an array of other structs, with variable length. v2: - Add another clause for empty dwords and make it work with python 3 (Dylan) - Set the length to 0 if dwords is empty, and do not declare dw Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 15:14:08 -07:00
Rafael Antognolli	19720405d5	genxml: Remove unused parameter. 'start' parameter from Group.emit_pack_function() is useless. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 15:14:05 -07:00
Rafael Antognolli	1ea41163eb	intel/aubinator: Correctly read variable length structs. Before this commit, when a group with count="0" is found, only one field is added to the struct representing the instruction. This causes only one entry to be printed by aubinator, for variable length groups. With this commit we "detect" that there's a variable length group (count="0") and store the offset of the last entry added to the struct when reading the xml. When finally reading the aubdump file, we check the size of the group and whether we have variable number of elements, and in that case, reuse the last field to add the remaining elements. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Tested-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 15:13:51 -07:00
Nanley Chery	50134cede1	isl/format: Update the R16G16B16X16_FLOAT entry The section of the PRM mentioned in the code comment above this table says that this format supports the render target write message. Internal documentation says that this format also supports alpha blending. As a side effect, this allows CCS_D buffers to be created for images with this format. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2017-04-24 13:30:50 -07:00
Nanley Chery	b1066f7365	anv/pass: Delete anv_pass::subpass_attachments This field has no users. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2017-04-24 13:30:50 -07:00
Francisco Jerez	58324389be	intel/fs: Take into account amount of data read in spilling cost heuristic. Until now the spilling cost calculation was neglecting the amount of data read from the register during the spilling cost calculation. This caused it to make suboptimal decisions in some cases leading to higher memory bandwidth usage than necessary. Improves Unigine Heaven performance by ~4% on BDW, reversing an unintended FPS regression from my previous commit `147e71242c` with n=12 and statistical significance 5%. In addition SynMark2 OglCSDof performance is improved by an additional ~5% on SKL, and a Kerbal Space Program apitrace around the Moho planet I can provide on request improves by ~20%. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-24 11:01:40 -07:00
Francisco Jerez	ecc19e12dc	intel/fs: Use regs_written() in spilling cost heuristic for improved accuracy. This is what we use later on to compute the number of registers that will actually get spilled to memory, so it's more likely to match reality than the current open-coded approximation. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-24 10:59:56 -07:00
Kenneth Graunke	6b10c37b9c	i965/vec4: Use reads_accumulator_implicitly(), not MACH checks. Curro pointed out that I should not just check for MACH, but use the reads_accumulator_implicitly() helper, which would also prevent the same bug with MAC and SADA2 (if we ever decide to use them). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-24 10:53:49 -07:00
Mauro Rossi	11db3d10bb	android: radv/ac: Fix nir.h include Fixes following building errors due to missing include paths: external/mesa/src/amd/common/ac_shader_info.c:23:10: fatal error: 'nir/nir.h' file not found ^ external/mesa/src/compiler/nir/nir.h:48:10: fatal error: 'nir_opcodes.h' file not found ^ Fixes: `224cf29` "radv/ac: add initial pre-pass for shader info gathering" Acked-by: Dave Airlie <Airlied@redhat.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-24 18:01:03 +01:00
Vinson Lee	b81d85f175	configure.ac: Fix typos. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: <mesa-stable@lists.freedesktop.org>	2017-04-23 22:23:22 -07:00
Dave Airlie	fed740eafe	radv/ac: copy llvm machine feature flags from radeonsi. This just updates this to use the same flags as radeonsi for consistency. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-24 05:55:44 +01:00
Timothy Arceri	794ae44095	i965: remove now unused GLSL IR optimisations These are no longer used since the previous commit. Acked-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 12:08:14 +10:00
Timothy Arceri	ad55b1a770	i965: remove GLSL IR optimisation loop IVB is running into some spilling issues in piglit with the loop removed. However those tests are not really reflective of a real world use case, also fp64 is brand new to IVB so we leave the spilling issues to be resolved at a later time. Run time for shader-db on my machine goes from ~795 seconds to ~665 seconds. shader-db results BDW: total instructions in shared programs: 12969459 -> 12968891 (-0.00%) instructions in affected programs: 1463154 -> 1462586 (-0.04%) helped: 3622 HURT: 3326 total cycles in shared programs: 246453572 -> 246504318 (0.02%) cycles in affected programs: 208842622 -> 208893368 (0.02%) helped: 24029 HURT: 35407 total loops in shared programs: 2931 -> 2931 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 14560 -> 14498 (-0.43%) spills in affected programs: 2270 -> 2208 (-2.73%) helped: 17 HURT: 2 total fills in shared programs: 19671 -> 19632 (-0.20%) fills in affected programs: 2060 -> 2021 (-1.89%) helped: 17 HURT: 2 LOST: 17 GAINED: 40 Most of the hurt shaders are 1-2 instructions, with what looks like a max of 7. I've looked at the worst cycles regressions and as far as I can tell its just a scheduling difference. Acked-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 12:08:14 +10:00
Timothy Arceri	21173194db	glsl: use ARB_enhahnced_layouts for packing where possible If packing doesn't cross locations we can easily make use of ARB_enhanced_layouts to do packing rather than using the GLSL IR lowering pass lower_packed_varyings(). Shader-db Broadwell results: total instructions in shared programs: 12977822 -> 12977819 (-0.00%) instructions in affected programs: 1871 -> 1868 (-0.16%) helped: 4 HURT: 3 total cycles in shared programs: 246567288 -> 246567668 (0.00%) cycles in affected programs: 1370386 -> 1370766 (0.03%) helped: 592 HURT: 733 Acked-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 12:08:14 +10:00
Timothy Arceri	eb8aa93c03	glsl: disable varying packing for varying used by interpolateAt* Currently the NIR backends depend on GLSL IR copy propagation to fix up the interpolateAt* function params after varying packing changes the shader input to a global. It's possible copy propagation might not always do what we need it too, and we also shouldn't depend on optimisations to do this type of thing for us. I'm not sure if the same is true for TGSI, but the following commit should re-enable packing for most cases in a safer way, so we just disable it everywhere. No change in shader-db for i965 (BDW) Acked-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 12:08:14 +10:00
Timothy Arceri	aa021d50c0	glsl_to_nir: skip ir_var_shader_shared variables These should be lowered away in GLSL IR but if we don't get dead code to clean them up it causes issues in glsl_to_nir. We wan't to drop as many GLSL IR opts in future as we can so this makes glsl_to_nir just ignore the vars if it sees them. In future we will want to just use the nir lowering pass that Vulkan currently uses. Acked-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 12:08:14 +10:00
Timothy Arceri	7a7ee40c2d	nir/i965: add before ffma algebraic opts This shuffles constants down in the reverse of what the previous patch does and applies some simpilifications that may be made possible from doing so. Shader-db results BDW: total instructions in shared programs: 12980814 -> 12977822 (-0.02%) instructions in affected programs: 281889 -> 278897 (-1.06%) helped: 1231 HURT: 128 total cycles in shared programs: 246562852 -> 246567288 (0.00%) cycles in affected programs: 11271524 -> 11275960 (0.04%) helped: 1630 HURT: 1378 V2: mark float opts as inexact Reviewed-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 12:08:14 +10:00
Timothy Arceri	fb2269fed1	nir: shuffle constants to the top V2: mark float opts as inexact If one of the inputs to an mul/add is the result of another mul/add there is a chance that we can reuse the result of that mul/add in other calls if we do the multiplication in the right order. Also by attempting to move all constants to the top we increase the chance of constant folding. For example it is a fairly common pattern for shaders to do something similar to this: const float a = 0.5; in vec4 b; in float c; ... b.x = b.x * c; b.y = b.y * c; ... b.x = b.x * a + a; b.y = b.y * a + a; So by simply detecting that constant a is part of the multiplication in ffma and switching it with previous fmul that updates b we end up with: ... c = a * c; ... b.x = b.x * c + a; b.y = b.y * c + a; Shader-db results BDW: total instructions in shared programs: 13011050 -> 12967888 (-0.33%) instructions in affected programs: 4118366 -> 4075204 (-1.05%) helped: 17739 HURT: 1343 total cycles in shared programs: 246717952 -> 246410716 (-0.12%) cycles in affected programs: 166870802 -> 166563566 (-0.18%) helped: 18493 HURT: 7965 total spills in shared programs: 14937 -> 14560 (-2.52%) spills in affected programs: 9331 -> 8954 (-4.04%) helped: 284 HURT: 33 total fills in shared programs: 20211 -> 19671 (-2.67%) fills in affected programs: 12586 -> 12046 (-4.29%) helped: 286 HURT: 33 LOST: 39 GAINED: 33 Some of the hurt will go away when we shuffle things back down to the bottom in the following patch. It's also noteworthy that almost all of the spill changes are in Deus Ex both hurt and helped. Reviewed-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 12:08:14 +10:00
Timothy Arceri	83f7fdf83a	nir: add flt comparision simplification Didn't turn out as useful as I'd hoped, but it will help alot more on i965 by reducing regressions when we drop brw_do_channel_expressions() and brw_do_vector_splitting(). I'm not sure how much sense 'is_not_used_by_conditional' makes on platforms other than i965 but since this is a new opt it at least won't do any harm. shader-db BDW: total instructions in shared programs: 13029581 -> 13029415 (-0.00%) instructions in affected programs: 15268 -> 15102 (-1.09%) helped: 86 HURT: 0 total cycles in shared programs: 247038346 -> 247036198 (-0.00%) cycles in affected programs: 692634 -> 690486 (-0.31%) helped: 183 HURT: 27 Reviewed-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-24 12:08:14 +10:00
Bas Nieuwenhuizen	18947fde7a	radv: Enable lowering fdiv in nir. Results in faster code than the lowering by LLVM. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-23 20:38:06 +02:00
Rob Clark	0012a98c0e	freedreno/a5xx: hack for r8g8b8a8_snorm Blob won't render to this format, and sampling from it it uses the same fmt value for r8g8b8_snorm and r8g8b8a8_snorm. But this is what is what blocks us from jumping from gl30/gles20 to gl31/gles30. So a hack it is! Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-23 13:03:25 -04:00
Rob Clark	c21fc881ed	freedreno/a5xx: rgtc formats Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-23 13:03:25 -04:00
Marek Olšák	070072ad43	mesa: replace _mesa_index_buffer::type with index_size This avoids repeated translations of the enum. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-22 22:51:15 +02:00
Bas Nieuwenhuizen	e137b9eed9	radv: Use the correct pipeline for dispatches. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Fixes: `ec15e0d30` "radv: optimise compute shader grid size emission." Tested-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-22 20:26:59 +01:00
Wladimir J. van der Laan	9da0cd56c3	etnaviv: Supertiled texture support on gc3000 Support supertiled textures on hardware that has the appropriate feature flag SUPERTILED_TEXTURE. Most of the scaffolding was already in place in etna_layout_multiple: case ETNA_LAYOUT_SUPER_TILED: paddingX = 64; paddingY = 64; *halign = TEXTURE_HALIGN_SUPER_TILED; So this is just a matter of allowing it. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-22 17:49:29 +02:00
Fabio Estevam	53e39f6df4	etnaviv: etnaviv_fence: Simplify the return code logic The return code can be simplified by using the logical not operator. Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-22 17:48:35 +02:00
Rob Clark	e769349fc6	freedreno/a5xx: occlusion query Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-22 10:03:02 -04:00
Rob Clark	52d2fa37f5	freedreno: drop ring arg from _set_stage() It is always the draw ring. Except for a5xx queries like time-elapsed, where we will eventually want to emit cmds into both binning and draw rings. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-22 10:03:02 -04:00
Rob Clark	5923780b2a	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-22 10:03:02 -04:00
Rob Clark	d310ea0f32	freedreno: add support for hw accumulating queries Some queries on a4xx and all queries on a5xx can do result accumulation on CP so we don't need to track per-tile samples. We do still need to handle pausing/resuming while switching batches (in case the query is active over multiple draws which are executed out of order). So introduce new accumulated-query helpers for these sorts of queries, since it doesn't really fit in cleanly with the original query infra- structure. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-22 10:03:02 -04:00
Rob Clark	935623af14	freedreno: a bit of query refactor Move a bit more of the logic shared by all query types (active tracking, etc) into common code. This avoids introducing a 3rd copy of that logic for a5xx. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-22 10:03:02 -04:00
Rob Clark	df63ff4d82	freedreno: make hw-query a helper For a5xx (and actually some queries on a4xx) we can accumulate results in the cmdstream, so we don't need this elaborate mechanism of tracking per-tile query results. So make it into vfuncs so generation specific backend can use it when it makes sense. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-22 10:03:01 -04:00
Kenneth Graunke	2faf227ec2	i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce(). opt_register_coalesce() was optimizing sequences such as: mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D mov(8) m4.zw:F, vgrf5.xxxy:F into: mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D This doesn't work - if we're going to reswizzle MACH, we'd need to reswizzle the MUL as well. Here, the MUL fills the accumulator's .zw components with attr18.yy * attr19.yy. But the MACH instruction expects .z to contain attr18.x * attr19.x. Bogus results ensue. No change in shader-db on Haswell. Prevents regressions in Timothy's patches to use enhanced layouts for varying packing (which rearrange code just enough to trigger this pre-existing bug, but were fine themselves). Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-22 00:01:16 -07:00
Timothy Arceri	d682f8aa8e	mesa: validate sampler type across the whole program Currently we were only making sure types were the same within a single stage. This looks to have regressed with `953a0af8e3`. Fixes: `953a0af8e3` ("mesa: validate sampler uniforms during gluniform calls") Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> https://bugs.freedesktop.org/show_bug.cgi?id=97524	2017-04-22 10:01:15 +10:00
Timothy Arceri	918cec8cbe	mesa: don't lock hashtables that are not shared across contexts From Chapter 5 'Shared Objects and Multiple Contexts' of the OpenGL 4.5 spec: "Objects which contain references to other objects include framebuffer, program pipeline, query, transform feedback, and vertex array objects. Such objects are called container objects and are not shared" For we leave locking in place for framebuffer objects because the EXT fbo extension allowed sharing. We could maybe just replace the hash with an ordinary hash table but for now this should remove most of the unnecessary locking. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-22 10:01:15 +10:00
Matt Turner	ef6af0d5f7	mesa: Remove deleteFlag pattern from container objects. This pattern was only useful when we used mutex locks, which the previous commit removed. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-22 10:01:15 +10:00
Matt Turner	0b2750620b	mesa: Remove unnecessary locking from container objects. From Chapter 5 'Shared Objects and Multiple Contexts' of the OpenGL 4.5 spec: "Objects which contain references to other objects include framebuffer, program pipeline, query, transform feedback, and vertex array objects. Such objects are called container objects and are not shared" For we leave locking in place for framebuffer objects because the EXT fbo extension allowed sharing. V2: (Timothy Arceri) - rebased and dropped changes to framebuffer objects Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-22 10:01:15 +10:00
Timothy Arceri	622a68ed3e	mesa: remove fallback RefCount == 0 pattern We should never get here if this is 0 unless there is a bug. Replace the check with an assert. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-22 10:01:15 +10:00
Elie TOURNIER	0cc8c81902	egl: add gitignore Since commit `ce562f9e3f`, two new files are generated. We don't want to track them. Signed-off-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-22 00:42:38 +01:00
Samuel Pitoiset	a7bc51aef8	glsl: make use of glsl_type::is_float() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-21 19:34:15 +02:00
Samuel Pitoiset	cacc823c39	glsl: make use of glsl_type::is_double() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-21 19:34:12 +02:00
Samuel Pitoiset	100721959b	glsl: make use of glsl_type::is_integer_64() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-21 19:33:57 +02:00
Samuel Pitoiset	362d9de29c	glsl: simplify glsl_type::is_integer_32_64() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-21 19:33:42 +02:00
Samuel Pitoiset	87be9faa78	glsl: add glsl_type::is_integer_64() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-21 19:33:40 +02:00
Samuel Pitoiset	60caca3019	glsl: make use of glsl_type::is_boolean() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-21 19:33:38 +02:00
Samuel Pitoiset	64db02b5fa	glsl: make use of glsl_type::is_record() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-21 19:33:36 +02:00
Samuel Pitoiset	cd78ab55d0	glsl: make use of glsl_type::is_interface() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-21 19:33:34 +02:00
Samuel Pitoiset	0c8898dc34	glsl: make use of glsl_type::is_array() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-21 19:33:32 +02:00
Samuel Pitoiset	053912382e	glsl: make use glsl_type::is_atomic_uint() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-21 19:33:29 +02:00
Samuel Pitoiset	993a05f0eb	glsl: add glsl_type::is_atomic_uint() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-21 19:33:27 +02:00
Emil Velikov	52df318d61	mesa/glthread: correctly compare thread handles As mentioned in the manual - comparing pthread_t handles via the C comparison operator is incorrect and pthread_equal() should be used instead. Cc: Timothy Arceri <tarceri@itsqueeze.com> Fixes: `d8d81fbc31` ("mesa: Add infrastructure for a worker thread to process GL commands.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-04-21 13:39:57 +01:00
Emil Velikov	dd6ec78b4f	st/clover: add space between < and :: As pointed out by compiler ./llvm/codegen.hpp:52:22: error: ‘<::’ cannot begin a template-argument list [-fpermissive] ./llvm/codegen.hpp:52:22: note: ‘<:’ is an alternate spelling for ‘[’. Insert whitespace between ‘<’ and ‘::’ Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2017-04-21 13:39:57 +01:00
Samuel Pitoiset	862361c4f5	glsl: get rid of values_for_type() This function is actually a wrapper for component_slots() and it always returns 1 (or N) for samplers. Since component_slots() now return 1 for samplers, it can go. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-21 10:08:32 +02:00
Samuel Pitoiset	4a0aa0b3b3	glsl: make component_slots() returns 1 for sampler types It looks inconsistent to return 1 for image types and 0 for sampler types. Especially because component_slots() is mostly used by values_for_type() which always returns 1 for samplers. For bindless, this value will be bumped to 2 because the ARB_bindless_texture states that bindless samplers/images should consume two components. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-21 10:08:04 +02:00
Kai Wasserbäch	29582dd20c	docs/features: mark KHR_no_error as started The OpenGL extension KHR_no_error is exposed since commit `d42d150ad2` by Timothy Arceri. Therefore it should be marked as "started" in the features.txt Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-21 09:39:38 +02:00
Tapani Pälli	ae6cbdede0	Revert "android: fix segfault within swap_buffers" This reverts commit `4d4558411d`. This was a wrong call, while it fixed issue with 3DMark it actually introduced regression elsewhere. Signed-off-by: Tapani Pälli <tapani.palli@intel.com>	2017-04-21 10:03:58 +03:00
Ilia Mirkin	da0a80804c	nvc0: Add support for setting viewport index/layer from VS/TES This enables support on GM200+ for: - GL_AMD_vertex_shader_layer - GL_AMD_vertex_shader_layer_viewport_index - GL_ARB_shader_viewport_layer_array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> [lyude: add relnotes/TES cap] Signed-off-by: Lyude <lyude@redhat.com> [imirkin: move relnotes to right place, add features.txt] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-20 23:24:06 -04:00
Lyude	214f96c1e7	nvc0/ir: Only store viewport in scratch register for GP EMIT only applies to geometry shaders. For everything else, we want to export the viewport normally. Signed-off-by: Lyude <lyude@redhat.com> Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-20 23:24:06 -04:00
Bas Nieuwenhuizen	0e91d8f38c	radv: Prefetch compute shader too. For consistency, doesn't really impact performance. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-21 00:59:02 +02:00
Jason Ekstrand	1e21d4227e	anv/query: Use genxml for MI_MATH Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed by: Iago Toral Quiroga <itoral@igalia.com>	2017-04-20 15:24:06 -07:00
Jason Ekstrand	e23129ac0c	genxml: Add better support for MI_MATH This breaks the guts of MI_MATH (the instruction part) out into its own structure with proper named values. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed by: Iago Toral Quiroga <itoral@igalia.com>	2017-04-20 15:24:06 -07:00
Jason Ekstrand	b7a2af8e38	genxml/pack: Allow hex values in the XML Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-04-20 15:24:06 -07:00
Dave Airlie	35ea0c07a1	radv/ac: use tex_lz if we can. Looking at some Talos shaders vs radeonsi, I noticed they use tex_lz in a few places, so we should be able to as well. Reviewed-by: Bas Nieuwenhuizen <basni@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-20 22:00:13 +01:00
Marek Olšák	d1608d6982	st/mesa: use one big translation table in st_pipe_vertex_format for lower overhead. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-20 20:11:35 +02:00
Marek Olšák	86f99c1e4c	st/mesa: check in advance in st_draw_vbo whether the bitmap cache is empty Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-20 20:11:35 +02:00
Marek Olšák	1fb5bc83f1	st/mesa: put the bitmap_cache structure inside st_context This is nicer on caches, and the next commit will need to access the structure from a different place. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-20 20:11:35 +02:00
Marek Olšák	69423dcf23	st/mesa: inline and optimize st_invalidate_readpix_cache Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-20 20:11:35 +02:00
Marek Olšák	7cd6e2df65	st/mesa: invalidate the readpix cache in st_indirect_draw_vbo Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-20 20:11:35 +02:00
Marek Olšák	4219e09343	gallium/util: remove util_draw_range_elements helper min/max_index are typically hints for the u_vbuf module, not the driver. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-20 20:11:35 +02:00
Marek Olšák	707d2e8b3e	gallium: fold u_trim_pipe_prim call from st/mesa to drivers Most drivers don't need it and shouldn't need it because it can't be used in some cases (indirect draws, primitive restart, count from streamout). Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-20 20:11:35 +02:00
Samuel Iglesias Gonsálvez	2beff74314	docs/envvars: sort INTEL_DEBUG envvar options by name It helps to find the envvar option you are looking for. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-04-20 16:27:31 +02:00
Christoph Haag	a9d27c8a33	ac: fix build after LLVM 5.0 SVN r300718 v2: previously getWithDereferenceableBytes() exists, but addAttr() doesn't take that type Signed-off-by: Christoph Haag <haagch+mesadev@frickel.club> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-and-reviewed-by: Mike Lothian <mike@fireburn.co.uk>	2017-04-20 10:58:19 +02:00
Juan A. Suarez Romero	3af7f8275b	bin/get-{extra,fixes}-pick-list.sh: improve output Show the commit hash and the title in a way that it is easier to copy and paste in the bin/.cherry-ignore-extra file if we want to ignore those commits for the future. v2: - Use printf instead echo (Eric Engestrom) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-20 10:28:54 +02:00
Juan A. Suarez Romero	99b41631bb	bin/get-{extra,fixes}-pick-list.sh: add support for ignore list Both scripts does not use a file with the commits to ignore. So if we have handled one of the suggested commits and decided we won't pick it, the scripts will continue suggesting them. v2: - Mark the candidates in bin/get-extra-pick-list.sh (Juan A. Suarez) - Use bin/.cherry-ignore to store rejected patches (Emil) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-20 10:28:21 +02:00
Brian Paul	8a7e3693c8	mesa: print target string in glBindTexture() error message Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-19 19:57:32 -06:00
Brian Paul	9bfecb03c5	mesa: fix Windows build error related to getuid() getuid() and geteuid() are not present on Windows. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-19 19:55:29 -06:00
Tim Rowley	dd4488ea6c	swr: simd16 vs work Build VS with alternating output for the current simd16 fe double-pump of a simd8 shader. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-19 19:01:48 -05:00
Bas Nieuwenhuizen	6bb1ed6bcc	radv: Set variant code_size when created from the cache. Signed-off-by: Bas Nieeuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-20 01:01:49 +02:00
Bas Nieuwenhuizen	1e1165389c	radv: Add shader prefetch. Gives me approximately a 2% perf increase in bot dota2 & talos. Having descriptors (both sets and vertex buffers) prefetched didn't help so I didn't include that. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-19 23:47:27 +02:00
Bas Nieuwenhuizen	74d92e547c	radv: Remove binding buffer count. In cases where it is used it is always 1. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Bas Nieuwenhuizen <basni@google.com>	2017-04-19 20:37:57 +02:00
Bas Nieuwenhuizen	f7b14ff4be	radv: Don't try to find gaps for non-freeable descriptors. With this we don't have any operations on a pool with non-freeable descriptors left that have O(#descriptors) complexity. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Bas Nieuwenhuizen <basni@google.com>	2017-04-19 20:37:57 +02:00
Bas Nieuwenhuizen	126d5adb11	radv: Use host memory pool for non-freeable descriptors. v2: Handle out of pool memory error. v3: Actually use VK_ERROR_OUT_OF_POOL_MEMORY_KHR for the error condition. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Bas Nieuwenhuizen <basni@google.com>	2017-04-19 20:37:57 +02:00
Bas Nieuwenhuizen	39644fa40a	radv: Don't allocate dynamic descriptors separately. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Bas Nieuwenhuizen <basni@google.com>	2017-04-19 20:37:57 +02:00
Emil Velikov	51c0c213b7	st/mesa: automake: honour the vdpau header install location If VDPAU is installed in the non-default location, we'll fail to find the headers and error at build time. ../../src/gallium/include/state_tracker/vdpau_dmabuf.h:37:25: fatal error: vdpau/vdpau.h: No such file or directory #include <vdpau/vdpau.h> ^ Fixes: `faba96bc60` ("st/vdpau: add new interop interface") Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 12:19:46 +01:00
Emil Velikov	309f4067a7	winsys/sw/dri: don't use GNU void pointer arithmetic Resolves build issues like the following: src/gallium/winsys/sw/dri/dri_sw_winsys.c:203:31: error: pointer of type ‘void ’ used in arithmetic [-Werror=pointer-arith] data = dri_sw_dt->data + (dri_sw_dt->stride box->y) + box->x * blsize; ^ src/gallium/winsys/sw/dri/dri_sw_winsys.c:203:62: error: pointer of type ‘void ’ used in arithmetic [-Werror=pointer-arith] data = dri_sw_dt->data + (dri_sw_dt->stride box->y) + box->x * blsize; ^ Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 12:19:38 +01:00
Emil Velikov	4516bfbd30	configure.ac: check require_basic_egl only if egl enabled Fixes: `1ac40173c2` ("configure.ac: simplify EGL requirements for drivers dependent on EGL") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 12:19:24 +01:00
Emil Velikov	179e21a720	configure.ac: manually expand PKG_CHECK_VAR The macro is introduced with pkgconfig v0.28 which isn't universally available. Thus it will error at configure stage. Reported-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Fixes: `ce562f9e3f` ("EGL: Implement the libglvnd interface for EGL (v3)") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-19 12:18:29 +01:00
Timothy Arceri	1787a3163f	mesa: add KHR_no_error support to glVertexAttribDivisor() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:25 +10:00
Timothy Arceri	f27f699672	mesa/vbo: add KHR_no_error support to DrawElements*() functions V2: move MESA_VERBOSE checks back into the common code path. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:25 +10:00
Timothy Arceri	3d08e18731	mesa/vbo: add KHR_no_error support to vbo_exec_DrawArrays*() V2: add missing FLUSH_CURRENT() to no_error path Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:25 +10:00
Timothy Arceri	4df2931a87	mesa/vbo: move some Draw checks out of validation These checks do not generate any errors. Move them so we can add KHR_no_error support and still make sure we do these checks. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:25 +10:00
Timothy Arceri	63a14e9e14	mesa/varray: add KHR_no_error support to *Pointer() functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:25 +10:00
Timothy Arceri	d86dd5963e	mesa/varray: add KHR_no_error support to some callers of validate_array_format() The only caller we don't update is update_arrays(), we leave that to the following commit. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:25 +10:00
Timothy Arceri	c495c2398c	mesa/varray: rename update_array_format() -> validate_array_format() We also move _mesa_update_array_format() into the caller. This gets these functions ready for KHR_no_error support. V2: Updated function comment as suggested by Brian. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:25 +10:00
Timothy Arceri	9e60742ddc	mesa/varray: create get_array_format() helper This will help us split array validation from array update. V2: add const to ctx param Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:25 +10:00
Timothy Arceri	d0608c43c5	mesa/varray: split update_array() into validate_array() and update_array() This will be used for adding KHR_no_error support. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:25 +10:00
Timothy Arceri	bd2662bfa1	mesa: add KHR_no_error support to glUniform*() functions V2: restore lost comment, add static to validate_uniform(), simplify array offset logic. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:25 +10:00
Timothy Arceri	2c9ac0bc63	mesa: always return GL_OUT_OF_MEMORY or GL_NO_ERROR when KHR_no_error enabled Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:25 +10:00
Timothy Arceri	3ff1fce6c9	mesa: add _mesa_is_no_error_enabled() helper Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:25 +10:00
Timothy Arceri	a0ed0eb342	mesa: add env var to force enable the KHR_no_error ctx flag V2: typo know -> known V3: add security check (Suggested by Nicolai) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:24 +10:00
Timothy Arceri	d42d150ad2	mesa: expose KHR_no_error Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 16:53:24 +10:00
Constantine Kharlamov	2a8a569276	r600g: update dirty_level_mask after the 1-st draw after FB change Ported from radeonsi. Testing with Kane&Lynch2 shows ≈1k skipped updates per frame on average. No piglit changes with tests/gpu.py, gbm mode. Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-19 08:15:22 +02:00
Nicolai Hähnle	51deba0eb3	vbo: fix gl_DrawID handling in glMultiDrawArrays Fixes a bug in KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-19 08:11:07 +02:00
Nicolai Hähnle	42d5465b9b	mesa: move glMultiDrawArrays to vbo and fix error handling When any count[i] is negative, we must skip all draws. Moving to vbo makes the subsequent change easier. v2: - provide the function in all contexts, including GLES - adjust validation accordingly to include the xfb check v3: - fix mix-up of pre- and post-xfb prim count (Nils Wallménius) Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-19 08:10:19 +02:00
Nicolai Hähnle	756e9ebbdd	mesa: extract need_xfb_remaining_prims_check The same logic needs to be applied to glMultiDrawArrays. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-19 08:09:57 +02:00
Nicolai Hähnle	ea9a8940ca	mesa: fix remaining xfb prims check for GLES with multiple instances Found by inspection. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-19 08:09:53 +02:00
Mike Lothian	2284d6bf7a	radv/meta: Fix nir_builder.h include This fixes the build after: commit `399ebd2a84` Author: Dave Airlie <airlied@redhat.com> Date: Wed Apr 19 06:18:23 2017 +1000 radv/meta: add common shader vertex generation function Signed-off-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 12:25:18 +10:00
Mike Lothian	709ed1fa9f	radv/ac: Fix nir.h include This fixes the build after: commit `224cf2906a` Author: Dave Airlie <airlied@redhat.com> Date: Mon Apr 17 13:01:52 2017 +1000 radv/ac: add initial pre-pass for shader info gathering Signed-off-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 12:25:18 +10:00
Dave Airlie	03a2ca6356	radv/meta: refactor out some common shaders. The vs vertex generate and fs noop shaders are used in a few places, so refactor them out. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 10:03:05 +10:00
Dave Airlie	bdd98d950f	radv/meta: generate position for blit shaders. This generates the position info using the vertex shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 10:03:01 +10:00
Dave Airlie	922f44d1ab	radv/meta: reduce vertex buffer in blit2d. Generate the position vertices. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 10:02:58 +10:00
Dave Airlie	dd17e4ceb4	radv/meta: reduce vertex buffer usage in clear shaders For depth clears we have to pass the depth in the 2nd component, we can use push constants for some of this later to drop the vertex buffer completely Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 10:02:53 +10:00
Dave Airlie	84b9e3a831	radv/meta: avoid using vertex buffer for resolve shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 10:02:50 +10:00
Dave Airlie	3a7fd0c4db	radv/meta: move depth decompress to using inline vertex data This removes the vertex buffer, and just generates the values in the shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 10:02:47 +10:00
Dave Airlie	90ed2872bc	radv/meta: move fast clear to generate vertices in shader. Avoids having to setup vertex buffers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 10:02:43 +10:00
Dave Airlie	399ebd2a84	radv/meta: add common shader vertex generation function Instead of passing in the same 1.0, -1.0 combinations via vertex buffers, we can just use vertex id to have the vertex shader build them. This function introduces the generator code needed, later patches will use this. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 10:02:39 +10:00
Dave Airlie	0e6d532d32	radv/meta: add support for save/restore meta without vertex data. Some of the shaders could just generate the vertex data in the shader, so add helpers to allow us to move to doing that. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 10:02:23 +10:00
Dave Airlie	60a93e11ba	radv: drop debugging leftovers code in descriptor set patches. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 09:31:14 +10:00
Dave Airlie	fd420a7417	radv: add support for 32 descriptor sets. This bumps the limit to the number of sets to 32, now that we have proper support for it. It also uses 1u in a few places to make things a bit safer. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 09:00:43 +10:00
Dave Airlie	25a5ee391d	radv/ac: add support for indirect access of descriptor sets. We want to expose more descriptor sets to the applications, but currently we have a 1:1 mapping between shader descriptor sets and 2 user sgprs, limiting us to 4 per stage. This commit check if we don't have enough user sgprs for the number of bound sets for this shader, we can ask for them to be indirected. Two sgprs are then used to point to a buffer or 64-bit pointers to the number of allocated descriptor sets. All shaders point to the same buffer. We can use some user sgprs to inline one or two descriptor sets in future, but until we have a workload that needs this I don't think we should spend too much time on it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 09:00:43 +10:00
Dave Airlie	d0991b135b	radv: start allocating user sgprs This adds an initial implementation to allocate the user sgprs and make sure we don't run out if we try to bind a bunch of descriptor sets. This can be enhanced further in the future if we add support for inlining push constants. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 09:00:43 +10:00
Dave Airlie	4087eaecd0	radv/ac: mark used descriptor sets in shader info. This pre calculates the used descriptor sets. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 09:00:43 +10:00
Dave Airlie	0b62669c8d	radv/ac: frag shader only needs ring offsets if sample positions enabled mostly documenting things, since with modern llvm we always have the spill enabled. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 09:00:42 +10:00
Dave Airlie	ec4785afb7	radv/ac: move needs_push_constants to shader info. First step to optimising push constants. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 09:00:42 +10:00
Dave Airlie	ec15e0d301	radv: optimise compute shader grid size emission. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 09:00:42 +10:00
Dave Airlie	31174069d2	radv: start conditionalising vertex inputs. (v2) In practice this will probably just drop draw id in a few places. v2: just do draw_id for now. (Bas) it might be possible to do something more if we need it in the future. (nha) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 09:00:42 +10:00
Dave Airlie	224cf2906a	radv/ac: add initial pre-pass for shader info gathering There is some radv specific info we need to gather from shaders before we get into converting nir->llvm, so we can make better decisions especially around user sgpr allocation. This is just an initial placeholder to gather if sample positions are required in the frag shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-19 09:00:42 +10:00
Rob Clark	4299849ec7	freedreno: refactor dirty state handling In particular, move per-shader-stage info out to a seperate array of enum's indexed by shader stage. This will make it easier to add more shader stages as well as new per-stage state (like SSBOs). Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-18 16:32:00 -04:00
Rob Clark	d7fa7f5e7e	freedreno: move clear path dirty state hack to a2xx backend a3xx/a4xx use the generic u_blitter path, which will make state dirty bits be set appropriately thanks to the automagic of generic code setting generic state in the driver. And a5xx has a blit/dma engine (actually, two) so it doesn't need these extra dirty bits set. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-18 16:32:00 -04:00
Rob Clark	b662f71d9c	freedreno/ir3: split out per-stage emit_consts fxns This makes it easier to deal with adding additional stages which have their own driver-params. The duplicated code this introduces can be refactored out after a later patch moves to per-shader-stage dirty flags. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-18 16:32:00 -04:00
Rob Clark	df37902e34	freedreno: add helper to mark all state clean Note that this involves juggling around a bit when we emit and clear texture state. So split out from the patch that adds the helper to set all state dirty. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-18 16:32:00 -04:00
Rob Clark	71f9e03d21	freedreno: add helper to mark all state dirty This will simplify things when we break out per-shader-stage dirty bits. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-18 16:32:00 -04:00
Rob Clark	248a508f24	freedreno: move a2xx specific hack out of core Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-18 16:32:00 -04:00
Rob Clark	0cc23ae779	freedreno: make texture state an array Make this an array indexed by shader stage, as is done elsewhere for other per-shader-stage state. This will simplify things as more shader stages are eventually added. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-18 16:32:00 -04:00
Rob Clark	5845b20455	freedreno/ir3: refactor out helpers for comparing shader keys Each of the ir3 users has basically the same logic for comparing the previous and current shader key, to see which, if any, shader state needs to be marked dirty due to shader variant change. The difference between gen's was just that some lowering flags never get set on certain generations. But it doesn't really hurt to include the extra checks (because both keys would have false). Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-18 16:32:00 -04:00
Rob Clark	6fb7935ded	util/queue: don't hang at exit So atexit() is horrible and `4aea8fe7` is probably not a good idea. But add an extra layer of duct-tape to the problem. Otherwise we hit a situation where app using an atexit() handler that runs later than ours doesn't hang when trying to tear down a context. (gdb) bt #0 util_queue_killall_and_wait (queue=queue@entry=0x52bc80) at ../../../src/util/u_queue.c:264 #1 0x0000007fb6c380c0 in atexit_handler () at ../../../src/util/u_queue.c:51 #2 0x0000007fb7730e2c in __run_exit_handlers () from /lib64/libc.so.6 #3 0x0000007fb7730e5c in exit () from /lib64/libc.so.6 #4 0x0000007fb7ce17dc in piglit_report_result (result=PIGLIT_PASS) at /home/robclark/src/piglit/tests/util/piglit-util.c:267 #5 0x0000007fb7ef99f8 in process_next_event (x11_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:139 #6 0x0000007fb7ef9a90 in enter_event_loop (winsys_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:153 #7 0x0000007fb7ef8e50 in run_test (gl_fw=0x432c20, argc=1, argv=0x7ffffff588) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:88 #8 0x0000007fb7edb890 in piglit_gl_test_run (argc=1, argv=0x7ffffff588, config=0x7ffffff400) at /home/robclark/src/piglit/tests/util/piglit-framework-gl.c:203 #9 0x0000000000401224 in main (argc=1, argv=0x7ffffff588) at /home/robclark/src/piglit/tests/bugs/drawbuffer-modes.c:46 (gdb) c Continuing. [Thread 0x7fb67580c0 (LWP 3471) exited] ^C Thread 1 "drawbuffer-mode" received signal SIGINT, Interrupt. 0x0000007fb72dda34 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0 (gdb) bt #0 0x0000007fb72dda34 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0 #1 0x0000007fb6c38304 in cnd_wait (mtx=0x5bdc90, cond=0x5bdcc0) at ../../../include/c11/threads_posix.h:159 #2 util_queue_fence_wait (fence=0x5bdc90) at ../../../src/util/u_queue.c:106 #3 0x0000007fb6daac70 in fd_batch_sync (batch=0x5bdc70) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch.c:233 #4 batch_reset (batch=batch@entry=0x5bdc70) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch.c:183 #5 0x0000007fb6daa5e0 in batch_flush (batch=0x5bdc70) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch.c:290 #6 fd_batch_flush (batch=0x5bdc70, sync=<optimized out>) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch.c:308 #7 0x0000007fb6daba2c in fd_bc_flush (cache=0x461220, ctx=0x52b920) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch_cache.c:141 #8 0x0000007fb6dac954 in fd_context_flush (pctx=0x52b920, fence=0x0, flags=<optimized out>) at ../../../../../src/gallium/drivers/freedreno/freedreno_context.c:54 #9 0x0000007fb6b43294 in st_glFlush (ctx=<optimized out>) at ../../../src/mesa/state_tracker/st_cb_flush.c:121 #10 0x0000007fb69a84e8 in _mesa_make_current (newCtx=newCtx@entry=0x0, drawBuffer=drawBuffer@entry=0x0, readBuffer=readBuffer@entry=0x0) at ../../../src/mesa/main/context.c:1654 #11 0x0000007fb6b7ca58 in st_api_make_current (stapi=<optimized out>, stctxi=0x0, stdrawi=0x0, streadi=0x0) at ../../../src/mesa/state_tracker/st_manager.c:827 #12 0x0000007fb6cc87e8 in dri_unbind_context (cPriv=<optimized out>) at ../../../../../src/gallium/state_trackers/dri/dri_context.c:217 #13 0x0000007fb6cc80b0 in driUnbindContext (pcp=0x5271e0) at ../../../../../../src/mesa/drivers/dri/common/dri_util.c:591 #14 0x0000007fb7d1da08 in MakeContextCurrent (dpy=0x433380, draw=0, read=0, gc_user=0x0) at ../../../src/glx/glxcurrent.c:214 #15 0x0000007fb7a8d5e0 in glx_platform_make_current () from /lib64/libwaffle-1.so.0 #16 0x0000007fb7a894e4 in waffle_make_current () from /lib64/libwaffle-1.so.0 #17 0x0000007fb7ef8c60 in piglit_wfl_framework_teardown (wfl_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_wfl_framework.c:628 #18 0x0000007fb7ef939c in piglit_winsys_framework_teardown (winsys_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:238 #19 0x0000007fb7ef9c30 in destroy (gl_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:212 #20 0x0000007fb7edb7c4 in destroy () at /home/robclark/src/piglit/tests/util/piglit-framework-gl.c:184 #21 0x0000007fb7730e2c in __run_exit_handlers () from /lib64/libc.so.6 #22 0x0000007fb7730e5c in exit () from /lib64/libc.so.6 #23 0x0000007fb7ce17dc in piglit_report_result (result=PIGLIT_PASS) at /home/robclark/src/piglit/tests/util/piglit-util.c:267 #24 0x0000007fb7ef99f8 in process_next_event (x11_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:139 #25 0x0000007fb7ef9a90 in enter_event_loop (winsys_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:153 #26 0x0000007fb7ef8e50 in run_test (gl_fw=0x432c20, argc=1, argv=0x7ffffff588) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:88 #27 0x0000007fb7edb890 in piglit_gl_test_run (argc=1, argv=0x7ffffff588, config=0x7ffffff400) at /home/robclark/src/piglit/tests/util/piglit-framework-gl.c:203 #28 0x0000000000401224 in main (argc=1, argv=0x7ffffff588) at /home/robclark/src/piglit/tests/bugs/drawbuffer-modes.c:46 (gdb) r Fixes: `4aea8fe7` ("gallium/u_queue: fix random crashes when the app calls exit()") Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-18 16:32:00 -04:00
Eric Anholt	c1362e78ad	vc4: Enable V3D 2.6. This version of the chip is present on the Cygnus-based 911360 enterprise phone platform. It appears to be completely backwards compatible.	2017-04-18 13:21:40 -07:00
Samuel Pitoiset	a18ff34452	st/mesa: add st_convert_sampler() Similar to st_convert_image(), will be useful for bindless. While we are at it, rename convert_sampler() to convert_sampler_from_unit() and make 'st' a const argument. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-18 21:42:01 +02:00
Bartosz Tomczyk	ca41ecf838	mesa/glthread: add async support to ARB_viewport_array functions v2: fix attribute name, it is count_scale not scale_count Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-18 12:19:12 +02:00
Timothy Arceri	a63919f848	mesa: rename _mesa_add_renderbuffer* functions These names make it easier to understand what is going on in regards to references. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-18 10:01:55 +10:00
Nanley Chery	d9d793696b	anv/cmd_buffer: Disable CCS on BDW input attachments The description under RENDER_SURFACE_STATE::RedClearColor says, For Sampling Engine Multisampled Surfaces and Render Targets: Specifies the clear value for the red channel. For Other Surfaces: This field is ignored. This means that the sampler on BDW doesn't support CCS. Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2017-04-17 16:47:38 -07:00
Lionel Landwerlin	d71efbe5f2	anv: blorp: flush memory after copy Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-04-17 14:45:57 -07:00
Grazvydas Ignotas	ba6c451390	radv: enable timestampComputeAndGraphics Commit `bfee9866` "radv: Use RELEASE_MEM packet for MEC timestamp query." added WriteTimestamp handling for compute queues but forgot to flip the flag. Tested with DOOM (by me) and CTS (by Bas), but without verification that these tests actually use timestamps on compute queues. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-17 21:21:35 +03:00
Rob Clark	d4601b0efc	freedreno: fix crash if ctx torn down with no rendering In this case, ctx->flush_queue would not have been initialized. Fixes: `0b613c20` ("freedreno: enable draw/batch reordering by default") Cc: "17.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-17 14:00:05 -04:00
Rob Clark	15fe9b2347	freedreno/ir3: add 'high' register class For compute shaders, we need to be able to allocate some "high" registers (r48.x to r55.w). (Possibly these are global to all threads in a warp?) Add a new register class to handle this. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-17 14:00:05 -04:00
Rob Clark	3c5d309477	freedreno: extract helper for stage->sb for a4xx+ Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-17 14:00:05 -04:00
Rob Clark	9567beab36	freedreno/{a4xx,a5xx}: switch to CP_LOAD_STATE4 The layout of CP_LOAD_STATE packet is slightly different on a4xx+. Switch to the a4xx+ specific CP_LOAD_STATE4 to get the correct encoding. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-17 14:00:05 -04:00
Rob Clark	dfdb1fed78	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-17 14:00:05 -04:00
Emil Velikov	9915753e63	configure.ac: print deprecation warning as needed The warning should be printed only when one explicitly uses the deprecated configure toggle. Fixes: `7748c3f5eb` ("configure.ac: deprecate --with-egl-platforms over --with-platforms") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-04-17 15:07:44 +01:00
Emil Velikov	19aec22c75	docs: add news item and link release notes for 17.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-17 14:44:35 +01:00
Emil Velikov	89ef8750f0	docs: add sha256 checksums for 17.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `12434966eb`)	2017-04-17 14:43:27 +01:00
Emil Velikov	d271401d61	docs: add release notes for 17.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `367bafc7c1`)	2017-04-17 14:43:26 +01:00
Emil Velikov	36aea77cd7	docs: add 17.2.0-devel release notes template, bump version Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-17 14:31:41 +01:00
Emil Velikov	7748c3f5eb	configure.ac: deprecate --with-egl-platforms over --with-platforms Currently the former controls more than just EGL. With follow-up commits we'll unwind and fix things so that one can build the different drivers with said platform support. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-17 13:37:41 +01:00
Emil Velikov	de128c19ee	configure: remove egl platforms check The configure option is used by more than just EGL and with next commit we'll rename it accordingly. Thus having the check will (and is atm) incorrect. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-17 13:13:09 +01:00
Emil Velikov	618a7b984b	travis: remove unneeded dri3/present proto requirement Signed-off-by: Emil Velikov <emil.lvelikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-04-17 13:12:03 +01:00
Emil Velikov	291a9405a5	configure: remove unneeded dri3/present proto requirements We are not using either of these. The respecive xcb packages are used instead. v2: Rebase, reword commit message. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-04-17 13:10:37 +01:00
Kyle Brenneman	ce562f9e3f	EGL: Implement the libglvnd interface for EGL (v3) The new interface mostly just sits on top of the existing library. The only change to the existing EGL code is to split the client extension string into platform extensions and everything else. On non-glvnd builds, eglQueryString will just concatenate the two strings. The EGL dispatch stubs are all generated. The script is based on the one used to generate entrypoints in libglvnd itself. v2: [Kyle] - Rebased against master. - Reworked the EGL makefile to use separate libraries - Made the EGL code generation scripts work with Python 2 and 3. - Change gen_egl_dispatch.py to use argparse for the command line arguments. - Assorted formatting and style cleanup in the Python scripts. v3: [Emil Velikov] - Rebase - Remove separate glvnd glx/egl configure toggles Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-17 13:03:58 +01:00
Tapani Pälli	370df207ca	android: add marshal_generated c and h files to generated sources Fixes: `efd63e2` ("mesa: Connect the generated GL command marshalling code to the build.") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-17 12:20:06 +01:00
Emil Velikov	3bcef6aa24	configure.ac: honour --disable-libunwind if the .pc file is present We should check the presence in order to determine if we should [implicitly] set the CFLAGS/LIBS v2: Drop spurious OMX hunk (Eric) Cc: Eric Anholt <eric@anholt.net> Reported-by: Eric Anholt <eric@anholt.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-17 12:05:10 +01:00
Emil Velikov	39c3482205	docs: document the C++14 SWR requirement Earlier commit bumped the requirement for the SWR driver. v2: Fold the note with the LLVM 3.9 one (Tim) Fixes: `3c52a7316a` ("swr: [configure.ac/scons] require c++14") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2017-04-17 12:04:22 +01:00
Samuel Pitoiset	84ed2e1192	winsys/amdgpu: init buffer_indices_hashlist with memset() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-17 11:59:17 +02:00
Samuel Pitoiset	af612816bc	winsys/amdgpu: simplify amdgpu_cs_add_buffer() a bit Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-17 11:59:17 +02:00
Kenneth Graunke	7c3b8ed878	i965/drm: Delete NULL check in brw_bo_unmap(). I accidentally moved the bo->bufmgr dereference above the NULL check when cleaning up this code. While passing NULL to free() is a common pattern...passing NULL to unmap seems pretty bad. You really ought to know whether you have a buffer or not. We don't want to paper over bugs like that. So, just drop the NULL check altogether. CID: 1405006 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-16 22:58:23 -07:00
Kenneth Graunke	9b71709cb8	intel/decoder: Fix is_header_field starting condition. Starting positions >= 32 are not part of the header, rather than >. Caught by Coverity, which found that "bits <<= field->start" may shift by 32, which has undefined behavior. CID: 1404968 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-04-16 22:58:23 -07:00
Kenneth Graunke	6142c3e298	i965/drm: Remove dead return in brw_bo_busy() If ret is 0, we return. If ret is not 0, we return. This is dead. CID: 1405013 (Structurally dead code (UNREACHABLE)) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-04-16 22:58:22 -07:00
Mauro Rossi	8c79dbe94e	android: amd/addrlib: trivial fix for gfx9 support Fixes the following build error: external/mesa/src/amd/addrlib/gfx9/gfx9addrlib.cpp:36:10: fatal error: 'gfx9_gb_reg.h' file not found ^ 1 error generated. Fixes: `7f160ef` "amd/addrlib: import gfx9 support" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-17 14:04:21 +10:00
Jason Ekstrand	4cf079f7f2	nir: Add GLSL_TYPE_[U]INT64 to some switch statements Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-04-16 20:14:42 -07:00
Marek Olšák	2769dadb0f	gallium/radeon: always flush asynchronously and wait after begin_new_cs This hides the overhead of everything in the driver after the CS flush and before returning from pipe_context::flush. Only microbenchmarks will benefit. +2% FPS for glxgears. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-17 01:22:11 +02:00
Marek Olšák	f05f0bb5cb	radeonsi: remove local variable 'mod' from si_compile_tgsi_shader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-17 01:22:11 +02:00
Marek Olšák	bd2cde0c25	radeonsi: add si_shader_selector::vs_needs_prolog cleanup Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-17 01:22:11 +02:00
Marek Olšák	777f305840	radeonsi: don't set VGT_GS_MODE as part of the GS state The VS state sets it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-17 01:22:11 +02:00
Marek Olšák	5438e39fae	radeonsi: don't allow user indices with indirect draws Not possible with GL and it will make future gallium rework easier. (also it's something I wouldn't like to support) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-17 01:22:11 +02:00
Marek Olšák	1c94d29984	radeonsi: merge two if (indirect) statements Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-17 01:22:11 +02:00
Marek Olšák	bdd6449769	radeonsi: don't mark non-dirty textures with CMASK as compressed because the compression is skipped with non-dirty textures. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-17 01:22:11 +02:00
Bas Nieuwenhuizen	566f2ed571	docs: Document interaction Fixes tag and stable branches. For the next time I forget. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-15 11:37:46 +02:00
Timothy Arceri	9f0dd85aa6	glsl: don't run the GLSL pre-processor when we are skipping compilation This moves the hashing of shader source for the cache lookup to before the preprocessor. In our experience, shaders are unlikely to hash the same after preprocessing if they didn't hash the same before, so we can skip preprocessing for cache hits. Improves Deus Ex start-up times with a warm cache from ~30 seconds to ~22 seconds. Also fixes the leaking of state. V2: fix indentation v3: add the value of MESA_EXTENSION_OVERRIDE to the hash of the shader. Tested-by (v2): Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-04-15 11:36:52 +10:00
Timothy Arceri	c2bc0aa7b1	glsl: delay optimisations on individual shaders when cache is available Due to a max limit of 65,536 entries on the index table that we use to decide if we can skip compiling individual shaders, it is very likely we will have collisions. To avoid doing too much work when the linked program may be in the cache this patch delays calling the optimisations until link time. Improves cold cache start-up times on Deus Ex by ~20 seconds. When deleting the cache index to simulate a worst case scenario of collisions in the index, warm cache start-up time improves by ~45 seconds. V2: fix indentation, make sure to call optimisations on cache fallback, make sure optimisations get called for XFB. Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-15 11:36:44 +10:00
Jason Ekstrand	d2d6cf6c83	anv: Add the pci_id into the shader cache UUID This prevents a user from using a cache created on one hardware generation on a different one. Of course, with Intel hardware, this requires moving their drive from one machine to another but it's still possible and we should prevent it. Reviewed-by: Chad Versace <chadversary@chromium.org> Cc: mesa-stable@lists.freedesktop.org	2017-04-14 17:41:07 -07:00
Philipp Zabel	36f2101723	etnaviv: native fence fd support This adds native fence fd support to etnaviv, similarly to commit `0b98e84e9b` ("freedreno: native fence fd"), enabled for kernel driver version 1.1 or later. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-15 01:47:18 +02:00
Francisco Jerez	96dfc014fd	docs: mark GL_ARB_vertex_attrib_64bit and OpenGL 4.2 as supported by i965/gen7+ v2 (Andreas Boll): - Mark GL 4.1 as supported by i965/gen7+ - Mark GL_ARB_shader_precision as supported by i965/gen7+ - Update release notes Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 16:13:21 -07:00
Juan A. Suarez Romero	1877982aca	i965: enable OpenGL 4.2 in Ivybridge Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 16:13:21 -07:00
Samuel Iglesias Gonsálvez	92d4dc76ea	i965: enable ARB_shader_precision in gen7+ Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 16:13:21 -07:00
Juan A. Suarez Romero	0aed1212ae	i965: enable ARB_vertex_attrib_64bit for gen7+ Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 16:13:21 -07:00
George Kyriazis	b9d4256e11	swr: Fix swr osmesa build Use GALLIUM_SWR to standardize Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-14 18:03:40 -05:00
Wladimir J. van der Laan	6a8d5ab932	etnaviv: SINGLE_BUFFER support on GC3000 This patch adds support for the SINGLE_BUFFER feature on GC3000 GPUs, which allows rendering to a single buffer using multiple pixel pipes. This feature is always used when it is available, which means that multi-tiled formats are no longer being used in that case, and all buffers will be normal (super)tiled. This mimics the behavior of the blob on GC3000. - Because the same format can be used to render to and texture from, this avoids an extra resolve pass when rendering to texture. - i.MX6qp includes a PRE which can scan-out directly from tiled formats, avoiding untiling overhead. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-15 00:34:13 +02:00
Wladimir J. van der Laan	1dcb1d4925	etnaviv: Update includes from rnndb Update to etna_viv commit 8486a97. austriancoder: changed patch to include isa redefinition fix. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-15 00:34:08 +02:00
Wladimir J. van der Laan	9e4d049f40	etnaviv: Add chipMinorFeatures4 and 5 Request chipMinorFeatures bitfields 4 and 5 from the drm driver. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-15 00:34:03 +02:00
Philipp Zabel	dda956340c	etnaviv: resolve tile status when flushing resource When passing render buffers from EGL clients to a wayland compositor, the resource tile status must be resolved because otherwise the tile status is lost in the transfer and cleared parts of the buffer will contain old contents. The same applies when sampling directly from a renderable resource. lst: Add seqno tracking, to skip flush when not needed. Fixes: aadcb5e94b35 ("etnaviv: enable TS, but disable autodisable") Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-15 00:15:30 +02:00
Philipp Zabel	f30aab7696	etnaviv: stop repeatedly resolving an unchanged resource into its scanout prime buffer Before resolving a resource into its scanout prime buffer, check that the prime resource is actually older. If it is not, the resolve is an expensive no-op, and we better skip it. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-15 00:15:27 +02:00
George Kyriazis	d7a1f01db3	swr: Add polygon stipple support Add polygon stipple functionality to the fragment shader. Explicitly turn off polygon stipple for lines and points, since we do them using tris. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-14 17:08:12 -05:00
Samuel Iglesias Gonsálvez	8973ae3162	docs/relnotes: add GL_ARB_gpu_shader_fp64 support on i965/ivybridge Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:10 -07:00
Samuel Iglesias Gonsálvez	ef49dda2df	docs: mark GL_ARB_gpu_shader_fp64 and OpenGL 4.0 as supported by i965/gen7+ Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:10 -07:00
Samuel Iglesias Gonsálvez	a494afdb8e	i965: enable OpenGL 4.0 to Ivybridge/Baytrail Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:10 -07:00
Samuel Iglesias Gonsálvez	cd0a6b2fc2	i965: enable ARB_gpu_shader_fp64 for Ivybridge/Baytrail Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:09 -07:00
Matt Turner	2eeb1b0ad9	i965: Use correct VertStride on align16 instructions. In commit `c35fa7a`, we changed the "width" of DF source registers to 2, which is conceptually fine. Unfortunately a VertStride of 2 is not allowed by align16 instructions on IVB/BYT, and the regular VertStride of 4 works fine in any case. See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-round-double.shader_test for example: cmp.ge.f0(8) g18<1>DF g1<0>.xyxyDF -g8<2>DF { align16 1Q }; ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed cmp.ge.f0(8) g19<1>DF g1<0>.xyxyDF -g9<2>DF { align16 2N }; ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed v2: - Add spec quote (Curro). - Change the condition to only BRW_VERTICAL_STRIDE_2 (Curro) Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:09 -07:00
Samuel Iglesias Gonsálvez	d8441e2276	i965/vec4/dce: improve track of partial flag register writes This is required for correctness in presence of multiple 4-wide flag writes (e.g. 4-wide instructions with a conditional mod set) which update a different portion of the same 8-bit flag subregister. Right now we keep track of flag dataflow with 8-bit granularity and consider flag writes to have killed any previous definition of the same subregister even if the write was less than 8 channels wide, which can cause live flag register updates to be dead code-eliminated incorrectly. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:09 -07:00
Samuel Iglesias Gonsálvez	c1fc8fad47	i965/vec4: don't do horizontal stride on some register file types horiz_offset() shouldn't be doing anything for scalar registers, because all channels of any SIMD instructions will end up reading or writing the same component of the register, so shifting the register offset would be wrong. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [ Francisco Jerez: Re-implement in terms of is_uniform() for simplicity. Pass argument by const reference. Clarify commit message. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:09 -07:00
Matt Turner	21e8e3a848	i965/vec4: Fix exec size for MOVs {SET,PICK}_{HIGH,LOW}_32BIT. Otherwise for a pack_double_2x32_split opcode, we emit: vec1 64 ssa_135 = pack_double_2x32_split ssa_133, ssa_134 mov(8) g5<1>UD g5<4>.xUD { align16 1Q compacted }; mov(8) g7<2>UD g5<4,4,1>UD { align1 1Q }; ERROR: When the destination spans two registers, the source must span two registers (exceptions for scalar source and packed-word to packed-dword expansion) mov(8) g8<2>UD g5.4<4,4,1>UD { align1 2N }; ERROR: The offset from the two source registers must be the same mov(8) g5<1>UD g6<4>.xUD { align16 1Q compacted }; mov(8) g7.1<2>UD g5<4,4,1>UD { align1 1Q }; ERROR: When the destination spans two registers, the source must span two registers (exceptions for scalar source and packed-word to packed-dword expansion) mov(8) g8.1<2>UD g5.4<4,4,1>UD { align1 2N }; ERROR: The offset from the two source registers must be the same The intention was to emit mov(4)s for the instructions that have ERROR annotations. See tests/spec/arb_gpu_shader_fp64/execution/vs-isinf-dvec.shader_test for example. v2 (Samuel): - Instead of setting the exec size to a fixed value, don't double it (Curro). - Add PICK_{HIGH,LOW}_32BIT to the condition. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [ Francisco Jerez: Trivial rebase changes. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:09 -07:00
Samuel Iglesias Gonsálvez	f030aaf2fb	i965/vec4: use vec4_builder to emit instructions in setup_imm_df() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [ Francisco Jerez: Drop useless vec4_visitor dependencies. Demote to static stand-alone function. Don't write unused components in the result. Use vec4_builder interface for register allocation. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:09 -07:00
Juan A. Suarez Romero	a907c91e93	i965/vec4: consider subregister offset in live variables Take into account offset values less than a full register (32 bytes) when getting the var from register. This is required when dealing with an operation that writes half of the register (like one d2x in IVB/BYT, which uses exec_size == 4). v2: - Take in account this offset < 32 in liveness analysis too (Curro) v3: - Change formula in var_from_reg() (Curro) - Remove useless changes (Curro) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:08 -07:00
Francisco Jerez	92649a3e67	i965/vec4: fix assert to detect SIMD lowered DF instructions in IVB On IVB, DF instructions have lowered the SIMD width to 4 but the exec_size will be later doubled. Fix the assert to avoid crashing in this case. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [ Francisco Jerez: Simplify assert. Except for the 'inst->group % 4 == 0' part the assertion was redundant with the previous assertion. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:08 -07:00
Samuel Iglesias Gonsálvez	6e3265eae5	i965/vec4: split VEC4_OPCODE_FROM_DOUBLE into one opcode per destination's type This way we can set the destination type as double to all these new opcodes, avoiding any optimizer's confusion that was happening before. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [ Francisco Jerez: Drop no_spill workaround originally needed due to the bogus destination type of VEC4_OPCODE_FROM_DOUBLE. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:08 -07:00
Samuel Iglesias Gonsálvez	50a5217637	i965/vec4: split d2x conversion and data gathering from one opcode to two explicit ones When doing a 64-bit to a smaller data type size conversion, the destination should be aligned to 64-bits. Because of that, we need to gather the data after the actual conversion. Until now, these two operations were done by VEC4_OPCODE_FROM_DOUBLE but now we split them explicitely in two different instructions: VEC4_OPCODE_FROM_DOUBLE just do the conversion and VEC4_OPCODE_PICK_LOW_32BIT will gather the data. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:08 -07:00
Juan A. Suarez Romero	cfaf14a126	i965/vec4: fix VEC4_OPCODE_FROM_DOUBLE for IVB/BYT In the generator we must generate slightly different code for Ivybridge/Baytrail, because of the way the stride works in this hardware. v2: - Use stride and don't need to fix dst (Curro) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:08 -07:00
Juan A. Suarez Romero	be445d3ea3	i965/vec4: keep original type when dealing with null registers Keep the original type when dealing with null registers. Especially because we do no want to introduce an implicit conversion between types that could affect the conditional flags. This affects especially when the original type is DF, and we are working on Ivybridge/Baytrail. v2 (Curro) - Fix typo. - Use retype() instead of applying the type directly. - Remove unneeded retype. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:08 -07:00
Samuel Iglesias Gonsálvez	a21dc2b500	i965/vec4: split DF instructions and later double its execsize in IVB/BYT We need to split DF instructions in two on IVB/BYT as it needs an execsize 8 to process 4 DF values (one GRF in total). v2: - Rename helper and make it static inline function (Matt). - Fix indention and add braces (Matt). v3: - Don't edit IR instruction when doubling exec_size (Curro) - Add comment into the code (Curro). - Manage ARF registers like the others (Curro) v4: - Add get_exec_type() function and use it to calculate the execution size. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [ Francisco Jerez: Fix bogus 'type != BAD_FILE' check. Take destination type as execution type where there is no valid source. Assert-fail if the deduced execution type is byte. Clarify comment in get_lowered_simd_width(). Move SIMD width workaround outside of 'if (...inst->size_written > REG_SIZE)' conditional block, since the problem should be independent of whether the amount of data written by the instruction is greater or lower than a GRF. Drop redundant is_ivb_df definition. Drop bogus inst->exec_size < 8 check. Simplify channel group assertion. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:08 -07:00
Samuel Iglesias Gonsálvez	a5399e8b1c	i965/fs: lower all non-force_writemask_all DF instructions to SIMD4 on IVB/BYT The hardware applies the same channel enable signals to both halves of the compressed instruction which will be just wrong under non-uniform control flow. Fix this by splitting those instructions to SIMD4. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:08 -07:00
Francisco Jerez	ebfb703d44	i965/fs: Get 64-bit indirect moves working on IVB. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-04-14 14:56:08 -07:00
Matt Turner	630b84cdc8	i965: Use source region <1,2,0> when converting to DF. Doing so allows us to use a single MOV in VEC4_OPCODE_TO_DOUBLE instead of two. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-04-14 14:56:08 -07:00
Juan A. Suarez Romero	3198ce3f96	i965/fs: fix lower SIMD width for IVB/BYT's MOV_INDIRECT According to the IVB and HSW PRMs: "2.When the destination requires two registers and the sources are indirect, the sources must use 1x1 regioning mode." So for DF instructions the execution size is not limited by the number of address registers that are available, but by the EU decompression logic not handling VxH indirect addressing correctly. This patch limits the SIMD width to 4 in this case. v2: - Fix typo (Matt). - Fix condition (Curro) v3: - Add spec quote (Curro) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:07 -07:00
Juan A. Suarez Romero	571cbd05eb	i965/fs: fix dst stride in IVB/BYT type conversions When converting a DF to 32-bit conversions, we set dst stride to 2, to fulfill alignment restrictions because the upper Dword of every Qword will be written with undefined value. But in IVB/BYT, this is not necessary, as each DF conversion already writes 2, the first one the real value, and the second one a 0. That is, IVB/BYT already set stride = 2 implicitly, so we must set it to 1 explicitly to avoid ending up with stride = 4. v2: - Fix typo (Matt) v3: - Fix stride in the destination's brw_reg, don't modity IR (Curro) v4: - Remove 'is_dst' argument of brw_reg_from_fs_reg() (Curro) - Fix comment (Curro). - Relax hstride assert (Curro) Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [ Francisco Jerez: Minor spelling fixes. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:07 -07:00
Samuel Iglesias Gonsálvez	af6fc3a8ea	i965/fs: rename lower_d2x to lower_conversions v2: - Change the name to lower_conversions. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:07 -07:00
Samuel Iglesias Gonsálvez	dee31311eb	Revert "i965/fs: Don't emit SEL instructions for type-converting MOVs." This reverts commit `7dccd38b40`. d2x pass fixes SEL instructions when there is a type conversion by doing a SEL without type conversion and then convert the result. This pass also takes into account the non-uniform control flow. Then, `7dccd38b40` is not needed anymore. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-14 14:56:07 -07:00
Samuel Iglesias Gonsálvez	aeecc82d05	i965/fs: generalize the legalization d2x pass Generalize it to lower any unsupported narrower conversion. v2 (Curro): - Add supports_type_conversion() - Reuse existing intruction instead of cloning it. - Generalize d2x to narrower and equal size conversions. v3 (Curro): - Make supports_type_conversion() const and improve it. - Use foreach_block_and_inst to process added instructions. - Simplify code. - Add assert and improve comments. - Remove redundant mov. - Remove useless comment. - Remove saturate == false assert and add support for saturation when fixing the conversion. - Add get_exec_type() function. v4 (Curro): - Use get_exec_type() function to get sources' type. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:07 -07:00
Matt Turner	94ffeb7fa2	i965: Use <0,2,1> region for scalar DF sources on IVB/BYT. On HSW+, scalar DF sources can be accessed using the normal <0,1,0> region, but on IVB and BYT DF regions must be programmed in terms of floats. A <0,2,1> region accomplishes this. v2: - Apply region <0,2,1> in brw_reg_from_fs_reg() (Curro). v3: - Added comment explaining the reason (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:07 -07:00
Samuel Iglesias Gonsálvez	82d17615f4	i965/fs: clamp exec_size when an instruction has a scalar DF source Then the SIMD lowering pass will get rid of any compressed instructions with scalar source (whether force_writemask_all or not) and we avoid hitting the Gen7 region decompression bug. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Suggested-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:07 -07:00
Juan A. Suarez Romero	0f1316d4db	i965/fs: double regioning parameters and execsize for DF in IVB/BYT In IVB and BYT, both regioning parameters and execution sizes are measured as 32-bits element size. So when we have something like: mov(8) g2<1>DF g3<4,4,1>DF We are not actually moving 8 doubles (our intention), but 4 doubles. We need to double the parameters to cope with this issue. However, horizontal strides don't behave as they're supposed to on IVB for DF regions, they will cause each 32-bit half of DF sources to be strided individually, and doubling the value won't make any difference. v2: - Use devinfo directly (Matt). - Use Baytrail instead of Valleview (Matt). - Use IvyBridge instead of Ivy (Matt) - Double the exec_size in code emission (Curro) v3: - Change hstride doubling by an assert and fix commit log (Curro). - Substitute remaining compiler->devinfo by devinfo (Curro). v4: - Fix comment (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:07 -07:00
Juan A. Suarez Romero	79af256388	i965/fs: add helper to retrieve instruction execution type The execution data size is the biggest type size of any instruction operand. We will use it to know if the instruction deals with DF, because in Ivy we need to double the execution size and regioning parameters. v2: - Fix typo in commit log (Matt) - Use static inline function instead of fs_inst's method (Curro). - Define the result as a constant (Curro). - Fix indentation (Matt). - Add braces to nested control flow (Matt). v3 (Curro): - Add get_exec_type() and other auxiliary functions and use them to calculate its size. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [ Francisco Jerez: Fix bogus 'type != BAD_FILE' check. Fix deduced execution type for integer vector types. Take destination type as execution type where there is no valid source. Assert-fail if the deduced execution type is byte. Move into brw_ir_fs.h header for consistency with the VEC4 back-end. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:07 -07:00
Matt Turner	fd349d29e4	i965: Handle IVB DF differences in the validator. On IVB/BYT, region parameters and execution size for DF are in terms of 32-bit elements, so they are doubled. For evaluating the validity of an instruction, we halve them. v2 (Sam): - Add comments. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-04-14 14:56:07 -07:00
Iago Toral Quiroga	fbac8b1f94	i965/disasm: also print nibctrl in IVB for execsize=8 4-wide DF operations where NibCtrl applies require and execsize of 8 in IvyBridge/BayTrail. v2: - Refactor NibCtrl printing (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-14 14:56:06 -07:00
Boyan Ding	ff29f488d4	nir: Destination component count of shader_clock intrinsic is 2 This fixes the following error when using ARB_shader_clock on i965: vec1 32 ssa_0 = intrinsic shader_clock () () () intrinsic store_var (ssa_0) (clock_retval) (3) /* wrmask=xy */ error: src->ssa->num_components == num_components (nir/nir_validate.c:204) Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org	2017-04-14 14:54:06 -07:00
Nicolai Hähnle	39f51b5db9	radeonsi: add missing initialization for userptr buffers Fix the accounting for memory usage of userptr buffers, which has been wrong forever (or at least for a long time). Also initialize flags. Without this initialization, the sparse buffer flag might end up being set, which leads to staging buffers being used unnecessarily (and incorrectly) in transfers to or from userptr buffers. This works around VM faults that occur with the radeon kernel module when running piglit ./bin/amd_pinned_memory decrement-offset map-buffer -auto Fixes: `e077c5fe65` ("gallium/radeon: transfers and invalidation for sparse buffers") Reported-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-14 23:23:04 +02:00
Fredrik Höglund	c1dd5d0b01	radv: remove the temp descriptor set infrastructure It is no longer used. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-14 23:21:24 +02:00
Fredrik Höglund	5ab5d1bee4	radv: use push descriptors in meta Use push descriptors instead of temp descriptor sets. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-14 23:21:24 +02:00
Fredrik Höglund	f95caae504	radv: add private push descriptors for meta This allows meta to use push descriptors without disturbing user push descriptors. radv_meta_push_descriptor_set differs from vkCmdPushDescriptorSetKHR in that partial updates are not supported; all descriptors used in subsequent draw commands must be pushed at the same time. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-14 23:21:24 +02:00
Jason Ekstrand	220974b38d	anv/blorp: Properly handle VK_ATTACHMENT_UNUSED The Vulkan driver was originally written under the assumption that VK_ATTACHMENT_UNUSED was basically just for depth-stencil attachments. However, the way things fell together, VK_ATTACHMENT_UNUSED can be used anywhere in the subpass description. The blorp-based clear and resolve code has a bunch of places where we walk lists of attachments and we weren't handling VK_ATTACHMENT_UNUSED everywhere. This commit should fix all of them. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: <mesa-stable@lists.freedesktop.org>	2017-04-14 14:20:42 -07:00
Jason Ekstrand	21d2ca72d8	anv/cmd_buffer: Use the null surface state for ATTACHMENT_UNUSED Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: <mesa-stable@lists.freedesktop.org>	2017-04-14 14:20:42 -07:00
Jason Ekstrand	02eca8b6f8	anv/cmd_buffer: Always set up a null surface state We're about to start requiring it in yet another case and calculating exactly when one is needed is starting to get prohibitively expensive. A single surface state doesn't take up that much space so we may as well create one all the time. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: <mesa-stable@lists.freedesktop.org>	2017-04-14 14:20:42 -07:00
Nicolai Hähnle	d6588d9962	radeonsi: cope with missing disassembly For robustness and testing purposes. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-14 22:51:07 +02:00
Nicolai Hähnle	d15b1f6e2d	gallium/ddebug: dump missing members of pipe_draw_info Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-14 22:50:54 +02:00
Nicolai Hähnle	2ac03e90fb	radeonsi: enable ARB_shader_viewport_layer_array Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-14 22:50:17 +02:00
Nicolai Hähnle	d5e53f348e	radeonsi: handle ignored LAYER and VIEWPORT_INDEX writes Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-14 22:50:13 +02:00
Nicolai Hähnle	4127f38bae	st/mesa: enable ARB_shader_viewport_layer_array Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-14 22:50:09 +02:00
Nicolai Hähnle	f3d2cf6c1f	tgsi: clarify TGSI_SEMANTIC_{LAYER,VIEWPORT_INDEX} Depending on pipe caps they can be writable in all vertex processing stages, but only the output of the last stage counts. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-14 22:50:06 +02:00
Nicolai Hähnle	17f24a9b75	gallium: add PIPE_CAP_TGSI_TES_LAYER_VIEWPORT Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-14 22:49:44 +02:00
Nicolai Hähnle	8b5d477aa8	configure.ac: add --enable-sanitize option Enable code sanitizers by adding -fsanitize=$foo flags for the compiler and linker. In addition, this also disables checking for undefined symbols: running the address sanitizer requires additional symbols which should be provided by a preloaded libasan.so (preloaded for hooking into malloc & friends globally), and the undefined symbols check gets tripped up by that. Running the tests works normally via `make check`, but shows additional failures with the address sanitizer due to memory leaks that seem to be mostly leaks in the tests themselves. I believe those failures should really be fixed. In the mean-time, you can set export ASAN_OPTIONS=detect_leaks=0 to only check for more serious error types. v2: - fail reasonably when an unsupported sanitize flag is given (Eric Engestrom) Reviewed-by: Bartosz Tomczyk <bartosz.tomczyk86@gmail.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-14 22:44:30 +02:00
Jason Ekstrand	e1f6fb8021	anv/cmd_buffer: Flush the VF cache at the top of all primaries Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-04-14 13:35:02 -07:00
Jason Ekstrand	939337e49f	anv/blorp: Flush the texture cache in UpdateBuffer Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-04-14 13:35:02 -07:00
Jason Ekstrand	475bab0330	anv: Limit VkDeviceMemory objects to 2GB Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-04-14 13:35:02 -07:00
Jason Ekstrand	4495b917e2	intel/blorp: Add a blorp_emit_dynamic macro This makes it much easier to throw together a bit of dynamic state. It also automatically handles flushing so you don't accidentally forget. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-04-14 13:35:02 -07:00
Bruce Cherniak	1832ef6cd9	swr: Enable MSAA in OpenSWR software renderer This patch enables multisample antialiasing in the OpenSWR software renderer. MSAA is a proof-of-concept/work-in-progress with bug fixes and performance on the way. We wanted to get the changes out now to allow several customers to begin experimenting with MSAA in a software renderer. So as not to impact current customers, MSAA is turned off by default - previous functionality and performance remain intact. It is easily enabled via environment variables, as described below. It has only been tested with the glx-lib winsys. The intention is to enable other state-trackers, both Windows and Linux and more fully support FBOs. There are 2 environment variables that affect behavior: * SWR_MSAA_FORCE_ENABLE - force MSAA on, for apps that are not designed for MSAA... Beware, results will vary. This is mainly for testing. * SWR_MSAA_MAX_SAMPLE_COUNT - sets maximum supported number of samples (1,2,4,8,16), or 0 to disable MSAA altogether. (The default is currently 0.) Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2017-04-14 15:22:45 -05:00
Bruce Cherniak	91a7f0b3af	swr: Removed unnecessary PIPE_BIND flags from swr_is_format_supported Removed unnecessary and probably wrong PIPE_BIND_SCANOUT and PIPE_BIND_SHARED flags in favor of check on single PIPE_BIND_DISPLAY_TARGET flag. Reference llvmpipe change <bee4c7718a3bd57e3d99f0913d9081cd13fe5fd> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2017-04-14 15:22:44 -05:00
Bruce Cherniak	97bbb7b6a3	swr: Align swr_context allocation to SIMD alignment. The context now contains SIMD vectors which must be aligned (specifically samplePositions in the rastState in the derived state). Failure to align can result in segv crash on unaligned memory access in vector instructions. Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2017-04-14 15:22:44 -05:00
Tim Rowley	4dcfa83114	swr: update gallium driver docs v2: add back scons section, mention additional built swr libraries Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-14 15:21:31 -05:00
Grazvydas Ignotas	bffdb434b7	radv: remove irrelevant comment A leftover from anv. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-14 23:16:03 +03:00
Grazvydas Ignotas	1b2fe7ce45	radv: report timestampPeriod correctly The kernel returns frequency in kHz, so to convert to nanosecond interval that Vulkan uses the dividend should be 1000000.0 and not 100000.0. This fixes the GPU graph in DOOM and matches the amdgpu-pro blob. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-14 23:15:55 +03:00
Rob Clark	9fc3e7137a	nir/print: add compute shader info Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-04-14 12:46:12 -04:00
Rob Clark	16d493f1e7	gallium/docs: small correction about register files for atomics These can operate on MEMORY[], in addition to BUFFER[] and IMAGE[] Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-14 12:46:12 -04:00
Rob Clark	0b613c20aa	freedreno: enable draw/batch reordering by default Probably should have flipped the switch a long time ago, since it doesn't seem to cause any problems and is a nice perf boost in a number of cases. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-14 12:46:12 -04:00
Rob Clark	b5cc88af5e	freedreno/ir3: small re-order Small re-order of switch statement to handled op-code categories in order. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-14 12:46:12 -04:00
Rob Clark	75afd2586f	freedreno/ir3: move 'keeps' to block level For things like SSBOs and atomics we'll want to track this at a block level. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-14 12:46:12 -04:00
Rob Clark	331bd3b5e1	freedreno/ir3: convert dynamic arrays to ralloc Want to move one of these under ir3_block, so that gives a reason to migrate the remaining malloc/realloc to ralloc. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-14 12:46:12 -04:00
George Kyriazis	870760e02e	swr: add linux to scons build Make swr compile for both linux and windows. Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2017-04-14 10:59:46 -05:00
Bas Nieuwenhuizen	e20eb91e2b	radv: make sizes & offsets 32 bit in radv_descriptor_update_template_entry. v2: Also convert the calculations. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2017-04-14 14:14:07 +02:00
Kenneth Graunke	7c83d44d54	docs: Update MESA_shader_integer_functions spec to version 3. When publishing this spec on the OpenGL ES registry, Jon Leech noticed that it didn't actually mention what the ES dependencies and interactions were. I looked at extensions_table.h and noted that we expose it in ES 3.0 contexts, and he added the obvious spec texts. The updated copy also contains our official extension number. https://github.com/KhronosGroup/OpenGL-Registry/issues/3 Acked-by: Matt Turner <mattst88@gmail.com>	2017-04-13 23:01:27 -07:00
Bas Nieuwenhuizen	17a75b4da4	radv: Set descriptor set limits. Properly and with comments this time. Signed-off-by: Bas Nieuwenhuizen <bansi@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-13 22:55:11 +02:00
Bas Nieuwenhuizen	24ccf1a8b6	radv: Increase integer sizes in descriptor sets. Needed if we want to allow them taking more than 64 KiB. The calculations of these already used 32 bits. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-13 22:55:11 +02:00
Dave Airlie	58dd57cb94	radv: support S8_UINT as a depth/stencil format. This enables a bunch of NotSupported CTS tests. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-14 05:49:25 +10:00
Dave Airlie	16b2dc0ca1	radv: bump maxGeometryShaderInvocations. This bumps it to the same level as amdgpu-pro, it also moves a bunch of dEQP-VK.geometry.instanced.* from NotSupported to Pass. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-14 05:49:14 +10:00
Axel Davy	442780ea37	st/nine: Fix support for ps 1.4 dw and dz modifiers RCP was used incorrectly to support NINED3DSPSM_DW and NINED3DSPSM_DZ. src.x was used as input instead of src.w or src.z. Fixes: https://github.com/iXit/Mesa-3D/issues/271 Signed-off-by: Axel Davy <axel.davy@ens.fr> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-04-13 20:05:03 +02:00
Jan Vesely	d8ffe4d0ce	clover: Add missing include to compat header Fixes build failure with LLVM 4 Fixes: `a981e68c26` (clover: Fix build against clang SVN >= r299965) Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-13 13:34:21 -04:00
Nicolai Hähnle	b52721e3b6	gallium/radeon: never use staging buffers with AMD_pinned_memory Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-04-13 17:36:26 +02:00
Nicolai Hähnle	4f7e3fbb50	radeonsi: fix gl_BaseVertex in non-indexed draws gl_BaseVertex is supposed to be 0 in non-indexed draws. Unfortunately, the way they're implemented, the VGT always generates indices starting at 0, and the VS prolog adds the start index. There's a VGT_INDX_OFFSET register which causes the VGT to start at a driver-defined index. However, this register cannot be written from indirect draws. So fix this unlikely case by setting a bit to tell the VS whether the draw is indexed or not, so that gl_BaseVertex can be adjusted accordingly when used. Fixes a bug in KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters.* Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-13 17:31:11 +02:00
Nicolai Hähnle	472c84d1ad	radeonsi: provide VS_STATE input to all VS variants v2: fix incorrect change in get_tcs_out_patch_stride Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-13 17:30:20 +02:00
Nicolai Hähnle	3b9fbcb3b6	radeonsi: change the bit-packing of LS out/TCS in data Avoid conflicts when merging various VS state bits. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-13 17:30:19 +02:00
Nicolai Hähnle	ff39f0d59c	radeonsi: emit VS_STATE register explicitly from si_draw_vbo We will merge other derived state information into this register. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-13 17:30:18 +02:00
Nicolai Hähnle	8c224d3d9f	radeonsi: extract derived tess state emit to higher level Especially with subsequent changes, this makes it easier to see the sequence of state emits at the higher level. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-13 17:30:17 +02:00
Nicolai Hähnle	215ceb37b9	radeonsi: drop support for TGSI_SEMANTIC_VERTEXID_NOBASE It is unused. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-13 17:30:11 +02:00
Bas Nieuwenhuizen	4f7fb25d4e	radv: Add more trace points. Most trace points happen after an operation, so add a trace point at the start of the command buffer. Furthermore, add one after a CmdUpdateBuffer using CP_DMA as that didn't emit one yet. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-13 16:06:47 +02:00
Bas Nieuwenhuizen	8a535a8bc0	radv: Ignore CmdUpdateBuffer with size 0. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-13 16:06:34 +02:00
Bas Nieuwenhuizen	04c7452d0c	radv: Enable query inheritance. timestamp and pipeline_statistics only do something on begin & end, so they don't need any action. Occlusion queries only do something to enable/disable and that register is set nowhere else so that doesn't need extra support either. (We technically should fix it to update the reg with the number of samples, but that hasn't happened yet, so we only change it to enable/disable counting) Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-13 16:04:27 +02:00
Bas Nieuwenhuizen	c3f38c8968	radv: enable variableMultisampleRate. This is only relevant with 0 attachments. In that case we do nothing on subpass switch already, and the pipeline is the authoritative source of the number of samples, so this shouldn't change anything. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-13 15:48:14 +02:00
Edmondo Tommasina	5589fd89e1	gallium/hud: set the dump file streams to line buffered Flush the HUD value streams to the dump files after every newline. v2: check that fopen succeeded (Julien) Reviewed-and-Tested-by: Julien Isorce <jisorce@oblong.com>	2017-04-13 12:38:49 +01:00
Dave Airlie	01d0c5a922	radv: fix stencil regression since new addrlib import The addrlib import meant we'd return after we attempted to setup the no stencil bits for an S8_UINT, now we break and use the stencil level info when creating stencil DB info. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-13 20:32:03 +10:00
Dave Airlie	4bcebe10ca	radv: allocate thin textures as linear. This is ported from radeonsi, and avoids the bug in the addrlib code. This should probably be something addrlib does for us, but for now this fixes the regression without changing addrlib and aligns us with radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-13 20:31:38 +10:00
Samuel Pitoiset	9ced105a52	i965: add missing ir_unop_/ir_binop_ in visit_leave() Fixes the following Clang warnings. brw_fs_channel_expressions.cpp:219:12: warning: enumeration values 'ir_unop_ballot', 'ir_unop_read_first_invocation', and 'ir_binop_read_invocation' not handled in switch [-Wswitch] switch (expr->operation) { ^ 1 warning generated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-13 10:06:07 +02:00
Samuel Pitoiset	b6b566b48e	st/mesa: fix wrong comparison in update_framebuffer_state() state_tracker/st_atom_framebuffer.c:208:27: warning: comparison of constant 4294967295 with expression of type 'uint16_t' (aka 'unsigned short') is always false [-Wtautological-constant-out-of-range-compare] if (framebuffer->width == UINT_MAX) ~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~ state_tracker/st_atom_framebuffer.c:210:28: warning: comparison of constant 4294967295 with expression of type 'uint16_t' (aka 'unsigned short') is always false [-Wtautological-constant-out-of-range-compare] if (framebuffer->height == UINT_MAX) ~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~ 2 warnings generated. Fixes: `eb0fd0e5f8` ("gallium: decrease the size of pipe_framebuffer_state - 96 -> 80 bytes") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-13 10:06:06 +02:00
Samuel Pitoiset	a18bd1373b	radeon: fix duplicate 'const' specifier Fixes the following Clang warning. In file included from radeon_debug.c:32: ./radeon_common_context.h:500:19: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier] extern const char const *radeonVendorString; v2: - do not remove the duplicate 'const' qualifier, fix it Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-04-13 10:06:06 +02:00
Samuel Pitoiset	ede273458c	svga: remove unused vmw_dri1_intersect_src_bbox() Fixes the following Clang warning. vmw_screen_dri.c:130:1: warning: unused function 'vmw_dri1_intersect_src_bbox' [-Wunused-function] vmw_dri1_intersect_src_bbox(struct drm_clip_rect *dst, ^ 1 warning generated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-13 10:06:05 +02:00
Samuel Pitoiset	fbe2ff7740	llvmpipe: remove unused subpixel_snap() and fixed_to_float() Fixes the following Clang warnings. lp_setup_tri.c:55:1: warning: unused function 'subpixel_snap' [-Wunused-function] subpixel_snap(float a) ^ lp_setup_tri.c:61:1: warning: unused function 'fixed_to_float' [-Wunused-function] fixed_to_float(int a) ^ v2: - do not remove subpixel_snap() (use !PIPE_ARCH_SSE instead) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-04-13 10:06:05 +02:00
Samuel Pitoiset	12647533fa	softpipe: remove unused sp_exec_fragment_shader() Fixes the following Clang warning. sp_fs_exec.c:56:1: warning: unused function 'sp_exec_fragment_shader' [-Wunused-function] sp_exec_fragment_shader(const struct sp_fragment_shader_variant *var) ^ 1 warning generated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-13 10:06:04 +02:00
Samuel Pitoiset	5fbe99ce9f	softpipe: remove unused quad_shade_stage() Fixes the following Clang warning. sp_quad_fs.c:60:1: warning: unused function 'quad_shade_stage' [-Wunused-function] quad_shade_stage(struct quad_stage *qs) ^ 1 warning generated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-13 10:06:04 +02:00
Samuel Pitoiset	b885488c22	softpipe: remove unused get_texel_quad_2d() Fixes the following Clang warning. sp_tex_sample.c:802:1: warning: unused function 'get_texel_quad_2d' [-Wunused-function] get_texel_quad_2d(const struct sp_sampler_view *sp_sview, ^ CC sp_tile_cache.lo 1 warning generated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-13 10:06:04 +02:00
Samuel Pitoiset	81ba57f463	trace: remove some unused trace_dump_tag() functions Fixes the following Clang warnings. tr_dump.c:137:1: warning: unused function 'trace_dump_tag' [-Wunused-function] trace_dump_tag(const char name) ^ tr_dump.c:168:1: warning: unused function 'trace_dump_tag_begin2' [-Wunused-function] trace_dump_tag_begin2(const char name, ^ tr_dump.c:187:1: warning: unused function 'trace_dump_tag_begin3' [-Wunused-function] trace_dump_tag_begin3(const char name, ^ CC tr_texture.lo 3 warnings generated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-13 10:06:04 +02:00
Samuel Pitoiset	c53a120a46	draw: remove unused wideline_stage() Fixes the following Clang warning. draw/draw_pipe_wide_line.c:48:38: warning: unused function 'wideline_stage' [-Wunused-function] static inline struct wideline_stage wideline_stage( struct draw_stage stage ) ^ 1 warning generated. v2: - remove commented code (Roland Scheidegger) v3: - remove half_line_width in the struct Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-04-13 10:05:59 +02:00
Samuel Pitoiset	4dfe38aa9c	draw: remove unused overflow() Fixes the following Clang warning. draw/draw_pipe_vbuf.c:102:1: warning: unused function 'overflow' [-Wunused-function] overflow( void map, void ptr, unsigned bytes, unsigned bufsz ) ^ 1 warning generated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-13 09:58:52 +02:00
Samuel Pitoiset	18844005ec	mesa: remove some unused functions in the perf monitor area Fixes the following Clang warnings. main/performance_monitor.c:157:1: warning: unused function 'index_to_queryid' [-Wunused-function] index_to_queryid(GLuint index) ^ main/performance_monitor.c:163:1: warning: unused function 'queryid_valid' [-Wunused-function] queryid_valid(const struct gl_context *ctx, GLuint queryid) ^ main/performance_monitor.c:169:1: warning: unused function 'counterid_to_index' [-Wunused-function] counterid_to_index(GLuint counterid) ^ 3 warnings generated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-13 09:58:24 +02:00
Samuel Pitoiset	df2dba558c	mesa: remove unused clamp_float_to_uint() and clamp_half_to_uint() Fixes the following Clang warnings. main/pack.c:470:1: warning: unused function 'clamp_float_to_uint' [-Wunused-function] clamp_float_to_uint(GLfloat f) ^ main/pack.c:477:1: warning: unused function 'clamp_half_to_uint' [-Wunused-function] clamp_half_to_uint(GLhalfARB h) ^ 2 warnings generated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-13 09:58:24 +02:00
Samuel Pitoiset	bdb53e240b	mesa: remove unused _mesa_unmarshal_BindBufferBase() Fixes the following Clang warning. main/marshal.c:209:1: warning: unused function '_mesa_unmarshal_BindBufferBase' [-Wunused-function] _mesa_unmarshal_BindBufferBase(struct gl_context ctx, const struct marshal_cmd_BindBufferBase cmd) ^ 1 warning generated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-13 09:58:19 +02:00
Samuel Pitoiset	b3375800d7	virgl: add missing PIPE_CAP_DOUBLES Fixes the following Clang warning. virgl_screen.c:60:12: warning: enumeration value 'PIPE_CAP_DOUBLES' not handled in switch [-Wswitch] switch (param) { ^ 1 warning generated. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-13 09:58:05 +02:00
Samuel Pitoiset	d5cd4990cd	glsl: simplify apply_image_qualifier_to_variable() This removes one level of indentation and will improve readability for bindless images. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-13 09:52:55 +02:00
Samuel Pitoiset	6bb0f75bb6	glsl: add validate_fragment_flat_interpolation_input() Requested by Timothy Arceri. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-13 09:52:48 +02:00
Boyan Ding	d02829c94e	nvc0: Enable ARB_shader_ballot on Kepler+ readInvocationARB() and readFirstInvocationARB() need SHFL.IDX instruction which is introduced in Kepler. Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-13 02:25:17 -04:00
Boyan Ding	59f6aa8096	nvc0/ir: Implement TGSI_OPCODE_BALLOT and TGSI_OPCODE_READ_* v2: Check if each channel is masked in TGSI_OPCODE_BALLOT (Ilia Mirkin) Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-13 02:25:14 -04:00
Boyan Ding	48d00779d0	nvc0/ir: Implement TGSI_SEMANTIC_SUBGROUP_* Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-13 02:25:08 -04:00
Boyan Ding	f7787f224f	nvc0/ir: Add SV_LANEMASK_* system values. v2: Add name strings in nv50_ir_print.cpp (Ilia Mirkin) Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-13 02:25:04 -04:00
Boyan Ding	2a3c4c6bc3	nvc0/ir: Allow 0/1 immediate value as source of OP_VOTE Implementation of readFirstInvocationARB() on nvidia hardware needs a ballotARB(true) used to decide the first active thread. This expressed in gm107 asm as (supposing output is $r0): vote any $r0 0x1 0x1 To model the always true input, which corresponds to the second 0x1 above, we make OP_VOTE accept immediate value 0/1 and emit "0x1" and "not 0x1" in the src field respectively. v2: Make sure that asImm() is not NULL (Samuel Pitoiset) v3: (Ilia Mirkin) Make the handling more symmetric with predicate version in gm107 Use i->getSrc(s) Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-13 02:24:59 -04:00
Boyan Ding	f1252996f5	gk110/ir: Emit OP_SHFL v2: Make sure that asImm() is not NULL (Samuel Pitoiset) v3: Check the range of immediate in OP_SHFL (Ilia Mirkin) Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-13 02:24:55 -04:00
Boyan Ding	c32e150008	nvc0/ir: Emit OP_SHFL v2: (Samuel Pitoiset) Add an assertion to check if the target is Kepler Make sure that asImm() is not NULL v3: (Ilia Mirkin) Check the range of immediate value of OP_SHFL Use the new setPDSTL API Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-13 02:24:52 -04:00
Boyan Ding	d941ef3829	nvc0/ir: Properly handle a "split form" of predicate destination GF100's ISA encoding has a weird form of predicate destination where its 3 bits are split across whole the instruction. Use a dedicated setPDSTL function instead of original defId which is incorrect in this case. v2: (Ilia Mirkin) Change API of setPDSTL() to handle cases of no output Fix setting of the highest bit in setPDSTL() Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-13 02:24:47 -04:00
Boyan Ding	854554c314	gm107/ir: Emit third src 'bound' and optional predicate output of SHFL v2: Emit the original hard-coded 0x1c03 when OP_SHFL is used in gm107's lowering (Samuel Pitoiset) Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-13 02:24:30 -04:00
Michel Dänzer	a981e68c26	clover: Fix build against clang SVN >= r299965 clang::LangAS::Offset is gone, the behaviour is as if it was 0. v2: Introduce and use clover::llvm::compat::lang_as_offset (Francisco Jerez) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-04-13 12:51:24 +09:00
Brian Paul	46f49d6fdc	st/mesa: add some _mesa_is_winsys_fbo() assertions A few functions related to FBOs/renderbuffers should only be used with window-system buffers, not user-created FBOs. Assert for that. Add additional comments. No piglit regressions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-12 21:13:23 -06:00
Brian Paul	c36d224921	st/mesa: minor optimization in st_DrawBuffers() We only do on-demand renderbuffer allocation for window-system FBOs, not user-created FBOs. So put the loop inside a conditional. Plus, add some comments. No piglit regressions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-12 21:13:23 -06:00
Timothy Arceri	fbcd709a34	mesa/st: only update samplers for stages that have changed Might help reduce cpu for some apps that use sso. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-13 12:08:31 +10:00
Vinson Lee	f30f575e7b	st/mesa: Fix missing-braces warning. CXX state_tracker/st_glsl_to_nir.lo state_tracker/st_glsl_to_nir.cpp:250:57: warning: suggest braces around initialization of subobject [-Wmissing-braces] nir_lower_wpos_ytransform_options wpos_options = {0}; ^ {} Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-12 15:43:30 -07:00
Alex Smith	4603bea1aa	radv: Disable primitive restart for non-indexed draws According to the Vulkan spec, VkPipelineInputAssemblyStateCreateInfo's primitiveRestartEnable flag should only apply to indexed draws, however it was being enabled regardless of the type of draw. This could cause problems for non-indexed draws with >=65535 vertices if the previous indexed draw used 16-bit indices. Fixes corruption of the credits text in Mad Max. v2: Reset primitive restart state after executing a secondary command buffer. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-12 20:58:41 +02:00
Matt Turner	ab18578b03	anv: Only define wsi_cbs when VK_USE_PLATFORM_WAYLAND_KHR defined	2017-04-12 11:00:39 -07:00
Marek Olšák	f7b1371d2d	Revert "r600g: get rid of dummy pixel shader" This reverts commit `61e47d92c5`. It causes a hang on RS780. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100663	2017-04-12 17:46:21 +02:00
Bartosz Tomczyk	bb847e78cf	mesa: fix memory leak in arb_fragment_program Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-12 17:50:36 +10:00
Bas Nieuwenhuizen	c4d43388c0	radv: Hash the immutable samplers. Since the shader code can include them. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-12 07:43:38 +02:00
Bas Nieuwenhuizen	bd91caf863	radv: Use an offset instead of pointers for immutable samplers. Makes more sense when we hash the layout for the pipeline cache. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-12 07:43:25 +02:00
Bas Nieuwenhuizen	b35b5951fc	radv: Stop shadowing the result in radv_GetQueryPoolResults. The outer result was referred to, which meant bugs. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-12 07:38:58 +02:00
Bas Nieuwenhuizen	0763453291	radv: Return VK_NOT_READY if the query results are not available. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Fixes: `8475a14302` ("radv: Implement pipeline statistics queries.") Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2017-04-12 07:38:58 +02:00
Bas Nieuwenhuizen	2dacb727c2	radv: Set query availability bit even if we don't wait. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Fixes: `8475a14302` ("radv: Implement pipeline statistics queries.") Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2017-04-12 07:38:58 +02:00
Gregory Hainaut	03d1de387e	mesa: avoid NULL ptr in prog parameter name Context: _mesa_add_parameter is sometimes[0] called with a NULL name as a mean of an unnamed parameter. Allowing NULL pointer as a name means that it must be NULL checked each access. So far it isn't always[1] true. Parameter name is only used for debug purpose (printf) and to lookup the index/location of the program by the application. Conclusion, there is no valid reason to use a NULL pointer instead of an empty string. So it was decided to use an empty string which avoid all issues related to NULL pointer [0]: texture gather offsets glsl opcode and st_init_atifs_prog [1]: at least shader cache, st_nir_lookup_parameter_index and some printfs Issue found by piglit 'texturegatheroffsets' tests on Nouveau v4: new patch based on Nicolai/Timothy/ilia discussion Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-12 14:30:28 +10:00
Kenneth Graunke	754b961f38	i965/drm: Use bools for a few flags. These one bit values are booleans. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Kenneth Graunke	44ecbbebe2	i965/drm: Make brw_bo_alloc_tiled flags parameter 32-bit. unsigned long is a terrible type for a bitfield - if you need fewer than 32 bits, it wastes 4 bytes. If you need more, things break on 32-bit builds. Just use unsigned. Even that's a bit ridiculous as we only have one flag today. Still, it's at least somewhat better. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Kenneth Graunke	f374b9449e	i965/drm: Make BO size a uint64_t rather than unsigned long. The drm_i915_gem_create ioctl structure uses a __u64 for the size, so we should probably use uint64_t to match. In theory, we could probably have a BO larger than 4GB, using a 48-bit PPGTT - it just wouldn't be mappable in the CPU's 32-bit address space. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Kenneth Graunke	c85d6832fd	i965/drm: Make alignment parameter a uint64_t. Theoretically, with a 48-bit address space, we could have buffers with an alignment of >= 4GB. It's a bit silly, but the exec_object structs (drm_i915_gem_exec_object2) use a __u64 for this, so we may as well use the same type as the kernel API. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Kenneth Graunke	444ab8126d	i965/drm: Make stride/pitch a uint32_t. struct drm_i915_gem_set_tiling's stride field is a __u32. intel_mipmap_tree::stride is a uint32_t. Using unsigned long just doesn't make sense. Switching also lets us drop many pointless locals that only existed to deal with the type mismatch. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Kenneth Graunke	14fc188460	i965/drm: Fix types for pwrite/pread fields. The ioctl structs contain __u64 offset and size fields, so make them uint64_t rather than unsigned long. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Kenneth Graunke	193601311c	i965/drm: Make brw_bo_alloc_tiled take tiling by value, not pointer. For some reason we passed tiling by pointer, through several layers, even though the functions only read the initial value, and never actually change it. We even had a do-while loop that executed until the tiling mode matched - except it always did, so it only ran once. We then had bogus error handling in case it changed the tiling mode to something nonsensical...which it never did. Drop all this nonsense. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Timothy Arceri	9bd7184078	mesa/st: remove _mesa_get_fallback_texture() calls These calls look like leftover from fallback texture support first being added to the st in `8f6d9e12be` and then later being added to core mesa in `00e203fe17`. The piglit test fp-incomplete-tex continues to work with this change. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-12 12:00:35 +10:00
Timothy Arceri	c72170fb1f	mesa: use pre_hashed version of search for the mesa hash table The key is just an unsigned int so there is never any real hashing done. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-04-12 12:00:35 +10:00
Tim Rowley	d0f381f865	swr: [rasterizer core] Disable 8x2 tile backend Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	31a23a9d9d	swr: [rasterizer common] Add _simd_testz_si alias Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	7abd1f9b24	swr: [rasterizer archrast] Fix archrast for MSVC 2017 compiler Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	54d11b3c95	swr: [rasterizer jitter] Remove unused function Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	af909c0200	swr: [rasterizer jitter] Remove HAVE_LLVM tests supporting llvm < 3.8 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	973d38801d	swr: [rasterizer common/core] Fix 32-bit windows build Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	217b791a44	swr: [rasterizer core] Fix unused variable warnings Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	da7aa39f93	swr: [rasterizer core] Code formating change Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	c8cc07ca25	swr: [rasterizer core] SIMD16 Frontend WIP - PA Fix PA NextPrim for SIMD8 on SIMD16. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	08a7136848	swr: [rasterizer core] SIMD16 Frontend WIP - Clipper Implement widened clipper for SIMD16. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	0033e86b2c	swr: [rasterizer core] Multisample sample position setup change Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	4c093869db	swr: [rasterizer core] Reduce templates to speed compile Quick patch to remove some unused template params to cut down rasterizer compile time. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Francisco Jerez	147e71242c	i965/fs: Take into account lower frequency of conditional blocks in spilling cost heuristic. The individual branches of an if/else/endif construct will be executed some unknown number of times between 0 and 1 relative to the parent block. Use some factor in between as weight while approximating the cost of spill/fill instructions within a conditional if-else branch. This favors spilling registers used within conditional branches which are likely to be executed less frequently than registers used at the top level. Improves the framerate of the SynMark2 OglCSDof benchmark by ~1.9x on my SKL GT4e. Should have a comparable effect on other platforms. No significant regressions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-11 15:28:54 -07:00
Tim Rowley	9a7b257450	swr: return true for PIPE_CAP_DOUBLES Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-11 13:16:43 -05:00
Kenneth Graunke	02ccd8f52c	i965: Set kernel features before computing max GL version. We check these bitfields when computing the Haswell max GL version. We need to set them ahead of time, or they won't exist, and all our checks will fail. That sets the max core profile GL version to 4.2. This introduces the bizarre situation where asking for a GL context with version 4.3+ fails, but asking for a GL core profile context with version <= 4.2 actually promotes you a 4.5 context. GLX_MESA_query_renderer also reported the bogus 4.2 value. Now it shows 4.5. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reported-and-tested-by: Rafael Ristovski <rafael.ristovski@gmail.com>	2017-04-11 08:58:16 -07:00
Juan A. Suarez Romero	8d7a82ae32	anv: remove needless VALGRIND_MAKE_MEM_DEFINED This is already invoked in the following VG_NOACCESS_READ() call. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-11 17:21:57 +02:00
Lucas Stach	4ee7c2c284	etnaviv: enable TS, but disable autodisable Autodisable seems to cause missed rendering in some cases, but otherwise TS seems to work properly. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-04-11 16:52:31 +02:00
Lucas Stach	797890bbbd	etnaviv: enable TS also on sampler resources Fixes a performance issue with imported winsys buffers as those are marked with binding sampler view. This might require a TS flush on single pipe chips that directly sample from the rendered buffer, but otherwise seems to work fine. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-04-11 16:52:27 +02:00
Lucas Stach	52f6c8cc31	etnaviv: align TS surface size to number of pixel pipes The TS surface gets cleared by a tiled RS fill. If the chip has more than 1 pixel pipe the size of the TS surface needs to be aligned so that each pipe address matches a tile start, otherwise the RS will hang. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-04-11 16:52:22 +02:00
Lucas Stach	37622ecc79	etnaviv: avoid using invalid TS The TS is only valid after it has been initialized by a fast clear, so it should not be taken into account when blitting resources that haven't been cleared. Also the blit itself invalidates the destination TS, as it's not updated and will retain data from the previous rendering after the blit. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-04-11 16:52:01 +02:00
Samuel Pitoiset	768f81b62b	glsl: use the BA1 macro for textureQueryLevels() For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-11 10:24:57 +02:00
Samuel Pitoiset	981ba1c89b	glsl: use the BA1 macro for textureSamples() For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-11 10:24:54 +02:00
Samuel Pitoiset	29082b0b22	glsl: use the BA1 macro for textureCubeArrayShadow() For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-11 10:24:51 +02:00
Bas Nieuwenhuizen	8475a14302	radv: Implement pipeline statistics queries. The devil is in the shader again, otherwise this is fairly straightforward. The CTS contains no pipeline statistics copy to buffer testcases, so I did a basic smoketest. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	d2906bc72d	radv: Let count be dynamic in radv_break_on_count. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	8473193760	radv: Rename query pipeline/set layout. For using them with both occlusion and pipeline statistics queries. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	95743d5b88	radv: Use VK_WHOLE_SIZE for the query buffer bindings. The buffer sizes are specified just a few lines earlier, so don't repeat ourselves. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	8911dd6d12	radv: Use a shader for occlusion CmdCopyQueryPoolResults. Use the new occlusion query copy shader. We don't use the shader for the waiting as a polling loop ineracts badly with having caching enabled. I noticed on my GPU (Tonga) that the values are written out in order, so I just use a WAIT_REG_MEM on the last value. If it turns out other chips don't do that we may need to look a bit more into this. Having 8 WAIT_REG_MEM packets per query doesn't sound ideal. This also restricts the availability word in the pool to timestamp queries only, as occlusion queries don't use it, and pipeline statistic queries likely won't either. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	ce0c8cf941	radv: Add occlusion query shader. Adds a shader for writing occlusion query results to a buffer, as the CP packet isn't support on SI or secondary buffers, and doesn't handle the availability bit (or partial results) nor truncation to 32-bit. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Kenneth Graunke	50b987c0f0	i965: Fix wonky indentation left by brw_bo_alloc_tiled rename.	2017-04-10 23:25:13 -07:00
Ilia Mirkin	d9cc58d6ec	nouveau: when mapping a persistent buffer, synchronize on former xfers If the buffer is being used, we should wait for those uses to be complete before returning the map. Fixes: GL45-CTS.direct_state_access.buffers_functional Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2017-04-11 00:13:55 -04:00
Ilia Mirkin	8036809799	nvc0: increase texture buffer object alignment to 256 for pre-GM107 We currently don't pass the low byte of the address via the surface info, so in order to work with images, these have to implicitly be aligned to 256. The proprietary driver also doesn't go out of its way to provide lower alignment. Fixes GL45-CTS.texture_buffer.texture_buffer_texture_buffer_range Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-11 00:13:55 -04:00
Timothy Arceri	8ffd54fef8	mesa: fix typo and add assert() to _mesa_attach_renderbuffer_without_ref() This function should only be used with a "freshly created" renderbuffer so assert RefCount is 1.	2017-04-11 09:57:45 +10:00
Kenneth Graunke	bd84252be6	i965/drm: Add stall warnings when mapping or waiting on BOs. This restores the performance warnings removed in: i965: Drop brw_bo_map[_gtt] wrappers which issue perf warnings. but adds them for nearly all BO mapping, and also for wait_rendering. Because we add this to the core bufmgr, we automatically get stall warnings in all callers, unlike before where only a few callsites used the wrappers that gave stall warnings. We also do it a bit differently: we simply measure how long set_domain takes (the part that stalls), and complain if it's more than 0.01 ms. We don't bother calling brw_bo_busy(), and we don't measure the mmap time (which doesn't stall). This should be more accurate. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-04-10 14:33:18 -07:00
Kenneth Graunke	f053ee78ed	i965/drm: Make a set_domain() helper function. Less boilerplate. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-04-10 14:33:18 -07:00
Daniel Vetter	a99a4979fd	i965/batch: Ensure we use a consistent offset in relocs In theory gcc is free to re-load them, and if a concurrent execbuf races and updates bo->offset64 then we have a problem: execbuffer api requires that the ->presumed_offset and the one we used for the reloc matches. It does not require that the value is sensible, which means no locks needed, just a consistent load. Ken said his next series will nuke this, so just hand-roll the kernel's READ_ONCE idea inline. FIXME: Most callers of brw_emit_reloc recompute the relocation themselves, which means this doesn't really fix the race. But the long term plan is to move to per-context relocation handling, which will fix this all properly. So leave this for now as just a reminder. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-10 14:33:18 -07:00
Daniel Vetter	7f3c85c21e	i965/bufmgr: Garbage-collect vma cache/pruning This was done because the kernel has 1 global address space, shared with all render clients, for gtt mmap offsets, and that address space was only 32bit on 32bit kernels. This was fixed in commit 440fd5283a87345cdd4237bdf45fb01130ea0056 Author: Thierry Reding <treding@nvidia.com> Date: Fri Jan 23 09:05:06 2015 +0100 drm/mm: Support 4 GiB and larger ranges which shipped in 4.0. Of course you still want to limit the bo cache to a reasonable size on 32bit apps to avoid ENOMEM, but that's better solved by tuning the cache a bit. On 64bit, this was never an issue. On top, mesa never set this, so it's all dead code. Collect an trash it. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-10 14:33:18 -07:00
Daniel Vetter	1f965d3f7a	i965/bufmgr: Remove some reuse functions is_reusable was needed by uxa because it couldn't keep track of its scanout buffers and used this as a proxy. Disabling reuse is a silly idea, we set this once at start. Remove both. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-10 14:33:18 -07:00
Daniel Vetter	edd85c1f04	i965/bufmgr: remove start_gtt_access Iirc this was used by uxa for persistent mmpas of the frontbuffer. For mesa all the set_domain stuff needed before a synchronized mmap is handled within the bufmgr, so no reason ever to call this. Inline the implementation into its only internal user. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-10 14:33:17 -07:00
Daniel Vetter	439edaa4b5	i965/bufmgr: Delete set_tiling Entirely unused, and really shouldn't be used. The alloc functions already take care of this. And even in a future where we're not going to h/v-align tiled buffers in the bufmgr, but only in isl, I think we still want to adjust the tiling mode in the bufmgr, since that ties in closely to mmaps and stuff like that. get_tiling is still needed for the import paths (until we have modifiers everywhere). Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-10 14:33:17 -07:00
Daniel Vetter	6308121475	i965/bufmgr: Delete alloc_for_render Entirely unused, mesa instead used the BO_ALLOC_FOR_RENDER flag. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-10 14:33:14 -07:00
Kenneth Graunke	538fa87f40	i965/drm: Use list_for_each_entry_safe in a couple of cases. Suggested by Chris Wilson. A tiny bit simpler. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-04-10 14:33:12 -07:00
Kenneth Graunke	10929da5fb	i965/drm: Rename intel_bufmgr_gem.c to brw_bufmgr.c. Matches the class name and the header file name. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:32 -07:00
Kenneth Graunke	7aa66e64fe	i965/drm: Reindent intel_bufmgr_gem.c and brw_bufmgr.h. indent -i3 -nut -br -brs -npcs -ce --no-tabs -Tuint32_t -Tuint64_t plus some manual fixes because those aren't quite the right settings. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:30 -07:00
Kenneth Graunke	d30a92738c	i965/drm: Rename drm_bacon_bo to brw_bo. The bacon is all gone. This renames both the class and the related functions. We're about to run indent on the bufmgr code, so no need to worry about fixing bad indentation. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:28 -07:00
Kenneth Graunke	e0d15e9769	i965: Drop brw_bo_map[_gtt] wrappers which issue perf warnings. The stupid reason for eliminating these functions is that I'm about to rename drm_bacon_bo_map() to brw_bo_map(), which makes the real function have the short name, rather than the wrapper. I'm also planning on reworking our mapping code soon, so we use WC mappings and proper unsynchronized mappings on non-LLC platforms. It will be easier to do that without thinking about the stall warnings and wrappers. My eventual hope is to put the performance warnings in the BO map function itself, so all callers gain the warning. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:25 -07:00
Kenneth Graunke	dfd81373b6	i965/drm: Rename drm_bacon_reg_read() to brw_reg_read(). Less bacon. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:24 -07:00
Kenneth Graunke	662a733dbc	i965/drm: Rename drm_bacon_bufmgr to struct brw_bufmgr. Also stop using typedefs, per Mesa coding style. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:21 -07:00
Kenneth Graunke	f5216b25e0	i965: Just use a uint32_t context handle rather than a malloc'd wrapper. drm_bacon_context is a malloc'd struct containing a uint32_t context ID and a pointer back to the bufmgr. The bufmgr pointer is pretty useless, as everybody already has brw->bufmgr. At that point...we may as well just use the ctx_id handle directly. A number of places already had to call drm_bacon_gem_context_get_id() to extract the ID anyway. Now they just have it. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:20 -07:00
Kenneth Graunke	4cb3e4429d	i965/drm: Fold drm_bacon_gem_reset_stats into the callers. We're going to get rid of drm_bacon_context shortly, so we'd have to change the interface slightly. It's basically just an ioctl wrapper that isn't terribly bufmgr-related, so We may as well just combine it with the code in brw_reset.c that actually uses it. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:19 -07:00
Kenneth Graunke	414c9343a2	i965/drm: Rename drm_bacon_gem_bo_bucket to bo_cache_bucket. No need for a prefix as this struct is local to the .c file. Less bacon. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:17 -07:00
Kenneth Graunke	e46b74d1b5	i965/drm: Drop drm_bacon_* from static functions. Mesa style is to not use lengthy prefixes for static functions. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:16 -07:00
Kenneth Graunke	13596ecb6b	i965/drm: Drop drm_bacon_gem_bo_madvise_internal(). The only difference is that it takes an explicit bufmgr rather than using bo->bufmgr, but there is only one bufmgr per screen so they should be identical anyway. Chris says this was added primarly to avoid bo/bo_gem casting, which was inconvenient. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:15 -07:00
Kenneth Graunke	9ee252865e	i965/drm: Merge drm_bacon_bo_gem into drm_bacon_bo. The separate class gives us a bit of extra encapsulation, but I don't know that it's really worth the boilerplate. I think we can reasonably expect the rest of the driver to be responsible. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:14 -07:00
Kenneth Graunke	59fdd94b85	i965/drm: Merge bo->handle and bo_gem->gem_handle. These fields are the same value. In the bad old days, bo->handle could have been an identifier from the pre-GEM fake bufmgr, but that's long gone. Keep the "gem_handle" name for clarity. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:08 -07:00
Kenneth Graunke	eb41aa82c4	i965/drm: Rewrite relocation handling. The execbuf2 kernel API requires us to construct two kinds of lists. First is a "validation list" (struct drm_i915_gem_exec_object2[]) containing each BO referenced by the batch. (The batch buffer itself must be the last entry in this list.) Each validation list entry contains a pointer to the second kind of list: a relocation list. The relocation list contains information about pointers to BOs that the kernel may need to patch up if it relocates objects within the VMA. This is a very general mechanism, allowing every BO to contain pointers to other BOs. libdrm_intel models this by giving each drm_intel_bo a list of relocations to other BOs. Together, these form "reloc trees". Processing relocations involves a depth-first-search of the relocation trees, starting from the batch buffer. Care has to be taken not to double-visit buffers. Creating the validation list has to be deferred until the last minute, after all relocations are emitted, so we have the full tree present. Calculating the amount of aperture space required to pin those BOs also involves tree walking, which is expensive, so libdrm has hacks to try and perform less expensive estimates. For some reason, it also stored the validation list in the global (per-screen) bufmgr structure, rather than as an local variable in the execbuffer function, requiring locking for no good reason. It also assumed that the batch would probably contain a relocation every 2 DWords - which is absurdly high - and simply aborted if there were more relocations than the max. This meant the first relocation from a BO would allocate 180kB of data structures! This is way too complicated for our needs. i965 only emits relocations from the batchbuffer - all GPU commands and state such as SURFACE_STATE live in the batch BO. No other buffer uses relocations. This means we can have a single relocation list for the batchbuffer. We can add a BO to the validation list (set) the first time we emit a relocation to it. We can easily keep a running tally of the aperture space required for that list by adding the BO size when we add it to the validation list. This patch overhauls the relocation system to do exactly that. There are many nice benefits: - We have a flat relocation list instead of trees. - We can produce the validation list up front. - We can allocate smaller arrays and dynamically grow them. - Aperture space checks are now (a + b <= c) instead of a tree walk. - brw_batch_references() is a trivial validation list walk. It should be straightforward to make it O(1) in the future. - We don't need to bloat each drm_bacon_bo with 32B of reloc data. - We don't need to lock in execbuffer, as the data structures are context-local, and not per-screen. - Significantly less code and a better match for what we're doing. - The simpler system should make it easier to take advantage of I915_EXEC_NO_RELOC in a future patch. Improves performance in Synmark 7.0's OglBatch7: - Skylake GT4e: 12.1499% +/- 2.29531% (n=130) - Apollolake: 3.89245% +/- 0.598945% (n=35) Improves performance in GFXBench4's gl_driver2 test: - Skylake GT4e: 3.18616% +/- 0.867791% (n=229) - Apollolake: 4.1776% +/- 0.240847% (n=120) v2: Feedback from Chris Wilson: - Omit explicit zero initializers for garbage execbuf fields. - Use .rsvd1 = ctx_id rather than i915_execbuffer2_set_context_id - Drop unnecessary fencing assertions. - Only use _WR variant of execbuf ioctl when necessary. - Shrink the arrays to be smaller by default. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:32:00 -07:00
Kenneth Graunke	e7ab0ea5e7	i965/drm: Make register write check handle execbuffer directly. I'm about to rewrite how relocation handling works, at which point drm_bacon_bo_emit_reloc() and drm_bacon_bo_mrb_exec() won't exist anymore. This code is already largely not using the batchbuffer infrastructure, so just go all the way and handle relocations, the validation list, and execbuffer ourselves. That way, we don't have to think the weird case where we only have a screen, and no context, when redesigning the relocation handling. v2: Write reloc.presumed_offset + reloc.delta into the batch, rather than duplicating the comment, so it's obvious that they match (suggested by Chris). Also add a comment about why we don't do any error checking. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:56 -07:00
Kenneth Graunke	6368284a34	i965: Make a screen::aperture_threshold field. This is the threshold after which drm_intel_bufmgr_check_aperture_space returns -ENOSPC, signalling that it thinks an execbuf is likely to fail and we need to roll back and flush the batch. We'll need this when we rewrite aperture space checking, shortly. In the meantime, we can also use it in GLX_MESA_query_renderer. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:55 -07:00
Kenneth Graunke	6079f4f16e	i965: Make/use a brw_batch_references() wrapper. We'll want to change the implementation of this shortly. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:54 -07:00
Kenneth Graunke	6537a3ca11	i965: Use brw_emit_reloc() instead of drm_bacon_bo_emit_reloc(). I'm about to make brw_emit_reloc do actual work, so everybody needs to start using it and not the raw drm_bacon function. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:52 -07:00
Kenneth Graunke	eadd5d1b51	i965: Change intel_batchbuffer_reloc() into brw_emit_reloc(). This renames intel_batchbuffer_reloc to brw_emit_reloc and changes the parameter naming and ordering to match drm_intel_bo_emit_reloc(). For now, it's a trivial wrapper that accesses batch->bo. When we rework relocations, it will start doing actual work. target_offset should be expanded to a uint64_t to match the kernel, but for now we leave it as its original 32-bit type. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:51 -07:00
Kenneth Graunke	fbb3297165	i965/drm: Drop GEM_SW_FINISH stuff. This is only useful when doing an incoherent CPU mapping of the current scanout buffer. That's a terrible plan, so we never do it. We always use an uncached GTT map. So, this is useless. Drop the code. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:49 -07:00
Kenneth Graunke	80761a42e0	i965/drm: Drop code to search for an existing bufmgr. This functionality was added by libdrm commit 743af59669386cb6e063fa4bd85f0a0b2da86295 (intel: make bufmgr_gem shareable from different API) in an attempt to solve libva/mesa buffer sharing problems. Specifically, this was working around an issue hit by Chromium, which used the same drm_fd for multiple APIs, and shared buffers between them. This code attempted to work around that issue by using the same bufmgr for both libva and Mesa. It worked because libdrm_intel was loaded by both libraries. However, now that Mesa has forked, we don't have a common library, and this code cannot work. The correct solution is to have each API open its own file descriptor (and get a corresponding buffer manager), and then use PRIME export and import to share BOs across those APIs. Then the kernel can manage those shared resources. According to Chris, the kernel will pass back the same handle for a prime FD if the lookup is from the same device FD. We believe Chromium has since moved to this model. In Mesa, there is already only one screen per FD, and so there will only be one bufmgr per FD. We don't need any of this code. v2: Add a big warning comment written by Chris Wilson. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:48 -07:00
Kenneth Graunke	b666654201	i965/drm: Unwrap the unnecessary drm_bacon_reloc_target_info struct. This used to have another field, but now it's just a BO pointer. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:46 -07:00
Kenneth Graunke	2662894baa	i965/drm: Switch from uthash to Mesa's hash table. No performance data has been gathered about this choice. I just don't want that many hash tables. Chris points out that this is not performance critical - we should not be recreating that many handles from scratch. In the past we used a linear list, which became unreasonable in stress tests that used hundreds of thousands of BOs. In real usage, it shouldn't matter that much. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:45 -07:00
Kenneth Graunke	ad1b1cce44	i965/drm: Drop bo_gem::kflags. It's always zero now. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:43 -07:00
Kenneth Graunke	a972c903cb	i965/drm: Drop has_exec_async related API. Mesa doesn't use this yet. We'll almost certainly want to, but we can add the functionality back after we clean up the messy drm code. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:42 -07:00
Kenneth Graunke	d606f64e2d	i965/drm: Drop softpin support for now. We may want this eventually, but simplify for now. We can add it back later when we actually intend to use it. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:41 -07:00
Kenneth Graunke	0314eed3b1	i965/drm: Drop userptr support for now. We'll want userptr support for GL_AMD_pinned_memory support someday, and possibly some other upload optimizations. Chris says "not in this form" though. Drop it and simplify for now - we can add it back later when we're ready to hook it up fully. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:39 -07:00
Kenneth Graunke	a460e1eb51	i965/drm: Delete engine checks. This is basically handholding to prevent a bogus caller from trying to execbuffer on a bogus engine. i965 already does this correctly. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:37 -07:00
Kenneth Graunke	1dc02da6d7	i965/drm: Drop intel_chipset.h in favor of using gen_device_info. This moves the PCI ID detection to intel_screen.c and makes drm_bacon_bufmgr_gem_init() take a devinfo pointer. We also drop the HAS_LLC query stuff - devinfo has that info already, without kernel queries, and it makes no sense to have two has_llc flags set by different mechanisms. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:36 -07:00
Kenneth Graunke	55ee8f36a8	i965/drm: Drop deprecated drm_bacon_bo::offset. This field was the wrong size, so we replaced it with offset64. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:35 -07:00
Kenneth Graunke	a29fb9b2ee	i965/drm: Drop has_wait_timeout. The wait-ioctl was introduced in kernel v3.6 (20120930) and that is our current minimum requirement for screen creation. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:33 -07:00
Kenneth Graunke	b97bcf3b6b	i965/drm: Assume aperture size query will work. This query has been available since 2.6.28. We require 3.6. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:32 -07:00
Kenneth Graunke	c28691ab77	i965/drm: Combine drm_bacon_bufmgr_gem and drm_bacon_bufmgr classes. The distinction was required when the bufmgr was virtualised, now there is only one class, we no longer need the distraction of pretending it is a subclass. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:31 -07:00
Kenneth Graunke	3673b89bf3	i965/drm: Move _drm_bacon_context to intel_bufmgr_gem.c. This moves us one step closer to killing off intel_bufmgr_priv.h. We might want to nuke it altogether, since it's basically just a uint32_t handle, but for now, let's focus on removing files. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:29 -07:00
Kenneth Graunke	b0d1c5983b	i965/drm: Drop cliprects and dr4 from execbuf variants. Legacy DRI1 leftovers. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:28 -07:00
Kenneth Graunke	2c257ff226	i965/drm: Devirtualize the bufmgr. libdrm_bacon used to have a GEM-based bufmgr and a legacy fake bufmgr, but that's long since dead (and we never imported it to i965). So, drop the extra layer of function pointers. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:27 -07:00
Kenneth Graunke	dca224a9ef	i965/drm: Check INTEL_DEBUG & DEBUG_BUFMGR directly. Eliminates some API around this, and more importantly, the last field in one bufmgr class. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:25 -07:00
Kenneth Graunke	68cb0c6d92	i965/drm: Use Mesa's macros.h instead of duplicating them. Replace the duplicated macros imported from libdrm: ARRAY_SIZE, MAX2, ALIGN, STATIC_ASSERT and remove unused ROUND_UP_TO and ROUND_UP_TO_MB. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:24 -07:00
Kenneth Graunke	c5cdb0f405	i965/drm: Use ALIGN, not ROUND_UP_TO. ROUND_UP_TO handles a NPOT alignment, but all the alignments we use are power of two anyway, so there's no need. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:23 -07:00
Kenneth Graunke	1d476e64e5	i965/drm: Delete execbuf1 support. execbuf2 has been around since v2.6.33. We require v3.6. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:21 -07:00
Kenneth Graunke	ddf01d3f41	i965/drm: Remove Gen2-3 fence accounting. Since gen4, we do not use fence registers for any GPU access and so never have to account for the fence during batch construction. All the related fence functions are unused. Based on Kristian Høgsberg's patch; commit message by Chris Wilson. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:20 -07:00
Kenneth Graunke	4f698b0049	i965/drm: Remove some unused functions and macros. Mesa doesn't use these functions or macros, so we can delete them, and save work refactoring and cleaning them up. We'll delete a lot more later, too. Based on a patch by Kristian Høgsberg. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:18 -07:00
Kenneth Graunke	09b2f6124a	i965/drm: Switch to util/list.h instead of libdrm_lists.h. Both are kernel style lists, so this is trivial. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:16 -07:00
Kenneth Graunke	7c64096b2d	i965/drm: Port to Mesa's atomic header. Drop xf86atomic.h in favor of Mesa's util/u_atomic.h. We replace the atomic_t wrapper struct with a bare integer, switch to the 'p_atomic' naming conventions, and move over the one extra helper. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:13 -07:00
Kenneth Graunke	eed86b975e	i965/drm: Use our internal libdrm (drm_bacon) rather than the real one. Now we can actually test our changes. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:11 -07:00
Kenneth Graunke	91b973e3a3	i965/drm: s/drm_intel/drm_bacon/g Using drm_intel_* as a prefix is hazardous - we don't want to conflict with the actual libdrm_intel symbols. In particular, I think we could get into trouble during the final megadrivers linking. So, rename everything to an different yet arbitrary prefix. bacon and intel are the same number of characters, so we don't have to reindent the world. It's also an homage to Ian's "Bacon Trail" platform. I was going to use "drm_relic" to poke fun at libdrm being ancient, and so we could explain the name with a "historical reasons" pun, but it sounds too much like ralloc. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:09 -07:00
Kenneth Graunke	4ad0758f51	i965/drm: Drop libpciaccess dependencies. i965 doesn't use drm_intel_get_aperture_sizes(), so we can delete support for it. This avoids a build dependency on libpciaccess. Chris also notes: "There's a really old bug that hopefully has been closed already (although as far as I can tell, it has never been fixed) about how using libpciaccess from libdrm_intel breaks the world (since libpciaccess uses a singleton that is torn down at the first request rather than upon the last user)." This bug should go away in two commits when we switch over to our internal copy of libdrm_intel. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84325 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:05 -07:00
Kenneth Graunke	d614135e95	i965/drm: Make libdrm_lists.h compile by defining typeof. typeof doesn't seem to exist, so this won't compile (but we don't yet try). Define it to __typeof__. This code is going to die soon anyway. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:03 -07:00
Kenneth Graunke	b97c7ef4c8	i965/drm: remove legacy defines, aub functions, and decoder prototypes We never imported any of this code, so drop the prototypes, unused enums, and defines. Based on patches by Emil Velikov. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:31:00 -07:00
Kenneth Graunke	514db96c11	i965: Import libdrm_intel. This imports commit 19c4cfc54918d361f2535aec16650e9f0be667cd of libdrm/intel/*.[ch], minus a few files that we're never going to use (and would immediately delete), plus a few necessary dependencies. We rename intel_bufmgr.h to brw_bufmgr.h to avoid #include conflicts. We also fix UTF-8 symbol problems in intel_bufmgr_gem.c comments because vim keeps trying to fix that every time I edit the file, and we may as well fix it right away. Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:30:53 -07:00
Kenneth Graunke	915820cc59	i965: Make sure we don't use CPU maps for the scanout buffer. Using an incoherent CPU map on the active scanout buffer is really sketchy - we may need extra flushing via GEM_SW_FINISH, or using drmModeDirtyFB() and kernel commit a6a7cc4b7db6d (4.10+). Chris suggests "never ever do that", which seems like a wise plan! intel_miptree_map_raw() uses CPU maps on linear buffers. Having a linear scanout buffer should be really rare, and mapping the front buffer should be similarly rare. Together, it should basically never happen. But, in case it does somehow...make sure that mapping the scanout buffer always goes through an uncached GTT map. v2: Add a giant comment written by Chris Wilson. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-10 14:30:49 -07:00
Kenneth Graunke	eb28ce2b0b	i965: Stop calling drm_intel_bufmgr_gem_enable_fenced_relocs(). This does nothing on Gen4+, which is the only hardware we support. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:30:44 -07:00
Kenneth Graunke	034b220dc4	i965: Fix GLX_MESA_query_renderer video memory on 32-bit. On modern systems with 4GB apertures, the size in bytes is 4294967296, or (1ull << 32). The kernel gives us the aperture size as a __u64, which works out great. Unfortunately, libdrm "helpfully" returns the data as a size_t, which on 32-bit systems means it truncates the aperture size to 0 bytes. We've happily reported this value as 0 MB of video memory via GLX_MESA_query_renderer since it was originally exposed. This patch bypasses libdrm and calls the ioctl ourselves so we can use a proper uint64_t, avoiding the 32-bit integer overflow. We now report a proper video memory size on 32-bit systems. Chris points out that the aperture size (CPU mappable size limit) isn't really the right thing to be checking. But libdrm_intel uses it to fail execbuffer, so it is an actual limit for now. Once that's fixed we can probably move to something else. In the meantime, fix the obvious typecasting bug. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-10 14:30:40 -07:00
Samuel Pitoiset	5bcfe90501	gallium/radeon: add HUD queries for GPU temperature and clocks Only the Radeon kernel driver exposed the GPU temperature and the shader/memory clocks, this implements the same functionality for the AMDGPU kernel driver. These queries will return 0 if the DRM version is less than 3.10, I don't explicitely check the version here because the query codepath is already a bit messy. v2: - rebase on top of master Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 23:06:19 +02:00
Samuel Pitoiset	0f39fb8500	configure.ac: require libdrm_amdgpu 2.4.79 The sensor info requires amdgpu_query_sensor_info(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 23:06:17 +02:00
Samuel Pitoiset	def02007cd	radeonsi: add new si_check_render_feedback_texture() helper For bindless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 23:05:41 +02:00
Samuel Pitoiset	fbcc8664fd	radeonsi: add new si_decompress_color_texture() helper For bindless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 23:05:38 +02:00
Samuel Pitoiset	6646212de0	radeonsi: add new depth_needs_decompression() helper v2: - rename to depth_needs_decompression() instead Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 23:05:32 +02:00
Samuel Pitoiset	9cc91ba6d5	radeonsi: add a 'break' in si_check_render_feedback_*() No need to check all color buffers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 23:05:29 +02:00
Samuel Pitoiset	51d6641700	radeonsi: re-use 'desc' in si_set_shader_image() No need to compute the offset in the descriptor twice. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 23:05:27 +02:00
Samuel Pitoiset	a1c37ff9e4	ac: add unreachable() in ac_build_image_opcode() To silent the following compiler warning: common/ac_llvm_build.c: In function ‘ac_build_image_opcode’: common/ac_llvm_build.c:1080:3: warning: ‘name’ may be used uninitialized in this function [-Wmaybe-uninitialized] snprintf(intr_name, sizeof(intr_name), "%s%s%s%s.v4f32.%s.v8i32", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ name, ~~~~~ a->compare ? ".c" : "", ~~~~~~~~~~~~~~~~~~~~~~~ a->bias ? ".b" : ~~~~~~~~~~~~~~~~ a->lod ? ".l" : ~~~~~~~~~~~~~~~ a->deriv ? ".d" : ~~~~~~~~~~~~~~~~~ a->level_zero ? ".lz" : "", ~~~~~~~~~~~~~~~~~~~~~~~~~~~ a->offset ? ".o" : "", ~~~~~~~~~~~~~~~~~~~~~~ type); ~~~~~ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 23:02:12 +02:00
Constantine Kharlamov	61e47d92c5	r600g: get rid of dummy pixel shader The idea is taken from radeonsi. The code mostly was already checking for null pixel shader, so little checks had to be added. Interestingly, acc. to testing with GTAⅣ, though binding of null shader happens a lot at the start (then just stops), but draw_vbo() never actually sees null ps. v2: added a check I missed because of a macros using a prefix to choose a shader. Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 22:45:22 +02:00
Constantine Kharlamov	544b40089b	r600g: add draw_vbo check for a NULL pixel shader Taken from radeonsi, required to remove dummy pixel shader in the next patch Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 22:45:22 +02:00
Constantine Kharlamov	22de96680c	r600g: skip repeating vs, gs, and tes shader binds The idea is taken from radeonsi. The code lacks some checks for null vs, and I'm unsure about some changes against that, so I left it in place. Some statistics for GTAⅣ: Average tesselation bind skip per frame: ≈350 Average geometric shaders bind skip per frame: ≈260 Skip of binding vertex ones occurs rarely enough to not get into per-frame counter at all, so I just gonna say: it happens. v2: I've occasionally removed an empty line, don't do this. v3: return a check for null tes and gs back, while I haven't figured out the way to move stride assignment to r600_update_derived_state() (as it is in radeonsi). Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 22:45:22 +02:00
Bartosz Tomczyk	a4019a81ab	mesa: use single memcpy when strides match in glReadPixels, texstore code v2: fix indentation Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-10 14:42:17 -06:00
Jason Ekstrand	da2ac19511	intel/blorp: Use ISL for emitting depth/stencil/hiz Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-04-10 07:57:21 -07:00
Jason Ekstrand	d3785dcb2f	intel/blorp: Emit 3DSTATE_STENCIL_BUFFER before HIER_DEPTH We're about to replace blorp's emit code with ISL and it emits them in the other order. This makes diffing the aubs easier. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-04-10 07:57:21 -07:00
Jason Ekstrand	f93dc5beee	anv: Use ISL for emitting depth/stencil/hiz Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-04-10 07:57:21 -07:00
Jason Ekstrand	bf95f7c209	intel/isl: Add support for emitting depth/stencil/hiz Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-04-10 07:57:21 -07:00
Thomas Hindoe Paaboel Andersen	957ccbe04a	amd/addrlib: use correct variable name in header Since the inclusion in `7f160efcde` the header used x_biased, while the implementation used y_biased. This changes the header to macth the implementation since the uses of the function seems to expect y_biased. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 12:44:59 +10:00
Timothy Arceri	d0791ac2ed	mesa/st: take ownership rather than adding reference for new renderbuffers This avoids locking in the reference calls and fixes a leak after the RefCount initialisation was change from 0 to 1. Fixes: `32141e53d1` (mesa: tidy up renderbuffer RefCount initialisation) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Bartosz Tomczyk <bartosz.tomczyk86@gmail.com>	2017-04-10 10:55:34 +10:00
Timothy Arceri	d9fe82fe41	x11: take ownership rather than adding reference for new renderbuffers This avoids locking in the reference calls and fixes a leak after the RefCount initialisation was change from 0 to 1. Fixes: `32141e53d1` (mesa: tidy up renderbuffer RefCount initialisation) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-10 10:55:34 +10:00
Timothy Arceri	a85b4e5719	osmesa: tidy up renderbuffer refCount initialisation `32141e53d1` changed _mesa_init_renderbuffer() to set it to 1 for us. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-10 10:55:34 +10:00
Timothy Arceri	e6d6266e6f	swrast: take ownership rather than adding reference for new renderbuffers This avoids locking in the reference calls and fixes a leak after the RefCount initialisation was change from 0 to 1. Fixes: `32141e53d1` (mesa: tidy up renderbuffer RefCount initialisation) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-10 10:55:34 +10:00
Timothy Arceri	6c02387b2c	radeon: take ownership rather than adding reference for new renderbuffers This avoids locking in the reference calls and fixes a leak after the RefCount initialisation was change from 0 to 1. Fixes: `32141e53d1` (mesa: tidy up renderbuffer RefCount initialisation) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-10 10:55:34 +10:00
Timothy Arceri	1b85009ec1	nouveau: take ownership rather than adding reference for new renderbuffers This avoids locking in the reference calls and fixes a leak after the RefCount initialisation was change from 0 to 1. Fixes: `32141e53d1` (mesa: tidy up renderbuffer RefCount initialisation) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-10 10:55:34 +10:00
Timothy Arceri	3387f66cab	i965: take ownership rather than adding reference for new renderbuffers This avoids locking in the reference calls and fixes a leak after the RefCount initialisation was change from 0 to 1. Fixes: `32141e53d1` (mesa: tidy up renderbuffer RefCount initialisation) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-10 10:55:34 +10:00
Timothy Arceri	c355675440	i915: take ownership rather than adding reference for new renderbuffers This avoids locking in the reference calls and fixes a leak after the RefCount initialisation was change from 0 to 1. Fixes: `32141e53d1` (mesa: tidy up renderbuffer RefCount initialisation) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-10 10:55:34 +10:00
Timothy Arceri	074a485d35	mesa: create _mesa_attach_renderbuffer_without_ref() helper This will be used to take ownership of freashly created renderbuffers, avoiding the need to call the reference function which requires locking. V2: dereference any existing fb attachments and actually attach the new rb. v3: split out validation and attachment type/complete setting into a shared static function. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Bartosz Tomczyk <bartosz.tomczyk86@gmail.com>	2017-04-10 10:55:34 +10:00
Ilia Mirkin	89253d5c67	nv50/ir: remove unused swizzle field in ValueRef The nv50 ir is scalar. Perhaps this was from some early attempts to integrate the simd aspects of nv30. However at this point it's entirely unused. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-09 14:59:42 -04:00
Boyan Ding	b1b189a0ab	nouveau: enable ARB_shader_clock on nv50 and nvc0 v2: Also enable support on nv50 Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-09 13:03:13 -04:00
Boyan Ding	6c3dd8f0ed	nv50/ir: Handle TGSI_OPCODE_CLOCK Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> [imirkin: make zero mov non-fixed] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-09 13:03:13 -04:00
Boyan Ding	e2e2c69927	gm107/ir: Emit SV_CLOCK system value Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-09 13:03:13 -04:00
Ben Widawsky	6e907812f8	gbm: Assert modifiers and count are copacetic The API/entry point in mesa already checks the correct behavior, however, it's possible to be handled by another implementation and those implementations should not be able to abuse a weird combination of count and pointer. This fixes CID 1403193 Cc: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-09 09:29:57 -07:00
Gustaw Smolarczyk	a2eae66b8b	st/mesa: Use compressed fog mode for atifs. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	8a4b93b1d9	mesa/main/ff_frag: Use compressed TexEnv Combine state. Along the way, add missing GL_ONE source support and drop non-existing GL_ZERO and GL_ONE operand support. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	f7c9bf0c6b	mesa/main/ff_frag: Use compressed fog mode. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	837ad2dc38	mesa/main: Maintain compressed TexEnv Combine state. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	6fa34de830	mesa/main: Maintain compressed fog mode. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	c9b2938aec	mesa/main/ff_frag: Don't retrieve format if not necessary. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	885012aab2	mesa/main/ff_frag: Use gl_texture_object::TargetIndex. Instead of computing it once again using _mesa_tex_target_to_index. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	a86891a9a9	mesa/main/ff_frag: Store nr_enabled_units only once. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	0e89ab0d6e	mesa/main/ff_frag: Simplify get_fp_input_mask. Change it into filter_fp_input_mask transform function that instead of returning a mask, transforms input. Also, simplify the case of vertex program handling by assuming that fp_inputs is always a combination of VARYING_BIT_COL* and VARYING_BIT_TEX*. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	f5e685da06	mesa/main/ff_frag: Don't bother with VARYING_BIT_FOGC. It's not used. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	03b9b3c471	mesa/main/ff_frag: Remove unused struct. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	ceb5ba9d1d	mesa/main/ff_frag: Reduce the size of nr_enabled_units. Since it holds values from 0 to 8, 4 bits will suffice. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	439eca951f	mesa/main/ff_frag: Remove enabled_units. Its only usage is easily replaced by nr_enabled_units. As for cache key part, unit[i].enabled should be enough. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk	3cc91537fa	mesa/main/ff_frag: Use correct constant. Since fixed-function shaders are restricted to MAX_TEXTURE_COORD_UNITS texture units, use this constant instead of MAX_TEXTURE_UNITS. This reduces the array size from 32 to 8. Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 20:29:57 +02:00
Jason Ekstrand	098ca9949d	intel/isl: Use genx_bits.h instead of a hand-rolled table This gets rid of one piece of ugliness with the way ISL handles surface emitting surface states. I've never liked that hand-rolled table but it was the best we had at the time. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-04-07 22:34:04 -07:00
Jason Ekstrand	b85d75b3e8	intel/genxml/bits: Emit per-container _length helpers Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-04-07 22:34:04 -07:00
Jason Ekstrand	f97e251ab2	intel/genxml/bits: Emit per-field _start helpers Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-04-07 22:34:04 -07:00
Jason Ekstrand	430e697868	intel/genxml/bits: Pull the function emit code into a helper block The helper block is extremely general. It takes an string property name and an object that supports three methods: has_prop, iter_prop, and get_prop. This way we can easily generalize it to emit more different types of getter functions. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-04-07 22:34:04 -07:00
Jason Ekstrand	2d52e65d03	intel/genxml/bits: Refactor to add a container class Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-04-07 22:34:04 -07:00
Ilia Mirkin	57a744025a	nvc0/ir: fix overwriting of offset register with interpolateAtOffset Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2017-04-07 23:31:01 -04:00
Jason Ekstrand	bc68aa42bd	anv: Use subpass dependencies for flushes Instead of figuring it all out ourselves, just use the information given to us by the client. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-04-07 19:24:14 -07:00
Jason Ekstrand	e5bbf8be36	anv/pass: Record required pipe flushes Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-04-07 19:24:14 -07:00
Jason Ekstrand	0039d0cf27	anv/pass: Use anv_multialloc for allocating the anv_pass Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-04-07 19:24:14 -07:00
Jason Ekstrand	415633a722	anv/descriptor_set: Use anv_multialloc for descriptor set layouts Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-04-07 19:24:14 -07:00
Jason Ekstrand	e5c29b8c27	anv: Add a helper for doing mass allocations We tend to try to reduce the number of allocation calls the Vulkan driver uses by doing a single allocation whenever possible for a data structure. While this has certain downsides (usually code complexity), it does mean error handling and cleanup is much easier. This commit adds a nice little helper struct for getting rid of some of that complexity. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-04-07 19:24:14 -07:00
Jason Ekstrand	82695c32b6	anv: Add helpers for converting access flags to pipe bits Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-04-07 19:24:14 -07:00
Timothy Arceri	9d69416a7e	mesa: simplify and optimise vertex bindings tracking We only need to update it if something changes. Also _mesa_bind_vertex_buffer() will update the mask when binding to a NULL or default buffer so no need to do that update here. Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-04-08 11:18:50 +10:00
Timothy Arceri	bfabef0e71	glsl: fix lower jumps for nested non-void returns Fixes the case were a loop contains a return and the loop is nested inside an if. Reviewed-by: Roland Scheidegger <sroland@vmware.com> https://bugs.freedesktop.org/show_bug.cgi?id=100303	2017-04-08 11:18:32 +10:00
Ilia Mirkin	5dd490f134	gallium: fix some math formulas to display better Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-07 20:20:17 -04:00
Ilia Mirkin	60f5766db4	nvc0/ir: fix LSB/BFE/BFI implementations Overwriting the src register is a very bad idea - it logically maps onto the TGSI registers, and so is effectively overwriting the source values. Reported-by: Boyan Ding <boyan.j.ding@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2017-04-07 20:20:16 -04:00
Nicolai Hähnle	c05cf9cf1b	util: fix swizzle of INSTANCEID system value radeonsi added stricter checking for correct swizzles in debug builds. Reported-by: Michel Dänzer <michel.daenzer@amd.com> Fixes: `4cf2942777` ("radeonsi: support 64-bit system values") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-08 00:44:52 +02:00
Bruce Cherniak	07b5b5cfd4	st/glx: Add awareness for multisample pixel formats to st/glx-xlib. In preparation for enabling MSAA in OpenSWR, the state trackers need to be aware of multisample pixel formats for software renderers. This patch allows glx-xlib to query the renderer for support of pixel formats with multisample, and create multisample resources. This change is benign to softpipe and llvmpipe, as is_format_supported returns FALSE for any sample_count > 1. OpenSWR does the same at the moment, but that will change soon. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-04-07 16:50:58 -05:00
Tim Rowley	7bd5057fd1	swr: fix unused variable warnings Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-07 16:50:41 -05:00
Brian Paul	8046c247de	glx: silence uninitialized var warning Signed-off-by: Brian Paul <brianp@vmware.com>	2017-04-07 13:46:44 -06:00
Brian Paul	ee3f75f538	st/mesa: silence unused/uninitialized var warnings Signed-off-by: Brian Paul <brianp@vmware.com>	2017-04-07 13:46:44 -06:00
Brian Paul	c77c381fae	gallivm: init vars to silence gcc warnings Silence warnings about using possibly uninitialized values. Signed-off-by: Brian Paul <brianp@vmware.com>	2017-04-07 13:46:44 -06:00
Charmaine Lee	16bd2c6d04	svga: add context pointer to the invalidate surface interface With this patch, we will specify the current context when we invalidate the surface before the surface is put back to the recycled surface pool. This allows the winsys layer to use the specified context to do the invalidation rather than using the last context that referenced the surface. This prevents race condition if the last referenced context is now made current in another thread. Tested with MTT glretrace, NobelClinicianViewer. Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2017-04-07 13:46:44 -06:00
Brian Paul	e000b17f87	winsys/svga: use c11 thread types/functions Gallium no longer has wrappers for mutexes and condition variables. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-07 13:46:44 -06:00
Thomas Hellstrom	0864f9c77a	winsys/svga: Resolve command submission buffer contention v3 If two contexts wanted to access the same buffer at the same time, it would end up on two validation lists simultaneously, which might cause a PIPE_ERROR_RETRY when trying to validate it from one context while the other context already had it validated but not yet fenced. In that situation we could spin until the error goes away, or apply various more or less expensive locking schemes to save cpu. Here we use a scheme that briefly locks after fencing but avoids locking on validation in the non-contended case. v2: Make sure we broadcast not only on releasing buffers after fencing, but also after releasing buffers in the pb_validate_validate error path. v3: Don't broadcast on PIPE_ERROR_RETRY because that would increase the chance of starvation. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2017-04-07 13:46:44 -06:00
Brian Paul	0baa372b6f	svga: remove pre-SVGA3D_HWVERSION_WS8_B1 code 3D wasn't officially supported before virtual HW version 8 so we can remove this old code. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-04-07 13:46:44 -06:00
Brian Paul	690fe77835	st/wgl: sort strings in stw_extension_string[] array Trivial.	2017-04-07 13:46:44 -06:00
Charmaine Lee	b1c964447a	svga: remove redundant surface propagation Currently, surface propagation for colliding render target resource is done at framebuffer emit time for vgpu10. This patch adds the surface propagation for non-vgpu10 path to emit_fb_vgpu9() and removes the redundant surface copy at set time. Tested with MTT glretrace, piglit, NobelClinicianViewer, Turbine, Cinebench. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2017-04-07 13:46:44 -06:00
Charmaine Lee	35a748e79c	svga: Fix zslice index to svga_texture_copy_handle_resource() The zslice index to svga_texture_copy_handle_resource() is not adjusted and should be a signed integer. This patch fixes piglit tests for non-vgpu10 including spec@arb_framebuffer_object@fbo-generatemipmap-3d spec@glsl-1.20@execution@tex-miplevel-selection gl2:texture* 3d Tested with MTT piglit and glretrace	2017-04-07 13:46:44 -06:00
Brian Paul	5637a497a3	svga: specify include path for git_sha1.h for out-of-src builds If we're doing an out-of-src build, we need to specify the #include patch to find git_sha1.h Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-04-07 13:46:44 -06:00
Brian Paul	c78fc70e8c	st/wgl: pseudo-implementation of WGL_EXT_swap_control This implementation is based on querying the time just before swap/present and doing a Sleep() if needed. There is no sync to vblank or actual coordination with the GPU. This isn't perfect, but basically works. We've had some request for this functionality, and it sounds like there are some Windows GL apps that refuse to start if the driver doesn't advertise this extension. Note: NVIDIA's Windows OpenGL driver advertises the WGL_EXT_swap_control string both with wglGetExtensionsStringEXT() and with glGetString(GL_EXTENSIONS). We're only advertising it with the former at this time. Tested with asst. Mesa demos, Google Earth, Lightsmark, etc. VMware bug 1591534. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2017-04-07 13:46:43 -06:00
Charmaine Lee	ab96d1baf4	svga: Fix out-of-sync backing surface When a backing surface is reused, it is possible that the original surface has been changed. So before the backing surface is bound again, we need to sync up the surface. This patch creates a new helper function svga_texture_copy_handle_resource() to sync up the backing surface resource. This patch, together with the backing surface dirty bit fix, fixes the rendering corruption in NobelClinicianViewer when rotating the model. Also tested with MTT glretrace, piglit, Cinebench, Turbine. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-07 13:46:43 -06:00
Charmaine Lee	a08e3b88ab	svga: add a reset flag to svga_propagate_surface() The reset flag specifies if the dirty bit needs to be reset after the surface is propagated to the texture. This is used to make sure that the dirty bit is not reset and stay unset before the surface is unbound. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-07 13:46:43 -06:00
Charmaine Lee	02c9bf2d54	svga: add the has_backed_views flag The new has_backed_views flag specifies if any of the render target views or depth stencil view is a backing surface view. The flag is used in svga_propagate_rendertargets() so it can return early if there is no surface to propagate. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-07 13:46:43 -06:00
Charmaine Lee	a421d45e61	svga: only destroy render target view from a context that created it A texture can be destroyed from a different context from which it is created, but destroying the render target view from a different context will cause svga device errors. Similar to shader resource view, this patch skips destroying render target view or depth stencil view from a non-parent context. Fixes driver errors running NobelClinician Viewer application. Tested with NobelClinician Viewer, MTT piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-07 13:46:43 -06:00
Charmaine Lee	b4c4ee0762	svga: disable rasterization if rasterizer_discard is set or FS undefined With this patch, rasterization will be disabled if the rasterizer_discard flag is set or the fragment shader is undefined due to missing position output from the vertex/geometry shader. Tested with piglit test glsl-1.50-geometry-primitive-id-restart. Also tested with full MTT glretrace and piglit. v2: As suggested by Roland, to properly disable rasterization, besides setting FS to NULL, we will also need to disable depth and stencil test. v3: As suggested by Brian, set SVGA_NEW_DEPTH_STENCIL_ALPHA dirty bit in svga_bind_rasterizer_state() if the rasterizer_discard flag is changed. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-07 13:46:43 -06:00
Charmaine Lee	fed72ff6cb	svga: do not emulate wide points in GS when doing transform feedback Emulating wide points in geometry shader when doing transform feedback is problematic. This patch disables the emulation. Tested with piglit test ext_transform_feedback-points. Also tested with MTT glretrace, mesa demos pointblast and spriteblast. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-07 13:46:43 -06:00
Jason Ekstrand	4e17b59f6c	anv/query: Use snooping on !LLC platforms Commit `b2c97bc789` which made us start using a busy-wait for individual query results also messed up cache flushing on !LLC platforms. For one thing, I forgot the mfence after the clflush so memory access wasn't properly getting fenced. More importantly, however, was that we were clflushing the whole query range and then waiting for individual queries and then trying to read the results without clflushing again. Getting the clflushing both correct and efficient is very subtle and painful. Instead, let's side-step the problem by just snooping. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-07 12:17:20 -07:00
Emil Velikov	5318d1ff94	anv: provide anv_gem_busy() stub for the tests Otherwise linking way fail. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100600 Fixes: `f195d40eca` ("anv/device: Add a helper for querying whether a BO is busy") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> Tested-by: Vinson Lee <vlee@freedesktop.org>	2017-04-07 19:45:58 +01:00
Rob Clark	3b32ec3ba6	gallium/util: tweak backtrace format with libunwind To work with addr2line.sh we also need the relative offset within the DSO. And addr2line.sh gets confused by the leading stackframe number. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-07 08:23:02 -04:00
Rob Clark	91dfa02125	gallium/util: cache symbol lookup with libunwind Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-07 08:23:02 -04:00
Rob Clark	7c69ea553b	gallium/util: fix missing limit check in libunwind backtrace Fixes: `70c272004f` ("gallium/util: libunwind support") Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-07 08:23:02 -04:00
Timothy Arceri	8046a944d0	mesa: fix renderbuffer leak We don't need to call _mesa_reference_renderbuffer() for the first assignment as refCount starts at 1. For swrast we work around the fact we will indirectly call _mesa_reference_renderbuffer() by resetting refCount to 0. Fixes: `32141e53d1` (mesa: tidy up renderbuffer RefCount initialisation) Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-04-07 19:48:10 +10:00
Samuel Iglesias Gonsálvez	1c934bc71b	anv/blorp: sample input attachments with resolves on BDW On Broadwell we still need to do a resolve between the subpass that writes and the subpass that reads when there is a self-dependency because HW could not see fast-clears and works on the render cache as if there was regular non-fast-clear surface. Fixes 16 tests on BDW: dEQP-VK.renderpass.formats..input.clear.store.self_dep Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-07 07:49:43 +02:00
Fredrik Höglund	fd0f539e60	radv: don't call radeon_check_space in radv_BindDescriptorSets This appears to be a leftover from an earlier version of this function. Nothing is emitted into the CS. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-07 00:54:46 +02:00
Fredrik Höglund	c1f8c83cb6	radv: implement VK_KHR_descriptor_update_template All offsets and strides are precomputed by radv_CreateDescriptorUpdateTemplateKHR and stored in the template. v2: Move the new struct declarations from radv_descriptor_set.h to radv_private.h (Bas) Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-07 00:54:46 +02:00
Fredrik Höglund	c6487bc48b	radv: implement VK_KHR_push_descriptor Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-07 00:54:46 +02:00
Fredrik Höglund	3b33f03913	radv: replace an assertion with a conditional Replace the !binding_layout->immutable_samplers assertion in radv_update_descriptor_sets with a conditional. The Vulkan specification does not say that it is illegal to update a sampler descriptor when it is immutable; only that pImageInfo is ignored. This change is also needed for push descriptors, because valid descriptors must be pushed for all bindings accessed by shaders, including immutable sampler descriptors. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-07 00:54:46 +02:00
Fredrik Höglund	a6e94a87cb	radv: refactor radv_UpdateDescriptorSets Move the implementation into a separate function that takes a cmd_buffer and a dstSetOverride parameter. When cmd_buffer is not NULL, radv_update_descriptor_sets calls cs_add_buffer directly instead of updating the buffer list. This will be used to implement VK_KHR_push_descriptor. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-07 00:54:46 +02:00
Samuel Pitoiset	bedd89429f	gallium/radeon: fix typo in radeon_winsys.h Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-07 00:48:19 +02:00
Samuel Pitoiset	7839243085	mesa/main: simplify _mesa_IsRenderbuffer() _mesa_lookup_renderbuffer() already checks if 'id' is non-zero. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-04-07 00:48:01 +02:00
Timothy Arceri	93d7014c1d	mesa: stop abstracting texture object hashtable locking This doesn't do anything useful so just remove it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-07 08:03:02 +10:00
Timothy Arceri	31cb6fd0a3	mesa: stop abstracting buffer object hashtable locking This doesn't do anything useful so just remove it. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-07 08:02:54 +10:00
Jason Ekstrand	c9c39812b9	i965/blorp: Bump the batch space estimate Commit `f938354362` recently increased the alignment on vertex buffer data from 32 to 64. This caused us to consume a bit more batch than we were before and we now go over the estimate by a small amount on certain blits on gen8+. This commit bumps then gen8 batch estimate by a bit to compensate. Haswell and older still seems to be well within the limit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100582 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-04-06 13:32:29 -07:00
Jordan Justen	0370350d11	intel/aubinator: Stop searching after a custom handler is found Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-04-06 13:26:08 -07:00
Jordan Justen	d5bd0e411e	intel/gen_decoder: return -1 for unknown command formats Decoding with aubinator encountered a command of 0xffffffff. With the previous code, it caused aubinator to jump 255 + 2 dwords to start decoding again. Instead we can attempt to detect the known instruction formats. If the format is not recognized, then we can advance just 1 dword. v2: * Update aubinator_error_decode * Actually convert the length variable returned into a signed integer in aubinator.c, intel_batchbuffer.c and aubinator_error_decode.c. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-04-06 13:26:08 -07:00
Jordan Justen	7c33372f82	intel/gen_decoder: Fix length for Media State/Object commands From BDW PRM, Volume 6: Command Stream Programming, 'Render Command Header Format'. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-04-06 13:26:08 -07:00
Jordan Justen	3c77a57222	intel/aubinator_error_decode: Fix structure decode data The call to gen_print_group should provide a pointer to the beginning of the the structure data, not the start of the batch data. Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-04-06 13:25:38 -07:00
Nicolai Hähnle	2357e7a202	st/pbo: select the right swizzle for instance IDs The system value only has an X component, and radeonsi started checking that in debug builds. Reported-by: Michel Dänzer <michel.daenzer@amd.com> Fixes: `4cf2942777` ("radeonsi: support 64-bit system values") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-06 20:26:27 +02:00
Jason Ekstrand	b2c97bc789	anv/query: Busy-wait for available query entries Before, we were just looking at whether or not the user wanted us to wait and waiting on the BO. Some clients, such as the Serious engine, use a single query pool for hundreds of individual query results where the writes for those queries may be split across several command buffers. In this scenario, the individual query we're looking for may become available long before the BO is idle so waiting on the query pool BO to be finished is wasteful. This commit makes us instead busy-loop on each query until it's available. This significantly reduces pipeline bubbles and improves performance of The Talos Principle on medium settings (where the GPU isn't overloaded with drawing) by around 20% on my SkyLake gt4. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com>	2017-04-05 21:17:11 -07:00
Jason Ekstrand	f195d40eca	anv/device: Add a helper for querying whether a BO is busy This is a bit more efficient than using GEM_WAIT with a timeout of 0. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-05 21:17:11 -07:00
Tim Rowley	d5157ddca4	swr: [rasterizer core] SIMD16 Frontend WIP Implement widened binner for SIMD16 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-05 18:20:45 -05:00
Tim Rowley	b8515d5c0f	swr: [rasterizer core] Enable 8x2 backend Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-05 18:20:45 -05:00
Tim Rowley	c1b7a5780d	swr: [rasterizer codegen] remove copy of mako mako is already a mesa build requirement, extra copy not needed. Tested building against mesa build baseline (mako-0.8.0). Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-05 18:20:45 -05:00
Tim Rowley	97dab87a22	swr: [rasterizer core/memory] Move intrinics to _simd functions Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-05 18:20:19 -05:00
Tim Rowley	117fc582f8	swr: [rasterizer core] Programmable sample position support Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-05 18:19:25 -05:00
Tim Rowley	3c52a7316a	swr: [configure.ac/scons] require c++14 New C++ features used by upcoming swr changes. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-05 18:19:16 -05:00
Tim Rowley	e5fdfcf836	swr: [rasterizer core] Fix center sample pattern Fix long hidden bug in rasterizer handling of center sample pattern. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-05 18:19:10 -05:00
Tim Rowley	c12b61d158	swr: [rasterizer core/memory] Fix missing avx512 storetile Fix pre-processor macro handing to eliminate silently missing implementation for AVX512. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-05 18:19:04 -05:00
Tim Rowley	cd6c200223	swr: [rasterizer core] SIMD16 Frontend WIP Implement widened VS output for SIMD16 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-05 18:18:36 -05:00
Timothy Arceri	1bfeb65397	mesa: use internal function when deleting buffers This avoids validation and looking up the buffer target for a second time. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-06 08:25:36 +10:00
Timothy Arceri	8feb5bb402	mesa: rework bind_buffer_object() This allows internal users to pass buffer objects directly and allows for KHR_no_error support to be more easily added. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-06 08:25:36 +10:00
Timothy Arceri	d1c1544a49	mesa: small texstate tidy up Possibly more efficient, either way it makes the code easier to follow. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-06 08:25:36 +10:00
Timothy Arceri	32141e53d1	mesa: tidy up renderbuffer RefCount initialisation `42aaa548` changed the renderbuffer initialisation of RefCount from 1 to 0. This is inconsitent with how we use RefCount elsewhere. Also every driver implementation of NewRenderbuffer() calls _mesa_init_renderbuffer() so its safe to set it there. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-06 08:17:10 +10:00
Christian Gmeiner	e75001811e	Revert "etnaviv: Cannot render to rb-swapped formats" This reverts commit `658568941d`. With the help of shader variants we can render to rb-swapped formats now. Fixes about 60 piglits. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-05 19:58:25 +02:00
Christian Gmeiner	7f62ffb68a	etnaviv: add support for rb swap If we render to rb swapped format we will create a shader variant doing the involved swizzing in the pixel shader. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-05 19:58:22 +02:00
Christian Gmeiner	8d9a31ef97	etnaviv: adapt shader-db output for variant support Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2017-04-05 19:58:18 +02:00
Christian Gmeiner	20fa8f1989	etnaviv: bring back shader-db traces If shader-db run, create a standard variant immediately (as otherwise nothing will trigger the shader to be actually compiled). Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2017-04-05 19:58:13 +02:00
Christian Gmeiner	7d2a806266	etnaviv: add etna_shader_key and generate variants if needed Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-05 19:58:10 +02:00
Christian Gmeiner	9da54fdcb5	etnaviv: pass a preallocated variant to compiler In the long run the compiler needs to know the specifc variant 'key' in order to compile appropriate assembly. With this commit the variant knows its shader and we are able pass the preallocated variant into etna_compile_shader(..). This saves us from passing extra ptrs everywhere. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2017-04-05 19:58:07 +02:00
Christian Gmeiner	ffd4762310	etnaviv: make specs const Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-05 19:58:03 +02:00
Christian Gmeiner	ecc2474e59	etnaviv: add struct etna_shader_state Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2017-04-05 19:57:59 +02:00
Christian Gmeiner	65e9bd2703	etnaviv: add basic shader variant support This commit adds some basic infrastructure to handle shader variants. We are still creating exactly one shader variant for each shader. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2017-04-05 19:57:56 +02:00
Christian Gmeiner	59b459ac17	etnaviv: s/etna_shader/etna_shader_variant Prep work to add shader variant support. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-05 19:57:52 +02:00
Christian Gmeiner	54e367bf0e	etnaviv: remove not needed forward declarations Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-04-05 19:57:47 +02:00
Emil Velikov	13181abc6d	gallium/util: honour LIBUNWIND_CFLAGS Fixes: `70c272004f` ("gallium/util: libunwind support") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-05 18:42:56 +01:00
Rhys Kidd	115e684792	travis: Add radeonsi to continuous integration Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-05 18:19:51 +01:00
Rhys Kidd	787ab42716	travis: Add radv vulkan driver to continuous integration Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-05 18:19:28 +01:00
Emil Velikov	a6840efc09	anv: provide required gem stubs for the tests Introduce stubs to anv_gem_stub.c that match the anv_gem.c ones. Otherwise we may get link-time errors, when building the tests. v2: Introduce all the missing stubs at once. Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Vinson Lee <vlee@freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100574 Fixes: `c964f0e485` ("anv: Query the kernel for reset status") Fixes: `651ec926fc` ("anv: Add support for 48-bit addresses") Fixes: `060a6434ec` ("anv: Advertise larger heap sizes") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> --- I've intentionally kept the order the same identical to the anv_gem.c. This way we can easily grep & diff in the future ;-)	2017-04-05 17:54:38 +01:00
Emil Velikov	8307124829	configure.ac: pthread-stubs is not a thing on GNU/kFreeBSD As mentioned on the xcb mailing list, the platform uses the GLIBC forwarding mechanism. https://lists.freedesktop.org/archives/xcb/2016-November/010896.html Cc: Andreas Boll <andreas.boll.dev@gmail.com> Reported-by: Andreas Boll <andreas.boll.dev@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-05 17:47:41 +01:00
Aaron Watry	4d0399f175	st/clover: Fix build after shrink of pipe_box Fixes: `3dfe61e` ("gallium: decrease the size of pipe_box - 24 -> 16 bytes") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100569 Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Vinson Lee <vlee@freedesktop.org>	2017-04-05 09:19:48 -05:00
Alex Deucher	d921af62f5	radeonsi: add new polaris10 pci id Reviewed-by: Christian König <christian.koenig@amd.com> Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-04-05 10:13:08 -04:00
Nicolai Hähnle	9e1b2e4d97	radeonsi: enable ARB_shader_ballot Require LLVM 5.0 or later because LLVM 4.0 is easily fooled into putting the lane select of llvm.amdgcn.readlane into a VGPR and then fails to continue to compile. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:44 +02:00
Nicolai Hähnle	8b13b11f11	radeonsi: optimization barriers to work around LLVM deficiencies Notably, llvm.amdgcn.readfirstlane and llvm.amdgcn.icmp may be hoisted out of loops or if/else branches in cases like if (cond) { v = readFirstInvocationARB(x); ... use v ... } else { v = readFirstInvocationARB(x); ... use v ... } ===> v = readFirstInvocationARB(x); if (cond) { ... use v ... } else { ... use v ... } The optimization barrier is a heavy hammer to stop that until LLVM is taught the semantics of the intrinsic properly. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:44 +02:00
Nicolai Hähnle	24d4fbe226	radeonsi: strengthen emit_optimization_barrier LLVM will lift inline assembly out of if-else-blocks if both paths have the same inline assembly. Prevent this by adding an irrelevant unique text to the assembly. This requires the LLVM assembly parser to be initialized. Furthermore, allow forcing subsequent computations to happen after the optimization barrier by defining a data dependency. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:43 +02:00
Nicolai Hähnle	5c4602f4a2	radeonsi: emit TGSI_OPCODE_READ_* Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:43 +02:00
Nicolai Hähnle	b46e3a30b7	radeonsi: emit TGSI_OPCODE_BALLOT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:43 +02:00
Nicolai Hähnle	a3075f4799	radeonsi: implement TGSI_SEMANTIC_SUBGROUP_* 64-bit system values are stored as v2i32 to simplify the fetch logic. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:43 +02:00
Nicolai Hähnle	4cf2942777	radeonsi: support 64-bit system values For simplicitly, always store system values as 32-bit values or arrays of 32-bit values. 64-bit values are unpacked and packed accordingly. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:43 +02:00
Nicolai Hähnle	1ee57b16be	radeonsi: bump RADEON_LLVM_MAX_SYSTEM_VALUES ARB_shader_ballot introduces 7 new system values that can be used in all shader stages. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:42 +02:00
Nicolai Hähnle	ee2d93eb92	st/mesa: enable ARB_shader_ballot Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:42 +02:00
Nicolai Hähnle	84039cc1c3	st/glsl_to_tgsi: implement ARB_shader_ballot system variables Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:42 +02:00
Nicolai Hähnle	76e3dba289	st/glsl_to_tgsi: implement ARB_shader_ballot builtin functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:41 +02:00
Ilia Mirkin	08bd0aa507	tgsi: add SUBGROUP_* semantics v2: add documentation (Nicolai) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:41 +02:00
Ilia Mirkin	3650d7455f	tgsi: add BALLOT/READ_* opcodes v2 (Nicolai): - BALLOT isn't per-channel - expand the documentation (also for VOTE_) v3: - only BALLOT returns a 64-bit lanemask (Boyan) - relax the requirement on READ_INVOC: the invocation number to read from must be uniform within a sub-group. This matches the GL_ARB_shader_ballot spect (and the v_readlane instruction of AMD GCN) v4: - hopefully really fix the doc of VOTE_ returns (Ilia) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2017-04-05 15:29:34 +02:00
Nicolai Hähnle	d3e6f6d7f7	gallium: add PIPE_CAP_TGSI_BALLOT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:29:31 +02:00
Nicolai Hähnle	b5711d5e1a	glsl: add gl_SubGroup*ARB builtins Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:25:56 +02:00
Nicolai Hähnle	961b8e9afe	glsl: add ARB_shader_ballot builtin functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:25:54 +02:00
Nicolai Hähnle	d37b7b5232	glsl: add ARB_shader_ballot operations Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:25:51 +02:00
Nicolai Hähnle	b8440ec9fa	glsl: add ARB_shader_ballot enable Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:25:48 +02:00
Nicolai Hähnle	4fdb691f10	mesa: add GL_ARB_shader_ballot boilerplate Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 15:25:40 +02:00
Emil Velikov	2c4c47dcb7	swr: automake: add gen_common.py to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-05 13:16:28 +01:00
Emil Velikov	e664cfc5a7	intel: genxml: automake: include gen_bits_header.py in the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-05 13:16:28 +01:00
Emil Velikov	e180680980	intel: genxml: automake: polish automake rules Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-05 13:16:28 +01:00
Emil Velikov	e2adec3a17	amd/addrlib: automake: add all headers to the tarball Fixes: `7f160efcde` ("amd/addrlib: import gfx9 support") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-05 13:16:28 +01:00
Nicolai Hähnle	570e50af4b	radeonsi: enable ARB_sparse_buffer v2: - fill in DRM version requirement - disable on SI due to CP DMA faults Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:44:32 +02:00
Nicolai Hähnle	aee473eb01	radeonsi: disable SDMA clears and copies for sparse buffers VM faults cannot be disabled for SDMA on <= VI. We could still use SDMA by asking the winsys about which parts of the buffers are committed. This is left as a potential future improvement. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:19 +02:00
Nicolai Hähnle	0a685ce9a7	gallium/radeon: implement pipe->resource_commit Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:19 +02:00
Nicolai Hähnle	e077c5fe65	gallium/radeon: transfers and invalidation for sparse buffers Sparse buffers can never be mapped by the CPU. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:19 +02:00
Nicolai Hähnle	5969a373a1	gallium/radeon: implement sparse buffer creation Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:19 +02:00
Nicolai Hähnle	47e59a7e36	winsys/amdgpu: sparse buffer debugging helpers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:19 +02:00
Nicolai Hähnle	0baee15596	winsys/amdgpu: take fences when freeing a backing buffer We never add fences to backing buffers during submit. When we free a backing buffer, it must inherit the sparse buffer's fences, so that it doesn't get re-used prematurely via the cache. v2: - remove pipe_mutex_* Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:18 +02:00
Nicolai Hähnle	79dae12b41	winsys/amdgpu: add sparse buffers to CS ... and implement the corresponding fence handling. v2: - add missing bit in amdgpu_bo_is_referenced_by_cs_with_usage - remove pipe_mutex_* Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:18 +02:00
Nicolai Hähnle	667da4eaed	winsys/amdgpu: sparse buffer creation / destruction / commitment This is the bulk of the buffer allocation logic. It is fairly simple and stupid. We'll probably want to use e.g. interval trees at some point to keep track of commitments, but Mesa doesn't have an implementation of those yet. v2: - remove pipe_mutex_* - fix total_backing_pages accounting - simplify by using the new VA_OP_CLEAR/REPLACE kernel interface Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:18 +02:00
Nicolai Hähnle	e348248647	winsys/amdgpu: add sparse buffer data structures v2: - remove pipe_mutex_* - use a simple page commitment array Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:18 +02:00
Nicolai Hähnle	f3e514361c	winsys/amdgpu: extend amdgpu_add_fence to allow adding multiple fences Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:18 +02:00
Nicolai Hähnle	ae4f442304	winsys/amdgpu: build handles and flags list late on submit thread This probably has only minor performance effects, but it simplifies some subsequent code slightly. Ideally, it could also be used to simplify the handling of slab buffers in the same way, but unfortunately that's not possible as long as we need indices for relocations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:17 +02:00
Nicolai Hähnle	0e476f6c03	winsys/amdgpu: share common code in amdgpu_add_fence_dependencies Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:17 +02:00
Nicolai Hähnle	1c125fdef0	winsys/amdgpu: extract amdgpu_do_add_real_buffer We will use it for delayed adding of sparse buffers' backing buffers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:17 +02:00
Nicolai Hähnle	a338f427ac	winsys/radeon: sparse buffers will not be supported Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:17 +02:00
Nicolai Hähnle	c2637a17d9	radeon/winsys: add sparse buffer interface Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:17 +02:00
Nicolai Hähnle	d9bc4d8305	st/mesa: plumbing for sparse buffers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:16 +02:00
Nicolai Hähnle	2599b23f7c	st/mesa: enable ARB_sparse_buffer when supported Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:16 +02:00
Nicolai Hähnle	634266c952	trace: add resource_commit pass-through v2: fix return type to bool (Marek) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:16 +02:00
Nicolai Hähnle	0e1c75acae	ddebug: add resource_commit pass-through v2: fix return type to bool (Marek) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:16 +02:00
Nicolai Hähnle	d6e6fa01a5	gallium: add sparse buffer interface and capability v2: - explain the resource_commit interface in more detail Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:37:04 +02:00
Nicolai Hähnle	4e6feacf6a	mesa: implement sparse buffer commitment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:31:02 +02:00
Nicolai Hähnle	d6fcbe1c2a	mesa: implement sparse storage buffer allocation v2: - spec quote and style (Ian) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:31:01 +02:00
Nicolai Hähnle	94227684c4	mesa: implement SPARSE_BUFFER_PAGE_SIZE_ARB Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:31:01 +02:00
Nicolai Hähnle	d085c7ce7c	mesa: Add GL_ARB_sparse_buffer boilerplate Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:31:01 +02:00
Nicolai Hähnle	a0970de839	configure.ac: require libdrm_amdgpu 2.4.77 The sparse buffer implementation requires amdgpu_bo_va_op_raw. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-05 10:30:42 +02:00
Matt Turner	d5ee55f028	mesa: Replace program locks with atomic inc/dec. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-05 14:54:49 +10:00
Jason Ekstrand	060a6434ec	anv: Advertise larger heap sizes Instead of just advertising the aperture size, we do something more intelligent. On systems with a full 48-bit PPGTT, we can address 100% of the available system RAM from the GPU. In order to keep clients from burning 100% of your available RAM for graphics resources, we have a nice little heuristic (which has received exactly zero tuning) to keep things under a reasonable level of control. Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>	2017-04-04 18:33:52 -07:00
Jason Ekstrand	651ec926fc	anv: Add support for 48-bit addresses This commit adds support for using the full 48-bit address space on Broadwell and newer hardware. Thanks to certain limitations, not all objects can be placed above the 32-bit boundary. In particular, general and state base address need to live within 32 bits. (See also Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.) In order to handle this, we add a supports_48bit_address field to anv_bo and only set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set. We set the bit for all client-allocated memory objects but leave it false for driver-allocated objects. While this is more conservative than needed, all driver allocations should easily fit in the first 32 bits of address space and keeps things simple because we don't have to think about whether or not any given one of our allocation data structures will be used in a 48-bit-unsafe way. Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>	2017-04-04 18:33:52 -07:00
Jason Ekstrand	439da38d18	anv: Replace anv_bo::is_winsys_bo with a uint32_t flags Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>	2017-04-04 18:33:52 -07:00
Jason Ekstrand	f938354362	i965/blorp: Align vertex buffers to 64B Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-04-04 18:33:52 -07:00
Jason Ekstrand	5d1ba2cb04	anv/blorp: Align vertex buffers to 64B This fixes issues seen when adding support for full 48-bit addresses. The 48-bit addresses themselves have nothing to do with it other than that it caused the kernel to place buffers slightly differently so they interacted differently with the caches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-04-04 18:33:52 -07:00
Jason Ekstrand	c964f0e485	anv: Query the kernel for reset status When a client causes a GPU hang (or experiences issues due to a hang in another client) we want to let it know as soon as possible. In particular, if it submits work with a fence and calls vkWaitForFences or vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be able to trust the results of that rendering. In order to provide this guarantee, we have to ask the kernel for context status in a few key locations. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-04 18:33:52 -07:00
Jason Ekstrand	82573d0f75	anv: Check for device loss at the end of WaitForFences It's possible that the device could have been lost while we were waiting. We should let the user know if this has happened. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-04 18:33:51 -07:00
Jason Ekstrand	c6f69eea6a	anv/pipeline: Properly handle unset gl_Layer and gl_ViewportIndex When the shader does not set one of these values, they are supposed to get a default value of 0. We have hardware bits in 3DSTATE_CLIP for this but haven't been setting them. This fixes the intermittent failure of dEQP-VK.geometry.layered.3d.render_to_default_layer. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-04-04 18:33:51 -07:00
Jason Ekstrand	3503b2714b	i965/fs: Always provide a default LOD of 0 for TXS and TXL We already provide a default LOD for textureQueryLevels and texture() on non-fragment stages. However, there are more cases where one is needed such as textureSize(gsampler2DMS*) in SPIR-V. Instead of trying to list out all of the cases one at a time, just provide the default for all TXS and TXL operations. This fixes a shader validation error in the new Sascha deferredmultisampling demo which uses textureSize(gsampler2DMS). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100391 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-04-04 18:33:35 -07:00
Kenneth Graunke	c5bf7cb529	mesa: Require mipmap completeness for glCopyImageSubData(), sometimes. This patch makes glCopyImageSubData require mipmap completeness when the texture object's built-in sampler object has a mipmapping MinFilter. Fixes (on i965): dEQP-GLES31.functional.debug.negative_coverage.*.buffer.copy_image_sub_data Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-04-04 17:35:18 -07:00
Vinson Lee	c161a10462	libgl-xlib: Link with libunwind. Fix linking error. CXXLD libGL.la ../../../../src/gallium/auxiliary/.libs/libgallium.a(u_debug_stack.o): In function `debug_backtrace_capture': src/gallium/auxiliary/util/u_debug_stack.c:59: undefined reference to `_Ux86_64_getcontext' src/gallium/auxiliary/util/u_debug_stack.c:60: undefined reference to `_ULx86_64_init_local' src/gallium/auxiliary/util/u_debug_stack.c:62: undefined reference to `_ULx86_64_step' src/gallium/auxiliary/util/u_debug_stack.c:71: undefined reference to `_ULx86_64_get_proc_info' src/gallium/auxiliary/util/u_debug_stack.c:73: undefined reference to `_ULx86_64_get_proc_name' src/gallium/auxiliary/util/u_debug_stack.c:65: undefined reference to `_ULx86_64_step' Fixes: `70c272004f` ("gallium/util: libunwind support") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100562 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-04-04 16:47:41 -07:00
Jason Ekstrand	1fde054b8f	intel/isl: Refactor and clerify gen8 alignment calculations Adding the actual table from the docs makes it clearer exactly what the restrictions are. In particular, it becomes clear that compressed textures ignore the alignment parameters in RENDER_SURFACE_STATE. Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-04-04 14:51:57 -07:00
Francisco Jerez	0de17f52a5	drirc: Set glsl_zero_init for Kerbal Space Program. This fixes the stripes of garbage rendered on the floor of the vehicle assembly building among other rendering issues. The reason for the misrendering seems to be that some of the GLSL shaders used by the application use variables before initializing them, incorrectly assuming that they will be implicitly set to zero by the implementation. Acked-by: Matt Turner <mattst88@gmail.com>	2017-04-04 14:13:03 -07:00
Lionel Landwerlin	e8d9b76f63	intel: tools: add aubinator_error_decode tool This is pretty much the same tool as what i-g-t has, only with a more fancy decoding of the instructions/registers. It also doesn't support anything before gen4. v2 (from Matt): Drop authors Remove undefined automake variable v3: Fix incorrect offsets for dword > 1 (Jordan) v4: Fix decompression error with large blobs (Jordan) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2017-04-04 21:22:26 +01:00
Lionel Landwerlin	567d77885e	intel: genxml: add RING_BUFFER_CTL registers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-04 21:22:26 +01:00
Lionel Landwerlin	6f260ff049	intel: genxml: add FAULT_REG register Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-04 21:22:26 +01:00
Lionel Landwerlin	ca2771fa18	intel: genxml: add gen7 ERR_INT register v2: add register to gen7.5 (Matt) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-04 21:22:26 +01:00
Lionel Landwerlin	84613bf6d5	intel: genxml: add ACTHD registers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-04 21:22:26 +01:00
Lionel Landwerlin	0f195f22aa	intel: genxml: add GFX_ARB_ERROR_RPT register Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-04 21:22:26 +01:00
Lionel Landwerlin	d1a7a54d77	intel: genxml: add INSTDONE registers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-04 21:22:26 +01:00
Marek Olšák	18b12bf533	targets: export radeon winsys_create functions to silence LLVM warning It silences the following radeonsi LLVM warning due to a previous commit adding an LLVM workaround: "mesa: for the -simplifycfg-sink-common option: may only occur zero or one times!" Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by; Emil Velikov <emil.velikov@collabora.com>	2017-04-04 22:15:47 +02:00
Constantine Kharlamov	6ee486899b	r600g: check rasterizer primitive states like in radeonsi Specifically, non-line primitives skipped, and defaulting to reset on each packet. The skip of non-line primitives saves ≈110 resetting of PA_SC_LINE_STIPPLE register per frame in Kane&Lynch2. Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-04-04 22:15:47 +02:00
Constantine Kharlamov	7ade08e2a8	r600g: extract a code into a r600_emit_rasterizer_prim_state() Also change gs_output_prim type: unsigned → pipe_prim_type. The idea of the code is mostly taken from radeonsi. The new code operating on prev/curr rast_primitives saves ≈15 reloads of PA_SC_LINE_STIPPLE per frame in Kane&Lynch2 Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-04-04 22:15:47 +02:00
Constantine Kharlamov	fa8bc90990	r600g/radeonsi: use the correct types (taken from pipe_draw_info) Note: si_shader.h has also "type" variable that should be changed to "enum pipe_prim_type", however it triggers a bunch of warnings about unhandled switches, so due not knowing the correct way to handle them, I decided to leave it as is. Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-04-04 22:15:47 +02:00
Constantine Kharlamov	ef62a7651c	r600g: remove duplicate memset by using a pointer, and constify args Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-04-04 22:15:47 +02:00
Elie TOURNIER	ba5b1ab3e0	glsl: remove unused file udivmod64 appears in src/compiler/glsl/builtin_int64.h and src/compiler/glsl/udivmod.h The second file seems unused. Fix commit `6b03b345eb` This change doesn't affect shader-db. Signed-off-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-04-04 18:37:42 +01:00
Marek Olšák	6ca46c3d77	radeonsi: access gallivm through ctx in most places Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-04 16:55:21 +02:00
Marek Olšák	04e4fe594b	radeonsi: use ctx->types instead of bld->types etc. even vec_type is f32. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-04 16:55:19 +02:00
Marek Olšák	7a5e6dcba5	radeonsi: use i32_0/1 instead of *int_bld.zero/one in most places Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-04 16:55:16 +02:00
Marek Olšák	7216e1d8af	gallium: decrease the size of pipe_draw_info - 88 -> 80 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	295f4f56cb	gallium: decrease the size of pipe_vertex_element - 16 -> 8 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	e6428092f5	gallium: decrease the size of pipe_resource - 64 -> 48 bytes Some other changes needed here. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	3dfe61ed6e	gallium: decrease the size of pipe_box - 24 -> 16 bytes Also: pipe_transfer: 48 -> 40 bytes. pipe_blit_info = 176 -> 160 bytes. v2: add a comment at pipe_box Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	9869a3b3ba	gallium: decrease the size of pipe_sampler_view - 48 -> 32 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	4648bc2a8f	gallium: decrease the size of pipe_surface - 48 -> 40 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	eb0fd0e5f8	gallium: decrease the size of pipe_framebuffer_state - 96 -> 80 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	19bc74f513	gallium: decrease the size of pipe_stream_output_info - 532 -> 268 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	15ff2f7aa9	gallium: decrease the size of pipe_rasterizer_state - 36 -> 32 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	18e760346a	amd/addrlib: second update for Vega10 + bug fixes Highlights: - Display needs tiled pitch alignment to be at least 32 pixels - Implement Addr2ComputeDccAddrFromCoord(). - Macro-pixel packed formats don't support Z swizzle modes - Pad pitch and base alignment of PRT + TEX1D to 64KB. - Fix support for multimedia formats - Fix a case "PRT" entries are not selected on SI. - Fix wrong upper bits in equations for 3D resource. - We can't support 2d array slice rotation in gfx8 swizzle pattern - Set base alignment for PRT + non-xor swizzle mode resource to 64KB. - Bug workaround for Z16 4x/8x and Z32 2x/4x/8x MSAA depth texture - Add stereo support - Optimize swizzle mode selection - Report pitch and height in pixels for each mip - Adjust bpp/expandX for format ADDR_FMT_GB_GR/ADDR_FMT_BG_RG - Correct tcCompatible flag output for mipmap surface - Other fixes and cleanups Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	3e7d62a774	radeonsi: use i32_0 and i32_1 more Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	29adaa19ac	radeonsi: remove most uses of lp_build_const* Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	7cec96a038	radeonsi: clean up 'radeon_bld' references Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-04 11:14:43 +02:00
Marek Olšák	0fb5a505fa	radeonsi: fix broken texture filtering on SI-CIK since GFX9 changes Don't clear state[7] on SI-CIK, and only do the meta stuff on VI+. Fixes: `5abf60076c` ("radeonsi/gfx9: image descriptor changes in mutable fields") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100531 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-04 11:14:43 +02:00
Juan A. Suarez Romero	1bcdf74cdd	bin/get-fixes-pick-list.sh: fix typo Replace "nore" by "more". Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-04-04 09:05:44 +02:00
Mauro Rossi	72175bd2a5	android: intel: genxml: fix genX_xml.h generation rules Recent changes in Makefile.sources merged the aubinator files in a unique list of generated files and genxml/genX_xml.h is now needed to avoid the following building error: ninja: error: '.../genxml/genX_xml.h', needed by '.../genxml/genX_xml.h', missing and no known rule to make it build/core/ninja.mk:148: recipe for target 'ninja_wrapper' failed Fixes: `0f83c05` "intel: genxml: compress all gen files into one" Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-04-04 09:10:46 +03:00
Jason Ekstrand	405ef7bb33	intel/vec4: Add some fall through comments Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-03 16:58:35 -07:00
Bartosz Tomczyk	64b3aa7ad8	mesa/glthread: Avoid unnecessary batch reallocation Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-04 09:56:52 +10:00
Bas Nieuwenhuizen	6e5e8a2e49	radv: Increase descriptor limits. We supported more generally. Decreased the dynamic buffers though, as we only support 16 for uniform+storage. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-04-04 01:47:47 +02:00
Bartosz Tomczyk	95720851e2	mesa/glthread: fix misaligned address access Address sanitizer reports lot of misaligned access: SUMMARY: AddressSanitizer: undefined-behavior main/marshal.c:276:31 in main/marshal.c:276:31: runtime error: load of misaligned address 0x631000104866 for type 'const GLuint' (aka 'const unsigned int'), which requires 4 byte alignment 0x631000104866: note: pointer points here 92 88 00 00 00 00 00 00 4a 03 0c 00 93 88 00 00 00 00 00 00 02 01 0c 00 40 8d 00 00 00 00 00 00 ^ SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28725:12 in main/marshal_generated.c:28726:12: runtime error: member access within misaligned address 0x6310003fc874 for type 'struct marshal_cmd_VertexAttribPointer', which requires 8 byte alignment 0x6310003fc874: note: pointer points here 01 00 00 00 7a 02 20 00 00 00 00 00 be be be be be be be be be be be be be be be be be be be be ^ SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28726:12 in main/marshal_generated.c:28726:12: runtime error: store to misaligned address 0x6310003fc87c for type 'GLint' (aka 'int'), which requires 8 byte alignment 0x6310003fc87c: note: pointer points here 00 00 00 00 be be be be be be be be be be be be be be be be be be be be be be be be be be be be Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-04 09:39:03 +10:00
Bartosz Tomczyk	bcb63ee63e	glsl: Fix blob memory leak Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-04 09:22:29 +10:00
Bas Nieuwenhuizen	a4c4efad89	radv: Rework guard band calculation. We want the guardband_x/y to be the largerst scalars such that each viewport scaled by that amount is still a subrange of [-32767, 32767]. The old code has a couple of issues: 1) It used scissor instead of viewport_scissor, potentially taking into account a viewport that is too small and therefore selecting a scale that is too large. 2) Merging the viewports isn't ideal, as for example viewports with boundaries [0,1] and [1000, 1001] would allow a guardband scale of ~30k, while their union [0, 1001] only allows a scale of ~32. The new code just determines the guardband per viewport and takes the minimum. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-04-03 23:03:46 +02:00
Bas Nieuwenhuizen	d64f689f61	radv: Enable VK_KHR_incremental_present. Just enabling the driver-independent implementation that Jason did. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-03 23:00:07 +02:00
Jason Ekstrand	0817110969	anv: Implement VK_KHR_incremental_present Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-04-03 13:51:08 -07:00
Jason Ekstrand	be1ecd8c6e	vulkan/wsi/wayland: Pass damage through to the compositor Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-04-03 13:51:08 -07:00
Jason Ekstrand	f82b6c6272	vulkan/wsi: Plumb present regions through the common code Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-04-03 13:51:08 -07:00
Jason Ekstrand	3598a2907c	vulkan/wsi: Fix some line wrapping Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-04-03 13:51:08 -07:00
Dave Airlie	22b116171f	radv: fix interp at sample code. Interp at sample needs to use the center, since the sample positions it retrieves are relative to the center. This fixes a bunch of CTS tests with multisample_interpolation. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-04 05:55:21 +10:00
Dave Airlie	1171b304f3	radv: overhaul fragment shader sample positions. The current code was broken, and I decided to redesign it instead. This puts the sample positions for all samples into the queue constant descriptor buffer after all the spill/ring descriptors. It then uses a single offset register to point how far into the samples the samples for num_samples are. This saves one user sgpr and means we only generate the sample position data in the rare single case where we need it currently. This doesn't fix the failing CTS tests without the followup fix. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-04 05:55:15 +10:00
Lionel Landwerlin	471c1bc7cc	aubinator/gen_decoder/i965: decode instructions from dword 0 Some packets like 3DSTATE_VF_STATISTICS, 3DSTATE_DRAWING_RECTANGLE, 3DPRIMITIVE, PIPELINE_SELECT, etc... have configurable fields in dword0, we probably want to print those. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-03 20:45:34 +01:00
Lionel Landwerlin	04f2e80257	intel: gen_decoder: store pointer to current decoded field in iterator Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-03 20:45:34 +01:00
Dave Airlie	1e9e747d00	radv/ac: fix texture derivative ordering The ordering NIR gives us is correct for the hw, this fixes: dEQP-VK.glsl.texture_functions.texturegrad.* (mainly trigged on isampler/usampler 3d textures.). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-04 05:39:10 +10:00
Dave Airlie	303d22f319	radv/ac: round cube array coordinate before fixup. This fixes: dEQP-VK.glsl.texture_functions.texture.samplercubearray* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-04 05:39:07 +10:00
Dave Airlie	5821f676ee	radv: move to using common buffer load format. Get rid of usage of SI.vs.load.input. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-04 05:37:52 +10:00
Brian Paul	b98ec1e920	util: fix MSVC warning in u_align_u32() To silence C:\Users\Brian\projects\mesa\src\util/u_vector.h(41) : warning C4146: unary minus operator applied to unsigned type, result still unsigned Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-04-03 13:09:05 -06:00
Brian Paul	960f640c7a	util: #include "c99_compat.h" to fix Windows build Otherwise, we were getting the definition for 'inline' by chance from some other preceeding #include. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-04-03 13:09:05 -06:00
Brian Paul	0fb2c16b3b	util: s/SHA1_H/MESA_SHA1_H/ To follow the convention of other header include guards. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-04-03 13:09:05 -06:00
Brian Paul	7348df81b8	svga: add comment on svga_buffer_hw_storage_map() Trivial.	2017-04-03 13:09:05 -06:00
Rhys Kidd	1572d11d89	travis: Support LLVM 3.8+ on Trusty-based Travis-CI via apt-get not apt addon Per comments by Travis-CI, the apt addon is only really needed for the container-based Precise builds, as they don't yet support Trusty on that platform. Mesa currently uses Trusty fully-virtualized environment (due to sudo: required). See further: https://docs.travis-ci.com/user/trusty-ci-environment/#Fully-virtualized-via-sudo%3A-required https://github.com/travis-ci/apt-source-whitelist/pull/205#issuecomment-216054237 Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-04-03 19:43:12 +01:00
Grazvydas Ignotas	a6a38a038b	util/u_atomic: provide 64bit atomics where they're missing There are still some distributions trying to support unfortunate people with old or exotic CPUs that don't have 64bit atomic operations. When compiling for such a machine, gcc conveniently inserts a library call to a helper, but it's implementation is missing and we get a linker error. This allows us to provide our own implementation, which is marked weak to prefer a better implementation, should one exist. v2: changed copyright, some style adjustments v3: [mattst88] Print results with AC_MSG_CHECKING/AC_MSG_RESULT Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089 Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-03 10:52:41 -07:00
Rob Clark	70c272004f	gallium/util: libunwind support It's kinda sad that (a) we don't have debug_backtrace support on !X86 and that (b) we re-invent our own crude backtrace support in the first place. If available, use libunwind instead. The backtrace format is based on what xserver and weston use, since it is nice not to have to figure out a different format. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-03 11:32:17 -04:00
Rob Clark	c3c884c49c	gallium/util: clean up stack frame printing Prep work for next patch. Ideally 'struct debug_stack_frame' would be opaque, but it is embedded in a bunch of places. But at least we can treat it opaquely. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-03 11:32:17 -04:00
Samuel Pitoiset	0c0b29591c	st/mesa: add st_convert_image() Should be used by the state tracker when glGetImageHandleARB() is called in order to create a pipe_image_view template. v3: - move the comment to *.c v2: - make 'st' const - describe the function Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-03 11:29:31 +02:00
Samuel Pitoiset	90534e9dba	st/mesa: make 'st' const in st_mesa_format_to_pipe_format() This avoids a compilation warning since st_convert_image() requires 'st' to be const. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-03 11:29:18 +02:00
Bartosz Tomczyk	8d919ba384	mesa/glthread: Call unmarshal_batch directly in glthread_finish Call it directly when batch queue is empty. This avoids costly thread synchronisation. This commit improves performance of games that have previously regressed with mesa_glthread=true. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-03 10:33:31 +10:00
Timothy Arceri	dbdd7231c2	mesa: disable glthread when DEBUG_OUTPUT_SYNCHRONOUS is enabled We could re-enable it also but I haven't tested that yet, and I'm not sure we care much anyway. V2: don't disable it from with the call itself. We need a custom marshalling function or we get stuck waiting for thread to finish. V3: tidy up redundant code copied from generated version. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-03 09:31:11 +10:00
Grazvydas Ignotas	a0f0f3958e	amd/addrlib: fix optimized build warnings All the -Wunused-but-set-variable ones. Found a way to do it with a oneliner. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-03 00:48:26 +02:00
Grazvydas Ignotas	8e42038d87	radeonsi: use unreachable to fix a warning si_state.c: In function ‘si_make_texture_descriptor’: si_state.c:3240:25: warning: ‘num_format’ may be used uninitialized si_state.c:3240:12: warning: ‘data_format’ may be used uninitialized Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-03 00:46:35 +02:00
Constantine Kharlamov	dc6b3c031e	r600g: Add more (un)likely functions 1-st is obvious because of assert, 2-nd stolen frmo si_draw_vbo(), and 3-rd is just a small refactoring. Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-03 00:36:25 +02:00
Constantine Kharlamov	807de52054	r600g: Remove intermediate assignment of pipe_draw_info It removes a need to copy whole struct every call for no reason. Comparing objdump -d output for original and this patch compiled with -O2, shows reduce of the function by 16 bytes. Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-03 00:36:25 +02:00
Constantine Kharlamov	4408e1ca53	r600g: Use separate index_bias variable Needed to get rid of a separate struct allocation in the next patch, because the one in argument is a constant, and don't allow changing its fields. Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-04-03 00:36:25 +02:00
Ilia Mirkin	cb518f2fb2	nv30: fp/rast may be null when validating fb/scissor due to clear Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-02 11:03:00 -04:00
Ilia Mirkin	1184fba86e	nvc0: fragprog may not be set when e.g. clearing Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-02 10:58:32 -04:00
Ilia Mirkin	7a0c1eee0c	nv50: don't assume a rast is set when validating for clears Clears can happen before a rast is set, which can in turn cause scissors and fragprog to be validated. Make sure that we handle this case. Reported-by: Andrew Randrianasulu <randrianasulu@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-02 10:58:32 -04:00
Dave Airlie	03a67fbbf7	radv: fix order of the guardband register emission. y is vert, x is horiz. Noticed in visual inspection compared to radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-02 20:17:30 +10:00
Edward O'Callaghan	f9387a223d	mesa/main: Fix memset in formatquery.c v2: We explicitly set each member to -1 over using a confusing memset(). Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-02 15:18:38 +10:00
Samuel Pitoiset	515165ff0e	radeonsi: add load_image_desc() Similar to load_sampler_desc(). Same deal for bindless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-01 18:07:49 +02:00
Samuel Pitoiset	2f44402386	radeonsi: rework the load_sampler_desc() helpers Will be more convenient for bindless because the 64bit handle is actually the base_ptr of the descriptor (ie. 'list' will be fetched from TGSI_FILE_CONSTANT/TGSI_FILE_TEMPORARY instead). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-01 18:07:49 +02:00
Samuel Pitoiset	8a3ef8c65d	gallivm: add lp_build_emit_fetch_src() helper lp_build_emit_fetch() is useful when the source type can be infered from the instruction opcode. However, for bindless samplers/images we can't do that easily because tgsi_opcode_infer_src_type() returns TGSI_TYPE_FLOAT for TEX instructions, while we need TGSI_TYPE_UNSIGNED64 if the resource register is bindless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-01 18:07:49 +02:00
Andres Gomez	8b10bf273d	docs: add news item and link release notes for 17.0.1 Signed-off-by: Andres Gomez <agomez@igalia.com>	2017-04-01 18:51:40 +03:00
Andres Gomez	f4d2f3aa30	docs: add sha256 checksums for 17.0.3 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `71d2f05a9e`)	2017-04-01 18:50:08 +03:00
Andres Gomez	5fa3f63036	docs: add release notes for 17.0.3 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `7f34ecae7f`)	2017-04-01 18:50:06 +03:00
Erik Faye-Lund	86a9ddfef7	glsl: ir_explog_to_explog2 is no more Since `63684a9a` ("glsl: Combine many instruction lowering passes into one.", Thu Nov 18 2010), we no longer have anything called ir_explog_to_explog2. So it's only confusing to have those references there. Update with the appropriate method, so people can grep for it in the current tree if they encounter it. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-01 13:39:52 +02:00
Erik Faye-Lund	99d8b933fd	gallium/docs: remove documentation of removed arg geom was removed in `e968975` ("gallium: remove the geom_flags param from is_format_supported", Tue Mar 8 00:01:58 2011 +0100), but the documentation of it was left over. Let's bring the documentation up to date. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-01 13:39:52 +02:00
Erik Faye-Lund	c33807463e	st/mesa: avoid aliasing violation in st_cb_perfmon.c Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-01 13:39:52 +02:00
Michal Srb	52f9ccefcb	st: Add cubeMapFace parameter to st_finalize_texture. st_finalize_texture always accesses image at face 0, but it may not be set if we are working with cubemap that had other face set. This fixes crash in piglit same-attachment-glFramebufferTexture2D-GL_DEPTH_STENCIL_ATTACHMENT. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-01 09:03:23 +02:00
Jason Ekstrand	d6fccb4c09	vulkan: Bump the header and XML to the latest public version	2017-03-31 22:41:43 -07:00
Karol Herbst	baaae8cb81	nv50/ir: also do PostRaLoadPropagation for FMA Helps Feral-ported games, due to their use of fma() shader-db changes: total instructions in shared programs : 3934925 -> 3934327 (-0.02%) total gprs used in shared programs : 481563 -> 481563 (0.00%) total local used in shared programs : 27469 -> 27469 (0.00%) total bytes used in shared programs : 36061888 -> 36056504 (-0.01%) local gpr inst bytes helped 0 0 228 228 hurt 0 0 0 0 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 23:57:16 -04:00
Karol Herbst	7d007824a3	gm107/ir: add LIMM form of mad v2: renamed commit reordered modifiers add assert(dst == src2) v3: reordered modifiers again v5: no rounding bit for limms Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 23:57:15 -04:00
Karol Herbst	ad638514e3	gk110/ir: add LIMM form of mad v2: renamed commit reordered modifiers add assert(dst == src2) v3: removed wrong neg mod emission Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 23:57:14 -04:00
Karol Herbst	d346b8588c	nv50/ir: implement mad post ra folding for nvc0+ changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0 /benchmark_duration_ms=60000 /width=1024 /height=640: score: 1026 -> 1045 changes for shader-db: total instructions in shared programs : 3943335 -> 3934925 (-0.21%) total gprs used in shared programs : 481563 -> 481563 (0.00%) total local used in shared programs : 27469 -> 27469 (0.00%) total bytes used in shared programs : 36139384 -> 36061888 (-0.21%) local gpr inst bytes helped 0 0 3587 3587 hurt 0 0 0 0 v2: removed TODO reorderd to show changes without RA modification removed stale debugging print() call v3: remove predicate checks enable only for gf100 ISA Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 23:57:13 -04:00
Karol Herbst	d6ce325147	nv50/ir: restructure and rename postraconstantfolding pass we might want to add more folding passes here, so make it a bit more generic v2: leave the comment and reword commit message v4: rename it to PostRaLoadPropagation Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 23:57:12 -04:00
Karol Herbst	f2a4d881fe	nvc0/ir: also do ConstantFolding for FMA Helps mainly Feral-ported games, due to their use of fma() shader-db changes: total instructions in shared programs : 3941587 -> 3940749 (-0.02%) total gprs used in shared programs : 481511 -> 481460 (-0.01%) total local used in shared programs : 27469 -> 27481 (0.04%) total bytes used in shared programs : 36123344 -> 36115776 (-0.02%) local gpr inst bytes helped 2 48 243 243 hurt 2 3 32 32 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 23:57:10 -04:00
Karol Herbst	fac921db63	nvc0/ir: disable support for LIMMs on MAD/FMA I hit an assert in the emiter while toying around with optimizations, because ConstantFolding immediated a big int into a mad. There is special handling for FMA/MAD in insnCanLoad, which is broken. With this patch the special path should be not hit anymore. Anyway, the constraints for the LIMMS can't be guarenteed in SSA form and I have patches pending to use it via a post-SSA optimization pass. As a result, immediates get immediated for int mad/fmas as well. changes in shader-db: total instructions in shared programs : 3943335 -> 3941587 (-0.04%) total gprs used in shared programs : 481563 -> 481511 (-0.01%) total local used in shared programs : 27469 -> 27469 (0.00%) total bytes used in shared programs : 36139384 -> 36123344 (-0.04%) Signed-off-by: Karol Herbst <karolherbst@gmail.com> [imirkin: remove extra bit from insnCanLoad as well] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 23:57:08 -04:00
Lyude	31970ab9a6	nvc0: Add support for NV_fill_rectangle for the GM200+ This enables support for the GL_NV_fill_rectangle extension on the GM200+ for Desktop OpenGL. Signed-off-by: Lyude <lyude@redhat.com> Changes since v1: - Fix commit message - Add note to reldocs Changes since v2: - Remove unnessecary parens in nvc0_screen_get_param() - Fix sorting in release notes - Don't execute FILL_RECTANGLE method on pre-GM200+ GPUs Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 21:41:36 -04:00
Lyude	82e0c5f484	st/mesa: Add support for NV_fill_rectangle Signed-off-by: Lyude <lyude@redhat.com> Changes since v1: - Fix commit name Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 21:41:32 -04:00
Lyude	1cc7352c4c	gallium: Add NV_fill_rectangle to pipe state Signed-off-by: Lyude <lyude@redhat.com> Changes since v1: - Fix accidental widening of bitfields Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 21:41:29 -04:00
Lyude	ffe2bd676f	gallium: Add a cap to check if the driver supports fill_rectangle Changes since v1: - Add pipe caps for etnaviv, freedreno, swr and virgl Signed-off-by: Lyude <lyude@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 21:41:24 -04:00
Lyude	54af467334	mesa: Add support for GL_NV_fill_rectangle Since we don't have the bits required to support this in OpenGLES yet, this only enables support for Desktop OpenGL Signed-off-by: Lyude <lyude@redhat.com> Changes since v1: - Simply _mesa_PolygonMode() a little bit - Fix formatting in OpenGL spec excerpts - Move polygon mode checking into _mesa_valid_to_render() Changes since v3: - Improve error message for invalid drawings with GL_FILL_RECTANGLE_NV Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 21:41:20 -04:00
Lyude	a7cb2b40ed	glapi: Add GL_NV_fill_rectangle Signed-off-by: Lyude <lyude@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-31 21:41:08 -04:00
Marek Olšák	150736b5c3	gallium: remove support for predicates from TGSI (v2) Neved used. v2: gallivm: rename "pred" -> "exec_mask" etnaviv: remove the cap gallium: fix tgsi_instruction::Padding Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-04-01 00:06:41 +02:00
Dave Airlie	c011fe7452	radv: enable tessellation shaders. This enables tessellation shaders and sets some values for the maximums. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:17:25 +10:00
Dave Airlie	cb1518e96b	radv/ac: setup lds for tessellation This seems to get lost in the rebases, should fix the tessellation demos, crash in llvm. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:17:15 +10:00
Dave Airlie	3f0d69af20	radv: add ia_multi_vgt_param tessellation support. This just ports the relevant radeonsi pieces. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:17:08 +10:00
Dave Airlie	b4495b71c6	radv/cmd: emit tessellation state. This emits the tessellation shaders and state to the command stream. It contains the logic to emit the LS/HS shaders. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:57 +10:00
Dave Airlie	60fc0544e0	radv/pipeline: handle tessellation shader compilation So tess shaders have some circular dependencies, TCS needs the TES primitive mode TES needs the TCS vertices out This builds the nir for each shader first to get the info, executes a tes specific nir pass, then builds the LLVM shaders. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:51 +10:00
Dave Airlie	aaabdd6bc6	radv/ac: handle writing out tess factors. This ports the code from radeonsi to build the if/endif, and ports the tess factor emission code. This code has an optimisation TODO that we can deal with later. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:47 +10:00
Dave Airlie	94f9591995	radv/ac: add support for TCS/TES inputs/outputs. This adds support for the tessellation inputs/outputs to the shader compiler, this is one of the main pieces of the patch. It is very similiar to the radeonsi code (post merge we should consider if there are better sharing opportunities). The main differences from radeonsi, is that we can have "compact" varyings for clip/cull/tess factors, and we have to add special handling for these. This consists of treating the const index from the deref different depending on the compactness. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:42 +10:00
Dave Airlie	5ab1289b48	radv/ac: add clip support for tess eval shader. As this may be the last shader to emit clip distances. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:37 +10:00
Dave Airlie	326b9bc6dc	radv/ac: hook up tessellation intrinsics. This just adds support for the nir intrinsics that tessellation uses. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:32 +10:00
Dave Airlie	d8ab71b207	radv/ac: hook up shader information handling for tessellation This hooks up the tessellation shader info to the nir values and ctx generated ones. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:27 +10:00
Dave Airlie	4c60c68bd1	radv/pipeline: start calculating tess stage. This calculates the pipeline state for tessellation. It moves the gs ring calculation down to below where the tessellation shaders will be compiled, as it needs the info from those shaders. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:19 +10:00
Dave Airlie	823b55a8a9	radv: add tessellation support to variant code. This just fills out the rsrc registers for tess shaders. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:14 +10:00
Dave Airlie	f239f59778	radv: add tessellation support to shader naming Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:08 +10:00
Dave Airlie	5b40eab00a	radv: add tess ctrl stage barrier workaround for SI. This just ports the workaround from radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:16:04 +10:00
Dave Airlie	3a633cc2cb	radv/ac: add support for patch inputs to unique index code. This add support for tessellation patch inputs to the code that finds the unique parameter index. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:57 +10:00
Dave Airlie	aeb49bc2b9	radv: port polaris vgt vertex reuse workaround. This ports the VGT_VERTEX_REUSE register settings for Polaris GPUs from radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:51 +10:00
Dave Airlie	46a820b383	radv: configure tessellation distribution register. This just takes the radeonsi values. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:45 +10:00
Dave Airlie	60326a7afc	radv/ac: setup tessellation shader inputs. This just configures all the register inputs for the tessellation related stages. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:41 +10:00
Dave Airlie	3968162751	radv/ac: setup tess rings on compiler side. This just sets up the necessary pointers on the compiler side for the rings needed for tessellation. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:35 +10:00
Dave Airlie	46e52df34d	radv: add tessellation ring allocation support. (v2) This patch adds support for the offchip rings for storing tessellation factors and attribute data. It includes the register setup for the TF ring v2: always do tess ring size calcs (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:30 +10:00
Dave Airlie	bbfb62df16	radv: add support for some device specific tess information. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:26 +10:00
Dave Airlie	2b3c4bcc1f	radv/ac: add tess changes to shader keys/info This adds the tess pieces for shader keys and shader info, it adds the necessary bits to the vertex key/info as well. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:22 +10:00
Dave Airlie	a4b039db04	radv: add tess shader stage user data support. This just adds support for tess to the shader stage conversion and emits the per-stage descriptors/constants for tess stages. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:15 +10:00
Dave Airlie	a5136a97f7	radv: use defines for ring descriptor offsets. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:15:12 +10:00
Dave Airlie	0604284e3f	radv: add helper function to denote if tess is enabled on a pipeline. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:14:59 +10:00
Dave Airlie	97e0ff30c0	radv: handle clip dist in es outputs. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:14:53 +10:00
Dave Airlie	6279646306	radv: drop unneeded start Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:14:39 +10:00
Dave Airlie	a58d03a5a2	radv: fixup geometry clip emission since using the geom pass Fixes: `2b35b60d`: radv: move to using nir clip/cull merge pass. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-04-01 07:14:38 +10:00
Marek Olšák	744317c9d2	radeonsi/gfx9: allow CMASK fast clear with RB+ Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 21:41:57 +02:00
Marek Olšák	ea59521475	radeonsi/gfx9: don't compare src_va w/ dst_va for CP_DMA_CLEAR src_va contains the clear value in this case. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 21:41:57 +02:00
Marek Olšák	e3cb67dc6b	radeonsi/gfx9: fix 1D array fetches with derivs, bias, or Z compare value Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 21:41:57 +02:00
Marek Olšák	6ab2042761	radeonsi/gfx9: fix and enable single-sample CMASK fast clear Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 21:41:57 +02:00
Marek Olšák	d4bb4583b0	radeonsi/gfx9: fix and enable MSAA compression Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 21:41:57 +02:00
Marek Olšák	06d725ab2f	radeonsi/gfx9: disable CE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 21:41:57 +02:00
Marek Olšák	35aaccaf81	radeonsi/gfx9: fix linear mipmap CPU access Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 21:41:57 +02:00
Marek Olšák	322eb13f09	radeonsi: add tests verifying that VM faults don't hang GFX9 hangs instead of writing VM faults to dmesg. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 21:41:57 +02:00
Marek Olšák	283c31afa1	radeonsi: unify HS max_offchip_buffers workarounds Vulkan doesn't set more than 508. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 21:41:57 +02:00
Marek Olšák	829bd77235	radeonsi: adjust checking for SC bug workarounds no change in behavior, just making sure that no later chips will use the workarounds Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 21:41:56 +02:00
Brian Paul	2936f5c37e	glsl: use -O1 optimization for builtin_functions.cpp with MinGW Some versions of MinGW-w64 such as 5.3.1 and 6.2.0 produce bad code with -O2 or -O3 causing a random driver crash when running programs that use GLSL. Most Mesa demos in the glsl/ directory trigger the bug, but not the fragcoord.c test. Use a #pragma to force -O1 for this file for later MinGW versions. Luckily, this is basically one-time setup code. I suspect the bug is related to the sheer size of this file. This should let us move to newer versions of MinGW-w64 for Mesa. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 13:36:25 -06:00
Brian Paul	15bb0511d6	tnl: remove unused var to silence warning Trivial.	2017-03-31 13:30:54 -06:00
Neha Bhende	2e24a11f1d	st/wgl: Replace variable name hdc with hDrawDC Reviewed-by: Brian Paul <brianp@vmware.com>	2017-03-31 13:30:54 -06:00
Brian Paul	7d0aac2392	st/wgl: add support for WGL_ARB_make_current_read This adds the wglMakeContextCurrentARB() and wglGetCurrentReadDCARB() functions. Signed-off-by: Brian Paul <brianp@vmware.com>	2017-03-31 13:30:54 -06:00
Brian Paul	7753f040fa	stw/wgl: add null context check in wglBindTexImageARB() To avoid dereferencing a null pointer in case wglMakeCurrent() wasn't called. Found while debugging SWKOTOR game. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-03-31 13:30:53 -06:00
Marek Olšák	7d2fa8dc10	radeonsi: decompress DCC in set_sampler_view instead of create_sampler_view (v2) v2: don't add a new decompress helper function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 20:57:53 +02:00
Marek Olšák	8c7d1ded19	radeonsi: decompress DCC in set_framebuffer_state instead of create_surface (v2) for threaded gallium, which can't use pipe_context in create_surface v2: don't add a new decompress helper function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 20:57:53 +02:00
Nicolai Hähnle	d10fbe5159	st/glsl_to_tgsi: fix 64-bit integer bit shifts Fix a bug that was caused by a type mismatch in the shift count between GLSL and TGSI. I briefly considered adjusting the TGSI semantics, but since both LLVM and AMD GCN require both arguments to be of the same type, it makes more sense to keep TGSI as-is -- it reflects the underlying implementation better. I'm also sending out piglit tests that expose this error. v2: use the right number of components for the temporary register Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 18:15:50 +02:00
Nicolai Hähnle	c22841d8d2	tgsi: fix printing of 64-bit integer immediates Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 18:15:47 +02:00
Lionel Landwerlin	74a80d579d	intel: genxml: fix out of tree builds v2: use Emil's recommendation change rule to closer to genxml/genX_bits.h Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-31 15:29:57 +01:00
Thomas Hellstrom	18e2aa063c	gbm/dri: Check dri extension version before flush after unmap The commit mentioned below required the __DRI2FlushExtension to have version 4 or above, for GBM functionality. That broke GBM with some classic dri drivers. Relax that requirement so that we only flush after unmap if we have version 4 or above. Drivers that require the flush for correct functionality should implement the desired version. Fixes: `ba8df228` ("gbm/dri: Flush after unmap") Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Dylan Baker <dylan@pnwbakers.com>	2017-03-31 10:25:46 +02:00
Nicolai Hähnle	02112c3ef7	radeonsi: implement ARB_shader_group_vote Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 07:56:27 +02:00
Nicolai Hähnle	cd3f386069	radeonsi: enable ARB_shader_clock Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 07:56:27 +02:00
Nicolai Hähnle	2290535d62	radeonsi: emit TGSI_OPCODE_CLOCK Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 07:56:26 +02:00
Nicolai Hähnle	65b542a7cc	st/mesa: implement ARB_shader_clock Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 07:56:26 +02:00
Ilia Mirkin	94ec847cb0	tgsi: add CLOCK opcode Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 07:56:26 +02:00
Nicolai Hähnle	d0c7f924a3	gallium: add PIPE_CAP_TGSI CLOCK Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 07:56:25 +02:00
Nicolai Hähnle	44125b29d1	glsl: fix clockARB builtin function The underlying intrinsic is defined to always have a uvec2 return type. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 07:56:25 +02:00
Tapani Pälli	3535b87a1a	anv: change BLOCK_POOL_MEMFD_SIZE to 1GB This allows us to run 32bit Vulkan apps on Android, ftruncate call would fail on 2GB (max size being 2GB - 1). Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-31 08:43:28 +03:00
Tapani Pälli	2398770c87	android: add libmesa_genxml as dep to libmesa_isl This is to fix following compile error with libmesa_isl: mesa/src/intel/isl/isl.c:28:10: fatal error: 'genxml/genX_bits.h' file not found Fixes: `f0eaf38` ("genxml: New generated header genX_bits.h (v6)") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emli Velikov <emil.velikov@collabora.com>	2017-03-31 08:42:54 +03:00
Timothy Arceri	3e524cfa47	mesa: remove MESA_GLSL=opt This is unused. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emli.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 13:43:38 +11:00
Timothy Arceri	2caa3aa1f4	mesa: remove MESA_GLSL=no_opts env option This is confusing because is only applys to GL_ARB_vertex/fragment_program, and because of that its also not very useful. If someone requires this for debugging they can just make an ad-hoc code change. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 13:43:38 +11:00
Timothy Arceri	94224950dd	mesa: move FLUSH_VERTICES() call to meta There is no need for this to be in the common code. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 13:43:38 +11:00
Timothy Arceri	2e70de7d2f	mesa/vbo: remove redundant _mesa_is_bufferobj() calls This is already called inside the vbo_exec_vtx_{unmap,map}() functions. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-31 11:54:37 +11:00
Timothy Arceri	3ef1ff6270	mesa/glthread: add async support to ARB_gpu_shader_int64 uniform functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 11:54:36 +11:00
Timothy Arceri	eb3df0e838	mesa/glthread: add async support to ARB_gpu_shader_fp64 uniform functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-31 11:54:35 +11:00
Lionel Landwerlin	469da094e1	aubinator: enable snb/ilk through --gen Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-03-31 01:25:33 +01:00
Lionel Landwerlin	0f83c05149	intel: genxml: compress all gen files into one Combining all the files into a single string didn't make any difference in the size of the aubinator binary. With this change we now also embed gen4/4.5/5 descriptions, which increases the aubinator size by ~16Kb. v2 (Lionel): rebase makefiles Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-03-31 01:24:56 +01:00
Bas Nieuwenhuizen	0f3de89a56	radv: Use the guard band. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-30 22:21:14 +02:00
Bas Nieuwenhuizen	8a53e6e4c5	radv: Prepare for not using the guard band for lines & points. Vulkan Clipping is defined in terms of vertices, the scissor based clipping happens on pixels. There is a difference with points and lines, as a vertex can be outside the viewport while some pixels are in. On Vulkan thoise pixels shouldn't be drawn, while they would be with the guardband. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-30 22:21:14 +02:00
Bas Nieuwenhuizen	76603aa90b	radv: Drop the default viewport when 0 viewports are given. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-30 22:21:14 +02:00
Bas Nieuwenhuizen	4083a2ddcb	radv: Set proper viewport & scissor for meta draws. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-30 22:21:14 +02:00
Lyude	42f2bccd11	mesa: Fix trailing whitespace in polygon.c Signed-off-by: Lyude <lyude@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-30 11:59:51 -07:00
Lyude	043ee96059	mesa: Fix gross indenting in _mesa_PolygonMode() Signed-off-by: Lyude <lyude@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-30 11:59:51 -07:00
Lyude	a1ce8a3fe2	r300: Fix indenting in r300_get_param() Signed-off-by: Lyude <lyude@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-30 11:59:51 -07:00
Lyude	e5c6c421c4	vc4: Fix indenting in vc4_screen_get_param() Signed-off-by: Lyude <lyude@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-30 11:59:51 -07:00
Kenneth Graunke	e113dfabad	intel: Add INTEL_CFLAGS to aubinator CFLAGS. It still needs intel_aub.h. Fixes the build.	2017-03-30 11:58:00 -07:00
Jason Ekstrand	fbcf92a278	nir: Add support for 8 and 16-bit types Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-03-30 11:34:45 -07:00
Jason Ekstrand	28e41506a6	nir/constant_expressions: Don't switch on bit size when not needed For opcodes such as the nir_op_pack_64_2x32 for which all sources and destinations have explicit sizes, the bit_size parameter to the evaluate function is pointless and should do nothing. Previously, we were always switching on the bit_size and asserting if it isn't one of the sizes in the list. This generates way more code than needed and is a bit cruel because it doesn't let us have a bit_size of zero on an ALU op which shouldn't need a bit_size. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-03-30 11:34:45 -07:00
Jason Ekstrand	b69b44d222	nir/constant_expressions: Pull the guts out into a helper block Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-03-30 11:34:45 -07:00
Kenneth Graunke	f5e5c0c101	i965: Stop using legacy dri_bufmgr_* and intel_* names. Eric renamed these from dri_bufmgr_* and intel_bufmgr_* to drm_intel_* in libdrm commit 4b9826408f65976a1a13387beda748b65e03ec52, circa 2008, but we've been using the legacy names this whole time. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-30 11:16:34 -07:00
Emil Velikov	3df993e1a2	intel: automake: move INTEL_CFLAGS as applicable Only common/decoder.[ch] requires it [for intel_aub.h]. v2: The code was moved to from intel/tools to intel/common, update accordingly. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-30 19:07:28 +01:00
Emil Velikov	4ffb394961	intel: android: remove libdrm_intel requirement The only part which requires libdrm_intel tools/aubinator is not built on Android. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-30 19:07:23 +01:00
Marek Olšák	331714d72e	Partially revert "amd/addrlib: silence warnings" to fix builds with DEBUG This partially reverts commit `8a74140a21`.	2017-03-30 19:17:39 +02:00
Marek Olšák	681adbc18c	ddebug: implement clear_texture Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 18:53:42 +02:00
Marek Olšák	83d3e6fbff	radeonsi: fix an unused-variable warning in a release build	2017-03-30 17:22:25 +02:00
Marek Olšák	bb2e05885d	vdpau: fix a maybe-uninitialized warning	2017-03-30 17:14:47 +02:00
Marek Olšák	65732a8ff6	softpipe: fix a maybe-uninitialized warning /home/marek/dev/mesa-main/src/gallium/drivers/softpipe/sp_compute.c:178: warning: 'grid_size' may be used uninitialized in this function [-Wmaybe-uninitialized]	2017-03-30 17:14:47 +02:00
Marek Olšák	9f5dbbe030	gallivm: fix a maybe-uninitialized warning /home/marek/dev/mesa-main/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c:3598: warning: 'level' may be used uninitialized in this function [-Wmaybe-uninitialized] out1 = lp_build_cmp(&leveli_bld, PIPE_FUNC_GREATER, level, last_level); ^	2017-03-30 17:14:47 +02:00
Marek Olšák	3b1934d9b6	gallium/radeon: s/dcc_disable/disable_dcc/ Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-30 16:09:39 +02:00
Marek Olšák	45a71d5de5	radeonsi: handle incompatible DCC formats in resource_copy_region Required because of later commits. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>	2017-03-30 16:09:39 +02:00
Marek Olšák	b05b8587ae	radeonsi: remove a workaround for inexact *8_SNORM blits All tests pass on Fiji now. This prevents DCC disablement due to incompatible DCC formats due to the fallback. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>	2017-03-30 16:09:39 +02:00
Marek Olšák	a955ee788f	gallium/radeon: add and use a new helper vi_dcc_enabled Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-30 16:09:37 +02:00
Marek Olšák	f7bd51626e	gallium/radeon: formalize that r600_query_hw_add_result doesn't need a context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-30 16:09:36 +02:00
Marek Olšák	d76c306162	radeonsi: don't make a copy of pipe_index_buffer in draw_vbo Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-30 16:09:32 +02:00
Marek Olšák	abb25fb18e	gallium/util: use const in u_index_modify helpers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-30 16:09:29 +02:00
Samuel Pitoiset	7d99f48b5e	winsys/amdgpu: remove AMDGPU_INFO_NUM_EVICTIONS This is now exposed with libdrm_amdgpu 2.4.76. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 15:27:13 +02:00
Marek Olšák	675af982e1	radeonsi: add Vega10 PCI IDs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Boyuan Zhang	cb8b84e3d0	radeon/uvd: set correct vega10 db pitch alignment Create new function to get correct alignment based on Asics, and change the corresponding decode message buffer and dpb buffer size calculations Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-03-30 14:44:33 +02:00
Leo Liu	5eba761fee	radeon/vce: add vce support for firmware 53.19.4 v2: squashed with other similar commits Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-03-30 14:44:33 +02:00
Leo Liu	ed48b399f1	radeon/vce: adapt gfx9 surface to vce Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-03-30 14:44:33 +02:00
Leo Liu	6c7870fee8	winsys/surface: add height pitch for gfx9 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2017-03-30 14:44:33 +02:00
Leo Liu	c89e771c9c	radeon/uvd: clear message buffer when reuse As required by firmware Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-03-30 14:44:33 +02:00
Leo Liu	c836f2ce28	radeon/uvd: adapt gfx9 surface to uvd Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-03-30 14:44:33 +02:00
Leo Liu	9d5db4e8f4	radeon/uvd: add uvd soc15 register Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	474468fbf9	radeonsi/gfx9: disable features that don't work Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	8ea3da0706	radeonsi/gfx9: only allow GL 3.1 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	7695ea0c02	radeonsi/gfx9: add linear address computations for texture transfers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	172b05a37e	radeonsi/gfx9: don't generate LS and ES states these shaders don't exist on GFX9 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	eb22f5bf6f	radeonsi/gfx9: SPI_SHADER_USER_DATA changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	f4ab7a5415	winsys/amdgpu: set/get BO tiling flags for GFX9 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	7d88233f84	radeonsi/gfx9: handle pitch and offset overrides for texture_from_handle Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	de55e57e29	radeonsi/gfx9: set/validate GFX9 BO metadata Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	bd1da6b339	radeonsi/gfx9: add radeon_surf.gfx9.surf_offset Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	3685a12bad	radeonsi/gfx9: don't write mipmap level offsets to BO metadata GFX9 doesn't have (usable) mipmap offsets. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	9c100bd693	radeonsi/gfx9: flush CB & DB caches with an EOP TS event Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	6e0d64712a	radeonsi/gfx9: use ACQUIRE_MEM Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	81aa21d732	radeonsi/gfx9: only use CE RAM for most-used descriptors because the CE RAM size decreased to 4 KB. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	86f13c7363	radeonsi/gfx9: emit FLUSH_DFSM where required Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	ad93d72c34	radeonsi/gfx9: emit BREAK_BATCH in emit_framebuffer_state Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	405bacd820	radeonsi/gfx9: fix MIP0_WIDTH & MIP0_HEIGHT for compressed texture blits Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	354285afa0	radeonsi/gfx9: fix textureSize/imageSize for 1D textures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	566defad13	radeonsi/gfx9: add a workaround for 1D depth textures The same workaround is used by Vulkan. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	fc3c503b5d	radeonsi/gfx9: enable clamping for Z UNORM formats promoted to Z32F so that shaders don't have to do it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	5abf60076c	radeonsi/gfx9: image descriptor changes in mutable fields Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	c8ffec4f4b	radeonsi/gfx9: FMASK image descriptor changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	d60f72a9f0	radeonsi/gfx9: image descriptor changes in immutable fields The border color swizzle logic was copied from Vulkan. It doesn't make any sense to me, but it passes all piglits except the stencil ones. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	dfd2b54948	radeonsi/gfx9: DB changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	94819a3e6c	radeonsi/gfx9: CB changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	272b50a6f4	radeonsi/gfx9: do DCC clears on non-mipmapped textures only Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	aba8e0ea68	radeonsi/gfx9: update can_sample_z/s flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	054dcbe42c	radeonsi/gfx9: pass correct parameters to buffer_get_handle Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	710aaed52b	radeonsi/gfx9: update si_set_optimal_micro_tile_mode Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	7fcad40ca5	radeonsi/gfx9: don't check array_mode for allowing TC-compatible HTILE GFX9 supports this with all modes except linear. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	6f09b0d076	radeonsi/gfx9: update HTILE/CMASK/FMASK allocators Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	281542c690	radeonsi/gfx9: stub testdma - array_mode_to_string Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	a0e8b73594	radeonsi/gfx9: update r600_print_texture_info Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	b25d7c2cbf	gallium/radeon: move pre-GFX9 radeon_bo_metadata.* to u.legacy.* Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	9b365d497a	winsys/amdgpu: set num_tile_pipes, pipe_interleave_bytes for GFX9 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	493de7f935	winsys/amdgpu: wire up new addrlib for GFX9 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	e572835fea	winsys/amdgpu: update amdgpu_addr_create for GFX9 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	a71139470c	winsys/amdgpu: rename GFX6 surface functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	9ca33ab78e	gallium/radeon: add GFX9 surface info to radeon_surf Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	ba2e7c68ce	gallium/radeon: move pre-GFX9 radeon_surf.* members to radeon_surf.u.legacy.* Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	641b79774a	radeonsi/gfx9: allow Z16_UNORM for TC-compatible HTILE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	a4f0a1099f	radeonsi/gfx9: draw changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	b39fade67c	radeonsi/gfx9: pad shader binaries by 128 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	5271d12a6e	radeonsi/gfx9: trivial shader and ring changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	0aae4f4764	radeonsi/gfx9: sampler state changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	71eca0780a	radeonsi/gfx9: add a scissor bug workaround Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	b576df4017	radeonsi/gfx9: rasterizer changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	be8eba0625	radeonsi/gfx9: disable the 2-bit format fetch fix Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	31b1042276	radeonsi/gfx9: set NUM_RECORDS correctly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	5f4659260e	radeonsi/gfx9: ELEMENT_SIZE change Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	d214b95e9a	radeonsi/gfx9: enable ETC2 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	6d21fd51b6	radeonsi/gfx9: disable RB+ on Vega10 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	2862300d9e	radeonsi/gfx9: init_config changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	b054718218	radeonsi/gfx9: don't set PA_SC_RASTER_CONFIG* The registers don't exist on GFX9. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	de7967a27a	radeonsi/gfx9: Gather4 no longer needs the workaround Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	71ad666414	radeonsi/gfx9: CP DMA changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	7690196135	radeonsi/gfx9: query changes - EVENT_WRITE and SET_PREDICATION Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	ea9cf0a322	radeonsi/gfx9: EVENT_WRITE_EOP -> RELEASE_MEM Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	3e3d4f5e1d	radeonsi/gfx9: INDIRECT_BUFFER change Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	9680a75489	radeonsi/gfx9: enable SDMA buffer copying & clearing Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	c9b004af58	radeonsi/gfx9: handle GFX9 in a few places Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	92112ec296	radeonsi/gfx9: don't read back non-existent SRBM registers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	ef97cc0cae	radeonsi/gfx9: add IB parser support Both GFX6 and GFX9 fields are printed next to each other in parsed IBs. The Python script parses both headers like one stream and tries to merge all definitions. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	9338ab0afd	radeonsi/gfx9: set the LLVM processor, require LLVM 5.0 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	68d6d097f1	radeonsi/gfx9: add GFX9 and VEGA10 enums Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	5691e14735	amd: GFX9 packet changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	ecbdfbeb05	amd: define event types for GFX9 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	00e777b61c	amd: add texture format definitions for GFX9 the DATA_FORMAT and NUM_FORMAT fields are the same, but some of the enums differ, thus add GFX6 and GFX9 suffixes, so that the IB parser can show enums for both. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	e6c520362d	amd: resolve remaining definition conflicts with gfx9d.h Add _GFX6 and _GFX9 suffixes to conflicting definitions. sid.h and gfx9d.h can now be included in the same file. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	7e7043c31c	amd: normalize register definition formatting This resolves trivial conflicts with gfx9d.h caused by different formatting. Some fields are also renamed. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	db04d4ccaa	amd: import GFX9 register definitions Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	a3556c0f06	radeonsi: code shuffling in si_init_depth_surface use fewer local variables, re-order the assignments, so that the GFX9 diff is smaller here. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	8a74140a21	amd/addrlib: silence warnings	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	7f160efcde	amd/addrlib: import gfx9 support	2017-03-30 14:44:33 +02:00
Kevin Furrow	047d6daf10	amd/addrlib: Not all ETC2 formats are 128bpp... add new ETC2 formats to differentiate between 64 and 128bpp formats.	2017-03-30 14:44:33 +02:00
Kevin Furrow	1360018c1c	amd/addrlib: Fix selection of swizzle modes for 3D compressed images.	2017-03-30 14:44:33 +02:00
Kevin Furrow	9705e3b72c	amd/addrlib: Add support for ETC2 and ASTC formats.	2017-03-30 14:44:33 +02:00
Joe Ma	a489cdb20f	amd/addrlib: Bump version to 6.02	2017-03-30 14:44:33 +02:00
Frans Gu	e736edf63d	amd/addrlib: Adjust slie size after pitch and actual height adjustment	2017-03-30 14:44:33 +02:00
Frans Gu	588e5bbf3d	amd/addrlib: Apply input pitch after internal pitch aligning	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	11f1306207	amdgpu/addrlib: Bump version to 6.01 Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	a136926eef	amdgpu/addrlib: Seperate 2 dcc related workarounds by different flags 1) dccCompatible for padding MSAA surface to support fast clear 2) dccPipeWorkaround for padding surface to support dcc	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	48bf5d0800	amdgpu/addrlib: Fix the issue that tcCompatible HTILE slice size is not calculated correctly	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	33c25655c1	amdgpu/addrlib: Add a new output flag to notify client that the returned tile index is for PRT on SI If this flag is set for mip0, client should set prt flag for sub mips, so that address lib can select the correct tile index for sub mips.	2017-03-30 14:44:33 +02:00
Xavi Zhang	fa906a888b	amdgpu/addrlib: add matchStencilTileCfg and tcCompatible fixes The usage should be client first call AddrComputeSurfaceInfo() on depth surface with flag "matchStencilTilecfg", AddrLib will use 2DThin1 tile index for depth as much as possible and do not down grade unless alignment requirement cannot be met. 1. If there is a matched 2DThin1 tile index for stencil which make sure they will share same tile config parameters, then return the stencil 2DThin1 tile index as well. 2. If using 2DThin1 tile mode cannot make sure such thing happen, and TcCompatible flag was set, then ignore this flag then try 2DThin1 tile mode for depth and stencil again. 3. If 2DThin1 tile mode cannot make sure depth and stencil to have same tile config parameters, then down grade depth surface tile mode to 1DThin1. 4. If depth surface's tile mode was 1DThin1, then return 1DThin1 tile index for stencil. 5. If depth surface's tile mode is PRT, then return invalid tile index to stencil since their tile config parameters will never be met. Client driver then check the returned tile index of stencil -- if it is not invalid tile index, then call AddrComputeSurfaceInfo() on stencil surface with the returned stencil tile index to get full output information. Please note, client needs to set flag "useTileIndex" when AddrLib get created.	2017-03-30 14:44:33 +02:00
Frans Gu	6764d96eaa	amdgpu/addrlib: Adjust bank equation bit order based on macro tile aspect ratio settings By this way, we can have valid equation for 2D_THIN1 tile mode. Add flag "preferEquation" to return equation index without adjusting input tile mode.	2017-03-30 14:44:33 +02:00
Frans Gu	ed1aca8e8f	amdgpu/addrlib: do some tile mode conversions to display surface	2017-03-30 14:44:33 +02:00
Xavi Zhang	cb8844392c	amdgpu/addrlib: Check prt flag for PRT_THIN1 extra padding for DCC.	2017-03-30 14:44:33 +02:00
Frans Gu	fe216415c6	amdgpu/addrlib: Add new flags minimizePadding and maxBaseAlign 1) minimizePadding - Use 1D tile mode if padded size of 2D is bigger than 1D 2) maxBaseAlign - Force PRT tile mode if macro block size is bigger than requested alignment. Also, related changes to tile mode optimization for needEquation.	2017-03-30 14:44:33 +02:00
Xavi Zhang	4dd4700612	amdgpu/addrlib: Always returns pixelPitch in original pixels	2017-03-30 14:44:33 +02:00
Sabre Shao	eb3036ed46	amdgpu/addrlib: fix crash on allocation failure	2017-03-30 14:44:33 +02:00
Frans Gu	680f91e5d4	amdgpu/addrlib: Add flag to report if a surface can have dcc ram	2017-03-30 14:44:33 +02:00
Roy Zhan	ca88f83222	amdgpu/addrlib: support non-power2 height alignment (for linear surface)	2017-03-30 14:44:33 +02:00
Frans Gu	c867a2b222	amdgpu/addrlib: Fix family setting for VI and CZ ASICs	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	b328e47d3d	amdgpu/addrlib: style cleanup Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	fbc9ba7559	amdgpu/addrlib: Pad pitch to multiples of 256 for DCC surface on Fiji The change also modifies function CiLib::HwlPadDimensions to report adjusted pitch alignment.	2017-03-30 14:44:33 +02:00
Xavi Zhang	145750efba	amdgpu/addrlib: Fix number of // Find ^/{80,99}$ and replace them to 100 "/" Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	4e2668ecd1	amdgpu/addrlib: Cleanup. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Xavi Zhang	d1ecb70ba3	amdgpu/addrlib: Use namespaces Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Kevin Zhao	8912862a40	amdgpu/addrlib: Adjust 99 "" to 100 "" alignment Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Frans Gu	acaeae2861	amdgpu/addrlib: Add a new tile mode ADDR_TM_UNKNOWN This can be used by address lib client to ask address lib to select tile mode.	2017-03-30 14:44:33 +02:00
Xavi Zhang	90029b958e	amdgpu/addrlib: Stylish cleanup. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Roy Zhan	554c1b9f2d	amdgpu/addrlib: Disable tcComaptible when depth surface is not macro tiled Experiment show 1D tiling + TcCompatible cannot work together.	2017-03-30 14:44:33 +02:00
Xavi Zhang	120a5d0e42	amdgpu/addrlib: fix pixel index calculation of thick micro tiling	2017-03-30 14:44:33 +02:00
Xavi Zhang	199912a9bc	amdgpu/addrlib: Add a flag to skip calculate indices This is useful for debugging and special cases for stencil surfaces do not require texture fetch compatible.	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	10f7d1cb03	amdgpu/addrlib: add equation generation 1. Add new surface flags needEquation for client driver use to force the surface tile setting equation compatible. Override 2D/3D macro tile mode to PRT_* tile mode if this flag is TRUE and num slice > 1. 2. Add numEquations and pEquationTable in ADDR_CREATE_OUTPUT structure to return number of equations and the equation table to client driver 3. Add equationIndex in ADDR_COMPUTE_SURFACE_INFO_OUTPUT structure to return the equation index to client driver Please note the use of address equation has following restrictions: 1) The surface can't be splitable 2) The surface can't have non zero tile swizzle value 3) Surface with > 1 slices must have PRT tile mode, which disable slice rotation	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	3e44337bd6	amdgpu/addrlib: rename ComputeSurfaceThickness to Thickness	2017-03-30 14:44:33 +02:00
Xavi Zhang	79dcda5116	amdgpu/addrlib: add define HAVE_TSERVER	2017-03-30 14:44:33 +02:00
Frans Gu	7293a020bd	amdgpu/addrlib: Add new interface to support macro mode index query	2017-03-30 14:44:33 +02:00
Roy Zhan	c16e1e2041	amdgpu/addrlib: add explicit Log2NonPow2 function	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	4a4b7da141	amdgpu/addrlib: Fix invalid access to m_tileTable Sometimes client driver passes valid tile info into address library, in this case, the tile index is computed in function HwlPostCheckTileIndex instead of CiAddrLib::HwlSetupTileCfg. We need to call HwlPostCheckTileIndex to calculate the correct tile index to get tile split bytes for this case.	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	9e40e09089	amdgpu/addrlib: add ADDR_ANALYSIS_ASSUME It helps fix analysis warnings in MSC.	2017-03-30 14:44:33 +02:00
XiaoYuan Zheng	6164f23a91	amdgpu/addrlib: add tcCompatible htile addr from coordinate support.	2017-03-30 14:44:33 +02:00
Carlos Xiong	3bd1380ab2	amdgpu/addrlib: force all zero tile info for linear general.	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	8b110f0319	amdgpu/addrlib: Add a member "bpp" for input of method AddrConvertTileIndex and AddrConvertTileInfoToHW When clients queries tile Info from tile index and expects accurate tileSplit info, bits per pixel info is required to be provided since this is necessary for computing tileSplitBytes; otherwise Addrlib will return value of "tileBytes" instead if bpp is 0 - which is also current logic. If clients don't need tileSplit info, it's OK to pass bpp with value 0.	2017-03-30 14:44:33 +02:00
Frans Gu	ca6a38fd6a	amdgpu/addrlib: Refine the PRT tile mode selection Switch the tile index based on logic instead of hardcoded threshold for different ASIC.	2017-03-30 14:44:33 +02:00
Xavi Zhang	2bf243f7c6	amdgpu/addrlib: add dccRamSizeAligned output flag This flag indicates to the client if this level's DCC memory is aligned or not. No aligned means there are padding to the end.	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	e443b48966	amdgpu/addrlib: Change comment alignment Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	e06aeaf19f	amdgpu/addrlib: style changes and minor cleanups Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	cb5d22a3f3	amdgpu/addrlib: AddrLib inheritance refactor Add one more abstraction layer into inheritance system. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	52a1288a15	amdgpu/addrlib: rearrange code in preparation of refactoring No code changes. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Xavi Zhang	f12d430c59	amdgpu/addrlib: add disableLinearOpt flag	2017-03-30 14:44:33 +02:00
Xavi Zhang	b5d8120a07	amdgpu/addrlib: Add GetMaxAlignments	2017-03-30 14:44:33 +02:00
Xavi Zhang	3c3d620cf3	amdgpu/addrlib: Let Kaveri go general stereo right eye offset padding path Kaveri (2-pipe) macro tiling mode table was initially set to all 4-aspect-ratio so the swizzling path did not work for it and then we chose to pad the offset. We now discover the root cause is that if ratio > 2, the swizzling path does not work. So we can safely use the same path for Kaveri.	2017-03-30 14:44:33 +02:00
Xavi Zhang	3614999878	amdgpu/addrlib: Rewrite tile mode optmization code Note: remove reference to degrade4Space and use opt4Space instead.	2017-03-30 14:44:33 +02:00
Carlos Xiong	c12e35065a	amdgpu/addrlib: Add a flag "tcCompatible" to surface info output structure. Even if surface info input flag "tcComaptible" is enabled, tc compatible may be not supported if tile split happens for depth surfaces. Add a new flag in output structure to notify client to disable tc compatible in this case.	2017-03-30 14:44:33 +02:00
Xavi Zhang	2ffb30c2af	amdgpu/addrlib: Make comments shorter Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
XiaoYuan Zheng	3c7bd4e013	amdgpu/addrlib: add new flag nonSplit Flag tcCompatible has different usage in CI and VI. Add a new flag "nonSplit" for CI.	2017-03-30 14:44:33 +02:00
Xiao-Tao Zai	47de94a794	amdgpu/addrlib: allow tileSplitBytes greater than row size Carrizo row size is 1K, while tileSplitBytes is 2K for a 4xAA 32bpp depth surface. Remove the sanity check that tileSplitBytes must be greater than row size. There could be performance loss but may be covered by non-split depth which enables tc-compatible read.	2017-03-30 14:44:33 +02:00
Carlos Xiong	d52e0bbfe6	amdgpu/addrlib: Change to compute TC compatible stencil info Change the logic to compute tc compatible stencil info via depth's tileIndex instead of using depth's tileInfo. So the clients can get the stencil's tileInfo computed from macroModeTable. If the stencil tileInfo is same as depth tileInfo, then stencil is tc compatible; otherwise, stencil is not tc compatible. The current suggestion is to create another stencil buffer with the tc compatible tileInfo, use depth-to-color copy to decompress and tile convert the rendered stencil to tc compoatible stencil (And use the new stencil buffer to program TC).	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	6c65f256e2	amdgpu/addrlib: rename SiAddrLib/CiAddrLib to match internal spelling Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	6e44087e77	configure.ac: require libdrm_amdgpu 2.4.76 for Vega Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:42:06 +02:00
Samuel Pitoiset	e7850bb7f0	st/glsl_to_tgsi: use glsl_type::sampler_index() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 13:15:34 +02:00
Samuel Pitoiset	784d3a7066	glsl: allow glsl_type::sampler_index() with images Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 13:15:16 +02:00
Nicolai Hähnle	257ee3f7ef	st/mesa: improve error messages and fix security warning Debian, Ubuntu set default build flag: -Werror=format-security CC state_tracker/st_cb_texturebarrier.lo state_tracker/st_cb_eglimage.c: In function ‘st_egl_image_get_surface’: state_tracker/st_cb_eglimage.c:64:7: error: format not a string literal and no format arguments [-Werror=format-security] _mesa_error(ctx, GL_INVALID_VALUE, error); ^~~~~~~~~~~ state_tracker/st_cb_eglimage.c:71:7: error: format not a string literal and no format arguments [-Werror=format-security] _mesa_error(ctx, GL_INVALID_OPERATION, error); ^~~~~~~~~~~ Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl> Fixes: `83e9de25f3` ("st/mesa: EGLImageTarget* error handling")	2017-03-30 11:24:36 +02:00
Kenneth Graunke	e4dc005bce	i965: Combine intel_batchbuffer_reloc and intel_batchbuffer_reloc64 These two functions do the exact same thing. One returns a uint64_t, and the other takes the same uint64_t and truncates it to a uint32_t. We only need the uint64_t variant - the caller can truncate if it wants. This patch gives us one function, intel_batchbuffer_reloc, that does the 64-bit thing. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-30 00:15:28 -07:00
Kenneth Graunke	5177231670	i965: Use WARN_ONCE instead of open coding it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-30 00:15:09 -07:00
Harish Krupo	36cb2003f1	android: pass sse4.1 flag as appropriate We have functions which depend on sse4.1 support but we didnt pass the right compile flag for it. This patch fixes it. Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-03-30 08:02:49 +03:00
Dave Airlie	a930c2c612	radv: fix mask attribs properly. some days it just doesn't pay to get out of bed. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-30 13:09:30 +10:00
Dave Airlie	aa27a9f687	radv: fix regression with mask attrib setting code. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-30 12:07:32 +10:00
Dave Airlie	2b35b60df1	radv: move to using nir clip/cull merge pass. Doing this before tessellation makes doing some bits of tessellation a bit cleaner. It also cleans up a bit of the llvm generator code. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-30 11:04:56 +10:00
George Kyriazis	5079c277b5	swr: [scons] Fix windows build Fix codegen build break that was introduced earlier v2: update rules for gen_knobs.cpp and gen_knobs.h v3: Introduce bldroot and revert generator file changes, making patch simpler. Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2017-03-29 18:52:07 -05:00
Craig Stout	1da7a11de8	anv/cmd_buffer: fix host memory leak push_constants must be free'd. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100452 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-03-29 14:32:32 -07:00
Timothy Arceri	16debc652a	mesa/glthread: fallback to sync if count validation fails The old code would sync and then throw a cryptic error message. There is no need for a custom error, we can just fallback to the real function and have it do proper validation. Fixes piglit test: glsl-uniform-out-of-bounds Which was returning the wrong error code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 08:23:00 +11:00
Timothy Arceri	18f4c93b02	mesa/glthread: add async support to glProgramUniform*() functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 08:22:51 +11:00
Timothy Arceri	1ea73b9c61	mesa/glthread: print out syncs when MARSHAL_MAX_CMD_SIZE is exceeded Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 08:19:07 +11:00
Jason Ekstrand	9aba81b160	anv/batch_chain: Handle another OOM in cmd_buffer_execbuf Found by inspection while rebasing other patches. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-29 09:39:49 -07:00
Philipp Zabel	83e9de25f3	st/mesa: EGLImageTarget* error handling Stop trying to specify texture or renderbuffer objects for unsupported EGL images. Generate the error codes specified in the OES_EGL_image extension. EGLImageTargetTexture2D and EGLImageTargetRenderbuffer would call the pipe driver's create_surface callback without ever checking that the given EGL image is actually compatible with the chosen target texture or renderbuffer. This patch adds a call to the pipe driver's is_format_supported callback and generates an INVALID_OPERATION error for unsupported EGL images. If the EGL image handle does not describe a valid EGL image, an INVALID_VALUE error is generated. v2: fixed get_surface to actually use the usage and error parameters Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-29 18:04:42 +02:00
Philipp Zabel	d10172d527	st/mesa: move st_manager_get_egl_image_surface into st_cb_eglimage.c The only callers are here, and we will add generation of GL errors in the following patch. Rename the function to st_egl_image_get_surface, pass the gl_context instead of st_context, and move the cast from GLeglImageOES to void* into st_egl_image_get_surface. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-29 18:04:12 +02:00
Alejandro Piñeiro	2f8d6bd578	i965: expose BRW_OPCODE_[F32TO16/F16TO32] name on gen8+ Technically those hw operations are only available on gen7, as gen8+ support the conversion on the MOV. But, when using the builder to implement nir operations (example: nir_op_fquantize2f16), it is not needed to do the gen check. This check is done later, on the final emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the specific operation accordingly. So in the middle, during optimization phases those hw operations can be around for gen8+ too. Without this patch, several (at least 95) vulkan-cts quantize tests crashes when using INTEL_DEBUG=optimizer. For example: dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert v2: simplify the code using GEN_GE (Ilia Mirkin) v3: tweak brw_instruction_name instead of changing opcode_descs table, that is used for validation (Matt Turner) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-29 17:34:15 +02:00
Marek Olšák	a2db9f9ff4	mesa: remove dd_function_table::BindProgram Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-29 15:44:00 +02:00
Marek Olšák	e81ee82119	r200: remove BindProgram Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-29 15:44:00 +02:00
Marek Olšák	bbb5561007	i915: remove BindProgram The same thing is done in i915_update_program called by i915InvalidateState. Why do it twice. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-29 15:44:00 +02:00
Marek Olšák	96a1c2406d	mesa: don't use _NEW_TEXTURE mainly in mesa/main v2: add missing %s Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-29 15:44:00 +02:00
Marek Olšák	d68150f15d	mesa: split _NEW_TEXTURE into _NEW_TEXTURE_OBJECT & _NEW_TEXTURE_STATE No performance testing has been done, because it makes sense to make this change regardless of that. Also, _NEW_TEXTURE is still used in many places, but the obvious occurences are replaced here. It's now possible to split _NEW_TEXTURE_OBJECT further. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-29 15:44:00 +02:00
Marek Olšák	226ff6aa30	mesa: inline _mesa_update_texture Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-29 15:44:00 +02:00
Jose Fonseca	bb9faba172	appveyor: Update dependencies. - Use explicit versions everywhere. - Avoid deprecate `--egg` pip option. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-03-29 11:53:03 +01:00
Jose Fonseca	ecfafdcbf5	c11/threads: Include thr/xtimec.h for xtime definition when building with MSVC. MSVC has been including a xtime definition in thr/xtimec.h ever since MSVC 2013 (which is the minimum we require for building Mesa), and including it prevents duplicate definitions when it gets included by LLVM. In fact, it looks that MSVC has been including a partial C11 threads implementation too for some time, which we should consider migrating to once we eliminate the use of _MTX_INITIALIZER_NP in our tree. Thanks to the anonymous helper from https://bugs.freedesktop.org/show_bug.cgi?id=100201#c4 for spotting this. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100201 CC: "17.0" <mesa-stable@lists.freedesktop.org>	2017-03-29 11:53:03 +01:00
Timothy Arceri	e44cba540e	mesa: update lower_jumps tests after bug fix This change updates the tests to reflect the IR after the following bug fix. Fixes: `c1096b7f1d` ("glsl: fix lower jumps for returns when loop is inside an if") Tested-by: Michel Dänzer <michel.daenzer@amd.com> Bugzilla: https://bugs.freedesktop.org/100441	2017-03-29 20:53:06 +11:00
Thomas Hellstrom	ba8df2286a	gbm/dri: Flush after unmap Drivers may queue dma operations on the context at unmap time so we need to flush to make sure the data gets to the bo. Ideally the application would take care of this, but since there appears to be no exported gbm flush functionality we need to explicitly flush at unmap time. This fixes a problem where kmscube on vmwgfx in rgba textured mode would render using an uninitialized texture rather than the intended rgba pattern. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-03-29 09:17:21 +02:00
Bas Nieuwenhuizen	3df410069a	radv: Enable sparseBinding feature. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-29 08:50:55 +02:00
Bas Nieuwenhuizen	b20af5c8d7	radv/amdgpu: Use reference counting for bos. Per the Vulkan spec, memory objects may be deleted before the buffers and images using them are deleted, although those resources then cannot be used except for deletion themselves. For the virtual buffers, we need to access them on resource destruction to unmap the regions, so this results in a use-after-free. Implement reference counting to avoid this. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-29 08:50:48 +02:00
Bas Nieuwenhuizen	e527e62e75	radv: Implement sparse memory binding. v2: Only submit when semaphores are specified. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-29 08:50:41 +02:00
Bas Nieuwenhuizen	6154efc193	radv: Implement sparse image creation. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-29 08:50:37 +02:00
Bas Nieuwenhuizen	ef0e505d02	radv: Implement sparse buffer creation. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-29 08:50:33 +02:00
Bas Nieuwenhuizen	715df30a4e	radv/amdgpu: Add winsys implementation of virtual buffers. v2: - Added comments. - Fixed a double unmap bug. - Actually unmap the non-edge old ranges. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-29 08:50:17 +02:00
Bas Nieuwenhuizen	78ee8b3f84	radv: Assert when setting 0 registers in a sequence. To catch more of those hangs early. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-03-29 01:58:16 +02:00
Jason Ekstrand	f3673db3d6	anv/cmd_buffer: Refactor flush_pipeline_select_* While having the _3d and _gpgpu versions is nice, there's no reason why we need to have duplicated logic for tracking the current pipeline. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-03-28 14:57:09 -07:00
Jason Ekstrand	6baae9625d	anv: Flush caches prior to PIPELINE_SELECT on all gens The programming note that says we need to do this still exists in the SkyLake PRM and, from looking at the bspec, seems like it may apply to all hardware generations SNB+. Unfortunately, this isn't particularly clear cut since there is also language in the bspec that says you can skip the flushing and stall to get better throughput. Experimentation with the "Car Chase" benchmark in GL seems to indicate that some form of flushing is still needed. This commit makes us do the full set of flushes regardless of hardware generation. We can always reduce the flushing later. Reported-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-03-28 14:57:08 -07:00
Jason Ekstrand	0fe3dcce4c	anv/cmd_buffer: Fix bad indentation A bunch of code was indented in such a way that it looked like it went with the if statement above but it definitely didn't. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-03-28 14:57:06 -07:00
Jason Ekstrand	01a65dc43b	anv/cmd_buffer: Apply flush operations prior to executing secondaries This fixes rendering issues in the Vulkan port of skia on some hardware. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-03-28 14:56:55 -07:00
Jason Ekstrand	9319ef96fd	anv/blorp: Use anv_get_layerCount everywhere Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-03-28 14:41:48 -07:00
Jason Ekstrand	1b8fa8dd79	anv: Make anv_get_layerCount a macro Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-03-28 14:41:47 -07:00
Dave Airlie	93d61e4945	radv: only emit ps_input_cntl is we have any to output Otherwise we get GPU hangs. Reported-by: Alex Smith <asmith@feralinteractive.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 20:12:10 +01:00
Adam Jackson	f208bdc0d2	glx: Remove #include <GL/glxint.h> We're not using anything in it, and we don't want to inherit struct definitions from some other package anyway. Signed-off-by: Adam Jackson <ajax@redhat.com>	2017-03-28 14:48:12 -04:00
Julien Isorce	7ee91af300	r600g: check NULL return from r600_aligned_buffer_create Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-28 18:27:55 +01:00
Julien Isorce	699cce3493	st_cb_bitmap: check NULL return from u_upload_alloc Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-28 18:27:55 +01:00
Julien Isorce	4a5e779b5f	si_compute: check NULL return from u_upload_alloc Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-28 18:27:53 +01:00
Julien Isorce	c5fe99eec2	r600g: check NULL return from u_upload_alloc Like done in si_state_draw.c::si_draw_vbo u_upload_alloc can fail, i.e. set output param *ptr to NULL, for 2 reasons: alloc fails or map fails. For both there is already a fprintf/stderr in radeon_create_bo and radeon_bo_do_map. In src/gallium/drivers/ it is a common usage to just avoid to crash by doing a silent check. But defer fprintf where the error comes from, libdrm calls. Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-28 17:54:15 +01:00
Tim Rowley	749cf3be6e	swr: fix llvm-5.0.0 build bustage Handle rename of llvm AttributeSet to AttributeList in the same fashion as ac_llvm_helper.cpp. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-28 11:46:58 -05:00
Tim Rowley	79d92a72d5	swr: [rasterizer jitter] fix llvm-5.0.0 build bustage Add CreateAlignmentAssumptionHelper to gen_llvm_ir_macros.py ignore list. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-28 11:46:58 -05:00
Chad Versace	d1032a047b	isl: Drop unused isl_surf_init_info::min_pitch Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-28 09:44:44 -07:00
Chad Versace	6cbc13d94c	intel: Fix requests for exact surface row pitch (v2) All callers of isl_surf_init() that set 'min_row_pitch' wanted to request an exact row pitch, as evidenced by nearby asserts, but isl lacked API for doing so. Now that isl has an API for that, update the code to use it. v2: Assert that isl_surf_init() succeeds because the callers assume it. [for jekstrand] Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> (v1) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)	2017-03-28 09:44:44 -07:00
Chad Versace	e9017d58dc	isl: Let isl_surf_init's caller set the exact row pitch (v2) The caller does so by setting the new field isl_surf_init_info::row_pitch. v2: Validate the requested row_pitch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)	2017-03-28 09:44:44 -07:00
Chad Versace	23802dafc2	isl: Validate the calculated row pitch (v45) Validate that isl_surf::row_pitch fits in the below bitfields, if applicable based on isl_surf::usage. RENDER_SURFACE_STATE::SurfacePitch RENDER_SURFACE_STATE::AuxiliarySurfacePitch 3DSTATE_DEPTH_BUFFER::SurfacePitch 3DSTATE_HIER_DEPTH_BUFFER::SurfacePitch v2: -Add a Makefile dependency on generated header genX_bits.h. v3: - Test ISL_SURF_USAGE_STORAGE_BIT too. [for jekstrand] - Drop explicity dependency on generated header. [for emil] v4: - Rebase for new gen_bits_header.py script. - Replace gen_10x with gen_device_info*. v5: - Drop FINISHME for validation of GEN9 1D row pitch. [for jekstrand] - Reformat bit tests. [for jekstrand] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v4)	2017-03-28 09:44:44 -07:00
Chad Versace	f0eaf38db2	genxml: New generated header genX_bits.h (v6) genX_bits.h contains the sizes of bitfields in genxml instructions, structures, and registers. It also defines some functions to query those sizes. isl_surf_init() will use the new header to validate that requested pitches fit in their destination bitfields. What's currently in genX_bits.h: - Each CONTAINER::Field from gen.xml that has a bitsize has a macro in genX_bits.h: #define GEN{N}_CONTAINER_Field_bits {bitsize} - For each set of macros whose name, after stripping the GEN prefix, is the same, genX_bits.h contains a query function: static inline uint32_t __attribute__((pure)) CONTAINER_Field_bits(const struct gen_device_info devinfo); v2 (Chad Versace): - Parse the XML instead of scraping the generated gen*_pack.h headers. v3 (Dylan Baker): - Port to Mako. v4 (Jason Ekstrand): - Make the _bits functions take a gen_device_info. v5 (Chad Versace): - Fix autotools out-of-tree build. - Fix Android build. Tested with git://github.com/android-ia/manifest. - Fix macro names. They were all missing the "_bits" suffix. - Fix macros names more. Remove all double-underscores. - Unindent all generated code. (It was floating in a sea of whitespace). - Reformat header to appear human-written not machine-generated. - Sort gens from high to low. Newest gens should come first because, when we read code, we likely want to read the gen8/9 code and ignore the gen4 code. So put the gen4 code at the bottom. - Replace 'const' attributes with 'pure', because the functions now have a pointer parameter. - Add --cpp-guard flag. Used by Android. - Kill class FieldCollection. After Jason's rewrite, it was just a dict. v6 (Chad Versace): - Replace `key not in d.keys()` with `key not in d`. [for dylan] Co-authored-by: Dylan Baker <dylan@pnwbakers.com> Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v5) Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v6)	2017-03-28 09:44:44 -07:00
Tim Rowley	3974cfea25	swr: [rasterizer core] Disable inline function expansion Disable expansion in windows Debug builds. Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2017-03-28 11:24:44 -05:00
Tim Rowley	1c7224c85f	swr: [rasterizer common] Use C++ thread_local keyword Allows use of thread_local objects with constructors. Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2017-03-28 11:24:39 -05:00
Tim Rowley	aee5276375	swr: [rasterizer core] SIMD16 Frontend WIP Implement widened clipper and binner interfaces for SIMD16. Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2017-03-28 11:24:33 -05:00
Tim Rowley	aea737e12e	swr: [rasterizer core] Don't bind single-threaded contexts Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2017-03-28 11:24:27 -05:00
Tim Rowley	4cd0b1bb2c	swr: [rasterizer core] Enable SIMD16 Make the AVX512 insert/extract intrinsics KNL-compatible Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2017-03-28 11:24:21 -05:00
Tim Rowley	ec51e8ecfe	swr: [rasterizer jitter] Clean up EngineBuilder construction Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2017-03-28 11:24:14 -05:00
Tim Rowley	89b83f4b1e	swr: [rasterizer codegen] add cmdline to archrast gen files Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2017-03-28 11:24:09 -05:00
Tim Rowley	549b9d2e9f	swr: [rasterizer core] SIMD16 Frontend WIP Fix GS and streamout. Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2017-03-28 11:23:45 -05:00
Tim Rowley	fee3fc018b	swr: [rasterizer codegen] Refactor codegen Move common codegen functions into gen_common.py. v2: change gen_knobs.py to find the template file internally, like the rest of the gen scripts. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-28 11:23:04 -05:00
Juan A. Suarez Romero	caa616ccc4	tests/cache_test: allow crossing mount points When using an overlayfs system (like a Docker container), rmrf_local() fails because part of the files to be removed are in different mount points (layouts). And thus cache-test fails. Letting crossing mount points is not a big problem, specially because this is just for a test, not to be used in real code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-28 18:00:39 +02:00
Emil Velikov	0f9a0cb5f5	glcpp/tests/glcpp-test-cr-lf: error out if we cannot find any tests Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:24 +01:00
Emil Velikov	d8096b75aa	glcpp/tests/glcpp-test-cr-lf: correctly set/use srcdir/abs_builddir Otherwise manual invokation of the script from elsewhere than `dirname $0` will fail. With these all the artefacts should be created in the correct location, and thus we can remove the old (and slighly strange) clean-local line. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:24 +01:00
Emil Velikov	cf77cdce83	glcpp/tests: update testname in help string Rather than hardcoding glcpp/other use `basename "$0"` which expands appropriatelly. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:24 +01:00
Emil Velikov	4ea4fbf93a	glcpp/tests/glcpp-test: error out if we cannot find any tests Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:24 +01:00
Emil Velikov	182d48ceb9	glcpp/tests/glcpp-test: print only the test basename Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:24 +01:00
Emil Velikov	addf62946d	glcpp/tests/glcpp-test: set srcdir/abs_builddir variables Current definitions work fine for the manual invokation of the script, although the whole script does not consider that one can run it OOT. The latter will be handled with latter patches, although it will be extensively using the two variables. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:24 +01:00
Emil Velikov	ee8aea3572	glsl/tests/optimization-test: 'echo' only folders which has generators The current "let's print any folder which exists" is simply confusing. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:24 +01:00
Emil Velikov	79a95f19e6	glsl/tests/optimization-test: print only the test basedir/name The relative/absolute path brings little to no benefit in being printed as testname. Trim it out. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:24 +01:00
Emil Velikov	33cd136fa2	glsl/tests/optimization-test: error if zero tests were executed We don't want to lie ourselves that 'everything is fine' when no tests were found/ran. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:23 +01:00
Emil Velikov	421115a729	glsl/tests/optimization-test: pass glsl_test as argument Rather than hardcoding the binary location (which ends up wrong in a number of occasions) in the python script, pass it as argument. This allows us to remove a couple of dirname/basename workarounds that aimed to keep this working, and succeeded in the odd occasion. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:23 +01:00
Emil Velikov	7d2a1394bb	glsl/tests/optimization-test: error out if we fail to generate any tests v2: use -eq over a string comparison (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:23 +01:00
Emil Velikov	86a937d264	glsl/tests/optimization-test: correctly manage srcdir/builddir At the moment we look for generator script(s) in builddir while they are in srcdir, and we proceed to generate the tests and expected output in srcdir, which is not allowed. To untangle: - look for the generator script in the correct place - generate the files in builddir, by extending create_test_cases.py to use --outdir With this in place the test passes `make check' for OOT builds - would that be as standalone or part of `make distcheck' Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:23 +01:00
Emil Velikov	a7d9f0a361	glsl/tests/optimisation-test: ensure that compare_ir is available Bail out early if the script is not where we expect it to be. v2: use -f instead of -e. latter returns true on folder(s) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:23 +01:00
Emil Velikov	9083c625f5	glsl/tests/optimization-test: correctly set compare_ir Now that we have srcdir we can use it to correctly manage/point to the script. Effectively fixing OOT invokation of `make check'. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:23 +01:00
Emil Velikov	44b6422258	glsl/tests/optimization-test: add fallback srcdir/abs_builddir defines There is no robust way to detect either one, so simply hope for the best and warn just in case. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:23 +01:00
Emil Velikov	05bc5b35a7	glsl/tests/optimisation-test: make sure that $PYTHON2 is set/available Otherwise we'll fail when invoking the script outside of "make check" v2: use -ne over a string comparison (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:23 +01:00
Emil Velikov	bd4be79fc5	glsl/tests/warnings-test: print only the test basename Spamming the log with the (in some cases extremely long) test location is of limited use. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:23 +01:00
Emil Velikov	1c58d08bd9	glsl/tests/warnings-test: error if zero tests were executed We don't want to lie ourselves that 'everything is fine' when no tests were found/ran. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:22 +01:00
Emil Velikov	493fa69e37	glsl/tests/warnings-test: correctly manage srcdir/builddir Before this commit, we would effectively fail to run any of the test in a OOT builds. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:22 +01:00
Emil Velikov	81ccc7a484	glsl/tests/warnings-test: add fallback srcdir/abs_builddir defines There is no robust way to detect either one, so simply hope for the best and warn just in case. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:22 +01:00
Emil Velikov	4b366b171d	glsl/tests/warnings-test: error out if glsl_compiler is missing ... or non-executable, in particular. v2: use test -x (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:22 +01:00
Emil Velikov	1d93fa7be4	glsl: automake: export abs_builddir for the tests We're going to use them with the next commits to determine where to put the generated tests and/or built binaries. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:22 +01:00
Emil Velikov	841f0d2c58	glsl/tests: automake: cleanup all artefacts during clean-local With later commits we'll fix the generators to produce the files in the correct location. That in itself will cause an issue since the files will be left dangling and make distcheck will fail. v2: Use -r only as needed (Eric) Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-28 15:31:22 +01:00
Nayan Deshmukh	3472be2bfd	st/va: remove assert for single slice we anyway allow for multiple slices v2: do not remove assert to check for buf->size Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-03-28 12:08:54 +02:00
Nicolai Hähnle	21ba6543be	radeonsi: use DMA for clears with unaligned size Only a small tail needs to be uploaded manually. This is only partly a performance measure (apps are expected to use aligned access). Mostly it is preparation for sparse buffers, which the old code would incorrectly have attempted to map directly. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-28 10:22:14 +02:00
Nicolai Hähnle	f0d9af772e	radeonsi: CP DMA clear supports unaligned destination addresses Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-28 10:22:12 +02:00
Nicolai Hähnle	d9014952f5	radeonsi: remove the early-out for SDMA in si_clear_buffer This allows the next patches to be simple while still being able to make use of SDMA even in some unusual cases. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-28 10:22:01 +02:00
Dave Airlie	239a9224a3	radv: move shader stages calculation to pipeline. With tess this becomes a bit more complex. so move to pipeline for now. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 17:40:33 +10:00
Dave Airlie	0232ea8025	radv: move pa_cl_vs_out_cntl calculation to pipeline This also takes the side band setting code from radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 17:40:29 +10:00
Dave Airlie	92e9c14a6a	radv: move calculating fragment shader i/os to pipeline. There is no need to calculate this on each command submit. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 17:40:20 +10:00
Dave Airlie	4b467c759e	radv: move shader_z_format calculation to pipeline. No need to recalculate this every time. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 17:40:17 +10:00
Dave Airlie	8996fdbf61	radv: move db_shader_control calculation to pipeline. There is no need to recalculate this every time. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 17:40:14 +10:00
Dave Airlie	cd33a5c1cb	radv: move vgt_gs_mode value to pipeline. No need to recalculate this everytime. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 17:40:08 +10:00
Dave Airlie	d43691ce77	radv: add parameter to emit_waitcnt. This is just a precursor for tess support, which needs to pass different values here. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 17:40:03 +10:00
Dave Airlie	931a8d0c9a	radv: rework vertex/export shader output handling In order to faciliate adding tess support, split the vs/es output info into a separate block, so we make it easier to have the tess shaders export the same info. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 17:39:59 +10:00
Dave Airlie	ae0551b4b3	radv: fix ia_multi_vgt_param for instanced vs indirect draw. The logic was different than radeonsi, fix it up before adding tess support. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 17:39:55 +10:00
Dave Airlie	a8b8e542c2	radv: handle NULL multisample state. If rasterization is disabled, we can get a NULL multisample state. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 17:39:38 +10:00
Bas Nieuwenhuizen	a8c51b1cd9	radv: flush DB cache before and after HTILE decompress. It reads @ writes the DB cache, and we haven't flushed dst caches yet, so DB cache may be stale. Also the user might be shader read (and probably is), so also flush after. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> CC: <mesa-stable@lists.freedesktop.org> Fixes: `f4e499ec79` ("radv: add initial non-conformant radv vulkan driver")	2017-03-28 02:51:40 +02:00
Anuj Phogat	f5c32b0762	i965: Delete tile resource mode code Yf/Ys tiling never got used in i965 due to not delivering the expected performance benefits. So, this patch is deleting this dead code in favor of adding it later in ISL when we actually find it useful. ISL can then share this code between vulkan and GL. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-27 16:17:18 -07:00
Anuj Phogat	bcee124ef7	i965: Delete fast copy blit code Fast copy blit was primarily added to support Yf/Ys detiling. But, Yf/Ys tiling never got used in i965 due to not delivering the expected performance benefits. Also, replacing legacy blits with fast copy blit didn't help the benchmarking numbers. This is probably due to a h/w restriction that says "start pixel for Fast Copy blit should be on an OWord boundary". This restriction causes many blit operations to skip fast copy blit and use legacy blits. So, this patch is deleting this dead code in favor of adding it later when we actually find it useful. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-27 16:17:18 -07:00
Kenneth Graunke	088449487e	i965: Require Kernel 3.6 for Gen4-5 platforms. We've already required Kernel 3.6 on Gen6+ since Mesa 9.2 (May 2013, commit `92d2f5acfa`). It seems reasonable to require it for Gen4-5 as well, bumping the requirement from 2.6.39. This is necessary for glClientWaitSync with a timeout to work, which is a feature we expose on Gen4-5. Without it, we would fall back to an infinite wait, which is pretty bad. See kernel commit 172cf15d18889313bf2c3bfb81fcea08369274ef in 3.6+. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-27 15:57:50 -07:00
Timothy Arceri	99dd3d1c3b	glsl: fix spelling of embedded in comment	2017-03-28 09:56:27 +11:00
Timothy Arceri	c1096b7f1d	glsl: fix lower jumps for returns when loop is inside an if Previously we would just escape the loop and move everything following the loop inside the if to the else branch of a new if with a return flag conditional. However everything outside the if the loop was nested in would still get executed. Adding a new return to the then branch of the new if fixes this and we just let a follow pass clean it up if needed. Fixes: tests/spec/glsl-1.10/execution/vs-nested-return-sibling-loop.shader_test tests/spec/glsl-1.10/execution/vs-nested-return-sibling-loop2.shader_test Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-03-28 09:54:31 +11:00
Dave Airlie	b640dfcd05	radv: don't emit no color formats. (v3) If we had no rasterization, we'd emit SPI color format as all 0's the hw dislikes this, add the workaround from radeonsi. Found while debugging tessellation v2: handle at pipeline stage, we have to handle it after we process the fragment shader. (Bas) v3: simplify even further, remove old fallback. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-28 08:39:14 +10:00
Vinson Lee	f1f1cb41d0	mesa/tests: Link main-test with CLOCK_LIB. Fix 'make check' linking error with glibc < 2.17. CXXLD main-test ../../../../src/mesa/.libs/libmesa.a(libmesautil_la-u_queue.o): In function `u_thread_get_time_nano': src/util/../../src/util/u_thread.h:84: undefined reference to `clock_gettime' Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2017-03-27 14:36:34 -07:00
Matt Turner	7dccd38b40	i965/fs: Don't emit SEL instructions for type-converting MOVs. SEL can only convert between a few integer types, which we basically never do. Fixes fs/vs-double-uniform-array-direct-indirect-non-uniform-control-flow Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Francisco Jerez <currojerez@riseup.net>	2017-03-27 10:59:42 -07:00
Xu Randy	004468de14	anv/blorp: Fix a crash in CmdClearColorImage We should use anv_get_layerCount() to access layerCount of VkImageSub- resourceRange in anv_CmdClearColorImage and anv_CmdClearDepthStencil- Image, which handles the VK_REMAINING_ARRAY_LAYERS (~0) case. Test: Sample multithreadcmdbuf from LunarG can run without crash Signed-off-by: Xu Randy <randy.xu@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-03-27 07:43:17 -07:00
Brian Paul	804676f384	mesa: simplify code around 'variable_data' in marshal.c Remove needless pointer increments, unneeded vars, etc. Untested. Plus, fix a couple comments. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-27 08:30:43 -06:00
Brian Paul	b71ef173a5	st/mesa: move duplicated st_ws_framebuffer() function into header file Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-27 08:30:43 -06:00
Andres Gomez	6255cc654d	glsl: Interface Block instances don't need linking validation From page 45 (page 52 of the PDF) of the GLSL ES 3.00 v.6 spec: " When instance names are present on matched block names, it is allowed for the instance names to differ; they need not match for the blocks to match. From page 51 (page 57 of the PDF) of the GLSL 4.30 v.8 spec: " When instance names are present on matched block names, it is allowed for the instance names to differ; they need not match for the blocks to match." Therefore, no cross linking validation is needed for the instance name of an Interface Block. This patch will make that no link error will be reported on a program like this: "# VS layout(binding = 1) Block1 { vec4 color; } uni_block; ... # FS layout(binding = 2) Block2 { vec4 color; } uni_block; ..." Fixes GL45-CTS.enhanced_layouts.ssb_layout_qualifier_conflict Signed-off-by: Andres Gomez <agomez@igalia.com> Cc: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-27 12:47:21 +03:00
Andres Gomez	40b09ed15c	glsl: UBOs and SSBOs must match the binding qualifier too From page 140 (page 147 of the PDF) of the GLSL ES 3.10 v.4 spec: " 9.2 Matching of Qualifiers The following tables summarize the requirements for matching of qualifiers. It applies whenever there are two or more matching variables in a shader interface. Notes: 1. Yes means the qualifiers must match. ... 9.2.1 Linked Shaders \| Qualifier \| Qualifier \| in/out \| Default \| uniform \| buffer\| \| Class \| \| \| Uniforms \| Block \| Block \| ... \| Layout \| binding \| N/A \| Yes \| Yes \| Yes \|" From page 93 (page 110 of the PDF) of the GL 4.2 (Core Profile) spec: " 2.11.7 Uniform Variables ... Uniform Blocks ... When a named uniform block is declared by multiple shaders in a program, it must be declared identically in each shader. The uniforms within the block must be declared with the same names and types, and in the same order. If a program contains multiple shaders with different declarations for the same named uniform block differs between shader, the program will fail to link." From page 129 (page 150 of the PDF) of the GL 4.3 (Core Profile) spec: " 7.8 Shader Buffer Variables and Shader Storage Blocks ... When a named shader storage block is declared by multiple shaders in a program, it must be declared identically in each shader. The buffer variables within the block must be declared with the same names, types, qualification, and declaration order. If a program contains multiple shaders with different declarations for the same named shader storage block, the program will fail to link." Therefore, if the binding qualifier differs between two linked Uniform or Shader Storage Blocks of the same name, a link error should happen. This patch will make that a link error will be reported on a program like this: "# VS layout(binding = 1) Block { vec4 color; } uni_block1; ... # FS layout(binding = 2) Block { vec4 color; } uni_block2; ..." Signed-off-by: Andres Gomez <agomez@igalia.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-27 12:47:00 +03:00
Andres Gomez	bf15b2b515	glsl: on UBO/SSBOs link error reset the number of active blocks to 0 While it's legal to have an active blocks count > 0 on link failure. Unless we actually assign memory for the blocks array we can end up segfaulting in calls such as glUniformBlockBinding(). To avoid having to NULL check these api calls we simply reset the block count to 0 if the array was not created. Signed-off-by: Andres Gomez <agomez@igalia.com> Cc: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-27 12:45:59 +03:00
Samuel Iglesias Gonsálvez	c4c02471f4	anv: enable sampling from fast-cleared images on SKL A resolve is not needed on Skylake in this case. We were forcing a resolve because we set the input_aux_usage to ISL_AUX_USAGE_NONE. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-03-27 06:32:24 +02:00
Grazvydas Ignotas	b97faea162	glsl, st/shader_cache: check the whole sha1 for zero The checks were only looking at the first byte, while the intention seems to be to check if the whole sha1 is zero. This prevented all shaders with first byte zero in their sha1 from being saved. This shaves around a second from Deus Ex load time on a hot cache. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-27 15:05:10 +11:00
Grazvydas Ignotas	f2d4d11611	glsl/shader_cache: restore evicted shader keys Even though the programs themselves stay in cache and are loaded, the shader keys can be evicted separately. If that happens, unnecessary compiles are caused that waste time, and no matter how many times the program is re-run, performance never recovers to the levels of first hot cache run. To deal with this, we need to refresh the shader keys of shaders that were recompiled. An easy way to currently observe this is running Deux Ex, then piglit and Deux Ex again, or deleting just the cache index. The later is causing over a minute of lost time on all later Deux Ex runs, with this patch it returns to normal after 1 run. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-27 09:10:16 +11:00
Axel Davy	bdf035ea6f	st/nine: Use atomics for available_texture_mem Resource dtor can be executed in the worker thread. Use atomic to avoid threading safety issues. CC: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr> Tested-by: James Harvey <lothmordor@gmail.com>	2017-03-26 23:10:38 +02:00
Axel Davy	bd85bb51c7	st/nine: Resolve deadlock in surface/volume dtors when using csmt Surfaces and Volumes can be freed in the worker thread. Without this patch, pending_uploads_counter could be non-zero in the Surfaces or Volumes dtor, leading to deadlock. Instead decrease properly the counter before releasing the item. Also avoid another potential deadlock if the item is not properly unlocked: Do not call UnlockRect which will cause deadlock, but free directly using the deadlock safe nine_context_get_pipe_multithread. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99246 CC: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr> Tested-by: James Harvey <lothmordor@gmail.com>	2017-03-26 23:10:38 +02:00
Axel Davy	31f8b3babb	st/nine: Fix user vertex data uploader with csmt Fix regression caused by `abb1c645c4` The patch made csmt use context.pipe instead of secondary_pipe, leading to thread safety issues. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-03-26 23:10:38 +02:00
Jose Fonseca	2ba991cbcd	scons: Fix dependencies of marshal_generated.[ch]. These generated source files depend not only upon gl_and_es_API.xml, but all other XML files that are included by it. This change updates the generation rules to depend on all gen/*.xml files, like done for other SCons generation rules, and should fix incremental broken SCons builds due to missing dependencies. Trivial.	2017-03-26 21:30:34 +01:00
Vinson Lee	641f629536	glsl: Link tests with CLOCK_LIB. Fix 'make check' linking errors with glibc < 2.17. CXXLD glsl/glsl_test glsl/.libs/libglsl.a(libmesautil_la-u_queue.o): In function `u_thread_get_time_nano': src/util/../../src/util/u_thread.h:84: undefined reference to `clock_gettime' Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2017-03-25 01:23:04 -07:00
Timothy Arceri	425671f616	mesa/glthread: add custom marshalling for ClearBufferfv() This is one of the main causes of syncs in Civ6. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-25 13:39:12 +11:00
Grazvydas Ignotas	b9e92334f7	util/disk_cache: don't deadlock on premature EOF If we get EOF earlier than expected, the current read loops will deadlock. This may easily happen if the disk cache gets corrupted. Fix it by using a helper function that handles EOF. Steps to reproduce (on a build with asserts disabled): $ glxgears $ find ~/.cache/mesa/ -type f -exec truncate -s 0 '{}' \; $ glxgears # deadlock Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-25 13:08:37 +11:00
Chad Versace	7414326164	genxml: Add 3DSTATE_DEPTH_BUFFER to gen5.xml isl will use this for validating the depth buffer pitch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-24 19:07:05 -07:00
Grazvydas Ignotas	7d8ee4b4d0	tests/cache_test: mark arguments const While at it, also fix up a failure message to not reference timestamp and gpu dirs as those are no longer being made. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-25 12:46:18 +11:00
Rob Clark	d87ef8f77c	freedreno: free compiler when screen is destroyed Drop ir3_compiler_destroy(), since it is only ralloc_free() and we shouldn't really have an ir3 dependency in core. If some future hw has a new compiler, as long as all it's resources are ralloc()d then things will all just work. (In practice, I suppose you never really see this leak, but removing it at least cleans up some noise in valgrind.) Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-03-24 18:01:47 -04:00
Jason Ekstrand	e6621746dc	genxml: Whitespace fixes Some field names had extra spaces and some had places where we should have had a space but didn't. Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-03-24 15:00:37 -07:00
Jason Ekstrand	34c3f6a27f	genxml: Replace "[N]" with "N" Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-03-24 15:00:37 -07:00
Jason Ekstrand	c2af555d6e	genxml/gen6: Remove a couple of bogus values Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-03-24 15:00:37 -07:00
Jason Ekstrand	ec27402a8f	genxml/gen8: Remove BLACK_LEVEL_CORRECTION_STATE We've never used it, it only exists on gen8, and the name of the struct contains piles of bad characters. Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-03-24 15:00:37 -07:00
Jason Ekstrand	a6df637d26	genxml: Rename two MCS fields to Auxiliary Surface on gen7 This makes gen7 more consistent with gen8+ Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-03-24 15:00:37 -07:00
Rob Clark	c03f6f12bb	freedreno: fix memory leak Otherwise blitter would still hold a ref to, for example, sampler- views. To reproduce: glmark2 -b desktop:duration=2 --run-forever Fixes: `a8e6734` ("freedreno: support for using generic clear path") Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-03-24 17:49:00 -04:00
Chad Versace	b3f81e06d4	genxml: Fix gen_zipped_file.py dependency The gen_xml.h files depend on gen_zipped_file.py, not the gen_pack.h files. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-24 14:38:22 -07:00
Chad Versace	c7c6c53adb	genxml: Define GENXML_XML_FILES in Makefile.sources The future header genX_bits.h will depend on GENXML_XML_FILES. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-24 14:38:15 -07:00
Jan Vesely	14b543bdc9	clover: use pipe_resource references v2: buffers are created with one reference. v3: add pipe_resource reference to mapping object v4: rename to pres and drop inline initializers CC: "17.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-03-24 15:57:47 -04:00
Kenneth Graunke	0a60ff4d8c	i965: Fix symbolic size of next_offset[] array. It's indexed by buffer, not stream. BRW_MAX_SOL_BUFFERS and MAX_VERTEX_STREAMS happen to both be 4, so there's no actual bug. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-24 12:21:50 -07:00
Kenneth Graunke	652d521408	i965: Remove pointless NULL check from Gen6 primitive counting code. We create the BO when creating a transform feedback object, and only destroy it when deleting that object. So it won't be NULL. CID: 1401410 Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-24 12:21:06 -07:00
Marek Olšák	61926733f9	radeonsi: don't crash on compute shader compile failure Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-24 18:25:05 +01:00
Marek Olšák	518d834162	radeonsi: don't hang on shader compile failure Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-24 18:25:05 +01:00
Nicolai Hähnle	eebd0cd560	radeonsi: fix dvec[34] attributes sourced from current attribute state The state tracker no longer uploads those attributes for us, so we must conservatively upload the size of the largest attribute, which is a dvec4. Fixes a regression of GL45-CTS.gpu_shader_fp64.varyings and GL45-CTS.vertex_attrib_64bit.limits_test. Fixes: `9b91e0b54c` ("radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VI") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-24 17:35:21 +01:00
Emil Velikov	15603055fb	anv: automake: ensure that the destination directory is created Earlier commit unintentionally dropped the mkdir, as it was rebased. Some versions of autotools will not create the output directory for generated sources. Thus the issue went unnoticed by the original author. Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Steven Newbury <steve@snewbury.org.uk> Reported-by: Steven Newbury <steve@snewbury.org.uk> Fixes: Fixes: `1610b3dede` ("anv: don't pass xmlfile via stdin anv_entrypoints_gen.py") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-24 12:02:04 +00:00
Samuel Pitoiset	43f5a2c915	glsl_to_tgsi: don't rely on glsl types when visiting tex instructions Instead add is_cube_shadow like is_cube_array. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-24 11:12:27 +01:00
Iago Toral Quiroga	129fd58131	anv/query: handle out of host memory without crashing in compute_query_result() We don't need to make the caller (CmdCopyQueryPoolResults) aware of the problem since compute_query_result() only emits state. The caller is also expected to hit OOM in this scenario right after calling this function, but it is already handling it safely. Fixes: dEQP-VK.api.out_of_host_memory.cmd_copy_query_pool_results Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-24 09:39:44 +01:00
Iago Toral Quiroga	ddb2bb3ed4	anv/pipeline: make FragCoord include sample positions when sample shading We need to know if sample shading has been requested during shader compilation since that affects the way fragment coordinates are computed. Notice that the semantics of fragment coordinates only depend on whether sample shading has been requested, not on whether more than one sample will actually be produced (that is, minSampleShading and rasterizationSamples do not affect this behavior). Because this setting affects the code we generate for the shader, we also need to include it in the WM prog key. Notice we don't need to alter the OpenGL code because it doesn't ever use this behavior, so they key's value is always false (the default). Fixes: dEQP-VK.glsl.builtin_var.fragcoord_msaa.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-24 08:11:53 +01:00
Iago Toral Quiroga	023ea3772d	nir/lower_wpos_center: support adding sample position to fragment coordinate According to section 14.6 of the Vulkan specification: "When sample shading is enabled, the x and y components of FragCoord reflect the location of the sample corresponding to the shader invocation." So add a boolean parameter to the lowering pass to select this behavior when we need it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-24 08:11:53 +01:00
Iago Toral Quiroga	4da1832c00	anv: return VK_ERROR_DEVICE_LOST immeditely when device is known to be lost If we know the device has been lost we should return this error code for any command that can report it before we attempt to do anything with the device. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-24 08:11:53 +01:00
Iago Toral Quiroga	50c8d2c1f7	anv/device: keep track of 'device lost' state The Vulkan specs say: "A logical device may become lost because of hardware errors, execution timeouts, power management events and/or platform-specific events. This may cause pending and future command execution to fail and cause hardware resources to be corrupted. When this happens, certain commands will return VK_ERROR_DEVICE_LOST (see Error Codes for a list of such commands). After any such event, the logical device is considered lost. It is not possible to reset the logical device to a non-lost state, however the lost state is specific to a logical device (VkDevice), and the corresponding physical device (VkPhysicalDevice) may be otherwise unaffected. In some cases, the physical device may also be lost, and attempting to create a new logical device will fail, returning VK_ERROR_DEVICE_LOST." This means that we need to track if a logical device has been lost so we can have the commands referenced by the spec return VK_ERROR_DEVICE_LOST immediately. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-24 08:11:53 +01:00
Iago Toral Quiroga	70194c9f1a	anv/device: return VK_ERROR_DEVICE_LOST for errors during queue submissions So that we don't have to do things like rolling back address relocations in case that we ran into OOM after computing them, etc Also, make sure that if the queue submission comes with a fence, we set it up correctly so it behaves according to the spec after returning VK_ERROR_DEVICE_LOST. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-24 08:11:53 +01:00
Timothy Arceri	adced4a2f9	mesa/marshal: add custom BufferData/BufferSubData marshalling GL_AMD_pinned_memory requires memory to be aligned correctly, so we skip marshalling in this case. Also copying the data defeats the purpose of EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD. Fixes GL_AMD_pinned_memory piglit tests when glthread is enabled. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-24 11:23:06 +11:00
Timothy Arceri	0a32b52a27	util/disk_cache: write cache entry keys to file header This can be used to deal with key hash collisions from different versions (should we find that to actually happen) and to find which mesa version produced the cache entry. V2: use blob created at cache creation. v3: remove left over var from v1. Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>	2017-03-24 11:20:09 +11:00
Grazvydas Ignotas	5136c09e70	util/disk_cache: hash pointer size and gpu name into cache keys This allows to get rid of the arch and gpu name directories. v2: (Timothy Arceri) don't use an opaque data type to store pointer size and gpu name. v3: (Timothy Arceri) use blob to store driver keys just make sure to store null terminator for strings, and make sure blob is defined by disk_cache and not it's users. v4: (Timothy Arceri) fix typo, and make ptr_size a uint8_t. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-24 11:20:09 +11:00
Grazvydas Ignotas	feb716239e	util/disk_cache: hash timestamps into the cache keys Instead of using a directory, hash the timestamps into the cache keys themselves. Since there is no more timestamp directory, there is no more need for deleting the cache of other mesa versions and we rely on eviction to clean up the old cache entries. This solves the problem of using several incarnations of disk_cache at the same time, where one deletes a directory belonging to the other, like when both OpenGL and gallium nine are used simultaneously (or several different mesa installations). v2: using additional blob instead of trying to clone sha1 state v3: (Timothy Arceri) don't use an opaque data type to store timestamp. V4: (Timothy Arceri) use blob to store driver keys just make sure to store null terminator for strings, and make sure blob is defined by disk_cache and not it's users. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100091	2017-03-24 11:20:09 +11:00
Miklós Máté	7ceb1a4fa8	mesa: set thread name for glthread Signed-off-by: Miklós Máté <mtmkls@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-24 10:00:19 +11:00
Matt Turner	7499bc7fd7	i965: Replace OPT_V() with OPT(). We want to be able to check the progress of each pass and dump the NIR for debugging purposes if it changed. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:44 -07:00
Matt Turner	1be91bd9d8	i965/fs: Return progress from demote_sample_qualifiers(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:44 -07:00
Matt Turner	fd3351246c	i965/fs: Return progress from move_interpolation_to_top(). And mark as static at the same time. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:44 -07:00
Matt Turner	e0f8daeb86	i965: Return progress from brw_nir_lower_uniforms(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:44 -07:00
Matt Turner	ef71af7356	nir: Return progress from nir_convert_from_ssa(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:44 -07:00
Matt Turner	abc8a702d0	nir: Return progress from nir_lower_io(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:44 -07:00
Matt Turner	a934b00222	nir: Return progress from nir_lower_regs_to_ssa(). And from nir_lower_regs_to_ssa_impl() as well. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:44 -07:00
Matt Turner	b0e72defc2	nir: Return progress from nir_lower_samplers(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:44 -07:00
Matt Turner	01548f9f01	nir: Return progress from nir_lower_atomics(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:44 -07:00
Matt Turner	0bd615d961	nir: Return progress from nir_lower_clamp_color_outputs(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:44 -07:00
Matt Turner	9dbf91f5c0	nir: Return progress from nir_lower_clip_fs(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:44 -07:00
Matt Turner	4e4927cd95	nir: Return progress from nir_lower_clip_vs(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:44 -07:00
Matt Turner	6077cc75aa	nir: Return progress from nir_move_vec_src_uses_to_dest(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:43 -07:00
Matt Turner	a539e05d00	nir: Return progress from nir_lower_to_source_mods(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:43 -07:00
Matt Turner	5a7e4ae23d	nir: Return progress from nir_lower_clip_cull_distance_arrays(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:43 -07:00
Matt Turner	19345fc160	nir: Return progress from nir_lower_var_copies(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:43 -07:00
Matt Turner	b831b8d2e1	nir: Return progress from nir_lower_load_const_to_scalar(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:43 -07:00
Matt Turner	adb157ddfd	nir: Return progress from nir_lower_64bit_pack(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:43 -07:00
Matt Turner	0012a6144a	nir: Return progress from nir_lower_doubles(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:43 -07:00
Matt Turner	c597f87739	nir: Return progress from nir_lower_vars_to_ssa(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:43 -07:00
Matt Turner	7d41bf8d7b	nir: Fix syntax. et is not an abbreviation. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:43 -07:00
Matt Turner	70c0455974	nir: Fix misspellings. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:43 -07:00
Matt Turner	d6e2bdfed3	nir: Stop using apostrophes to pluralize. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 14:34:43 -07:00
Leo Liu	54f9f34181	st/omx/enc: use PIPE_USAGE_STAGING for output buffer Workaround an unknown bug with inside the transfer_map for certain ASIC, also tested with un-affected ASICs, the performance actually improved slightly. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-23 14:43:42 -04:00
Daniel Stone	378025ca8b	gbm: Use unsigned for BO offset getter The actual offset returned is uint32_t, however int64_t was used as the return type from gbm_bo_get_offset to allow negative returns to signal errors to the caller. In case of an error getting the offset, the user will also be unable to get the handle/FD, and thus have nothing to offset into. This means that returning 0 as an error value is harmless, allowing us to change the return type to uint32_t in order to avoid signed/unsigned confusion in callers. Signed-off-by: Daniel Stone <daniels@collabora.com> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Jason Ekstrand <jason@jlekstrand.net>	2017-03-23 15:28:41 +00:00
Eric Engestrom	ec0313fd58	REVIEWERS: add autogen.sh to the autoconf group Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-03-23 14:50:51 +00:00
Eric Engestrom	0adc9832f5	docs/submittingpatches: add mention about legal disclaimers Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-23 14:21:48 +00:00
Julien Isorce	48b5f1cca7	r600_shader.c: fix indentation Introduced by `ad13bd2e51` Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-23 13:21:37 +00:00
Topi Pohjolainen	90633079ec	glx: Prefer library path given by pkgconfig over the system Recent change to use drmGetDevices2() made me realize that build configured using PKG_CONFIG_PATH=my_drm_lib_path/pkgconfig ./autogen.sh considers the libdrm path gotten from pkgconfig only during make. When invoking "make install" the relink command puts system library ahead of the path gotten from pkgconfig (and starts to fail as system libdrm isn't new enough). This change forces the relink command to respect pkgconfig settings. It looks to me that in https://bugs.freedesktop.org/show_bug.cgi?id=100259 with Emil et al considering it a libtool bug. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> [Emil Velikov: add inline comment] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-23 12:30:19 +00:00
Tapani Pälli	4f69573178	intel: move gen_decoder.* to DECODER_FILES patch adds DECODER_FILES for libintel_common, this is so that platforms such as Android not currently using this functionality can opt out. Fixes: `7d84bb3` ("intel: Move tools/decoder.[ch] to common/gen_decoder.[ch].") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-23 14:05:19 +02:00
Tapani Pälli	bcae4eb502	android: fix vulkan build issues with anv_entrypoints Patch fixes entrypoint generation for libmesa_anv_entrypoints that still used old style of calling generator script. Also small fixes to libmesa_vulkan_common where there was a typo in target name (vulknan) and files were generated to wrong folder. Fixes: `8211e3e6` ("anv: Generate anv_entrypoints header and code in one command") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-23 14:04:44 +02:00
Mauro Rossi	0ff8ac1b55	android: i965: generate code for OA counter queries Automake generation rules are replicated for android. $* macro was expected to return "hsw" but instead gives "hsw.{h,c}" so $(basename $*) is used as a workaround to set the correct --chipset option for brw_oa.py script. Build tested with nougat-x86 Fixes: `e565505` "i965: Add script to gen code for OA counter queries" Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Robert Bragg <robert@sixbynine.org> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-23 08:20:18 +02:00
Tapani Pälli	dc9ebc6ef1	android: rename Intel Vulkan library to match desktop one Original naming was following Vulkan HAL naming scheme for no good purpose and we need same binary name for build-id code. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-23 08:19:16 +02:00
Boyan Ding	51b7fae1ae	nouveau: enable glsl/tgsi on-disk cache v2: Fix argument to nouveau_screen_get_name() Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-22 22:51:35 -04:00
Eric Engestrom	e8875c7a87	REVIEWERS: add myself as a reviewer for EGL and docs Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-23 00:35:45 +00:00
Dylan Baker	4ee675d537	anv: Remove dead prototype from entrypoints Spotted by Emil. v2: - Add this patch Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	860beb99a6	anv: use cElementTree in anv_entrypoints_gen.py It's written in C rather than pure python and is strictly faster, the only reason not to use it that it's classes cannot be subclassed. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	9050138af7	anv: don't use Element.get in anv_entrypoints_gen.py This has the potential to mask errors, since Element.get works like dict.get, returning None if the element isn't found. I think the reason that Element.get was used is that vulkan has one extension that isn't really an extension, and thus is missing the 'protect' field. This patch changes the behavior slightly by replacing get with explicit lookup in the Element.attrib dictionary, and using xpath to only iterate over extensions with a "protect" attribute. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	4d4697f868	anv: use dict.get in anv_entrypoints_gen.py Instead of using an if and a check, use dict.get, which does the same thing, but more succinctly. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	96a5f2a5ac	anv: anv_entrypoints_gen.py: use reduce function. Reduce is it's own reward. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	dd3830d11b	anv: anv-entrypoints_gen.py: rename hash to cal_hash. hash is reserved name in python, it's the interface to access an object's hash protocol. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	8211e3e60d	anv: Generate anv_entrypoints header and code in one command This produces the header and the code in one command, saving the need to call the same script twice, which parses the same XML file. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	383032c700	anv: anv_entrypoints_gen.py: directly write files instead of piping This changes the output to be written as a file rather than being piped. This had one critical advantage, it encapsulates the encoding. This prevents bugs where a symbol (generally unicode like © [copyright]) is printed and the system being built on doesn't have a unicode locale. v2: - Update Android.mk v3: - Don't generate both files at once - Fix Android.mk - drop --outdir, since the filename is passed in as an argument Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	a2a2bad2e2	anv: convert C generation to template in anv_entrypoints_gen.py This produces a file that is identical except for whitespace, there is a table that has 8 columns in the original and is easy to do with prints, but is ugly using mako, so it doesn't have columns; the data is not inherently tabular. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	0d8e22c5e4	anv: convert header generation in anv_entrypoints_gen.py to mako This produces an identical file except for whitespace. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	ba1085c694	anv: Update "do not edit" comments with proper filename This does two things, first it updates both the .h and the .c file to have the same do not edit string. Second, it uses __file__ to ensure that even if the file is moved or renamed that the name will be correct. One thing to note is the use of '{{' and '}}' in the C template. This is to instruct python to print a literal '{' and '}' respectively, rather than treating the contents as a formatter specifier. v3: - add this patch Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	ed9339bf26	anv: split main into two functions in anv_entrypoints_gen.py This is groundwork for the next patches, it will allows porting the header and the code to mako separately, and will also allow both to be run simultaneously. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	1610b3dede	anv: don't pass xmlfile via stdin anv_entrypoints_gen.py It's slow, and has the potential for encoding issues. v2: - pass xml file location via argument - update Android.mk Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	8017da8dd2	anv: make constants capitals in anv_entrypoints_gen.py Again, it's standard python style. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	08a6d3b4ba	anv: Use python style in anv_entrypoints_gen.py These are all fairly small cleanups/tweaks that don't really deserve their own patch. - Prefer comprehensions to map() and filter(), since they're faster - replace unused variables with _ - Use 4 spaces of indent - drop semicolons from the end of lines - Don't use parens around if conditions - don't put spaces around brackets - don't import modules as caps (ET -> et) - Use docstrings instead of comments v2: - Replace comprehensions with multiplication Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Dylan Baker	abd72f2e35	anv: anv_entrypoints_gen.py: use a main function This is just good practice. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-03-22 16:22:00 -07:00
Alex Smith	bc5d587a80	radv: Invalidate L2 for TRANSFER_WRITE barriers CP DMA and PKT3_WRITE_DATA (in CmdUpdateBuffer) don't (currently) write through L2. Therefore, to make these writes visible to later accesses we must invalidate L2 rather than just writing it back, to avoid the possibility that stale data is read through L2. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-23 09:20:31 +10:00
Vinson Lee	bb32ea4fc6	glsl: Link glsl_compiler with CLOCK_LIB. Fix linking error on CentOS 6. CXXLD glsl_compiler glsl/.libs/libstandalone.a(lt16-libmesautil_la-u_queue.o): In function `u_thread_get_time_nano': src/util/../../src/util/u_thread.h:84: undefined reference to `clock_gettime' Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-22 14:58:18 -07:00
Timothy Arceri	6a9020f8dc	util/disk_cache: use rand_xorshift128plus() to get our random int Otherwise for apps that don't seed the regular rand() we will always remove old cache entries from the same dirs. V2: assume bits returned by rand are independent uniformly distributed bits and grab our hex value without taking the modulus of the whole value, this also fixes a bug where 'f' was always missing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-23 08:16:29 +11:00
Timothy Arceri	dd00a3c923	util/rand_xor: add function to seed rand V2: pass the seed to the seed function so that we can isolate its uses. Stop leaking fd when urandom couldn't be read. Reviewed-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-23 08:16:29 +11:00
Timothy Arceri	53660c2366	util: move rand_xorshift128plus() to utils V2: pass the seed to rand_xorshift128plus() so that we can isolate its uses. Reviewed-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-23 08:16:29 +11:00
Samuel Pitoiset	e11049f2c3	drirc: add force_glsl_abs_sqrt() for "Spec Ops: The Line" Game ported from D3D9 which expects sqrt() to compute the absolute value as explained in the spec. This gets rid of the NaN values as well as the black squares with RadeonSI. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97338 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-22 22:02:20 +01:00
Samuel Pitoiset	7a0ecbfffd	st/glsl_to_tgsi: enable lower_sqrt() conditionally It relies on the force_glsl_abs_sqrt driconf option. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-22 22:02:20 +01:00
Samuel Pitoiset	737c734cd4	glsl: lower sqrt(abs()) and inversesqrt(abs()) if requested Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-22 22:02:12 +01:00
Samuel Pitoiset	448f4c0c89	driconf: add force_glsl_abs_sqrt option This will allow to force computing the absolute value for sqrt() and inversesqrt() in order to follow D3D9 behaviour for buggy apps that rely on it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-22 22:01:01 +01:00
Tim Rowley	08f864abd9	swr: [rasterizer jitter] fix llvm >= 5.0 build break Function::getArgumentList() doesn't exist anymore, switch to using arg_begin() (existed back to at least llvm-3.6.0). Reviewed-by: Vedran Miletić <vedran@miletic.net> CC: <mesa-stable@lists.freedesktop.org>	2017-03-22 13:45:35 -05:00
Rob Herring	7a5b5f5226	Android: drop Android 4.4 (KitKat) support Any users of KitKat are likely using an older version of Mesa and KitKat support adds complexity to the make files. Dropping support allows removing the MESA_LOLLIPOP_BUILD make variable in various make files. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-22 17:53:31 +00:00
Rob Herring	0e1ff22d55	Android: kill off {MESA_}ANDROID_VERSION defines aka Android 4.1 and older The Android version defines are only needed for versions less than 4.2 which aren't really supported or tested. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-22 17:52:57 +00:00
Rob Herring	6bfad7c659	Android: fix libz dependency for host targets Commit `6facb0c08f` ("android: fix libz dynamic library dependencies") added libz as a dependency, but this breaks host targets as the host dependency is libz-host. As no host lib needs libz, just remove the dependency for them. Fixes: `6facb0c08f` "android: fix libz dynamic library dependencies" Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-22 17:52:35 +00:00
Rob Herring	6f8f97a9b2	Android: remove host libmesa_util The host libmesa_util is never used for Android builds, so remove it. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-22 17:52:23 +00:00
Rob Herring	5410c60112	Android: clean-up trailing '\' in make variables Fixed with the following command: perl -pe 'BEGIN{undef $/;} s/ \\\n\n/\n\n/smg' $(find . -name 'Android.*') Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-22 17:52:06 +00:00
Emil Velikov	50a9b0cb43	mesa/main: remove unused strndup.h include No longer needed as of commit `ac257f1070` ("glsl: calculate TOP_LEVEL_ARRAY_SIZE and STRIDE when adding resources") Reported-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-22 17:51:07 +00:00
Emil Velikov	68b545fa27	util: automake: beautify sources list Remove trailing tabs and sort alphabetically. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:23 +00:00
Emil Velikov	e0129f3142	util/strndup: move header inclusion as applicable Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:23 +00:00
Emil Velikov	e325fc12db	util: inline strndup implementation in the header Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:23 +00:00
Emil Velikov	d542d2fc13	util: consistently use ifndef guards over pragma once Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:22 +00:00
Emil Velikov	43a9ca8eb4	mesa/program: consistently use ifndef guards over pragma once Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:22 +00:00
Emil Velikov	f66fe28d9f	mesa/main: consistently use ifndef guards over pragma once Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:22 +00:00
Emil Velikov	2438c0a236	intel/compiler: consistently use ifndef guards over pragma once Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:22 +00:00
Emil Velikov	868324419e	intel/common: consistently use ifndef guards over pragma once Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:22 +00:00
Emil Velikov	8c8761b237	i965: consistently use ifndef guards over pragma once The only remaining case is the brw_oa.py generator which pipes the generated file to stdout. That will be resolved with later commits. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:22 +00:00
Emil Velikov	b04916285e	st/wgl: consistently use ifndef guards over pragma once Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:22 +00:00
Emil Velikov	1385e58805	egl/dri2: consistently use ifndef guards over pragma once Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:22 +00:00
Emil Velikov	b27a883205	spirv: consistently use ifndef guards over pragma once Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:22 +00:00
Emil Velikov	e3de145fa2	nir: consistently use ifndef guards over pragma once Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:22 +00:00
Emil Velikov	b08aee305e	glsl: consistently use ifndef guards over pragma once Through the glsl headers we had an odd mix of guards be that "ifndef", "pragma once" neither or both. Simplify things by using the more common ones (ifndef) and annotating all the sources, barring the generated builting header - builtin_int64.h. The final header - udivmod64.h - is [seemingly] unused and on its way out (patch purge it is on the mailing list). Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:22 +00:00
Emil Velikov	b0bfb5f89c	compiler: consistently use ifndef guards over pragma once Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:21 +00:00
Emil Velikov	b9d035e75b	radv: consistently use ifndef guards over pragma once Namely: annotate the single file which is not using a ifndef guard - vk_format.h Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:21 +00:00
Emil Velikov	95ab07c586	ac: consistently use ifndef guards over pragma once Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:21 +00:00
Emil Velikov	3b277bae66	i965: make brw_setup_image_uniform_values static Used only internally. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-22 16:55:21 +00:00
Emil Velikov	7e79e895a6	docs/releasing: do not pass any arguments to autogen.sh This should just work (tm) with the default options. Plus the one we pass is already the default, so just drop it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2017-03-22 16:55:21 +00:00
Emil Velikov	559ca99ce1	mesa: more unused linux/version.h include The header provides the LINUX_VERSION_CODE and KERNEL_VERSION macros. With neither of which being used by any part of mesa. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-22 16:55:21 +00:00
Marek Olšák	84012262ea	ac: fix build with LLVM 5.0svn Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-22 17:54:42 +01:00
Marek Olšák	6e2b9fd071	gallivm: remove lp_add_attr_dereferenceable in favor of amd/common Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-22 17:54:40 +01:00
Jason Ekstrand	7ab03ba725	anv/device: Move push descriptor query handling The query is a properties query so it needs to be handled in GetPhysicalDeviceProperties2, not GetPhysicalDeviceFeatures2. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-22 09:44:54 -07:00
Jason Ekstrand	c942faf8f3	anv/image: Return early when unbinding an image Found by inspection. Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-03-22 09:44:54 -07:00
Grazvydas Ignotas	10d3702a36	util/sha1: harmonize _mesa_sha1_* wrappers Rather than using 3 different ways to wrap _mesa_sha1_() to SHA1() functions (a macro, prototype with implementation in .c and an inline function), make all 3 inline functions. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-22 11:33:51 +00:00
Emil Velikov	64b9a37c3b	anv: android: remove unused include/vulkan include Spotted while skimming through similar hunks for the Autotools build. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-03-22 11:33:40 +00:00
Emil Velikov	1fa6a33e4d	anv: automake: use the local headers over any system provided ones At the moment, we would honour any system headers - vulkan_intel.h in particular over the ones in-tree. Thus, if one does incremental build of mesa, without the vulkan.h already installed (or at least not in the same directory as vulkan_intel.h) the build will fail. In the future we might want to upstream the vulkan_intel.h within vulkan.h or use other ways to make vulkan_intel.h obsolete. In either case, the more robust thing is to rely on our own copy. v2: Move AM_CPPFLAGS just above LIBDRM_CFLAGS (Grazvydas, Jason) Tested-by: Grazvydas Ignotas <notasas@gmail.com> Fixes: `ee8044fd` "intel/vulkan: Get rid of recursive make" Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reported-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-22 11:32:54 +00:00
Nicolai Hähnle	c11dcfb5e9	mesa/main: fix MultiDrawElements[BaseVertex] validation of primcount primcount must be a GLsizei as in the signature for MultiDrawElements or bad things can happen. Furthermore, an error should be flagged when primcount is negative. Curiously, this code used to work somewhat correctly even when primcount was negative, because the loop that checks count[i] would iterate out of bounds and almost certainly hit a negative value at some point. Found by an ASAN error in GL45-CTS.gtf32.GL3Tests.draw_elements_base_vertex.draw_elements_base_vertex_primcount Note that the OpenGL spec seems to have s/primcount/drawcount/ at some point, and the code still reflects the old language. v2: provide the correct spec quotes (pointed out by Ian) Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-03-22 12:12:29 +01:00
Nicolai Hähnle	c2dfff280b	mesa: Avoid out-of-bounds stack read via _mesa_Materiali MATERIALFV may end up reading up to 4 floats from the passed parameter. This should really set a GL_INVALID_ENUM error in the cases where it matters, but does anybody really care? Found by ASAN in piglit gl-1.0-beginend-coverage. v2: fix a trivial compiler warning Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)	2017-03-22 12:12:11 +01:00
Vinson Lee	bd6f0dcafc	configure.ac: Do not strip away space after regex word match. Fixes: `62c48ccb41` ("configure.ac: Use POSIX compatible regex for word boundary.") Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2017-03-22 00:30:22 -07:00
Vinson Lee	62c48ccb41	configure.ac: Use POSIX compatible regex for word boundary. Fixes build error on Mac OS X. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100236 Suggested-by: Jan Beich <jbeich@freebsd.org> Suggested-by: Michel Dänzer <michel@daenzer.net> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-21 23:56:17 -07:00
Chad Versace	44ac618a41	isl: Refactor row pitch calculation (v2) The calculations of row_pitch, the row pitch's alignment, surface size, and base_alignment were mixed together. This patch moves the calculation of row_pitch and its alignment to occur before the calculation of surface_size and base_alignment. This simplifies a follow-on patch that adds a new member, 'row_pitch', to struct isl_surf_init_info. v2: - Also extract the row pitch alignment. - More helper functions that will later help validate the row pitch. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> (v2) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)	2017-03-21 15:56:16 -07:00
Chad Versace	c2b706f8af	isl: Drop misplaced comment about padding isl has a giant comment that explains the hardware's padding requirements. (Hint: Cache lines and page faults). But the comment is in the wrong place, in isl_calc_linear_row_pitch(), which is unrelated to padding. The important parts of that comment were copied to isl_apply_surface_padding() long ago. So drop the misplaced comment. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-21 15:56:13 -07:00
Ben Widawsky	0e55e46540	i965/dri: Turn on support for image modifiers All the plumbing is in place so the extension just needs to be advertised. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-21 14:48:12 -07:00
Ben Widawsky	cd6bd7f123	i965/dri: Handle X-tiled modifier This doesn't really "do" anything because the default tiling for the winsys buffer is X tiled. We do however want the X tiled modifier to work correctly from the API perspective, which would imply that if you set this modifier, and later do a get_modifier, you get back at least X tiled. Running with a modified kmscube, here are the bandwidth measurements. Linear: Read bandwidth: 1039.31 MiB/s Write bandwidth: 1453.56 MiB/s Y-tiled: Read bandwidth: 458.29 MiB/s Write bandwidth: 542.12 MiB/s X-tiled: Read bandwidth: 575.01 MiB/s Write bandwidth: 606.25 MiB/s Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-21 14:48:12 -07:00
Ben Widawsky	7ce0405826	i965/dri: Handle Y-tiled modifier This patch begins introducing how we'll actually handle the potentially many modifiers coming in from the API, how we'll store them, and the structure in the code to support it. Prior to this patch, the Y-tiled modifier would be entirely ignored. It shouldn't actually be used until this point because we've not bumped the DRIimage extension version (which is a requirement to use modifiers). Measuring later in the series with kmscube: Linear: Read bandwidth: 1048.44 MiB/s Write bandwidth: 1483.17 MiB/s Y-tiled: Read bandwidth: 471.13 MiB/s Write bandwidth: 589.10 MiB/s Similar functionality was introduced and then reverted here: commit `6a0d036483` Author: Ben Widawsky <ben@bwidawsk.net> Date: Thu Apr 21 20:14:58 2016 -0700 i965: Always use Y-tiled buffers on SKL+ v2: Use last set bit instead of first set bit in modifiers to address bug found by Daniel Stone. v3: Use the new priority modifier selection thing. This nullifies the bug fixed by v2 also. v4: Get rid of modifier compaction which originally served another purpose and now serves none (Jason) Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-21 14:48:12 -07:00
Ben Widawsky	d78a36ea62	i965/dri: Handle the linear fb modifier At image creation create a path for dealing with the linear modifier. This works exactly like the old usage flags where __DRI_IMAGE_USE_LINEAR was specified. During development of this patch series, it was decided that a lack of modifier was an insufficient way to express the required modifiers. As a result, 0 was repurposed to mean a modifier for a LINEAR layout. NOTE: This patch was added for v3 of the patch series. v2: Rework the algorithm for modifier selection to go from a bitmask based selection to this priority value. v3: Make DRM_FORMAT_MOD_INVALID allowed at selection as a way of identifying no modifiers found (because 0 is LINEAR) (Jason) v4: Remove the logic to prune unknown modifiers (like those from other vendors) and simply handle is in select_best_modifier (Jason) Requested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-21 14:48:12 -07:00
Ben Widawsky	79f619ca70	i965/dri: Enable modifier queries New to the patch series after reordering things for landing smaller chunks. This will essentially enable modifiers from clients that were just enabled in previous patches. A client could use the modifiers by setting all of them at create, but had no way to actually query them after creating the surface (ie. stupid clients could be broken before this patch, but in more ways than this). Obviously, there are no modifiers being actually stored yet - so this patch shouldn't do anything other than allow the API to get back 0 (or the LINEAR modifier). Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-21 14:48:12 -07:00
Ben Widawsky	fc1e9f0cb2	i965/dri: Store the screen associated with the image I intend to need to get to the devinfo structure, and storing the screen is an easy way to do that. It seems to be the consensus that you cannot share an image between multiple screens. Scape-goat: Rob Clark <robdclark@gmail.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-21 14:48:11 -07:00
Ben Widawsky	2a16de9e4b	gbm: Disallow INVALID modifiers returned upon image creation v2: Add a TODO about modifier validation (Jason) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-21 14:48:11 -07:00
Ben Widawsky	962b31da95	i965/dri: Disallow image with INVALID modifier Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-21 14:48:03 -07:00
Kenneth Graunke	b15038a289	i965: Shut up major()/minor() warnings. Recent glibc generates this warning: brw_performance_query.c:1648:13: warning: In the GNU C Library, "minor" is defined by <sys/sysmacros.h>. For historical compatibility, it is currently defined by <sys/types.h> as well, but we plan to remove this soon. To use "minor", include <sys/sysmacros.h> directly. If you did not intend to use a system-defined macro "minor", you should undefine it after including <sys/types.h>. min = minor(sb.st_rdev); So, include sys/sysmacros.h to shut up the warning. v2: Use the AC_HEADER_MAJOR defines to figure out the right header (thanks to Jonathan Gray for helping me not break non-glibc systems) Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Emil Velikov <emli.velikov@collabora.com>	2017-03-21 14:10:17 -07:00
Kenneth Graunke	0c3fbf8028	i965: Drop AUB_TRACE_* stuff. This was used for aubdumping (deleted a while ago) and INTEL_DEBUG=bat decoding (deleted recently). While we're changing parameters, delete the wrapper macro and make the actual function brw_state_batch instead of __brw_state_batch. This subsumes a patch by Emil Velikov to drop this from BLORP. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-21 13:49:18 -07:00
Kenneth Graunke	705c38e96f	i965: Use aubinator/genxml for INTEL_DEBUG=bat state decoding. This deletes all of our handwritten code in favor of autogenerated genxml-based decoding. This should be much more usable, as the old code isn't entirely accurate - we updated some things for new generations, but not everything. Aubinator has one annoying limitation: it has no idea how many entries to print when encountering e.g. 3DSTATE_BINDING_TABLE_POINTERS_VS. It picks an arbitrary number, which may skip decoding valid data, and may print extra garbage entries. We do a better job here by making brw_state_batch track the size of the data stored at a particular batchbuffer offset. Then, we can divide by the structure size to obtain the exact number of entries. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-21 13:49:15 -07:00
Kenneth Graunke	5fab46572f	i965: Use aubinator/genxml for INTEL_DEBUG=bat commands. This should give substantially better decoding, as the public libdrm decoder hasn't been properly maintained in years. For now, we reuse the existing state dumping mechanism. We'll improve that in the next patch. To avoid increasing the size of the driver, we restrict this feature to debug builds of Mesa. There's probably very little use for it in release builds anyway. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-21 13:49:13 -07:00
Kenneth Graunke	7d84bb32aa	intel: Move tools/decoder.[ch] to common/gen_decoder.[ch]. This way they become part of libintel_common.la so I can use them in the i965 driver. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-21 13:49:10 -07:00
Kenneth Graunke	2b074bb7e5	intel: Add a INTEL_DEBUG=color option. This will be used for color output in debug messages. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-21 13:48:53 -07:00
Vinson Lee	1fa432741c	nir: Add positional argument specifiers. Fix build with Python < 2.7. File "src/compiler/nir/nir_builder_opcodes_h.py", line 46, in <module> from nir_opcodes import opcodes File "src/compiler/nir/nir_opcodes.py", line 178, in <module> unop_convert("{}2{}{}".format(src_t[0], dst_t[0], bit_size), ValueError: zero length field name in format Fixes: `762a6333f2` ("nir: Rework conversion opcodes") Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2017-03-21 13:38:00 -07:00
Julien Isorce	ad13bd2e51	r600_shader.c: check returned value of eg_get_interpolator_index Like done in another place in that same file. CID 1250588 Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-21 18:14:26 +00:00
Timothy Arceri	020b3f0c46	util/disk_cache: fix build on platforms where shader cache is disabled	2017-03-21 11:51:03 +11:00
Grazvydas Ignotas	b9a370f2b4	util/disk_cache: add a write helper Simplifies the write code a bit and handles EINTR. V2: (Timothy Arceri) Drop EINTR handling. To do it properly we would need a retry limit but it's probably best to just avoid trying to write if we hit EINTR and try again next time we see the program. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-21 11:51:03 +11:00
Grazvydas Ignotas	af73acca2b	tests/cache_test: use the blob key's actual first byte There is no need to hardcode it, we can just use blob_key[0]. This is needed because the next patches are going to change how cache keys are computed. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-21 11:15:52 +11:00
Grazvydas Ignotas	529a767041	util/disk_cache: use a helper to compute cache keys This will allow to hash additional data into the cache keys or even change the hashing algorithm easily, should we decide to do so. v2: don't try to compute key (and crash) if cache is disabled Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-21 11:15:52 +11:00
Dave Airlie	021c87fa24	radv: move KHR_get_physical_device_properties2 to instance props. This is an instance property not a device one. Fixes: dEQP-VK.api.info.device.extensions Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-21 10:05:49 +10:00
Dave Airlie	93e62898cc	radv: drop illegal DB format error. We'll get this if we have a stencil only setup. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-21 10:05:49 +10:00
Kenneth Graunke	72c89522c2	i965: Add autogenerated OA files to .gitignore.	2017-03-20 16:28:04 -07:00
Tim Rowley	fe325e6423	swr: [rasterizer] Cleanup naming of codegen files All template files and generated files are prefixed with gen_. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:05:54 -05:00
Tim Rowley	cf8fa67364	swr: [rasterizer codegen] Remove BOM from knob_defs.py Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:05:54 -05:00
Tim Rowley	8a5069e81f	swr: [rasterizer codegen] Rewrite gen_llvm_types.py to use mako Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:05:54 -05:00
Tim Rowley	5d0b3b05a2	swr: [rasterizer codegen] Fix generation of knobs Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:05:54 -05:00
Tim Rowley	4ed72758db	swr: [rasterizer codegen] Change backend template comment style Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:05:54 -05:00
Tim Rowley	2776d94545	swr: [rasterizer codegen] Rewrite gen_llvm_ir_macros.py to use mako Don't create/use cpp files, header only now. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:05:54 -05:00
Tim Rowley	9538ba9bd1	swr: [rasterizer codegen] Quiet gen_backends.py execution Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:05:54 -05:00
Tim Rowley	97cbabc8fb	swr: [rasterizer scripts] Put codegen scripts into a separate directory Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:05:54 -05:00
Tim Rowley	7046695a0e	swr: [rasterizer core] Fix trifan regression from `9d3442575f` Fixes piglit triangle-rasterization-overdraw. SIMD16 path not working. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:05:22 -05:00
Tim Rowley	4cb69e817c	swr: [rasterizer core] SIMD16 Frontend WIP - fix tesselation crashes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	ab3f4449c3	swr: [rasterizer jitter] Fix LogicOp blend jit after assert changes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	8cd8240cfc	swr: [rasterizer] Convert more SWR_ASSERT(false, ...) to SWR_INVALID(...) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	ab032fb436	swr: [rasterizer core] Fix typo in SIMD16 code path Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	d011ba74ee	swr: [rasterizer core/common] Fix the native AVX512 build under ICC Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	2f513d8d83	swr: [rasterizer core] Allow no arguments to SWR_INVALID macro Turns out this is somewhat tricky with gcc/g++. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	0b066b2bf3	swr: [rasterizer] Slight assert refactoring Make asserts more robust. Add SWR_INVALID(...) as a replacement for SWR_ASSERT(0, ...) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	f445b6de9c	swr: [rasterizer] Backend code adjustments Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	e4d1294afb	swr: [rasterizer archrast] Fix the early and late depthstencil events The coverage and stencil mask arguments were reversed. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	a508c2c2ac	swr: [rasterizer core] Implement double pumped SIMD16 TESS Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	2cbac00221	swr: [rasterizer archrast/core/scripts] Fix archrast multithreading issue Per pixel stats are cached but were not always being flushed as threads moved from one draw context to the next. Added an explicit flush to allow all archrast objects to flush any cached events. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	0a36a7cf04	swr: [rasterizer archrast] Remove redundant data from archrast files If count can be derived from other counts then this can be done in post processing scripts. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	1cc885d1d1	swr: [rasterizer archrast/scripts] Further archrast cleanups Removed redundant data being written out to file Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	1399fbd6fd	swr: [rasterizer core] Fix RECT_LIST primitive assembly The bug would make the 3rd component of attributes on the second triangle of a RECT be invalid. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	ade5351900	swr: [rasterizer common] Add InterpolateComponentFlat utility Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	ab04221bf1	swr: [rasterizer archrast] Fix performance issue with archrast stats Performance is now 50x faster with archrast now that we're properly filtering out all of the rdtsc begin/end. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	b228d2db18	swr: [rasterizer core] Implement SIMD16 GS and STREAMOUT Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	5830a0a6f8	swr: [rasterizer archrast] Add additional API events Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	d2759c1eb3	swr: [rasterizer core/scripts] Autogen backend initialization function(s) Autogen functions that instantiates different BackendPixelRate templates. Functions get split into separate files after reaching a user defined threshold (currently 512 per file) to speed up compilation. This change will enable the addition of more template flags in the pixel back end. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	2c820d22cf	swr: [rasterizer core] backend.h declares gBackendPixelRateTable Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	50d491e22d	swr: [rasterizer core] Finish SIMD16 PA OPT including tesselation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	9d3442575f	swr: [rasterizer core] Finish SIMD16 PA OPT except tesselation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Tim Rowley	7b94e5e1fa	swr: [rasterizer core] Support sparse numa id values on all OSes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-20 18:04:53 -05:00
Kenneth Graunke	5e29af5f77	i965: Skip register write detection when possible. Detecting register write support by trial and error introduces a stall at screen creation time, which it would be nice to avoid. Certain command parser versions guarantee this will work (see the giant comment in intelInitScreen2 below, or a few commits ago): - Ivybridge: version >= 1 (kernel v3.16) - Baytrail: version >= 2 (kernel v3.19) - Haswell: version >= 7 (kernel v4.8) For simplicity, we don't bother with version 1 in this patch. This assumes that the user hasn't disabled aliasing PPGTT via a kernel command line parameter. Don't do that - you're only breaking things. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-03-20 15:58:05 -07:00
Kenneth Graunke	31693a13f8	i965: Set screen->cmd_parser_version to 0 if we can't write registers. If we can't write registers, then the effective command parser version is 0 - it may exist, but it's not usefully enabling anything. See kernel commit 1ca3712ca3429a617ed6c5f87718e4f6fe4ae0c6 (in v4.8) where the kernel starts doing this for us. This makes us do more or less the same thing on older kernels. This should preserve a bit of sanity by allowing us to perform a screen->cmd_parser_version > N check to determine that we really can use the features promised by command parser version N. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-03-20 15:58:05 -07:00
Kenneth Graunke	4a2ad6b145	i965: Document the sad story of the kernel command parser. This should help us figure out the complexities of which kernel versions we need to get various features on various platforms. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-03-20 15:58:05 -07:00
Kenneth Graunke	9b324e4dca	i965: Fall back to GL 4.2/4.3 on Haswell if the kernel isn't new enough. In commit `d2590eb65f` I enabled GL 4.5 on Haswell...but failed to check if we could do indirect compute shader dispatch...and query buffer objects. Indirect compute shader dispatch requires command parser version 5 (kernel commit 7b9748cb513a6bef4af87b79f0da3ff7e8b56cd8, which is in Linux v4.4). On earlier kernels we would have disabled ARB_compute_shader, which is a mandatory part of OpenGL 4.3+. Query buffer objects currently require MI_MATH and MI_LOAD_REGISTER_REG, which mean command parser version 7 (Linux v4.8). On earlier kernels we would have disabled ARB_query_buffer_object, which is a mandatory part of OpenGL 4.4+. The new version support looks like: - Kernel 4.1 and older => OpenGL 3.3 - Kernel 4.2-4.3 => OpenGL 4.2 - Kernel 4.4-4.7 => OpenGL 4.3 - Kernel 4.8+ => OpenGL 4.5 Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-03-20 15:58:05 -07:00
Constantine Kharlamov	99d400b78f	r600g/sb: Fix memory leak by reworking uses list (rebased) The author is Heiko Przybyl(CC'ing), the patch is rebased on top of Bartosz Tomczyk's one per Dieter Nützel's comment. Tested-by: Constantine Charlamov <Hi-Angel@yandex.ru> v2: Resend the patch again through git-email. The prev. rebase was sent through Thunderbird, which screwed up tab characters, making the patch not apply. -------------- When fixing the stalls on evergreen I introduced leaking of the useinfo structure(s). Sorry. Instead of allocating a new object to hold 3 values where only one is actually used, rework the list to just store the node pointer. Thus no allocating and deallocation is needed. Since use_info and use_kind aren't used anywhere, drop them and reduce code complexity. This might also save some small amount of cycles. Thanks to Bartosz Tomczyk for finding the bug. Reported-by: Bartosz Tomczyk <bartosz.tomczyk86 at gmail.com <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>> Signed-off-by: Heiko Przybyl <lil_tux at web.de <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>> Supersedes: https://patchwork.freedesktop.org/patch/135852 Signed-off-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-03-20 23:23:50 +01:00
Marek Olšák	827ae79b2c	radeonsi: check the IR type before waiting for a compute compilation fence This should fix OpenCL getting stuck. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100288 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-20 23:17:14 +01:00
Kenneth Graunke	4084083124	aubinator: Move the guts of decode_group() to decoder.c. This lets us use it outside of the aubinator binary itself. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-20 11:20:51 -07:00
Kenneth Graunke	aa1ef0b984	aubinator: Drop spec parameter to decode_group(). No longer necessary - the iterator gets it from the group. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-20 11:20:51 -07:00
Kenneth Graunke	b2c0c1d9a5	aubinator: Make the iterator store a pointer to structure descriptions. When the iterator encounters a structure field, it now looks up the gen_group for that structure definition and saves a pointer to it. This lets us drop a lot of ridiculous code in the caller, which looked at item->value (<struct NAME dword>), strtok'd the structure name back out, and looked it up itself. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-20 11:20:51 -07:00
Kenneth Graunke	a1aa78cb45	aubinator: Track the current field's starting dword offset. The iterator code already computed this value, then we stored it in the structure name, strtok'd it back out, and also manually computed it when printing dword headers. Just put the value in the struct and use it. Way simpler. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-20 11:20:51 -07:00
Kenneth Graunke	e6f7357cab	aubinator: Drop decode_structure() helper. It made more sense when decode_group() took a bunch of extra options, but now that there's only one...we may as well pass 0 and call it a day. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-20 11:20:51 -07:00
Kenneth Graunke	a8d4184b00	aubinator: Drop unused print_dword_headers flag. I added this flag in `65a9d5eabb` but it was completely unused. Both callers appear to have printed dword headers, so we can just drop the flag and continue doing it unconditionally. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-20 11:20:51 -07:00
Kenneth Graunke	7f21cb56b8	aubinator: Store a pointer from gen_group back to gen_spec. When decoding a structure field within a group, we may want to look up that structure type. Having a gen_spec pointer makes it easy to do so. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-20 11:20:51 -07:00
Kenneth Graunke	2c6c760a4b	aubinator: Store enum textual name in iter->value. gen_field_iterator_next() produces a string representing the value of the field. For enum values, it also produced a separate "description" string containing the textual name of the enum. The only caller of this function combines the two, printing enums as "<numeric value> (<texture enum name>)". We may as well just store that in item->value directly, eliminating the description field, and a layer of wrapping. v2: Use non-overlapping source and destination strings in snprintf. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-20 11:20:51 -07:00
Julien Isorce	a6e2124402	si_descriptor: move velems nullity check before dereference CID 1399479: Dereference before null check (REVERSE_INULL) check_after_deref: Null-checking velems suggests that it may be null, but it has already been dereferenced on all paths leading to the check. Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-20 18:01:51 +00:00
Julien Isorce	521860b2a9	radeon_drm_bo: explicitly check return value of drmCommandWriteRead CID 1313492 Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-20 18:01:51 +00:00
Julien Isorce	dac124466a	si_pipe: remove nullity check after dereference sscreen cannot be NULL CID 1354483 Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-20 18:01:41 +00:00
Julien Isorce	ce27b27c38	radeon: initialize hole variable before calling container_of Like in a few other places in that radeon_drm_bo.c file. CID 715739. Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-20 16:47:31 +00:00
Nanley Chery	7c50f9903f	intel: Correct the BDW surface state size The PRMs state that this packet is 16 DWORDS long. Ensure that the last three DWORDS are zeroed as required by the hardware when allocating a null surface state. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-03-20 09:43:44 -07:00
Bartosz Tomczyk	f4b23589da	r600g: Fix out of bounds access fc_sp variable should indicate number of elements in fc_stack array, but fc_sp was increased at beginning of fc_pushlevel function. It leads to situation where idx=0 was never used, and last 32 element was stored outside fs_stack array. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-03-20 17:32:53 +01:00
Constantine Kharlamov	f9190f3e65	r600g: update sb documentation v2: s/r600/r600g in the title Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-03-20 17:11:15 +01:00
Constantine Kharlamov	64cbbd2888	r600g: make condition clearer The second check in the old code looked pretty much unreachable, esp. because it's not obvious that "max_entries" could be zero. To find out that it was intentional I had to run some checks, and to dig into the old versions of the file. So, rewrite the check to make the intention clear. v2: s/r600/r600g in the title, and per Dieter Nützel's comment wrap lines of condition. Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-03-20 17:11:15 +01:00
Emil Velikov	36e029d356	docs: add news item and link release notes for 13.0.6/17.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-20 14:25:18 +00:00
Emil Velikov	54fd78f637	docs: add sha256 checksums for 17.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `9b66351f5b`)	2017-03-20 14:20:32 +00:00
Emil Velikov	887ad468b5	docs: add release notes for 17.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `373d88a711`)	2017-03-20 14:20:31 +00:00
Emil Velikov	9bad99742f	docs: add sha256 checksums for 13.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `879d24c497`)	2017-03-20 14:20:26 +00:00
Emil Velikov	0babb9e091	docs: add release notes for 13.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `fcef88d13a`)	2017-03-20 14:20:25 +00:00
Xu,Randy	57595cb073	anv/genX: Solve the vkCreateGraphicsPipelines crash The crash is due to NULL pColorBlendState, which is legal if the pipeline has rasterization disabled or if the subpass of the render pass the pipeline is created against does not use any color attachments. Test: Sample subpasses from LunarG can run without crash Signed-off-by: Xu,Randy <randy.xu@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-03-20 08:31:18 +02:00
Dave Airlie	e70e7cc7ff	radv: fix logic for when to flush on multiple CS emission The current code evaluated to always true, we only want to flush on the first submit. Rename the variable to do_flush, and only emit on the first iteration. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-20 14:17:43 +10:00
Jason Ekstrand	fcca6a83cd	spirv: Implement IsInf using an integer comparison Since we already do fabs on the one source, we're guaranteed to get positive infinity if we get any infinity at all. Since +inf only has one IEEE 754 representation, we can use an integer comparison and avoid all of the ordered/unordered issues. Cc: Dave Airlie <airlied@redhat.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-20 14:08:19 +10:00
Dave Airlie	e0208949d1	radv/meta: fix image clears for r4g4 format. This just uses an 8-bit clear and packs the values. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-20 13:41:31 +10:00
Dave Airlie	10c2b588c4	Revert "radv: fallback to an in-memory cache when no pipline cache is provided" This reverts commit `2845a108a9`. This break VK-GL-CTS randomly. ./deqp-vk --deqp-case=dEQP-VK.texture.filtering.3d.formats.r4g4b4a4* bounces around here from 6/6 to 3/6 or 4/6 to hanging. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-20 13:41:31 +10:00
Timothy Arceri	72fa447d45	mesa: disable glthread when glNewList() is called glNewList() swaps dispatch tables, and we don't have anything in place to handle that in glthread. Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2017-03-20 10:22:20 +11:00
Dave Airlie	d06e168b87	radv: fix primitive reset index emission This was meant to be checking the index type to get the correct index not the last emitted one. This fixes: dEQP-VK.pipeline.input_assembly.primitive_restart.index_type_uint32.triangle_strip_with_adjacency Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-20 08:47:03 +10:00
Grazvydas Ignotas	274aaa331c	util/disk_cache: check rename result I haven't seen this causing problems in practice, but for correctness we should also check if rename succeeded to avoid breaking accounting and leaving a .tmp file behind. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-20 08:24:46 +11:00
Grazvydas Ignotas	67911fa4b8	util/disk_cache: delete .tmp if target exists At the time of target file check, .tmp file is already created and file lock is held, so we should remove the .tmp, like in other error paths. With this, piglit no longer leaves large amount of empty .tmp files behind, which waste directory entries and may interfere with eviction. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-20 08:24:38 +11:00
Grazvydas Ignotas	bd93cea691	util/disk_cache: fix stored_keys index It seems there is a bug because: - 20 bytes are compared, but only 1 byte stored_keys step is used - entries can overlap each other by 19 bytes - index_mmap is ~1.3M in size, but only first 64K is used With this fix for Deus Ex: - startup time (from launch to Feral logo): ~38s -> ~16s - disk_cache_has_key() hit rate: ~50% -> ~96% Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-20 08:14:31 +11:00
Ilia Mirkin	663e7c25f5	nv30: create uploader after pipe->screen is set Fixes crashes after recent upload rework. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-19 01:24:06 -04:00
Ilia Mirkin	0e9232dbcc	nv50,nvc0: enable TEX_LZ and TXF_LZ There should be minimal gain, if any, for nvc0, but nv50 may end up noticing more often that the lod argument is uniform. This, in turn, will remove the need for some unnecessary transformations, which were being hit due to the checks being done pre-ssa. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-18 20:37:52 -04:00
Ilia Mirkin	dab88e9af7	st/mesa: set result writemask based on ir type This prevents textureQueryLevels, which maps as LODQ, from ending up with a xyzw writemask, which is illegal. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100061 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-18 20:16:45 -04:00
Karol Herbst	09f16de7e6	nvc0/ir: treat FMA like MAD for operand propagation Helps mainly Feral-ported games, due to their use of fma() shader-db changes: total instructions in shared programs : 3901147 -> 3842505 (-1.50%) total gprs used in shared programs : 471258 -> 467359 (-0.83%) total local used in shared programs : 27405 -> 27361 (-0.16%) total bytes used in shared programs : 35749888 -> 35214176 (-1.50%) local gpr inst bytes helped 17 1829 4091 4091 hurt 4 44 3 3 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2017-03-18 20:15:45 -04:00
Alan Swanson	a7eb7984bf	util/disk_cache: pass predicate functions file stats directly (v4) Since switching to LRU eviction the only user of these predicate functions now resolves directory entry stats itself so pass them directly saving calling fstat and strlen twice (and the expensive strlen is skipped entirely if access time is newer). v2: Update for empty cache dir detection changes v3: Fix passing string length to predicate with the +1 for NULL termination and also pass sb as pointer v4: Missed ampersand for passing sb as pointer Reviewed-by: Grazvydas Ignotas <notasas@gmail.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-18 14:32:57 +11:00
Timothy Arceri	bf8bc6190e	glsl: use set for copy propagation kills Previously each time we saw a variable we just created a duplicate entry in the list. This is particularly bad for loops were we add everything twice, and then throw nested loops into the mix and the list was growing expoentially. This stops the glsl-vs-unroll-explosion test which has 16 nested loops from reaching the tests mem usage limit in this pass. The test now hits the mem limit in opt_copy_propagation_elements() instead. I suspect this was also part of the reason this pass can be so slow with some shaders. Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-03-18 14:21:09 +11:00
Timothy Arceri	9e42b93f33	st/dri: wait for thread to finish before unbinding context Fixes a bunch of piglit crashes that hit an assert() when trying to delete the framebuffer. The assert() was triggered because WinSysDrawBuffer was set to NULL before glDeleteFramebuffers() was called. Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-18 14:15:52 +11:00
Timothy Arceri	40bc1afc94	glsl: don't leak memory when trying to count loop iterations Suggested-by: Damian Dixon <damian.dixon@gmail.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99789	2017-03-18 14:12:40 +11:00
Jason Ekstrand	1d5f4f46da	genxml: Make MI_STORE_DATA_IMM have a single 64-bit data field This is way more convenient than having two separate dword fields. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 15:31:19 -07:00
Jason Ekstrand	ced61fd53e	anv: Turn on inherited queries It all just works since it's just a hardware register so we might as well turn it on. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:50 -07:00
Ilia Mirkin	e675f57d4f	anv: Implement pipeline statistics queries In the end, pipeline statistics queries look a lot like occlusion queries only with between 1 and 11 begin/end pairs being generated instead of just the one. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:50 -07:00
Jason Ekstrand	dda54890f3	anv: Disable VF statistics for blorp and SOL memcpy In order to get accurate statistics, we need to disable statistics for blits, clears, and the surface state memcpy at the top of each secondary command buffer. There are two possible approaches to this: 1) Disable before the blit/memcpy and re-enable afterwards 2) Move emitting 3DSTATE_VF_STATISTICS from initialization and make it part of pipeline state and then just disabale statistics before blits and memcpy operations. Emitting 3DSTATE_VF_STATISTICS should be fairly cheap so it doesn't really matter which path we take. We choose the second option as it's more consistent with the way the rest of the statistics are enabled and disabled. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:50 -07:00
Jason Ekstrand	9576cea519	anv/pipeline: Enable clipper statistics Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:50 -07:00
Jason Ekstrand	2a616242cd	genxml: s/Clipper Statistics Enable/Statistics Enable/ It's in 3DSTATE_CLIP, so it doesn't really need the extra detail. This matches what we do for VS, FS, etc. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:50 -07:00
Jason Ekstrand	149d10d38a	anv/query: Rework store_query_result The new version is a nice GPU parallel to cpu_write_query_result and it nicely handles things like dealing with 32 vs. 64-bit offsets in the destination buffer. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:50 -07:00
Jason Ekstrand	c773ae88df	anv/query: Break GPU query calculation into a helper Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:50 -07:00
Jason Ekstrand	7de73f0c94	genxml: Add pipeline statistics registers on gen7+ Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:50 -07:00
Jason Ekstrand	0557dfdb4a	anv/query: Add a helper for writing a query pool result Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:50 -07:00
Jason Ekstrand	bce4a935c6	anv/query: Use a variable-length slot size Not all queries are the same. Even the two queries we support today require a different amount of data per slot. Once we introduce pipeline statistics queries, the size will vary wildly. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:49 -07:00
Jason Ekstrand	1c797af2c6	anv/query: Move the available bits to the front We're about to make slots variable-length and always having the available bits at the front makes certain operations substantially easier once we do that. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:47 -07:00
Jason Ekstrand	9d43afa3dc	anv/query: Let 32-bit values wrap From the Vulkan 1.0.39 Specification: "If VK_QUERY_RESULT_64_BIT is not set and the result overflows a 32-bit value, the value may either wrap or saturate." So we can either clamp or wrap. Wrapping is both easier and what the user gets if they use vkCmdCopyQueryPoolResults and we should be consistent. We could make vkCmdCopyQueryPoolResults clamp but it's annoying and ends up burning extra batch for something the spec clearly doesn't require. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:11:35 -07:00
Alex Deucher	c2a97fb7ae	radeonsi: add new polaris12 pci id Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: 17.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-03-17 14:13:17 -04:00
Marek Olšák	4b064d16e5	gallium/radeon: formalize that create_batch_query doesn't need pipe_context Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-17 18:30:21 +01:00
Marek Olšák	be6173e7d6	gallium/radeon: formalize that create_query doesn't need pipe_context for threaded gallium Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-17 18:30:21 +01:00
Marek Olšák	04e6977e5d	gallium/radeon: reference pipe_resource in pipe_transfer for threaded gallium Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-17 18:30:21 +01:00
Marek Olšák	03127bb6d5	radeonsi: compile all TGSI compute shaders asynchronously required by threaded gallium Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-17 18:30:21 +01:00
Marek Olšák	e9c6953ddb	radeonsi: require that compiler threads are enabled threaded gallium can't use pipe_context's LLVM target machine, because create_shader_selector can be called from a non-driver thread. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-17 18:30:21 +01:00
Marek Olšák	080f322f06	trace: remove leftover assertions after pipe_resource wrapping removal	2017-03-17 18:30:21 +01:00
Marek Olšák	6c0a28084d	gallium/u_upload: make the first persistent mapping unsynchronized This is simpler for drivers.	2017-03-17 18:30:21 +01:00
Robert Bragg	a27b62e794	anv/device: init timestampPeriod from devinfo Now that there's a timebase_scale in gen_device_info which is effectively the 'period' this switches anv_GetPhysicalDeviceProperties to using this common device info to initialize the timestampPeriod device limit. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-17 16:10:22 +00:00
Robert Bragg	344d1a4015	i965: Allow a per gen timebase scale factor Prior to Skylake the Gen HW timestamps were driven by a 12.5MHz clock with the convenient property of being able to scale by an integer (80) to nanosecond units. For Skylake the frequency is 12MHz or a scale factor of 83.333333 This updates gen_device_info to track a floating point timebase_scale factor and makes corresponding _queryobj.c changes to no longer assume a scale factor of 80 works across all gens. Although the gen6_ code could have been been left alone, the changes keep the code more comparable, and it now shares a few utility functions for scaling raw timestamps and calculating deltas. The utility for calculating deltas takes into account 32 or 36bit overflow depending on the current kernel version. Note: this leaves the timestamp handling of ARB_query_buffer_object untouched, which continues to use an incorrect scale of 80 on Skylake for now. This is more awkward to solve since the scaling is currently done using a very limited uint64 ALU available to the command parser that doesn't support multiply or divide where it's already taking a large number of instructions just to effectively multiple by 80. This fixes piglit arb_timer_query-timestamp-get on Skylake v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-17 15:45:19 +00:00
Jason Ekstrand	28b134c75c	anv/device: Remove a use of a compound literal Older versions of GCC don't like compound literals in static const variable declarations because they don't think it's an actual constant value. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 08:40:30 -07:00
Robert Bragg	76dc49f3fb	i965: bounds checks while concatenating sysfs paths This adds some missing return value checks for all uses of snprintf in brw_performance_query.c. This also switches a use of strncpy + strncat for snprintf for consistency and to avoid the chance of the strncpy leaving an unterminated string in the dest buffer if the src is too long. This issue with strncpy was picked up by Coverity. CID: 1402201 Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 13:40:29 +00:00
Emil Velikov	f8b1b9404e	mesa: automake: add all headers to the tarball. Fixes: `d8d81fbc31` ("mesa: Add infrastructure for a worker thread to process GL commands.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-17 13:10:09 +00:00
Emil Velikov	d9a41ce8aa	mapi: automake: add all python scripts to EXTRA_DIST Otherwise it'll be missing in the tarball and make distcheck will fail. Fixes: `05dd4a1104` ("glapi: Generate GL API marshalling code from the XML.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-17 13:10:09 +00:00
Jonathan Gray	9e8d6ba1d6	glapi: avoid using $< in non-suffix make rules Using $< in non-suffix make rules is a GNU extension. Explicitly use the name of the python script to fix the build on OpenBSD. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabore.com>	2017-03-17 13:06:26 +00:00
Alex Smith	ce4058dafd	radv/ac: Fix shared memory offset calculation The index passed to get_shared_memory_ptr is an attribute slot index, i.e. the index of a vec4 within LDS. Therefore this must be scaled by sizeof(vec4) to give the LDS byte offset. Fixes: `f4e499ec79` ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: <mesa-stable@lists.freedesktop.org>	2017-03-17 09:35:48 +01:00
James Legg	e88cac1df0	radv: Fix using more than 4 bound descriptor sets Avoid a buffer overflow in ac_nir_to_llvm.c's create_function when using more than 4 descriptor sets. radv claims support for 8. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-17 09:12:43 +01:00
Tapani Pälli	70d25cae8b	util/build-id: check dlpi_name before strstr call According to dl_iterate_phdr man page first object visited is the main program where dlpi_name is an empty string. This fixes segfault on Android when using build-id as identifier. Fixes: `d4fa083e11` ("util: Add utility build-id code.") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-17 07:34:26 +02:00
Tapani Pälli	4d4558411d	android: fix segfault within swap_buffers Function droid_swap_buffers may get called without dri2_surf->buffer set, in these cases we don't have a back buffer set either. Patch fixes segfault seen with 3DMark that uses android.opengl.GLSurfaceView for rendering it's UI. backtrace: #00 pc 00013f88 /system/lib/egl/libGLES_mesa.so (droid_swap_buffers+104) #01 pc 000117b2 /system/lib/egl/libGLES_mesa.so (dri2_swap_buffers+50) #02 pc 000058b2 /system/lib/egl/libGLES_mesa.so (eglSwapBuffers+386) #03 pc 00011329 /system/lib/libEGL.so (eglSwapBuffersWithDamageKHR+553) #04 pc 000118e7 /system/lib/libEGL.so (eglSwapBuffers+55) #05 pc 000754dc /system/lib/libandroid_runtime.so v2: do like other backends, call get_back_bo (Emil Velikov) Fixes: `2acc69d` ("EGL/Android: Add EGL_EXT_buffer_age extension") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-17 07:30:34 +02:00
Timothy Arceri	72ab7bb765	radv: make sure gs copy shader is retrieved from the cache with the variant Apps can limit the size of the cache via VkAllocationCallbacks so we can't be sure that both are always in the cache. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-17 16:17:10 +11:00
Timothy Arceri	2845a108a9	radv: fallback to an in-memory cache when no pipline cache is provided Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-17 16:17:10 +11:00
Timothy Arceri	315e8a9321	radv: always create an fallback pipeline cache This will be used as an in-memory cache when a pipeline cache is not provided by the app. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-17 16:17:10 +11:00
Timothy Arceri	4ffdab78b9	radv: move cache check inside insert and search functions This will allow us to use fallback in-memory and on-disk caches should the app not provide a pipeline cache. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-17 16:17:10 +11:00
Timothy Arceri	124ec417f9	st/mesa: call glthread_destroy() before _vbo_DestroyContext() Otherwise we have a race condition between vbo calls in the glthread and the _vbo_DestroyContext() call. This fixes a bunch of piglit crashes. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-17 09:47:02 +11:00
Jason Ekstrand	08df015b9d	anv/GetQueryPoolResults: Actually implement the spec The Vulkan spec is fairly clear about when we should and should not write query pool results. We're also supposed to return VK_NOT_READY if VK_QUERY_RESULT_PARTIAL_BIT is not set and we come across any queries which are not yet finished. This fixes rendering corruptions on The Talos Principle where geometry flickers in and out due to bogus query results being returned by the driver. These issues are most noticable on Sky Lake GT4 2hen running on "ultra" settings. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100182 Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-03-16 15:08:18 -07:00
Jason Ekstrand	81840130c0	anv/query: Invalidate the correct range Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-03-16 15:08:17 -07:00
Jason Ekstrand	4bbb4b95b8	anv/query: Fix the location of timestamp availability Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "17.0 13.0" <mesa-dev@lists.freedesktop.org>	2017-03-16 15:08:17 -07:00
Jason Ekstrand	9e60f59e62	genxml: Add XML version tags There's not much point to having them or not having them but this reduces some pointless diff from the version we can auto-generate Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-16 15:08:17 -07:00
Kenneth Graunke	f51a320b12	aubinator: Use fprintf for output. This will make it easier to choose an output file. For now, it remains stdout. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-16 10:48:44 -07:00
Kenneth Graunke	65a9d5eabb	aubinator: Reuse decode_structure code for handling commands The code for decoding structures and commands was almost identical. The only differences are: we print dword headers for commands, and we skip the first one (with the command opcode and lengths). So, generalize decode_structure to add a starting DWord, and a flag for printing the DWord headers, and reuse it. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-16 10:48:41 -07:00
Kenneth Graunke	f0aa8fd4e4	aubinator: Delete redundant NULL check. handle_struct_decode() is just a wrapper around decode_structure() with a NULL check. But the only caller already does that NULL check. So, just use decode_structure() directly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-16 10:48:37 -07:00
Kenneth Graunke	65138ce019	aubinator: Fix indentation. Three space, not four. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-16 10:48:32 -07:00
Topi Pohjolainen	bd25d9670b	i965/gen8+: Do full stall when switching pipeline just as earlier gens do. CC: "17.0 13.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96743 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 18:44:15 +02:00
Jonathan Gray	46707bc27b	i965: remove uneeded asm/unistd.h include Fix the build on OpenBSD by removing an uneeded include for asm/unistd.h. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-16 13:56:40 +00:00
Emil Velikov	e6bef50f4c	i965: automake: remove spurious white space Unintentionally introduced by yours truly with the i965 compiler move. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-16 13:55:42 +00:00
Jonathan Gray	d2bb0c8590	i965: avoid using a GNU make pattern rule % pattern rules are a GNU extension. As there is only one file here avoid patterns and globbing entirely to fix the build on non-GNU make. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> v2 [Emil Velikov: brw_oa.py dependency] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-16 13:55:23 +00:00
Emil Velikov	ccb89e72aa	docs/releasing: document how to squash/announce queued patches In the odd case where a patch needs to be fixed, squash the appropriate fix and document how. Add a note in the pre-release notes, such that devs can quickly spot it. v2: Grammar/typo fixes (Eric). Use upstream commit [SHA] as reference. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-16 13:22:40 +00:00
Emil Velikov	0f988add50	docs/releasing: release.sh is located in xorg/util-modular Correct the silly typo s/macros/modular/ and add a reference to the repository. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-16 13:18:13 +00:00
Emil Velikov	79562033b5	docs/releasing: remove "git clean" step release.sh from master, does not require the tree to be clean. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-16 13:18:11 +00:00
Emil Velikov	c81c563fbb	mapi: remove Xlib/xcb include in gl_marshal.py The only use of the header is to provide the _X_INLINE macro. We already require (and provide where needed) 'inline', plus it's used in the file already. So replace the macro and drop the include. This fixes the build on platforms which lack the header - from X-less Linuxes to Androids. Fixes: `05dd4a1104` ("glapi: Generate GL API marshalling code from the XML.") Reported-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100223 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-16 13:12:26 +00:00
Eric Engestrom	8a82f551cd	docs/specs: update Khronos registries URLs The registries were migrated to git and are now hosted on GitHub. The old svn is now read-only, and will not be updated anymore. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2017-03-16 11:50:40 +00:00
Iago Toral Quiroga	ca34a3125f	anv: improve error reporting when creating pipelines Specifically, report 'out of memory' errors that might have happened while emitting the pipeline's batch. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	1d7468311d	anv: handle errors in emit_binding_table() and emit_samplers() These can fail to allocate device memory, however, the driver can recover from this error by allocating a new binding table block and trying again. v2: - Instead of tracking the errors in these functions and making callers reset the batch's status before attempting to allocate a new block for the binding table, simply make callers responsible for setting the error status if they fail to allocate memory during the second attempt (Jason). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	dd8348c8be	anv: handle errors while allocating new binding table blocks Also, we had a couple of instances in flush_descriptor_sets() were we were returning a VkResult directly upon error, but the return value of this function is not a VkResult but a uint32_t dirty mask, so simply return 0 in these cases which reduces the amount of work the driver will do after the error has been raised. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	be52f9693a	anv/blorp: make anv_cmd_buffer_alloc_blorp_binding_table() return a VkResult Instead of asserting inside the function, and then use use that information to return early from its callers upon failure. v2: - Make sure that clear_color_attachment() and clear_depth_stencil_attachment() get the VkResult as well so they avoid executing the batch if an error happened. (Topi) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	a578b06d7b	anv/device: assert that commands submitted to a queue are not bogus Any errors that may have happened during the command buffer recording are reported by vkEndCommandBuffer() and it is the application's reponsibility to not submit broken commands to a queue. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	a752c4ecda	anv/cmd_buffer: skip vkCmdExecuteCommands() on broken command buffers v2: Assert on secondary commands, applications should've called vkEndCommandBuffer() and received an error for them before (Jason) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	801493051e	anv/cmd_buffer: skip vkCmdDispatch() on broken command buffers Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	18ec3fa2a9	anv/cmd_buffer: skip vkCmdDraw*() on broken command buffers Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	fb9d563fb9	anv: handle memory allocation errors during queue submissions Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	c04dbd6b3e	anv/cmd_buffer: handle out of memory during vkCmdPushConstants Fixes: dEQP-VK.api.out_of_host_memory.cmd_push_constants Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	94a4f0c255	anv/cmd_buffer: handle allocation errors during vkCmdBeginRenderPass() Fixes: dEQP-VK.api.out_of_host_memory.cmd_begin_render_pass Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	d823f381a5	anv/cmd_buffer: skip vkCmdEndRenderPass() for broken command buffers Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	6743456699	anv/cmd_buffer: skip vkCmdNextSubpass() for broken command buffers Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	8174e63869	anv/cmd_buffer: report tracked errors in vkEndCommandBuffer() Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	68d88f0237	anv: handle failures when growing reloc lists Growing the reloc list happens through calling anv_reloc_list_add() or anv_reloc_list_append(). Make sure that we call these through helpers that check the result and set the batch error status if needed. v2: - Handling the crashes is not good enough, we need to keep track of the error, for that, keep track of the errors in the batch instead (Jason). - Make reloc list growth go through helpers so we can have a central place where we can do error tracking (Jason). v3: - Callers that need the offset returned by anv_reloc_list_add() can compute it themselves since it is extracted from the inputs to the function, so change the function to return a VkResult, make anv_batch_emit_reloc() also return a VkResult and let their callers do the error management (Topi) v4: - Let anv_batch_emit_reloc() return an uint64_t as it originally did, there is no real benefit in having it return a VkResult. - Do not add an is_aux parameter to add_surface_state_reloc(), instead do error checking for aux in add_image_view_relocs() separately. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	d4bdd871dc	anv: avoid crashes when failing to allocate batches Most of the time we use macros that handle this situation transparently, but there are some cases were we need to handle this explicitly. This patch makes sure we don't crash, notice that error handling takes place in the function that actually failed the allocation, anv_batch_emit_dwords(), which will set the status field of the batch so it can be used at a later moment to report the error to the user. v2: - Not crashing is not good enough, we need to keep track of the error (Topi, Jason). Iago: now that we track errors in the batch, this is being handled. - Added guards in a few more places that needed it (Iago) v3: - Check result of anv_batch_emitn() for NULL before calling memset() in emit_vertex_input() (Topi) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	31f5049ff1	anv: handle allocation failure in anv_batch_emit_dwords() Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	9e69409fcf	anv: handle allocation failure in anv_batch_emit_batch() v2: - Call the error handler (Topi) Fixes: dEQP-VK.api.out_of_host_memory.cmd_execute_commands Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	a8ce8e3542	anv: add anv_batch_set_error() and anv_batch_has_error() helpers The anv_batch_set_error() helper will track the first error that happened while recording a command buffer. The helper returns the currently tracked error to help the job of internal functions that may generate errors that need to be tracked and return a VkResult to the caller. We will use the anv_batch_has_error() helper to guard parts of the driver that are not safe to execute if an error has been generated while recording a particular command buffer. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	d0195bd067	anv/cmd_buffer: add a status field to anv_batch The vkCmd*() functions do not report errors, instead, any errors should be reported by the time we call vkEndCommandBuffer(). This means that we need to make the driver robust against incosistent and/or imcomplete command buffer states through the command recording process, particularly, avoid crashes due to access to memory that we failed to allocate previously. The strategy used to do this is to track the first error ocurred while recording a command buffer in the batch associated with it. We use the batch to track this information because the command buffer may not be visible to all parts of the driver that can produce errors we need to be aware of (such as allocation failures during batch emissions). Later patches will use this error information to guard parts of the driver that may not be safe to execute. v2: Move the field from the command buffer to the batch so we can track errors from batch emissions (Jason) v3: Registering errors in the command buffer's batch during anv_create_cmd_buffer() is unnecessary, since the command buffer is freed at the end of the function in that case (Topi) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	6dd06f54eb	anv/cmd_buffer: report errors in vkBeginCommandBuffer() Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	88b539c4a0	anv: do not try to ref/unref NULL shaders This situation can happen if we failed to allocate memory for the shader. v2: - We shouldn't see NULL shaders in anv_shader_bin_ref so we should not check for that (Jason). Make sure that callers don't attempt to call this function with a NULL shader and assert that this never happens (Iago). v3: - All callers to anv_shader_bin_unref seem to check for NULL before calling, so just assert that it is not NULL (Topi) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	bad3a2e911	anv/blorp: return early if we failed to create the shader binary Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	e2f707ce5b	intel/blorp: make upload_shader() return a bool indicating success or failure For now we always return true, follow-up patches will handle fail scenarios. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	808503b8f8	anv: remove unnecessary function prototype. The function is defined right after the prototype declaration. Also, the protoype for it is included in anv_genX.h which is included via anv_private.h. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Timothy Arceri	04a9ca2700	mapi: don't include X11/Xlib-xcb.h on non PTHREAD platforms Should fix the last of the glthread build issues on windows.	2017-03-16 15:45:40 +11:00
Timothy Arceri	4a32d473fd	mesa: fix glthread marshal build issues on platforms without PTHREAD	2017-03-16 15:33:08 +11:00
Timothy Arceri	643b0fd7e9	mesa: fix glthread build issues on platforms without PTHREAD	2017-03-16 14:48:09 +11:00
Marek Olšák	c83562ccaa	gallium: implement the backend of threaded GL dispatch Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:19 +11:00
Gregory Hainaut	93bdad3253	mesa/glthread: restore the dispatch table when incompatible gl calls are detected While a context only has a single glthread, the context itself can be attached to several threads. Therefore the dispatch table must be updated in all threads before the destruction of glthread. In others words, glthread can only be destroyed safely when the context is deleted. Fixes remaining crashes in the glx-multithread-makecurrent* tests. V2: (Timothy Arceri) updated gl_API.dtd marshal_fail description. Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:19 +11:00
Gregory Hainaut	70e715eea6	mesa/glthread: don't set a dispatch table if we aren't the owner Fix crashes when glxMakeCurrent is called. Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:19 +11:00
Eric Anholt	012bfebc07	mesa: Track the current vertex/element array buffers for glthread. We want to support glthread on GLES contexts with reasonable apps, and on desktop for apps that use VBOs but haven't completely moved to core GL. To do so, we have to deal with the "the user may or may not pass user pointers to draw calls" problem. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:19 +11:00
Eric Anholt	238d027ed6	mesa: Disable glthread when glBegin() is called. glBegin() swaps dispatch tables, and we don't have any code in place for handling that in glthread (which also messes with dispatch tables), and I don't particularly care to at this point. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:19 +11:00
Eric Anholt	cd1c003b18	mesa: Add an attribute for conditions to turn off threading. The threading for GL core is in place, but there are so few applications actually using a core GL context that it would be nice to extend support back. However, some of the features of compat GL (particularly user vertex arrays) would be so expensive to track state for that we want to be able to disable threading when we discover that the app is using them. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:19 +11:00
Eric Anholt	43d4f7a227	mesa: Add support for asynchronous glDraw* on GL core. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:19 +11:00
Eric Anholt	b18755a457	mesa: Add support for NULL arguments like in glBufferData() in marshalling. This will let us support things like glBufferData() that should be asynchronous. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:19 +11:00
Eric Anholt	47f819d3cb	mesa: Statically allocate glthread command buffer in the batch struct. This avoids an extra pointer dereference in the marshalling functions, which, with the instruction count doing in the low 30s, could actually matter for main-thread performance. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:19 +11:00
Eric Anholt	1d6b71c5c6	glapi: Mark vertex attrib pointer functions as async. These don't actually read data out of the pointers, they set the pointers (or offsets in a VBO) to be used in a later draw call. v2: Don't forget glVertexAttribIPointer, and don't bother with annotations on aliases. v3: Mark CompressedTexSubImage1D as sync also. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:19 +11:00
Paul Berry	a4a5de6f18	mesa: Custom thread marshalling for Flush. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Paul Berry	154a4f2679	mesa: Custom thread marshalling for ShaderSource. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Eric Anholt	efd63e234a	mesa: Connect the generated GL command marshalling code to the build. v2: Rebase on the Begin/End changes, and just disable this feature on non-GL-core. v3: (Timothy Arceri) enable for non-GL-core contexts. Remove unrelated safe_mul() hunk. while loop style fix. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Marek Olšák	db06e91de2	Revert "mesa: make _mesa_alloc_dispatch_table() static" This reverts commit `4009d22b61`. glthread needs it. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Paul Berry	ef30ce97a6	mesa: Create pointers for multithread marshalling dispatch table. This patch splits the context's CurrentDispatch pointer into two pointers, CurrentClientDispatch, and CurrentServerDispatch, so that when doing multithread marshalling, we can distinguish between the dispatch table that's being used by the client (to serialize GL calls into the marshal buffer) and the dispatch table that's being used by the server (to execute the GL calls). Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Eric Anholt	d8d81fbc31	mesa: Add infrastructure for a worker thread to process GL commands. v2: Keep an allocated buffer around instead of checking for one at the start of every GL command. Inline the now-small space allocation function. v3: Remove duplicate !glthread->shutdown check, process remaining work before shutdown. v4: Fix leaks on destroy. V5: (Timothy Arceri) fix order of source files in makefile Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Eric Anholt	a76a3cf664	mesa: Validate count parameters when marshalling. Otherwise, for example, glDeleteBuffers(-1, &bo) gets you a segfault instead of GL_INVALID_VALUE. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Paul Berry	05dd4a1104	glapi: Generate GL API marshalling code from the XML. This is not yet used in the build, just generated. v2: Add missing build dependencies. v3: Avoid mixing declarations and code, remove logic for avoiding emitting code that the compiler's optimizer can deal with anyway. v4: (Timothy Arceri) move safe_mul() genereation here from a later patch. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Eric Anholt	f05524ffaa	glapi: Mark compressed teximage functions as sync. Without doing some additional tracking, we won't know whether the data will be immediate user data, or will be loaded from a PBO. The normal teximage functions will be sync by default because they don't know up front what the size of their image data is. But for compressed teximage, we have the count information, so they would end up async by default. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Paul Berry	f5052f45a2	glapi: Annotate functions with "marshal" attribute. Several API functions require special treatment in order to be marshalled to a background thread. Others can't be safely executed in a background thread and need to be executed synchronously (e.g. since they return data through a pointer argument). This annotation will be used when code generating thread marshalling code, to ensure that each function is marshalled in the correct way. Note that PixelMap functions are marked as synchronous for now since their pointer may be relative to buffer on the GPU, so we'll need special logic to marshal them properly. v2: Move description of attribute types to a comment in the dtd file. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Eric Anholt	3b7b6adf3a	egl: Implement __DRI_BACKGROUND_CALLABLE v2: (Timothy Arceri) use C99 initializers. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Paul Berry	6b70d9fce3	glx: Implement __DRI_BACKGROUND_CALLABLE v2: Marek: Add DRI3 support. v3: (Timothy Arceri) use C99 initializers. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Paul Berry	77630841da	mesa: Add SetBackgroundContext to dd_function_table Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Paul Berry	5bc527d39d	dri: Update dri_util to keep track of __DRI_BACKGROUND_CALLABLE Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Paul Berry	e043b2a1a0	dri_interface: Add new marshalling interfaces to dri_interface.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Marek Olšák <maraeo@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-16 14:14:18 +11:00
Roland Scheidegger	e1f9e9bafd	gallivm: (trivial) remove duplicated line pointed out by clang (stored value never read)	2017-03-16 04:03:29 +01:00
Roland Scheidegger	9d104dfd55	draw: (trivial) remove a unnecessary lp_build_alloca() pointed out by clang (stored value never read)	2017-03-16 04:03:29 +01:00
Ilia Mirkin	e893b3a367	swr: support layer output in geometry shaders This makes bin/gl-3.2-layered-rendering-gl-layer-render fail only with 2DMS_ARRAY, which is expected given the lackluster MSAA support. However all the regular types pass. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-15 21:03:11 -04:00
Bas Nieuwenhuizen	ad4dee521d	Revert "radv: Emit cache flushes before CP DMA." This reverts commit `cce43f6d8c`. Redundant, as the flush already happens at si_cp_dma_prepare. Acked-by: Dave Airlie <airlied@redhat.com>	2017-03-16 00:55:03 +01:00
Francisco Jerez	e6469ec43b	gallium/tgsi: Treat UCMP sources as floats to match the GLSL-to-TGSI pass expectations. Currently the GLSL-to-TGSI translation pass assumes it can use floating point source modifiers on the UCMP instruction. See the bug report linked below for an example where an unrelated change in the GLSL built-in lowering code for atan2 (`e9ffd12827`) caused the generation of floating-point ir_unop_neg instructions followed by ir_triop_csel, which is translated into UCMP with a negate modifier on back-ends with native integer support. Allowing floating-point source modifiers on an integer instruction seems like rather dubious design for a transport IR, since the same semantics could be represented as a sequence of MOV+UCMP instructions instead, but supposedly this matches the expectations of TGSI back-ends other than tgsi_exec, and the expectations of the DX10 API. I take no responsibility for future headaches caused by this inconsistency. Fixes a regression of piglit glsl-fs-tan-1 on softpipe introduced by the above-mentioned glsl front-end commit. Even though the commit that triggered the regression doesn't seem to have made it to any stable branches yet, this might be worth back-porting since I don't see any reason why the bug couldn't have been reproduced before that point. Suggested-by: Roland Scheidegger <sroland@vmware.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99817 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-03-15 15:47:14 -07:00
Grazvydas Ignotas	eb5a61f77a	util/disk_cache: do eviction before creating .tmp cache_put() first creates a .tmp file and then tries to do eviction. The recently added LRU eviction code selects non-empty directory with the oldest access time, but that may easily be the one with just the new .tmp file, especially on Linux where atime is updated lazily (with "relatime" mount option, which is the default). So when cache is small, if random doesn't hit another dir LRU keeps selecting the same dir with just the .tmp and not deleting anything. To fix this (and the tests), do eviction earlier. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-16 09:36:18 +11:00
Tim Rowley	a7ce0490e4	swr: validate backend state numAttributes General protection and prevents us from smashing the stack on the first clear state validation (`a7b8d50bcb`). Fixes crash using icc. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-15 15:08:59 -05:00
Ben Widawsky	8378c576ab	gbm: Export a get modifiers This patch originally had i965 specific code and was named: commit 61cd3c52b868cf8cb90b06e53a382a921eb42754 Author: Ben Widawsky <ben@bwidawsk.net> Date: Thu Oct 20 18:21:24 2016 -0700 gbm: Get modifiers from DRI To accomplish this, two new query tokens are added to the extension: __DRI_IMAGE_ATTRIB_MODIFIER_UPPER __DRI_IMAGE_ATTRIB_MODIFIER_LOWER The query extension only supported 32b queries, and modifiers are 64b, so we needed two of them. NOTE: The extension version is still set to 13, so none of this will actually be called. v2: Error handling of queryImage (Emil) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-15 10:36:05 -07:00
Ben Widawsky	5c6e0d1c7d	i965: introduce modifier selection. Nothing special here other than a brief introduction to modifier selection. Originally this was part of another patch but was split out from gbm: Introduce modifiers into surface/bo creation by request of Emil. Requested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-15 10:36:05 -07:00
Ben Widawsky	191ff914a2	egl/drm: Use modifiers for backbuffer creation Split into a separate patch from the previous patch as requested by Emil. Requested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-15 10:36:05 -07:00
Ben Widawsky	63bd2ae745	gbm: Introduce modifiers into surface/bo creation The idea behind modifiers like this is that the user of GBM will have some mechanism to query what properties the hardware supports for its BO or surface. This information is directly passed in (and stored) so that the DRI implementation can create an image with the appropriate attributes. A getter() will be added later so that the user GBM will be able to query what modifier should be used. Only in surface creation, the modifiers are stored until the BO is actually allocated. In regular buffer allocation, the correct modifier can (will be, in future patches be chosen at creation time. v2: Make sure to check if count is non-zero in addition to testing if calloc fails. (Daniel) v3: Remove "usage" and "flags" from modifier creation. Requested by Kristian. v4: Take advantage of the "INVALID" modifier added by the GET_PLANE2 series. v5: Don't bother with storing modifiers for gbm_bo_create because that's a synchronous operation and we can actually select the correct modifier at create time (done in a later patch) (Jason) v6: Make modifier condition outside the check so that dri_use will work properly (Jason) Cc: Kristian Høgsberg <krh@bitplanet.net> References (v4): https://lists.freedesktop.org/archives/intel-gfx/2017-January/116636.html Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Stone <daniels@collabora.com>	2017-03-15 10:36:05 -07:00
Ben Widawsky	5e7d8d3961	i965: Implement basic modifier image creation This is just a stub for now and will be filled in later. This was split out of an earlier patch Requested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-15 10:36:05 -07:00
Ben Widawsky	d075cce258	dri: Add an image creation with modifiers Modifiers will be obtained or guessed by the client and passed in during image creation/import. In guessing, a client might decide to simply pass along all known modifiers This requires bumping the DRIimage version. As of this patch, the modifiers aren't plumbed all the way down, this patch simply makes sure the interface level stuff is correct. v2: Don't allow usage + modifiers v3: Make NAND actually NAND. Bug introduced in v2. (Jason) v4: - s/obtains/obtained (Jason) - Pull out i965 imlemnentation into a later patch (Emil) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Stone <daniels@collabora.com>	2017-03-15 10:36:04 -07:00
Marek Olšák	0550f3d631	radeonsi: implement TGSI opcodes TEX_LZ and TXF_LZ This massively decreases VGPR spilling for DiRT Showdown, because we no longer have to use v4i32 for 2D fetches when level == 0. We now use v2i32 for those cases. DiRT Showdown - Spilled VGPRs: -26 (-81%) This surprisingly doesn't have any useful effect on performance (+ 0.05%).	2017-03-15 18:17:41 +01:00
Marek Olšák	a7cc9b0fcf	glsl_to_tgsi: use TEX_LZ and TXF_LZ when available	2017-03-15 18:17:41 +01:00
Marek Olšák	46cbb00f53	glsl_to_tgsi: remove a redundant statement it's the same as the last "else".	2017-03-15 18:17:41 +01:00
Marek Olšák	cca0389c72	gallium: add TGSI opcodes TEX_LZ and TXF_LZ for better code generation in radeonsi	2017-03-15 18:17:41 +01:00
Marek Olšák	bf3cdf0fd3	gallium: add PIPE_CAP_TGSI_TEX_TXF_LZ	2017-03-15 18:17:41 +01:00
Samuel Pitoiset	7751ed39e4	radeonsi: disable sinking common instructions down to the end block Initially this was a workaround for a bug introduced in LLVM 4.0 in the SimplifyCFG pass that caused image instrinsics to disappear (because they were badly sunk). Finally, this is a win because it decreases SGPR spilling and increases the number of waves a bit. Although, shader-db results are good I think we might want to remove it in the future once the issue is fixed. For now, enable it for LLVM >= 4.0. This also fixes a rendering issue with the speedometer in Dirt Rally. More information can be found here https://reviews.llvm.org/D26348. Thanks to Dave Airlie for the patch. v2: - add a FIXME comment - use if (HAVE_LLVM >= 0x0400) instead Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99484 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97988 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-15 14:24:40 +01:00
Samuel Pitoiset	74265fd03c	tgsi: add missing compute shader entry in tgsi_get_processor_name() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-15 14:16:29 +01:00
Samuel Pitoiset	38ee3246d2	radeonsi: clean up tex_fetch_ptrs() Will also help when the src sampler register will be TGSI_FILE_CONSTANT for bindless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-15 14:16:26 +01:00
Emil Velikov	8a5680f248	configure.ac: bump pthread-stubs requirement On platforms that require it, we bump the requirement to 0.4 or later. Due to an issue with the project [design] any version earlier than it, is bound to cause issues. For the specifics see the pthread-stubs README Cc: Uli Schlachter <psychon@znc.in> Cc: Jonathan Gray <jsg@jsg.id.au> Cc: Jean-Sébastien Pédron <dumbbell@FreeBSD.org> Cc: François Tigeot <ftigeot@wolfpond.org> Cc: Tobias Nygren <tnn@NetBSD.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-03-15 11:49:27 +00:00
Emil Velikov	eec0cd71cd	glx: don't expose systemTimeExtension for DRI2/DRI3/DRISW Used/applicable to only dri1 drivers. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-03-15 11:48:50 +00:00
Emil Velikov	b1fb6e8d8c	anv: do not open random render node(s) drmGetDevices2() provides us with enough flexibility to build heuristics upon. Opening a random node on the other hand will wake up the device, regardless if it's the one we're interested or not. v2: Rebase, explicitly require/check for libdrm v3: Return VK_ERROR_INCOMPATIBLE_DRIVER for no devices (Ilia) v4: Rebase Cc: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-15 11:38:05 +00:00
Emil Velikov	743315f269	radv: do not open random render node(s) drmGetDevices2() provides us with enough flexibility to build heuristics upon. Opening a random node on the other hand will wake up the device, regardless if it's the one we're interested or not. v2: Rebase. v3: Return VK_ERROR_INCOMPATIBLE_DRIVER for no devices (Ilia) Cc: Michel Dänzer <michel.daenzer@amd.com> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-15 11:38:02 +00:00
Emil Velikov	8ff2937dfa	radv/winsys: use drmGetDevice2 API Analogous to previous commit v2: Add explicit require_libdrm check. Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-15 11:38:00 +00:00
Emil Velikov	858170e8a4	winsys/amdgpu: use drmGetDevice2 API Analogous to previous commit Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98502 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-15 11:37:58 +00:00
Emil Velikov	a50c4eb2a0	loader: use drmGetDevice[s]2 API By this allows us to fetch the device list/info w/o the revision field. At the moment retrieving the latter wakes up the device. Note: kernel patch to resolve that should be in 4.10. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-15 11:37:55 +00:00
Emil Velikov	2c72e78ff5	autoconf/scons: bump libdrm to 2.4.75 We'll be using the drmGetDevice[s]2 API in src/loader with next patch. v2: Rebase. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-03-15 11:37:39 +00:00
Emil Velikov	0fd61fb639	util/sha1: drop _mesa_sha1_{update, format} return type Unused/unchecked by any of the callers. v2: Fix the glsl cases that have crept in since v1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>	2017-03-15 11:18:45 +00:00
Emil Velikov	a9a4028fd7	util/sha1: rework _mesa_sha1_{init,final} Rather than having an extra memory allocation [that we currently do not and act accordingly] just make the API take an pointer to a stack allocated instance. This and follow-up steps will effectively make the _mesa_sha1_foo simple define/inlines around their SHA1 counterparts. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>	2017-03-15 11:18:43 +00:00
Emil Velikov	c96127e873	util/sha1: add non-typedef name for the SHA1_CTX struct Using typedef(s) is not always the answer and makes it harder for people to do clever (or one might call nasty) things with the code. Add a struct name which we will use with follow-up commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>	2017-03-15 11:15:53 +00:00
Bas Nieuwenhuizen	ef43eeb09f	radv: Remove unused descriptor set field. Trivial. Signed-off-by: Bas Nieuwenhuizen <basni@google.com>	2017-03-15 09:06:52 +01:00
Dave Airlie	686d060458	r600: refactor binding code for attach buffer to CB. This refactors out the code and fixes it up to be used for images later. It uses the code in the current RAT binding for compute. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-15 14:33:26 +10:00
Dave Airlie	222e42e45f	r600: refactor out CB setup. This moves the code to create CB info out into a separate function so it can be reused in images code to create RATs. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-15 14:33:23 +10:00
Dave Airlie	0cf717821e	r600: refactor texture resource words setup code. This refactors out the code to setup a texture resource so we can reuse it later from the images code. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-15 14:33:06 +10:00
Dave Airlie	95a976b651	r600: factor out the code to initialise a buffer resource. This takes the code required to initialise a buffer resource out of the texture buffer code, into it's own function. This is going to be used for the image support later. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-15 14:32:48 +10:00
Dave Airlie	cf2af021b9	r600g: make framebuffer atom rely on dual src blend state. In order to make ARB_shader_image_load_store, we have to share the CB space with RATs, so we should only steal the dual src space if we have dual src enabled. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-15 14:32:44 +10:00
Jason Ekstrand	d142c7436c	intel/debug: Add a common INTEL_DEBUG=nohiz option The GL driver had a driconf option (which doesn't make much sense) and the Vulkan driver had a hand-rolled environment variable. Instead, let's tie both into the INTEL_DEBUG mechanism and unify things. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-14 21:00:09 -07:00
Jason Ekstrand	c09bb956ca	anv/image: Move handling of INTEL_VK_HIZ This makes it so that you don't get an "Implement gen7 HiZ" perf warning when you manually disable HiZ on gen8. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-14 21:00:09 -07:00
Timothy Arceri	304b35b0e9	radv: trivial tidy ups Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-15 11:45:04 +11:00
Alan Swanson	b7e03d87e4	util/disk_cache: scale cache according to filesystem size Select higher of current 1G default or 10% of filesystem where cache is located. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>	2017-03-15 11:15:11 +11:00
Alan Swanson	f1e9671442	util/disk_cache: actually enforce cache size Currently only a one in one out eviction so if at max_size and cache files were to constantly increase in size then so would the cache. Restrict to limit of 8 evictions per new cache entry. V2: (Timothy Arceri) fix make check tests Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>	2017-03-15 11:15:11 +11:00
Alan Swanson	af09b86732	util/disk_cache: use LRU eviction rather than random eviction Still using fast random selection of two-character subdirectory in which to check cache files rather than scanning entire cache. v2: Factor out double strlen call v3: C99 declaration of variables where used Reviewed-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-15 11:15:11 +11:00
Timothy Arceri	c2793e2c89	util/disk_cache: don't fallback to an empty cache dir on evict If we fail to randomly select a two letter cache dir, don't select an empty dir on fallback. In real world use we should never hit the fallback path but it can be hit by tests when the cache is set to a very small max value. Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>	2017-03-15 11:15:11 +11:00
Timothy Arceri	50989f87e6	util/disk_cache: use a thread queue to write to shader cache This should help reduce any overhead added by the shader cache when programs are not found in the cache. To avoid creating any special function just for the sake of the tests we add a one second delay whenever we call dick_cache_put() to give it time to finish. V2: poll for file when waiting for thread in test V3: fix poll delay to really be 100ms, and simplify the wait function Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>	2017-03-15 11:15:11 +11:00
Timothy Arceri	fc5ec64ba3	util/disk_cache: add helpers for creating/destroying disk cache put jobs V2: Make a copy of the data so we don't have to worry about it being freed before we are done compressing/writing. Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>	2017-03-15 11:15:11 +11:00
Timothy Arceri	e2c4435b07	util/disk_cache: add thread queue to disk cache Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>	2017-03-15 11:15:10 +11:00
Dave Airlie	7372e3cf5f	radv/ac: workaround regression in llvm 4.0 release LLVM 4.0 released with a pretty messy regression, that hopefully get fixed in the future. This work around was proposed by Tom, and it fixes the CTS regressions here at least, I'm not sure if this will cause any major side effects, but correctness over speed and all that. radeonsi should possibly consider the same workaround until an llvm fix can be found. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-15 09:51:53 +10:00
Dave Airlie	3ece76f03d	radv/ac: gather4 cube workaround integer This fix is extracted from amdgpu-pro shader traces. It appears the gather4 workaround for integer types doesn't work for cubes, so instead if forces a float scaled sample, then converts to integer. It modifies the descriptor before calling the gather. This also produces some ugly asm code for reasons specified in the patch, llvm could probably do better than dumping sgprs to vgprs. This fixes: dEQP-VK.glsl.texture_gather.basic.cube.rgba8* Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-15 09:51:53 +10:00
Bas Nieuwenhuizen	407fa77669	radv: Set driver version to mesa version; I couldn't really find an encoding in the spec. I'm not sure it prescribes VK_MAKE_VERSION format, but vulkan.gpuinfo.org interprets it that way by default. vulkaninfo gives the raw number, so we could alternatively do something like 17001000, but that doesn't show up right on vulkan.gpuinfo.org again. Looking at that site, the -pro driver also uses VK_MAKE_VERSION, so keeping consistency is probably best. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-03-15 00:37:56 +01:00
Bas Nieuwenhuizen	ed28ae71f5	radv: Increase api version to 1.0.42. I've skimmed to changes from 1.0.5 to 1.0.42 and I think we have all changes. We're still not conformant ofcourse, but this should not regress stuff, Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-03-15 00:37:56 +01:00
Jason Ekstrand	2e98db68e4	util/vk: Add helpers for finding an extension struct Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-15 08:22:02 +10:00
Alex Smith	e0cc32b85b	radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBuffer Need to flush before updating the buffer to ensure that the copy is ordered after previous accesses (assuming the app has performed the appropriate barriers). This fixes potential issues due to draws prior to an update reading the new buffer content, despite having the necessary barriers between them. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-14 22:17:03 +01:00
Bas Nieuwenhuizen	cce43f6d8c	radv: Emit cache flushes before CP DMA. The flushes could be due to TRANSFER barriers. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-14 22:16:34 +01:00
Jan Beich	fe56c745b8	Convert sed(1) syntax to be compatible with FreeBSD and OpenBSD BSD regex library doesn't support extended RE escapes (e.g. \+) and shorthand character classes (e.g. \s, \S) and SVR4-style word delimiters[1] (on DragonFly and NetBSD). Both GNU and BSD sed support -E and -r to enable extended RE but OS X still lacks -r. [1] https://www.illumos.org/issues/516 Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> (GNU sed)	2017-03-14 17:07:04 +00:00
Jason Ekstrand	aed2714145	anv: Properly enumerate physical devices when none are present	2017-03-14 09:08:07 -07:00
Jason Ekstrand	9d559ba39d	nir/constant_expressions: Refactor helper functions Apart from avoiding some unneeded size cases, this shouldn't have any actual functional impact. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-14 07:36:40 -07:00
Jason Ekstrand	762a6333f2	nir: Rework conversion opcodes The NIR story on conversion opcodes is a mess. We've had way too many of them, naming is inconsistent, and which ones have explicit sizes was sort-of random. This commit re-organizes things and makes them all consistent: - All non-bool conversion opcodes now have the explicit size in the destination and are named <src_type>2<dst_type><size>. - Integer <-> integer conversion opcodes now only come in i2i and u2u forms (i2u and u2i have been removed) since the only difference between the different integer conversions is whether or not they sign-extend when up-converting. - Boolean conversion opcodes all have the explicit size on the bool and are named <src_type>2<dst_type>. Making things consistent also allows nir_type_conversion_op to be moved to nir_opcodes.c and auto-generated using mako. This will make adding int8, int16, and float16 versions much easier when the time comes. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-03-14 07:36:40 -07:00
Jason Ekstrand	7107b32155	i965/fs: Re-arrange conversion operations Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-14 07:36:40 -07:00
Jason Ekstrand	bab4610e9c	i965/vec4: Get rid of the type parameter from to/from_double Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-14 07:36:40 -07:00
Jason Ekstrand	702d1af8ba	glsl/nir: Use nir_type_conversion_op Using the helper is way better than hand-coding the universe. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-03-14 07:36:40 -07:00
Jason Ekstrand	6eb051e36f	nir: Rewrite nir_type_conversion_op The original version was very convoluted and tried way too hard to not just have the nested switch statement that it needs. Let's just write the obvious code and then we know it's correct. This fixes a bunch of missing cases particularly with int64. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2017-03-14 07:36:40 -07:00
Jason Ekstrand	9084b1db30	nir: Add a get_nir_type_for_glsl_base_type helper Reviewed-by: Eric Anholt <eric@anholt.net>	2017-03-14 07:36:40 -07:00
Jason Ekstrand	a136884139	nir/validate: Rework ALU bit-size rule validation The original bit-size validation wasn't capable of properly dealing with instructions with variable bit sizes. An attempt was made to handle it by looking at source and destinations but, because the validation was done in validate_alu_(src\|dest), it didn't really have the needed information. The new validation code is much more straightforward and should be more correct. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-03-14 07:36:40 -07:00
Jason Ekstrand	370d68babc	nir/validate: Validate that bit sizes and components always match We've always required bit sizes to match but the rules for number of components have been a bit loose. You've never been allowed to source from something with less components than you consume, but more has always been fine. This changes the validator to require that they match exactly. The fact that they don't always match has been a source of confusion in NIR for quite some time and it's time we got rid of it. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2017-03-14 07:36:40 -07:00
Jason Ekstrand	e9a45a3d5d	nir: Make image_size a variable-width intrinsic Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2017-03-14 07:36:40 -07:00
Jason Ekstrand	b377be9213	i965/fs: Use num_components from the SSA def in image intrinsics Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2017-03-14 07:36:40 -07:00
Jason Ekstrand	0bf0365393	nir/lower_tex: Use tex_instr_dest_size for txs destinations Using coord_components of the source texture is correct for everything except cube maps where it's off by one. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2017-03-14 07:36:20 -07:00
Jason Ekstrand	fffa4111df	nir/spirv: Restrict the number of channels in texture coordinates Some SPIR-V texturing instructions pack more than the texture coordinate into the coordinate source. We need to mask off the unused channels. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2017-03-14 07:36:20 -07:00
Jason Ekstrand	3c312be7b3	nir/copy_prop: Respect the source's number of components In the near future we are going to require that the num_components in a src dereference match the num_components of the SSA value being dereferenced. To do that, we need copy_prop to not remove our MOVs from a larger SSA value into an instruction that uses fewer channels. Because we suddenly have to know how many components each source has, this makes the pass a bit more complicated. Fortunately, copy propagation is the only pass that cares about the number of components are read by any given source so it's fairly contained. Shader-db results on Sky Lake: total instructions in shared programs: 13318947 -> 13320265 (0.01%) instructions in affected programs: 260633 -> 261951 (0.51%) helped: 324 HURT: 1027 Looking through the hurt programs, about a dozen are hurt by 3 instructions and the rest are all hurt by 2 instructions. From a spot-check of the shaders, the story is always the same: They get a vec4 from somewhere (frequently an input) and use the first two or three components as a texture coordinate. Because of the vector component mismatch, we have a mov or, more likely, a vecN sitting between the texture instruction and the input. This means that the back-end inserts a bunch of MOVs and split_virtual_grfs() goes to town. Because the texture coordinate is also used by some other calculation, register coalesce can't combine them back together and we end up with an extra 2 MOV instructions in our shader. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2017-03-14 07:36:20 -07:00
Jason Ekstrand	60d1aac28a	nir/intrinsics: Make load_barycentric_input take a 2-component coor Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-03-14 07:36:20 -07:00
Jason Ekstrand	678fd00f2f	anv/blorp: Only set a clear color for resolves if fast-cleared Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-03-14 07:36:20 -07:00
Jason Ekstrand	273b720310	anv/blorp: Turn off AUX after doing a CCS_D resolve For render passes with multiple subpasses on gen7, we only fast-clear at the top but an input attachment use can cause us to do a resolve in the middle of the render pass. Once we've done so, we are no longer have a fast-cleared surface so we can just set aux_usage to NONE. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-03-14 07:36:20 -07:00
Tapani Pälli	773d510c66	android: add '/vulkan' to libmesa_anv_entrypoints path otherwise generated entrypoint headers are not found during build Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-14 07:48:30 +02:00
Tapani Pälli	4734322574	android: add src/intel/compiler to libmesa_intel_compiler includes fixes build error when brw_nir.h not found in the generated file brw_nir_trig_workarounds.c. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-14 07:48:22 +02:00
Gwan-gyeong Mun	8f22552a4f	anv: Add missing error-checking to anv_CreateDevice (v3) This patch adds missing error-checking and fixes resource leak in allocation failure path on anv_CreateDevice() v2: Fixes from Jason Ekstrand's review a) Add missing destructors for all of the state pools on allocation failure path b) Add missing destructor for batch bo pools on allocation failure path v3: Fixes from Emil Velikov's review Add missing destructor for queue and scratch_pool on allocation failure path Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 21:29:43 -07:00
Dave Airlie	b8ee70384a	radv: setup llvm target data layout Ported from radeonsi, pointed out by Tom. "This prevents LLVM from using sext instructions for local memory offsets and allows the backend to fold immediate offsets into the instruction. This also prevents some incorrect code generation for ptrtoint and inttoptr instructions." Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tom Stellard <tstellar@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-14 10:33:59 +10:00
Alex Smith	c19607d59d	radv: Reinitialise loaderMagic when allocating a cached command buffer This must be set to ICD_LOADER_MAGIC by vkAllocateCommandBuffers, which was being done when allocating a new buffer but not when reusing an existing one in the cache. This would hit an assertion and crash in debug builds of the Vulkan loader. Fixes: `682248db45` ("radv: Cache command buffers in command pool.") Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-13 23:42:36 +01:00
Marek Olšák	cdbe4990cd	gallium/radeon: disable the shader cache if dumping shaders otherwise, cached shaders aren't dumped. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-13 23:34:52 +01:00
Marek Olšák	71a2e4e945	radeonsi: mark all bound shader buffer ranges as initialized This should prevent cases when a buffer was incorrectly mapped without synchronization just because this wasn't done. Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-13 23:34:52 +01:00
Marek Olšák	686cd76a4c	st/mesa: disable the shader cache if dumping shaders otherwise, cached shaders aren't dumped. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-13 23:34:52 +01:00
Chad Versace	c5a0829e1f	anv: Use vk_outarray in vkGetPhysicalDeviceQueueFamilyProperties No intended change in behavior. Just a refactor. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 15:08:15 -07:00
Chad Versace	876f0ecd2f	anv: Use vk_outarray in vkEnumeratePhysicalDevices (v2) No intended change in behavior. Just a refactor. v2: Replace vk_outarray_is_incomplete() with vk_outarray_status(). For Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 15:08:15 -07:00
Chad Versace	62160536a0	util/vulkan: Add vk_outarray (v2) This is a wrapper for a Vulkan output array. A Vulkan output array is one that follows the convention of the parameters to vkGetPhysicalDeviceQueueFamilyProperties(). v2: Replace vk_outarray_is_incomplete() with vk_outarray_status(). For Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 15:08:11 -07:00
Lionel Landwerlin	bf47e5ba53	intel: genxml: prevent missing ; with address fields dwords Before this change, the generator could print this kind of things : const uint32_t v0 = __gen_uint(values->ValidBit, 0, 0) \| __gen_uint(values->FaultType, 1, 2) \| __gen_uint(values->SRCIDofFault, 3, 10) \| __gen_uint(values->GTTSEL, 11, 1) \| dw[0] = __gen_combine_address(data, &dw[0], values->VirtualAddressofFault, v0); This change fix the trailing '\|'. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 17:23:49 +00:00
Julien Isorce	9df3f28a8b	gallium/hud: check NULL return from u_upload_alloc Fixes the following segmentation fault: signal SIGSEGV: invalid address (fault address: 0x0) frame #0: 0x00007fffe718e117 radeonsi_dri.so hud_draw_background_quad hud_context.c:170 167 168 assert(hud->bg.num_vertices + 4 <= hud->bg.max_num_vertices); 169 -> 170 vertices[num++] = (float) x1; 171 vertices[num++] = (float) y1; 172 173 vertices[num++] = (float) x1; (lldb) bt * frame #0: 0x00007fffe718e117 radeonsi_dri.so`hud_draw_background_quad frame #1: 0x00007fffe718f458 radeonsi_dri.so`hud_draw frame #2: 0x00007fffe712967f radeonsi_dri.so`dri_flush Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-03-13 17:20:21 +01:00
Julien Isorce	d08c0930af	winsys/radeon: check null return from radeon_cs_create_fence in cs_flush Follow-up of patch: "radeon_cs_create_fence: check null return from radeon_winsys_bo_create" radeon_drm_cs_flush radeon_cs_create_fence radeon_winsys_bo_create Signed-off-by: Julien Isorce <jisorce@oblong.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-03-13 17:19:29 +01:00
Julien Isorce	d09edb0146	winsys/radeon: check null in radeon_cs_create_fence Fixes the following segmentation fault: radeon_drm_cs_add_buffer (bo=0x0) at radeon_drm_cs.c -> if (!bo->handle) (gdb) bt 0 radeon_drm_cs_add_buffer (bo=0x0) at radeon_drm_cs.c 1 0x00007fffe73575de in radeon_cs_create_fence radeon_drm_cs.c 2 0x00007fffe7358c48 in radeon_drm_cs_flush radeon_drm_cs.c Signed-off-by: Julien Isorce <jisorce@oblong.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-03-13 17:17:30 +01:00
Juan A. Suarez Romero	192de3f051	vulkan/wsi: include builddir for generated headers wayland-drm-client-protocol.h is generated in builddir, so when builddir != srcdir the header is not found, and compilation of wsi_common_wayland.c will fail. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-13 16:04:20 +01:00
Jason Ekstrand	dd4db84640	anv: Use on-the-fly surface states for dynamic buffer descriptors We have a performance problem with dynamic buffer descriptors. Because we are currently implementing them by pushing an offset into the shader and adding that offset onto the already existing offset for the UBO/SSBO operation, all UBO/SSBO operations on dynamic descriptors are indirect. The back-end compiler implements indirect pull constant loads using what basically amounts to a texelFetch instruction. For pull constant loads with constant offsets, however, we use an oword block read message which goes through the constant cache and reads a whole cache line at a time. Because of these two things, direct pull constant loads are much faster than indirect pull constant loads. Because all loads from dynamically bound buffers are indirect, the user takes a substantial performance penalty when using this "performance" feature. There are two potential solutions I have seen for this problem. The alternate solution is to continue pushing offsets into the shader but wire things up in the back-end compiler so that we use the oword block read messages anyway. The only reason we can do this because we know a priori that the dynamic offsets are uniform and 16-byte aligned. Unfortunately, thanks to the 16-byte alignment requirement of the oword messages, we can't do some general "if the indirect offset is uniform, use an oword message" sort of thing. This solution, however, is recommended for a few of reasons: 1. Surface states are relatively cheap. We've been using on-the-fly surface state setup for some time in GL and it works well. Also, dynamic offsets with on-the-fly surface state should still be cheaper than allocating new descriptor sets every time you want to change a buffer offset which is really the only requirement of the dynamic offsets feature. 2. This requires substantially less compiler plumbing. Not only can we delete the entire apply_dynamic_offsets pass but we can also avoid having to add architecture for passing dynamic offsets to the back- end compiler in such a way that it can continue using oword messages. 3. We get robust buffer access range-checking for free. Because the offset and range are baked into the surface state, we no longer need to pass ranges around and do bounds-checking in the shader. 4. Once we finally get UBO pushing implemented, it will be much easier to handle pushing chunks of dynamic descriptors if the compiler remains blissfully unaware of dynamic descriptors. This commit improves performance of The Talos Principle on ULTRA settings by around 50% and brings it nicely into line with OpenGL performance. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-13 07:58:00 -07:00
Jason Ekstrand	6b644e571e	anv: Stall before fast-clear operations During initial CCS bring-up, I discovered that you have to do a full CS stall prior to doing a CCS resolve as well as afterwards. It appears that the same is needed for fast-clears as well. This fixes rendering corruptions on The Talos Principle on Sky Lake GT4. The issue hasn't been demonstrated on any other hardware however, given that this appears to be a "too many things in the pipe" problem, having it be easier to reproduce on a system with more EUs makes sense. The issues with resolves is demonstrable on a GT3 or GT2 so this is probably also a problem on all GTs. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-03-13 07:57:03 -07:00
Jason Ekstrand	5e44ef4a76	anv: Accurately advertise dynamic descriptor limits The number of dynamic descriptors is limited by both the number of descriptors and the total number of dynamic things. Because there isn't a single "maximum dynamic things" limit, we need to divide by two so that they can create the maximum of both UBOs and SSBOs. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-03-13 07:57:03 -07:00
Jason Ekstrand	d36b463817	anv: Add a helper for working with VK_WHOLE_SIZE for buffers Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2017-03-13 07:57:03 -07:00
Rob Clark	f805593b12	freedreno/ir3: fragz cannot be half precision Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-03-13 10:33:07 -04:00
Rob Clark	b1df639db6	freedreno/ir3: optimize less in glsl Rely on nir for optimization, to reduce compile times. Very minimal impact on shader-db: total instructions in shared programs: 104170 -> 104199 (0.03%) total dwords in shared programs: 209664 -> 209728 (0.03%) total full registers used in shared programs: 7156 -> 7161 (0.07%) total half registers used in shader programs: 109 -> 109 (0.00%) total const registers used in shared programs: 24222 -> 24224 (0.01%) half full const instr dwords helped 12 107 103 112 98 hurt 11 104 105 115 102 But shader db runtime dropped from ~29.3s user to ~20.4s user. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-03-13 10:33:07 -04:00
Lionel Landwerlin	3278cd7610	aubinator/genxml: use gzipped files to store embedded genxml This reduces the size of the aubinator binary from ~1.4Mb to ~700Kb. With can now drop the checks on xxd in configure. v2: Fix incorrect makefile dependency (Lionel) v3: use $(PYTHON2) (Emil) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-13 13:36:31 +00:00
Lionel Landwerlin	351c951e09	intel: genxml: add script to generate gzipped genxml v2 (from Dylan): Add main function Add missing Copyright Use print_function v3: Add actually license (Dylan) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-13 13:36:27 +00:00
Jose Fonseca	b822d9c2b7	util/u_thread.h: Include stdint.h for int64_t definition. Fixes MinGW build. Trivial.	2017-03-13 12:23:11 +00:00
Iago Toral Quiroga	e8eeb759b7	intel: fix compiler build compiler/brw_vec4_gs_visitor.cpp:744:39: error: ‘GEN7_MAX_GS_OUTPUT_VERTEX_SIZE_BYTES’ was not declared in this scope output_vertex_size_bytes <= GEN7_MAX_GS_OUTPUT_VERTEX_SIZE_BYTES); Fixes: `d0d4a5f43b` ("i965: split EU defines to brw_eu_defines.h") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-13 13:09:24 +01:00
Christian König	8dee325752	svga: handle P016 format as well Fixes: `62cff79378` ("gallium: add P016 format") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100180 Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-13 12:49:41 +01:00
Emil Velikov	b82bd31c54	configure.ac: require pthread-stubs only where available The project is a thing only for BSD platforms. Or in other words - for any other platforms building/installing pthread-stubs results only in a pthread-stub.pc file. And even where it provides a DSO, there's a fundamental design issue with it - see the pthread-stubs mailing list for the specifics. v2: Update comment above the switch statement (Jon Turney). Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Acked-by: Gary Wong <gtw@gnu.org> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Randy Fishel <randy.fishel@oracle.com> Cc: Niveditha Rau <niveditha.rau@oracle.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-13 11:30:07 +00:00
Emil Velikov	9aebdb5d08	configure.ac: do not require the i965 driver for ANV As of last few commits we have the two split, thus we no longer require the i965 in order to have the ANV driver. Even though ANV does not link against libdrm nor libdrm_intel, we still require those as dependencies due to the headers they provide. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:35 +00:00
Jason Ekstrand	ee8044fd33	intel/vulkan: Get rid of recursive make v2 [Emil Velikov] - Various fixes and initial stab at the Android build. - Keep the generation rules/EXTRA_DIST outside the conditional Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:35 +00:00
Jason Ekstrand	7f9bbcfb7b	intel/tools: Use a makefile included from intel/Makefile.am Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:35 +00:00
Emil Velikov	aa09c9552c	intel/compiler: whitespace cleanups Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:35 +00:00
Emil Velikov	bdc5036464	intel/compiler: link all tests again gtest, even test_eu_compact" At the moment all the tests but test_eu_compact are actual C++ gtests. To simplify things, we can move the gtest.la to the common TEST_LIBS. As we're here, we can rename change the test extension [to .cpp] to avoid using the confusing dummy.cpp. Add a nice comment in the makefile for posterity. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:35 +00:00
Emil Velikov	f282ace678	i965: remove i965_symbols_test reference from .gitignore The test/binary was removed back in 2012. With that one gone, we can drop the .gitignore file all together. Cc: Eric Anholt <eric@anholt.net> Fixes: `c885039442` ("i965: Drop the missing symbols link test.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:35 +00:00
Jason Ekstrand	700bebb958	i965: Move the back-end compiler to src/intel/compiler Mostly a dummy git mv with a couple of noticable parts: - With the earlier header cleanups, nothing in src/intel depends files from src/mesa/drivers/dri/i965/ - Both Autoconf and Android builds are addressed. Thanks to Mauro and Tapani for the fixups in the latter - brw_util.[ch] is not really compiler specific, so it's moved to i965. v2: - move brw_eu_defines.h instead of brw_defines.h - remove no-longer applicable includes - add missing vulkan/ prefix in the Android build (thanks Tapani) v3: - don't list brw_defines.h in src/intel/Makefile.sources (Jason) - rebase on top of the oa patches [Emil Velikov: commit message, various small fixes througout] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:34 +00:00
Emil Velikov	d0d4a5f43b	i965: split EU defines to brw_eu_defines.h Split out the EU defines from the 'generic' ones, as the former are more compiler oriented. With a later commit we'll move brw_eu_defines.h alongside the compiler infra to src/intel/. Pulling all the defines in there seems overzealous. Some defines are used by both i965 and the i965 compiler. Those are moved to brw_eu_defines.h, and annotated accordingly. The i965 users were updated to have the extre include to indicate that. With future work we might provide a better, split but for now this seems reasonable. Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:34 +00:00
Emil Velikov	a72ac98160	util/bitscan: use correct signature for ffs/ffsll Otherwise we'll get errors such as error: conflicting types for ‘ffs’ error: conflicting types for ‘ffsll’ We might want to improve the heuristics and provide a definition only when a native one is missing. We can address that at a later stage. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:34 +00:00
Emil Velikov	fb0832b86d	i965: add missing brw_defines.h include in brw_program.c File is using MI_LOAD_REGISTER_IMM, GEN7_CACHE_MODE_1 and others as defined in the header. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:34 +00:00
Emil Velikov	2eefb903d5	i965: add missing brw_defines.h include in brw_program.c File is using the PIPE_CONTROL_* macros as defined in the header. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:34 +00:00
Emil Velikov	1d80407a6a	i965: add missing #include <assert.h> in brw_inst.h Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:34 +00:00
Emil Velikov	077078ce77	i965: move brw_define.h ifndef guard to the top Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:34 +00:00
Emil Velikov	8c432645bb	i965: remove unused macros from brw_defines.h The follow three groups are not used by neither the DRI module nor the compiler. BRW_POLYGON__FACING BRW_POLYGON_FACING_ BRW_STATELESS_BUFFER_* Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:34 +00:00
Emil Velikov	7784b3c846	i965: remove unused brw_program.h include Neither of the changed files requires the brw_program.h include. Since we're about to move them [to src/intel/compiler] with the next commit there's no point in having the include. Let alone the very confusing compiler include directive [-I${top_srcdir}/src/mesa/drivers/dri/i965/] that one would have to use. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:34 +00:00
Emil Velikov	c54c379b96	i965: remove duplicate declaration of brw_mark_surface_used Function was made static and moved to another header with earlier commit. Fixes: `760c8a1d95` ("i965: Make mark_surface_used a static inline in brw_compiler.h") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:33 +00:00
Emil Velikov	b69a03e12a	i965: remove dead brw_new_shader() declaration Cc: Timothy Arceri <tarceri@itsqueeze.com> Fixes: `194537ebe4` ("mesa/glsl/i965: remove Driver.NewShader()") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:33 +00:00
Emil Velikov	a032002dc9	i965: remove unused brw_cs.h include Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:33 +00:00
Jason Ekstrand	e042f5fcbc	anv: Stop including brw_context.h Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:33 +00:00
Jason Ekstrand	4ec5922afa	intel/isl: Stop linking libi965_compiler.la into tests Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:33 +00:00
Jason Ekstrand	12f348bc98	vulkan/wsi: Generate wayland protocol headers separately from EGL Previously, we were depending on EGL for generating the headers and providing the protocol symbols. However, since neither Vulkan driver actually wants to link against EGL, this is kind of pointless. It also creates a weird build dependency. v2 [Jason] - Add missing wsi/ prefix, MKDIR_GEN v3 [Emil Velikov] - include BUILT_SOURCES/generation rules outside of conditional Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:33 +00:00
Emil Velikov	1d135e2561	radv/wsi: Don't include wayland headers Unused and we'll rework the way wayland-drm-client-protocol.h is generated with later commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Dave Airlie <airlied@redhat.com>	2017-03-13 11:16:30 +00:00
Jason Ekstrand	4ea9bbe1f6	anv/wsi: Don't include wayland headers Unused and we'll rework the way wayland-drm-client-protocol.h is generated with later commit. v2 [Emil] - Also remove wayland-client.h Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:30 +00:00
Emil Velikov	d1042ef1dc	configure.ac: provide a fall-back define for WAYLAND_SCANNER In some cases, we can end up calling WAYLAND_SCANNER even when there's no binary. Do follow the other's approach set by AX_PROG_FLEX/BISON and set the variable to : Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:30 +00:00
Emil Velikov	c1b5ed853f	wayland: move .gitignore where applicable Strictly speaking things work as-is, but let's move the file alongside the artefacts it references. Analogous to all other places in mesa. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:30 +00:00
Christian König	5369b5a91d	st/va: add config support for 10bit decoding v2 Advertise 10bpp support if the driver supports decoding to a P016 surface. v2: Advertise 10bpp for the decoder as well. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mark Thompson <sw@jkqxz.net>	2017-03-13 08:51:44 +01:00
Christian König	e9d3e29bb3	st/va: add support for allocating 10bpp surfaces We support P010 and P016 as targets for 10bpp video decoding. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mark Thompson <sw@jkqxz.net>	2017-03-13 08:51:41 +01:00
Christian König	e58a1e8f68	st/va: add support for P010 and P016 formats v3 No hardware I know off can actually support P010 natively. But we can easily support P016 and as long as nobody decodes anything into the lower 6bits it doesn't make any difference to P010. v2: allow P0160 for post processing as well v3: fix post processing once more Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mark Thompson <sw@jkqxz.net>	2017-03-13 08:51:38 +01:00
Christian König	f1d1deb015	st/va: clear the video surface on allocation This makes debugging of decoding problems quite a bit easier. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mark Thompson <sw@jkqxz.net>	2017-03-13 08:51:35 +01:00
Christian König	1ce68af07b	st/va: cleanup error handling in vlVaCreateSurfaces2 No need to have that twice. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mark Thompson <sw@jkqxz.net>	2017-03-13 08:51:32 +01:00
Christian König	88f3451083	radeon/uvd: enable 10bit HEVC decode v2 Just use whatever the state tracker allocated. v2: fix msb mode Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mark Thompson <sw@jkqxz.net>	2017-03-13 08:51:29 +01:00
Christian König	3e1e441aa0	radeon/UVD: fix the decoding target pitch calculation The firmware expects the value in pixel not bytes. Didn't made a difference so far because we only used 8bpp surfaces. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mark Thompson <sw@jkqxz.net>	2017-03-13 08:51:25 +01:00
Christian König	cee591a224	vl/video_buffer: add support for P016 Just simply the description of the planes. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mark Thompson <sw@jkqxz.net>	2017-03-13 08:51:22 +01:00
Christian König	62cff79378	gallium: add P016 format Same layout as NV12, but 16bit per channel instead of 8. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Mark Thompson <sw@jkqxz.net>	2017-03-13 08:51:07 +01:00
Kenneth Graunke	920ab07566	i965: Delete unused last_ring local. Dead since `071d80bde2`, and causing warnings.	2017-03-12 22:57:46 -07:00
Bas Nieuwenhuizen	7c282b3ca1	radv: Store shaders in VRAM. Less IFETCH latency on misses. Shader code is write once read many, so GTT doesn't make much sense anyway. If it turns out to fragment the CPU visible VRAM too much, we can upload with SDMA. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-13 02:14:29 +01:00
Dave Airlie	e27fdbcb4c	radv/ac: move to new image intrinsics. This hooks up radv to the new image intrinsic builders. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-13 09:44:53 +10:00
Dave Airlie	3b49cee8fa	radv: disabled scaled formats for transfers. These really are only supported for vertex buffers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-13 09:36:49 +10:00
Timothy Arceri	13d69a8519	util/u_queue: make u_queue accessible to cpp Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-13 09:50:26 +11:00
Timothy Arceri	df1d5fc442	glsl: don't use ralloc for blob creation There is no need to use ralloc here. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-13 09:50:19 +11:00
Timothy Arceri	ca76a2ba1b	gallium/util: replace pipe_thread_setname() with u_thread_setname() They do the same thing we just moved the function to be accessible to all of Mesa. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:49:04 +11:00
Timothy Arceri	14e6b86952	gallium/util: replace pipe_thread_get_time_nano() with u_thread_get_time_nano() They do the same thing we just moved the function to be accessible to all of Mesa. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:49:04 +11:00
Timothy Arceri	f8cc4c25b8	gallium/util: replace pipe_thread_create() with u_thread_create() They do the same thing we just moved the function to be accessible to all of Mesa. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:49:04 +11:00
Timothy Arceri	b822d9dd67	gallium/util: move u_queue.{c,h} to src/util This will allow us to use it outside of gallium for things like compressing shader cache entries. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:49:03 +11:00
Timothy Arceri	04ec4db8b5	gallium/util: make use of new u_thread.h in u_queue.{c,h} Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:49:03 +11:00
Timothy Arceri	fbfe887253	util: add u_thread.h This is a minimal copy of os_thread.h from gallium in order to move u_queue.{c,h} to this directory. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:49:03 +11:00
Timothy Arceri	a3b820308b	gallium/util: use standard malloc/calloc/free in u_queue.c This will help us moving the file to the shared src/util dir. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:49:03 +11:00
Timothy Arceri	94a6457724	gallium/util: move u_string.h to src/util/u_string.h This will help us move u_queue.c here eventually and also provide string function wrappers for anyone wishing to port disk_cache.c to windows. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:43:06 +11:00
Timothy Arceri	d55d1e9805	gallium/util: remove unused header from u_string.h Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:43:06 +11:00
Timothy Arceri	ff8aad66bd	gallium/util: remove unused util_strbuf* Looks like they have been unused since 2008 `b8a7eef242`. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:43:06 +11:00
Timothy Arceri	b4b1dcb2c1	gallium/util: remove unused util_memmove() This is not used anywhere and Visual Studio looks to have supported memmove() for a long time if not always. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:43:06 +11:00
Timothy Arceri	b607aad8e1	glsl: don't recompile a shader on fallback unless needed Because we optimistically skip compiling shaders if we have seen them before we may need to compile them later at link time if they haven't yet been use in a specific combination to create a program. Rather than always recompiling we take advantage of the gl_compile_status enum introduced in the previous patch to only compile when we have previously skipped compilation. This helps with regressions in app start-up times on cold cache runs, compared with no cache. Deus Ex: Mankind Divided start-up times: cache disabled: ~3m15s cold cache master: ~4m23s cold cache with this patch: ~3m33s Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:26:08 +11:00
Timothy Arceri	bfa95997c4	mesa/glsl: introduce new gl_compile_status enum This will allow us to tell if a shader really has been compiled or if the shader cache has just seen it before. Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-03-12 17:24:40 +11:00
Matt Turner	3d253d330a	i965: Initialize compaction tables in unit test. Fixes: `fa4b792e83` "i965: Move brw_init_compaction_tables() to brw_create_compiler()." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100154	2017-03-10 23:16:39 -08:00
Matt Turner	fa4b792e83	i965: Move brw_init_compaction_tables() to brw_create_compiler(). ... so that we can avoid threading complications or unnecessary compaction table initializations (which just consists of setting some pointers based on devinfo->gen). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-10 17:58:11 -08:00
Emil Velikov	32be87852b	bin/get-fixes-pick-list.sh: do not mandate bash Silly thinko on my end, as I was writing the script. There is nothing bash specific in there. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:49 +00:00
Emil Velikov	0e94217999	bin/shortlog_mesa.sh: remove the final bashism Remove the typeset built-in and toggle to /bin/sh Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:48 +00:00
Emil Velikov	3aa5f51c27	bin/bugzilla_mesa.sh: rework the looping method We don't use DRYRUN (and no others scripts have one) so just drop it. This allows us to rework the loop to the more commonly used "git .... \| while read foo; do ... done" That in itself gets rid of the only remaining bashism and we can toggle the shebang to /bin/sh. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:48 +00:00
Emil Velikov	1c3a1d74ec	wayland-egl/wayland-egl-symbols-check: do not mandate bash Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:48 +00:00
Emil Velikov	f7e7708d75	gbm/gbm-symbols-check: do not mandate bash Analogous to previous commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:48 +00:00
Emil Velikov	5a0e4f4837	egl/egl-symbols-check: do not mandate bash There's nothing bash specific in the script. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:48 +00:00
Emil Velikov	a3782f2b7a	glsl/tests: remove any bashisms Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:48 +00:00
Emil Velikov	05c1d6d564	dri: use correct shebang for gen-symbol-redefs.py This is a python2 script and the generic "python" may point to python3. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:48 +00:00
Emil Velikov	fb187d2232	util: remove shebang from format_srgb.py Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:48 +00:00
Emil Velikov	2e8c683f5e	xmlpool: remove shebang from gen_xmlpool.py Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:48 +00:00
Emil Velikov	6d9ad29451	genxml: remove shebang from gen_pack_header.py Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:48 +00:00
Emil Velikov	e4c7911150	nir: remove shebang from python scripts Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:47 +00:00
Emil Velikov	a497f44645	st/xa: suffix xa-indent{,.sh} and add a shebang line Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:47 +00:00
Emil Velikov	c79c54ae34	gallium/tools: use correct shebang for python scripts These are python2 scripts and the generic "python" may point to python3. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:47 +00:00
Emil Velikov	e7b01d9fc8	gallium/tools: do not hardcode bash location It is not guaranteed to be in /bin Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:47 +00:00
Emil Velikov	6f341b9dfd	gallium/tests: remove execute bit from TGSI shader - vert-uadd.sh Just like the the dozens of other shaders, the file is parsed by separate tool and not executed. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:47 +00:00
Emil Velikov	68c38b2431	mapi/gen: remove shebang from python scripts All of those should be executed $PYTHON2/python2 [or equivalent] hence why they are missing the execute bit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:47 +00:00
Emil Velikov	9a502f5c47	mapi: do not mandate bash for es*api/ABI-check Seemingly there is nothing bash specific in these. The Debian checkbashisms does not spot neither run in zsh. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:47 +00:00
Emil Velikov	d73603fcdd	bin/perf-annotate-jit: add .py suffix To provide direct feedback about the file in question. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:47 +00:00
Emil Velikov	f03e7af7b9	i965: remove shebang from brw_nir_trig_workarounds.py Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:47 +00:00
Emil Velikov	1a39f3187c	i965: remove execute bit from brw_nir_trig_workarounds.py Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:47 +00:00
Emil Velikov	be4ce4937e	mesa: remove shebang from python scripts Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Emil Velikov	d2af6f6ee0	mesa: remove execute bit from main/format_parser.py Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Emil Velikov	a1d186cb70	amd: remove shebang from python scripts Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Emil Velikov	f6180a5ab7	amd: remove execute bit from python scripts Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Emil Velikov	168d801149	gallium: remove shebang from python scripts Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Emil Velikov	2ea1ce2701	gallium: remove execute bit from the python script(s) Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Emil Velikov	5d15fe446d	svga: remove shebang from svgadump/svga_dump.py Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Emil Velikov	bd9bb86bc3	svga: remove execute bit from svga_dump.py The file is used to generate svgadump/svga_dump.c... in theory at least. Atm. the file is checked in-tree but that is about to change later commits. As we get to that we'll use $PYTHON2 or equivalent as used throughout the tree. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Emil Velikov	cc9533c53f	freedreno: remove shebang from ir3_nir_trig.py Analogous to earlier commit(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Emil Velikov	55ffbbf571	freedreno: remove execute bit from ir3_nir_trig.py The file is meant to be called with $(PYTHON2) and not executed directly. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Emil Velikov	56e58e01e4	glsl: remove shebang from python scripts All of the scripts are [must be] executed via $PYTHON2 [or equivalent] hence why they are missing the execute bit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:46 +00:00
Emil Velikov	eca18d440d	glsl/tests: remove execute bit from compare_ir python script Nearly all the python scripts used in-tree are invoked via $PYTHON2 or equivalent. As such having the execute bit not needed and generally ill-advised. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:45 +00:00
Emil Velikov	7473fcd40b	glsl/tests: suffix .sh/.py files as applicable This makes it easier/clearer as to: - if the file should have the execute bit set (.py should not) - do we need the shebang in the first place and if so what it should be Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:45 +00:00
Emil Velikov	32d153c428	mesa: drop the execute bit from gl.xml This is a spec file which is parsed by scripts. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:45 +00:00
Emil Velikov	45a37c98e7	mapi/glapi: remove unused next_available_offset.sh Afaict there was no [documented] users since it was introduced. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-10 14:12:45 +00:00
Ben Widawsky	2ee34bd5dc	gbm: Export a per plane getter for offset Unlike stride, there was no previous offset getter, so it can be right on the first try. v2: Return EINVAL when plane is greater than total planes to make it match the similar APIs. Avoid leak after fromPlanar (Daniel) Make sure when getting offsets we consider dumb images (Daniel) v3: Use Jason's recommendation for handling the non-planar case. v4: Return int64_t so we can get real errors v5: Add an assertion for dumb BOs (Jason) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Stone <daniels@collabora.com>	2017-03-09 15:35:44 -08:00
Ben Widawsky	7f6209e46f	gbm: Export a per plane getter for stride v2: Preserve legacy behavior when plane is 0 (Jason Ekstrand) EINVAL when input plane is greater than total planes (Jason Ekstrand) Don't leak the image after fromPlanar (Daniel) Move bo->image check below plane count preventing bad index succeeding (Daniel) v3: Fix DRIimage leak (using Jason's recommended change) Make plane 0 return planar stride. This might break legacy behavior (Jason) v4: Move bogus hunk for get_handle_for_plane to the right patch (Jason) Fix error handling path to be cleaner (Jason) v5: Add assert for dumb BOs to make sure plane == 0 (Jason) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Stone <daniels@collabora.com>	2017-03-09 15:35:44 -08:00
Ben Widawsky	ed4cf2440d	gbm: Create a gbm_device getter for stride This will be used so we can query information per plane. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Stone <daniels@collabora.com>	2017-03-09 15:35:44 -08:00
Ben Widawsky	f9567ab435	gbm: Export a getter for per plane handles v2: Make the error return be -1 instead of 0 because I think 0 is actually valid. v3: Set errno to EINVAL when the specified plane is above the total planes. (Jason Ekstrand) Return the bo's handle if there is no image ie. for dumb images like cursor (Daniel) v4: - Add assertions about plane == 0 (Jason) - Add a comment about new restriction on planar dumb bo which is not an earlier patch in the series. - Correctly refactor from v2 in this patch; it ended up rebased into the wrong patch. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Stone <daniels@collabora.com>	2017-03-09 15:35:44 -08:00
Ben Widawsky	42eacddfc0	gbm: Export a plane getter function This will be used by clients that need to know the number of planes allocated for them on behalf of the GL or other API. The best current example of this is when an extra "plane" is allocated to store compression data for the primary plane. v2: Return 1 for cases where there is no image, ie. dumb bo (Daniel) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Stone <daniels@collabora.com>	2017-03-09 15:35:44 -08:00
Ben Widawsky	770b06588f	gbm: Explicitly disallow a planar dumb BO As more GBM functionality support planes is being evaluated, it becomes clear that a dumb bo can never actually be planar. It's questionable whether it was ever feasible to do this, and later functionality will implicitly assume a dumb BO is non-planar. v2: Include stdbool.h Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Daniel Stone <daniels@collabora.com>	2017-03-09 15:35:44 -08:00
Anuj Phogat	29e2ba0756	i965: Rename brw_format_for_mesa_format() to brw_isl_format_for_mesa_format() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-09 09:47:30 -08:00
Robert Bragg	a678b79ef4	i965: Add more Haswell OA metrics sets This extends the brw_oa_hsw.xml to expose these additional queries: - Compute Metrics Basic Gen7.5 - Compute Metrics Extended Gen7.5 - Memory Reads Distribution Gen7.5 - Memory Writes Distribution Gen7.5 - Metric set Sampler Balance Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 13:45:51 +00:00
Robert Bragg	458468c136	i965: Expose OA counters via INTEL_performance_query This adds support for exposing basic Observation Architecture performance counters on Haswell. This support is based on the i915 perf kernel interface which is used to configure the OA unit, allowing Mesa to emit MI_REPORT_PERF_COUNT commands around queries to collect counter snapshots. To take into account the small chance that some of the 32bit counters could wrap around for long queries (~50 milliseconds for a GT3 Haswell @ 1.1GHz) the implementation also collects periodic metrics. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 13:45:50 +00:00
Robert Bragg	a98ffe2477	exec_list: Add a foreach_list_typed_from macro This allows iterating list nodes from a given start point instead of necessarily the list head. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 13:45:50 +00:00
Robert Bragg	e56550565e	i965: Add script to gen code for OA counter queries Avoiding lots of error prone boilerplate and easing our ability to add + maintain support for multiple OA performance counter queries for each generation: This adds a python script to generate code for building up performance_queries from the metric sets and counters described in brw_oa_hsw.xml as well as functions to normalize each counter based on the RPN expressions given. Although the XML file currently only includes a single metric set, the code generated assumes there could be many sets. The metrics as described in XML get translated into C structures which are registered in a brw->perfquery.oa_metrics_table hash table keyed by the GUID of the metric set in XML. v2: numerous python style improvements (Dylan) v3: Makefile.am fixups (Emil) v4: Pattern rule for codegen + orthogonal .c and .h rules (Robert) Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 13:45:44 +00:00
Robert Bragg	f46e58e018	i965: extend query/counter structs for OA queries In preparation for generating code from brw_oa_hsw.xml for describing OA performance counter queries this adds some OA specific members to brw_perf_query that our generated code will initialize: - The oa_metric_set_id is the ID we will pass to DRM_IOCTL_I915_PERF_OPEN, and is an ID got via sysfs under: /sys/class/drm/<card>/metrics/<guid/id - The oa_format is the OA report layout we will request from the kernel - The accumulator offsets determine where the different groups of A, B and C counters are located within an intermediate 64bit 'accumulator' buffer. Additionally brw_perf_query_counter now has 64bit or float _read() callback members for OA counters. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 12:53:07 +00:00
Robert Bragg	eaab41c9db	i965: brw_context.h additions for OA unit query codegen In preparation for generating code from the XML performance counter meta data, this makes some additions to brw_context.h for this code to be able to reference. It adds a brw->perfquery.oa_metrics_table hash table for indexing built up query descriptions by the GUID that is expected to be advertised by the kernel (via sysfs) to be able to use that query. It adds an 'OA_COUNTERS' brw_query_kind to be assigned to queries built up by generated code. It adds a brw->perfquery.sys_vars structure to have a consistent place to represent the different system variables like $EuCoresTotalCount and $EuSlicesTotalCount that are referenced by OA counter normalization equations. Although extending + referencing gen_device_info for these variables was considered, these are some of the (mostly minor) reasons for going with a dedicated structure: - Currently we only need this info for the performance_query backend and it might be a bit tedious to go back and initialize the state for pre-Haswell devinfo structures. - Considering the $SubsliceMask then the requirement for how multiple per-slice masks are packed only comes from how the variables are references by availability tests in XML, and might not be a good general representation for tracking subslice masks if another use case arises. - If we used gen_device_info then we'd likely want to avoid making assumptions about the C types during codegen and adding explicit casts, while that's not necessary with a dedicated struct with all members being uint64_t. - This structure and the code for initializing it is currently shared (just through copy & paste) with a few other projects dealing with OA counters, and that's been convenient so far. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 12:53:07 +00:00
Robert Bragg	b79268174b	i965: XML description of Haswell OA metric set In preparation for exposing Gen Observation Architecture performance counters via INTEL_performance_query this adds an XML description for an initial 'Render Metrics Basic Gen7.5' query and corresponding counters. The intention is to auto generate code for building a query from these counters as well as the code for normalizing the individual counters. Note that the upstream for this XML data is currently GPU Top: https://github.com/rib/gputop The files are maintained under gputop-data/ and they are themselves derived from files in an internal 'MDAPI XML' schema. There are scripts under gputop-scripts/ and make rules in gputop-data/Makefile.xml for maintaining these files. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 12:53:07 +00:00
Pierre Moreau	655c395f65	nv50/ir: check for origin insn in findOriginForTestWithZero Function arguments do not have an "origin" instruction, causing a NULL-pointer dereference without this check. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-09 12:42:46 +01:00
Samuel Pitoiset	d54b498694	mesa/main: make use of lookup_samplerobj_locked() There is no need to check sampler == 0 twice. This removes now unused _mesa_lookup_samplerobj_locked(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-09 11:01:37 +01:00
Samuel Pitoiset	58b4ae0411	mesa/main: inline {begin,end}_samplerobj_lookups() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-09 11:01:31 +01:00
Grazvydas Ignotas	8cd83a6c81	glsl/blob: clear padding bytes Since blob is intended for serializing data, it's not a good idea to leave padding holes with uninitialized data, which may leak heap contents and hurt compression if the blob is later compressed, like done by shader cache. Clear it. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-09 20:41:02 +11:00
Grazvydas Ignotas	61bbb25a08	util/disk_cache: fix size subtraction on 32bit Negating size_t on 32bit produces a 32bit result. This was effectively adding values close to UINT_MAX to the cache size (the files are usually small) instead of intended subtraction. Fixes 'make check' disk_cache failures on 32bit. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-09 20:26:30 +11:00
Grazvydas Ignotas	926bcacfd3	util/disk_cache: fix compressed size calculation It incorrectly doubles the size on each iteration. Fixes: `85a9b1b5` "util/disk_cache: compress individual cache entries" Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-09 20:26:23 +11:00
Lionel Landwerlin	f81ede4699	glsl: builtin: always return clones of the builtins Builtins are created once and allocated using their own private ralloc context. When reparenting IR that includes builtins, we might be steal bits of builtins. This is problematic because these builtins might now be freed when the shader that includes then last is disposed. This might also lead to inconsistent ralloc trees/lists if shaders are created on multiple threads. Rather than including builtins directly into a shader's IR, we should include clones of them in the ralloc context of the shader that requires them. This fixes double free issues we've been seeing when running shader-db on a big multicore (72 threads) server. v2: Also rename _mesa_glsl_find_builtin_function_by_name() to better reflect how this function is used. (Ken) v3: Rename ctx to mem_ctx (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 08:30:36 +00:00
Kenneth Graunke	071d80bde2	i965: Delete render ring prelude. This was a hook I came up when trying to do the initial performance counter work years ago. Nothing's used it for a long time, and the upcoming performance counter support doesn't want it either. So, goodbye render ring prelude. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-03-08 23:01:21 -08:00
Vinson Lee	d64ded7b50	swr: s/uint/enum pipe_render_cond_flag/ Fix build error. swr_context.cpp: In function ‘void swr_blit(pipe_context, const pipe_blit_info)’: swr_context.cpp:336:44: error: invalid conversion from ‘uint {aka unsigned int}’ to ‘pipe_render_cond_flag’ [-fpermissive] ctx->render_cond_mode); ~~~~~^~~~~~~~~~~~~~~~ Fixes: `b0d3938430` ("gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition()") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100133 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-03-08 21:43:07 -08:00
Bas Nieuwenhuizen	7d6e1a341a	radv: Don't flush the CB before doing a fast clear eliminate. The only way we write CMASK/DCC compressed textures through shaders is fast clears and CMASK/DCC inits, which have their own flushes. Hence the CB cache is always up to date. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:28 +01:00
Bas Nieuwenhuizen	8700329785	radv: Don't emit cache flushes on subpass switch. I think we should only flush right before an action (draw/dispatch etc.), as otherwise it is too easy to issue redundant flushes. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:23 +01:00
Bas Nieuwenhuizen	9251f8b35e	radv: Only flush for the needed stages, and before the flushes. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:19 +01:00
Bas Nieuwenhuizen	f92a118434	radv: Don't invalidate CB/DB for images that aren't modified outside CB/DB. Without stores, the only writes are fast clears, transfers and metadata initialization, each of which have the appropiate invalidations already. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:14 +01:00
Bas Nieuwenhuizen	0567ab0407	radv: Flush more caches after writes. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:10 +01:00
Bas Nieuwenhuizen	7a600bbc81	radv: Don't flush for fixed-function reading. The data should always be in memory after a src flush. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:05 +01:00
Bas Nieuwenhuizen	dd094e4ff9	radv: Invalidate the correct caches for CB/DB dst barriers. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:01 +01:00
Bas Nieuwenhuizen	b075eb7d47	radv: Determine cache flushes per object. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:34:42 +01:00
Samuel Pitoiset	2568d9d0cd	mesa/main: remove unused _mesa_new_texture_image() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-03-09 01:57:20 +01:00
Dave Airlie	e6902be900	radv/ac: fixup texture coord to have right number of channels. Jason has patches to add validation to this area, this should fix radv shaders. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-09 09:17:11 +10:00
Timothy Arceri	0e34966340	st/nine: pass NULL to ureg_get_tokens() The number of tokens in never used and the pointer is NULL checked so just pass NULL. Reviewed-by: Axel Davy <axel.davy@ens.fr>	2017-03-09 09:29:07 +11:00
Matt Turner	a45cd8107d	docs: ARB_shader_atomic_counter_ops is enabled on i965/gen7+. This extension was enabled in commit `40dd45d0c6` ("i965: Enable ARB_shader_atomic_counter_ops") but the commit failed to update the release notes or features.txt. The release notes ship has sailed, since the commit was in 13.0.	2017-03-08 13:58:52 -08:00
Eric Anholt	19f571ba6d	vc4: Fix math with a condition flag set. Math results land in r4, regardless of the condition. To implement them, we just need to ensure that the results are moved out of r4 (as often happens anyway, the values is live across another math instruction), so that we can attach the condition to the MOV. Fixes dEQP-GLES2.functional.shaders.random.all_features.fragment.93 and a couple others, that were assertion failing that their conditions hadn't been handled during the QIR->QPU stage.	2017-03-08 13:44:17 -08:00
Eric Anholt	615f6653b0	vc4: Fix register pressure cost estimates when a src appears twice. This ended up confusing the scheduler for things like fabs (implemented as fmaxabs x, x) or squaring a number, and it would try to avoid scheduling them because it appeared more expensive than other instructions. Fixes failure to register allocate in dEQP-GLES2.functional.uniform_api.random.3 with almost no shader-db effects (+.35% max temps)	2017-03-08 13:44:17 -08:00
Eric Anholt	0fca01d027	vc4: Report to shader-db how many threads a fragment shader has. Doing instruction count analysis when we emit the thread switches that will save us from tons of stalls is kind of missing the point.	2017-03-08 13:44:17 -08:00
Eric Anholt	61359324c1	Revert "vc4: Lazily emit our FS/VS input loads." This reverts commit `292c24ddac`. It broke a lot of GLES2 deqp, and I see at least one problem that will require some serious rework to fix.	2017-03-08 13:44:17 -08:00
Marek Olšák	ab12a126fd	radeonsi: fix elimination of literal VS outputs broken when switched to the new intrinsics. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-08 19:56:36 +01:00
Fabio Estevam	78c5772633	loader: Move non-error message to debug level Currently when running mesa on imx6 the following loader warnings are seen: # kmscube -D /dev/dri/card1 MESA-LOADER: device is not located on the PCI bus MESA-LOADER: device is not located on the PCI bus MESA-LOADER: device is not located on the PCI bus Using display 0x1920948 with EGL version 1.4 As this is not an error message, change it to debug level in order to have a cleaner log output. Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-08 16:35:00 +00:00
Mauro Rossi	61c38d14b7	android: r600: fix libmesa_amd_common dependency Adding libmesa_amd_common dependency and exporting its headers, avoids the following building error: external/mesa/src/gallium/drivers/r600/evergreen_compute.c:29:10: fatal error: 'ac_binary.h' file not found ^ 1 error generated. Fixes: `3bbbb63` "automake: r600: radeonsi: correctly manage libamd_common.la linking" Fixes: `503fb13` "radeon/ac: switch to ac_shader_binary_config_start()" v2 [Emil Velikov: drop unneeded LOCAL_EXPORT_C_INCLUDE_DIRS] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-08 16:27:23 +00:00
Emil Velikov	1fe4d638a1	gallium/targets: rework the empty targets removal Earlier commit added extra tracking and we've attempted to remove the vdpau/other folder if empty. V2 of said commit dropped the pipe to /dev/null and the explicit "true" override. Sadly both of those are needed since there's no guarantee that the folder will be empty before we [mesa] make install. Since we're bringing those two back, there's no need to track if we've installed anything, and simply do "rm -d foo/ &>/dev/null \|\| true" Tested-by: Andy Furniss <adf.lists@gmail.com> Reported-by: Andy Furniss <adf.lists@gmail.com> Fixes: `1cd4fde053` ("gallium/targets: don't leave an empty target directory(ies)") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-08 16:23:07 +00:00
Brian Paul	2f3f5728f7	util/indices: minor clean-ups Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:21 -07:00
Brian Paul	a0927da006	radeonsi: s/uint/enum pipe_shader_type/ This can probably be done in more places in the driver. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	b0d3938430	gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition() Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	2b9ab605aa	gallium: s/uint/enum pipe_shader_type/ for set_constant_buffer() Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	73bafb5ee3	gallium: s/unsigned/enum pipe_shader_type/ for get_compiler_options() Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	1564a768ae	virgl: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	6614b060fb	swr: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	f676c700cc	softpipe: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	0fc5110a6e	llvmpipe: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	4aec68176d	freedreno: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	7532ed106f	etnaviv: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	b4191b712b	draw: s/unsigned/enum pipe_shader_type/ and some s/uint/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	ed66c9d7b8	cso: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	637e5719b5	gallium: s/unsigned/enum pipe_shader_type/ for pipe_screen::get_shader_param() Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Tapani Pälli	db5f9c3177	anv: change BLOCK_POOL_MEMFD_SIZE to exactly 2GB This is what comment above definition says and change fixes issue with 32bit build where BLOCK_POOL_MEMFD_SIZE is used as ftruncate parameter and constant currently gets converted from 4294967296 to 0. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-08 07:57:55 +02:00
Matt Turner	58b69eedd3	Revert "configure.ac: Use PKG_CHECK_VAR for wayland-scanner." This reverts commit `8a26e94439`.	2017-03-07 21:24:05 -08:00
Matt Turner	0b361f9d35	Revert "configure.ac: Use PKG_CHECK_VAR for libclc." This reverts commit `706074cc96`.	2017-03-07 21:24:05 -08:00
Chris Wilson	05520ba490	i965: Remove use of deprecated drm_intel_aub routines With mesa/drm commit cd2f91e18db087edf93fed828e568ee53b887860 Author: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Date: Fri Jul 31 10:47:50 2015 -0700 intel: Drop aub dumping functionality the drm_intel_aub routines are mere stubs and do nothing. Likewise remove our invocations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-07 16:40:03 -08:00
Jason Ekstrand	4483c5d57c	spirv: Silence unused variable warnings in release mode Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-03-07 15:22:16 -08:00
Jason Ekstrand	0421813588	anv: Make the framebuffer-renderpass format assert non-fatal This should let Dota 2 run on debug builds though it will spew errors like mad. Hopefully, Valve will get this fixed sooner rather than later. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-07 15:22:16 -08:00
Jason Ekstrand	33301d949f	anv: Drop the anv_validate block helper Over the course of driver development, we've come up with a number of different schemes for adding giant blocks of asserts inside the driver. This one is only being used once in anv_pipeline.c and the way it's being used actually generates compiler warnings in release builds. This commit drops the anv_validate macro and just puts the contents of the one validation function in side of a "#ifdef DEBUG" guard. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-07 15:22:16 -08:00
Jason Ekstrand	a316d8f406	anv: Get rid of the stub() macros Except for a few unimplemented things on gen7, we don't really have stubs anymore so we should drop this. This commit replaces the few gen7 stub() calls with explicitly labeled finishme's and makes the sparse binding stuff silently no-op or return a FEATURE_NOT_PRESENT error. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-07 15:22:16 -08:00
Jason Ekstrand	1488d079cb	anv: Remove a pointless finishme We've been supporting multiple shaders per module for some time now. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-07 15:22:16 -08:00
Jason Ekstrand	1a43792783	anv: Convert the HiZ finishme's to perf_warn Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-07 15:22:16 -08:00
Jason Ekstrand	201fc83df7	anv: Add a performance warning helper This acts identically to anv_finishme except that it only dumps out these nice log messages if you run with INTEL_DEBUG=perf. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-07 15:22:16 -08:00
Timothy Arceri	20234cfe3a	st/mesa: don't propagate uniforms when restoring from cache We will have already loaded the uniforms when the parameter list was restored from cache. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-08 09:45:48 +11:00
Damien Grassart	e25c92a72d	radv: remove duplicate initialization of alphaToOne feature Fixes a GCC warning when compiling with -Wextra: radv_device.c:463:47: warning: initialized field overwritten [-Woverride-init] Signed-off-by: Damien Grassart <damien@grassart.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-08 06:00:34 +10:00
Dave Airlie	d81bd2f754	radv: disable mip point pre clamping. No idea what this does, but disabling it fixes a bunch of failing CTS tests in the lod area, so let's go with that. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-08 05:50:46 +10:00
Fredrik Höglund	162beb2abb	radv/ac: fix multiple descriptor sets with dynamic buffers The dynamic_offset_offset in the descriptor set binding layout is relative to the dynamic_offset_start for the set in the pipeline layout. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-07 20:23:32 +01:00
Fredrik Höglund	71bb1a9c3c	radv: fix the size of the dynamic_buffers array A buffer descriptor is 16 bytes, not 16 dwords. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-07 20:23:26 +01:00
Fredrik Höglund	0941d1a574	radv: fix the dynamic buffer index in vkCmdBindDescriptorSets This fixes the wrong dynamic buffer descriptors being updated when firstSet > 0. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-07 20:23:04 +01:00
Matt Turner	69063d0561	configure.ac: Ensure libomxil-bellagio exists before invoking pkg-config. I was already tired of seeing the message Package libomxil-bellagio was not found in the pkg-config search path. Perhaps you should add the directory containing `libomxil-bellagio.pc' to the PKG_CONFIG_PATH environment variable No package 'libomxil-bellagio' found on every configure, but I just got a distro bug reported where the user was confused by this message and thought it indicated a bug. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-07 07:27:45 -08:00
Matt Turner	86c023f973	configure.ac: Ensure libva is enabled before invoking pkg-config. PKG_CHECK_VAR can only check --variable=$NAME, so it cannot handle modversion. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-07 07:27:45 -08:00
Matt Turner	706074cc96	configure.ac: Use PKG_CHECK_VAR for libclc. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-07 07:27:45 -08:00
Matt Turner	8a26e94439	configure.ac: Use PKG_CHECK_VAR for wayland-scanner. Available since pkg-config-0.28 and pkgconf-0.8.10. The removal of the AC_PATH_PROG is intentional. Use pkg-config. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-07 07:27:45 -08:00
Matt Turner	f73903f09b	configure.ac: Fix error message in radeon_llvm_check(). It printed the version of LLVM ($1): configure: error: 3.6.0 requires libelf when using llvm instead of the driver name ($2): configure: error: r600 requires libelf when using llvm Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-03-07 07:27:45 -08:00
Matt Turner	e457e6abec	build: Replace NEED_RADEON_LLVM with HAVE_GALLIUM_LLVM. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-07 07:27:45 -08:00
Bas Nieuwenhuizen	6424795f52	radv: Use the subresource range in HTILE initialization. v2: fix levelCount assert. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-07 09:58:33 +01:00
Bas Nieuwenhuizen	3b455c1cb7	radv: Use winsys HTILE info. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-07 09:58:27 +01:00
Bas Nieuwenhuizen	dbecbab5aa	radv/amdgpu: Let addrlib calculate the HTILE parameters. Still not sure we can support miptrees when sampling from HTILE enabled textures. Added the tcCompatible winsys stuff while I'm at it. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-07 09:58:21 +01:00
Dave Airlie	03f5405fc2	amd/common: document PREDICATION OP 3 as 64-bit bool. This just documents some info for possible future use. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 15:20:01 +10:00
Dave Airlie	b26249781e	radv: handle z offset for 3d image <-> buffer copies. This fixes: dEQP-VK.pipeline.render_to_image.3d.huge.depth.r8g8b8a8_unorm Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 04:02:00 +00:00
Dave Airlie	c5947e9787	radv: move fast clear before resolve into own loop. Don't fast clear inside the meta loop as things get confused, fixes a crash in: dEQP-VK.api.copy_and_blit.resolve_image.whole_array_image.2_bit Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 04:01:53 +00:00
Bas Nieuwenhuizen	0ab2dd361f	radv: Disable HTILE for textures with multiple layers/levels. It has issues and the fix I'm working on is too complicated for stable, so disable for now. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> CC: 13.0 17.0 <mesa-stable@lists.freedesktop.org>	2017-03-06 23:58:57 +01:00
Dave Airlie	6bae1e44a9	radv: Properly handle destroying NULL devices and instances Ported from anv: 3d33a23e anv: Properly handle destroying NULL devices and instances Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 08:17:03 +10:00
Dave Airlie	5c45d2051a	radv/ac: introduce i1true/i1false to context. This uses these in a few places, and fixes one or two cases which were using da as 32-bit instead of bool. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 08:17:03 +10:00
Dave Airlie	ca884aef86	radv/ac: handle Z export using new builder. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 08:17:03 +10:00
Dave Airlie	bf2be50774	radv/ac: move to using common ac_get_image_intr_name. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 08:17:03 +10:00
Dave Airlie	10ae83a9c2	radeonsi/ac: move get_image_intr_name to common This code is used in radv, so move to common build code. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-07 08:17:03 +10:00
Timothy Arceri	7eb85b8204	gallium/util: remove unused header from u_queue.c Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 09:12:16 +11:00
Timothy Arceri	60a2c2507d	gallium/util: remove unused pipe_thread_destroy() Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 09:12:16 +11:00
Timothy Arceri	d82d8be614	gallium/util: replace pipe_thread_wait() with thrd_join() Replace done using: find ./src -type f -exec sed -i -- \ 's:pipe_thread_wait($[^)]*$):thrd_join(\1, NULL):g' {} \; Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 09:12:16 +11:00
Timothy Arceri	da40ac65c7	gallium/util: remove PIPE_THREAD_ROUTINE() This was made unnecessary with `fd33a6bcd7`. This was mostly done with: find ./src -type f -exec sed -i -- \ 's:PIPE_THREAD_ROUTINE($[^,]$, $[^)]$):int\n\1(void \*\2):g' {} \; With some small manual tidy ups. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 09:12:16 +11:00
Timothy Arceri	e92293a601	gallium/util: replace pipe_condvar with cnd_t pipe_condvar was made unnecessary with `fd33a6bcd7`. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 09:07:33 +11:00
Timothy Arceri	e5375ba028	gallium/util: replace pipe_thread with thrd_t pipe_thread was made unnecessary with `fd33a6bcd7`. V2: fix compile error in u_queue.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:53:27 +11:00
Timothy Arceri	628e84a58f	gallium/util: replace pipe_mutex_unlock() with mtx_unlock() pipe_mutex_unlock() was made unnecessary with `fd33a6bcd7`. Replaced using: find ./src -type f -exec sed -i -- \ 's:pipe_mutex_unlock($[^)]*$):mtx_unlock(\&\1):g' {} \; Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:53:05 +11:00
Timothy Arceri	ba72554f3e	gallium/util: replace pipe_mutex_lock() with mtx_lock() replace pipe_mutex_lock() was made unnecessary with `fd33a6bcd7`. Replaced using: find ./src -type f -exec sed -i -- \ 's:pipe_mutex_lock($[^)]*$):mtx_lock(\&\1):g' {} \; Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:52:38 +11:00
Timothy Arceri	be188289e1	gallium/util: replace pipe_mutex_destroy() with mtx_destroy() pipe_mutex_destroy() was made unnecessary with `fd33a6bcd7`. Replace was done with: find ./src -type f -exec sed -i -- \ 's:pipe_mutex_destroy($[^)]*$):mtx_destroy(\&\1):g' {} \; Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:52:16 +11:00
Timothy Arceri	75b47dda0c	gallium/util: replace pipe_mutex_init() with mtx_init() pipe_mutex_init() was made unnecessary with `fd33a6bcd7`. Replace was done using: find ./src -type f -exec sed -i -- \ 's:pipe_mutex_init($[^)]*$):(void) mtx_init(\&\1, mtx_plain):g' {} \; Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:52:07 +11:00
Timothy Arceri	acdcaf9be4	gallium/util: remove pipe_static_mutex() This was made unnecessary with `fd33a6bcd7`. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:48:16 +11:00
Timothy Arceri	2efddc63ee	gallium/util: replace pipe_mutex with mtx_t pipe_mutex was made unnecessary with `fd33a6bcd7`. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:48:11 +11:00
Timothy Arceri	464d4806c1	gallium/util: replace pipe_condvar_broadcast() with cnd_broadcast() pipe_condvar_broadcast() was made unnecessary with `fd33a6bcd7`. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:23:26 +11:00
Timothy Arceri	5e56c2c79d	gallium/util: replace pipe_condvar_signal() with cnd_signal() pipe_condvar_signal() was made unnecessary with `fd33a6bcd7`. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:23:26 +11:00
Timothy Arceri	74c879ac75	gallium/util: replace pipe_condvar_wait() with cnd_wait() pipe_condvar_wait() was made unnecessary with `fd33a6bcd7`. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:23:26 +11:00
Timothy Arceri	1e0314281a	gallium/util: replace pipe_condvar_destroy() with cnd_destroy() pipe_condvar_destroy() was made unnecessary with `fd33a6bcd7`. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:23:26 +11:00
Timothy Arceri	3f58242863	gallium/util: replace pipe_condvar_init() with cnd_init() pipe_condvar_init() was made unnecessary with `fd33a6bcd7`. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:23:26 +11:00
Marek Olšák	63d7a12fad	st/dri: reduce dri_fill_st_options() params Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-03-07 08:16:46 +11:00
Marek Olšák	696c5115b9	st/dri: use local pointer to st_context_iface Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2017-03-07 08:16:39 +11:00
Gregory Hainaut	2ab5eccf5d	glapi: fix typo in count_scale 2*4=8 Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-07 08:11:40 +11:00
Kenneth Graunke	7782936cbc	i965: Return NULL from initScreen2, not false. This returns a pointer, not a boolean. No actual effect, but cleaner. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-06 12:38:15 -08:00
Kenneth Graunke	b5b123ac8f	i965: Make a devinfo local variable. screen->devinfo.gen is annoying to type and linewrap. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-06 12:38:15 -08:00
Kenneth Graunke	951f56cd43	i965: Delete vestiges of resource streamer code. We never actually used the resource streamer in any shipping build of Mesa. We have no plans to do so in the future. We looked into using it in Vulkan, and concluded that it was unusable. We're not the only ones to arrive at the conclusion that it's not worth using. So, drop the last vestiges of resource streamer support and move on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-06 12:38:15 -08:00
Kenneth Graunke	4dc785728a	i965: Drop duplicate #defines now that we've bumped libdrm requirements. We've updated our libdrm requirement, and it will already provide these. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-06 12:38:15 -08:00
Samuel Pitoiset	4317cd96d3	getteximage: fix _mesa_GetTextureSubImage() Oops. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100088 Fixes: `5ae54c0cf7` ("getteximage: avoid to lookup textures with id 0") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-06 21:36:56 +01:00
Grazvydas Ignotas	ff494fe999	ralloc: don't leave out the alignment factor Experimentation shows that without alignment factor gcc and clang choose a factor of 16 even on IA-32, which doesn't match what malloc() uses (8). The problem is it makes gcc assume the pointer is 16 byte aligned, so with -O3 it starts using aligned SSE instructions that later fault, so always specify a suitable alignment factor. Cc: Jonas Pfeil <pfeiljonas@gmx.de> Fixes: `cd2b55e5` "ralloc: Make sure ralloc() allocations match malloc()'s alignment." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100049 Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Tested by: Mike Lothian <mike@fireburn.co.uk> Tested by: Jonas Pfeil <pfeiljonas@gmx.de>	2017-03-06 11:28:48 -08:00
Grazvydas Ignotas	b384c23b9e	i965: don't require 64bit cmpxchg There are still some distributions trying to support unfortunate people with old or exotic CPUs that don't have 64bit atomic operations. The only thing preventing compile of the Intel driver for them seems to be initialization of a debug variable. v2: use call_once() instead of unsafe code, as suggested by Matt Turner Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089 Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>	2017-03-06 11:07:20 -08:00
Alex Smith	290d7e892d	radv: Emit pending flushes before executing a secondary command buffer If we have any pending flushes on the primary command buffer, these must be performed before executing the secondary buffer. This fixes potential corruption when the contents of a subpass which clears any of its render targets are given in a secondary buffer: the flushes after a fast clear would not have been performed until the vkCmdEndRenderPass call. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>	2017-03-06 19:46:14 +01:00
Samuel Pitoiset	052c81faa1	mesa/main: remove useless check in _mesa_IsSampler() _mesa_lookup_samplerobj() returns NULL if sampler is 0. v2: use _mesa_lookup...(...) != NULL Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-06 18:01:38 +01:00
Samuel Pitoiset	5ae54c0cf7	getteximage: avoid to lookup textures with id 0 This fixes the following assertion when the key is 0. main/hash.c:181: _mesa_HashLookup_unlocked: Assertion `key' failed. Fixes: `633c959fae` ("getteximage: Return correct error value when texure object is not found") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-06 18:01:38 +01:00
Marek Olšák	5ac6ab701f	docs/relnotes/17.1.0: document the new LLVM requirement	2017-03-06 17:35:36 +01:00
Marek Olšák	c416d8a3bc	gallium/radeon: don't monitor SDMA busyness on EG/Cayman/SI It's always busy. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99955 Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 14:13:04 +01:00
Marek Olšák	7e1faa79d3	radeonsi: drop support for LLVM 3.6 & 3.7 They are too old. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 14:13:04 +01:00
Marek Olšák	d5d74fe2b5	radeonsi: set the convergent attribute where needed Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 14:13:04 +01:00
Marek Olšák	ef883fc554	gallivm,ac: add LP_FUNC_ATTR_CONVERGENT Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 14:13:04 +01:00
Marek Olšák	9b08f044be	radeonsi: fix LLVM 3.9 - don't use non-matching attributes on declarations Call site attributes are used since LLVM 4.0. This also reverts commit `b19caecbd6` "radeon/ac: fix intrinsic version check", because this is the correct fix. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 14:13:04 +01:00
Mark Thompson	6398a09213	st/omx: Set end-of-frame flag on bitstream output buffers Since all output buffers are whole frames, this should always be set. Technically, setting this flag is is optional (see OpenMAX IL section 3.1.2.7.1), but some clients assume that it will be used and therefore buffer indefinitely thinking that all output buffers are fragments of the first frame when it is not set. Signed-off-by: Mark Thompson <sw@jkqxz.net> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-03-06 14:05:43 +01:00
Mark Thompson	6d95358aac	st/omx: Fix port format enumeration From OpenMAX IL section 4.3.5: "The value of nIndex is the range 0 to N-1, where N is the number of formats supported by the port. There is no need for the port to report N, as the caller can determine N by enumerating all the formats supported by the port. Each port shall support at least one format. If there are no more formats, OMX_GetParameter returns OMX_ErrorNoMore (i.e., nIndex is supplied where the value is N or greater)." Only one format is supported, so N = 1 and OMX_ErrorNoMore should be returned if nIndex >= 1. The previous code here would return the same format for all values of nIndex, resulting in an infinite loop when a client attempts to enumerate all formats. Signed-off-by: Mark Thompson <sw@jkqxz.net> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-03-06 14:05:17 +01:00
Mark Thompson	0798fddb50	st/va: Fix forward/backward referencing for deinterlacing The VAAPI documentation is not very clear here, but the intent appears to be that a forward reference is forward from a frame in the past, not forward to a frame in the future (that is, forward as in forward prediction, not as in a forward reference in source code). This interpretation is derived from other implementations, in particular the i965 driver and the gstreamer client. In order to match those other implementations, this patch swaps the meaning of forward and backward references as they currently appear for motion-adaptive deinterlacing. Signed-off-by: Mark Thompson <sw@jkqxz.net> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-03-06 14:05:05 +01:00
Mark Thompson	c93a157078	st/va: Support fractional framerate in misc parameter Signed-off-by: Mark Thompson <sw@jkqxz.net> Acked-by: Christian König <christian.koenig@amd.com>	2017-03-06 14:04:59 +01:00
Andy Furniss	012b6d3fe7	st/va encode handle ntsc framerate rate control Tested with ffmpeg and gst-vaapi. Without this bits per frame is set way too low for fractional framerates. v2: Mark Thompson: simplify calculation. Use float. Signed-off-by: Andy Furniss <adf.lists@gmail.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-03-06 14:04:24 +01:00
Bas Nieuwenhuizen	f3dc318464	radv: Use the new L2 writeback flag. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 09:16:05 +01:00
Bas Nieuwenhuizen	66e12d4073	radv: Add L2 writeback. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 09:15:51 +01:00
Timothy Arceri	6b657cecd5	util/disk_cache: fix make check Fixes make check after `11f0efec2e` which caused disk cache to create an additional directory.	2017-03-06 16:39:55 +11:00
Dave Airlie	2e73ccb485	radv/ac: use bitfield extract new intrinsics. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-06 15:27:33 +10:00
Dave Airlie	9c7309b09b	radv/ac: move to new kill build. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-06 15:27:33 +10:00
Dave Airlie	a2652719f3	radv/ac: move to using new export intrinsics. This uses the new code in build to do exports. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-06 15:27:33 +10:00
Dave Airlie	2830ece0fc	radv/ac: switch to new intrinsics for pkrtz and clamp. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-06 15:27:32 +10:00
Dave Airlie	cc59e24a6b	radv: drop Z24 support. This isn't exposed in -pro, the hw docs say it is deprecated, so let's not bother with it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-05 23:32:36 +00:00
Grazvydas Ignotas	6aaadd8728	radv: use VK_NULL_HANDLE for handles Avoids warnings on 32bit. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-06 00:10:42 +01:00
Grazvydas Ignotas	a5446e3187	radv: check for upload alloc failure Mainly to avoid gcc's complains about uninitialized ptr and offset use later in that code. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-06 00:10:42 +01:00
Grazvydas Ignotas	666fe622e1	radv: don't use uninitialized value on failure Mainly to avoid a warning. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-06 00:10:42 +01:00
Grazvydas Ignotas	5458b02305	radv: avoid casting warnings on 32bit Use the same helpers as for other handle<->pointer conversions. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-06 00:10:42 +01:00
Bas Nieuwenhuizen	fb7e4e16e7	radv/amdgpu: Add some debug flags. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 00:10:23 +01:00
Bas Nieuwenhuizen	682248db45	radv: Cache command buffers in command pool. So that we don't keep allocating BOs for the IBs and upload buffers. We run some risk of memory increase with e.g. a bimodal size distribution of command buffers, but I haven't noticed a significant increase with dota2 and talos. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-06 00:07:51 +01:00
Timothy Arceri	e3a01a5d1b	Revert "glsl: Switch to disable-by-default for the GLSL shader cache" This reverts commit `0f60c6616e`. Piglit and all games tested so far seem to be working without issue. This change will allow wide user testing and we can decided before the next release if we need to turn it off again. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-06 09:38:07 +11:00
Timothy Arceri	ee8d2e2804	docs: update envvars.html to reflect having a cache per arch	2017-03-06 09:33:20 +11:00
Timothy Arceri	11f0efec2e	util/disk_cache: support caches for multiple architectures Previously we were deleting the entire cache if a user switched between 32 and 64 bit applications. V2: make the check more generic, it should now work with any platform we are likely to support. V3: Use suggestion from Emil to make even more generic/fix issue with __ILP32__ not being declared on gcc for regular 32-bit builds. Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-03-06 09:27:01 +11:00
Grazvydas Ignotas	175d4aa8f5	util/disk_cache: mark read-only arguments const No functional changes. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-06 09:23:17 +11:00
Dave Airlie	b19caecbd6	radeon/ac: fix intrinsic version check Reported-by: 375gnu@gmail.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100068 Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-06 06:05:58 +10:00
Bas Nieuwenhuizen	a247215469	radv: Merge fast clear flushes. Don't flush multiple times if we clear multiple attachments. Also allows doing the depth clear in parallel with the fast color clears. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-05 20:40:31 +01:00
Tim Rowley	a01a104216	relnotes: [swr] note addition of gs, increased llvm requirement Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-05 07:33:49 -06:00
Tim Rowley	bb8a4242ff	docs: update features.txt for swr geometry shaders Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-05 07:33:49 -06:00
Tim Rowley	c307092557	swr: [rasterizer core] fix primID provoking vertex for GS Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-05 07:33:49 -06:00
Tim Rowley	f1d7284117	swr: implement geometry shaders Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-05 07:33:49 -06:00
Tim Rowley	08a82363ba	configure.ac: increase required swr llvm to 3.9.0 GS implementation uses the masked.{gather,store} intrinsics, introduced in llvm-3.9.0. swr llvm version requirement in automake and scons now match (scons already needed >= 3.9). Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-03-05 07:33:49 -06:00
Kenneth Graunke	6f71d9adc1	i965: Clamp texture buffer size to GL_MAX_TEXTURE_BUFFER_SIZE. The OpenGL 4.5 specification's description of TexBuffer says: "The number of texels in the texture image is then clamped to an implementation-dependent limit, the value of MAX_TEXTURE_BUFFER_SIZE." We set GL_MAX_TEXTURE_BUFFER_SIZE to 2^27. For buffers with a byte element size, this is the maximum possible size we can encode in SURFACE_STATE. If you bind a buffer object larger than this as a texture buffer object, we'll exceed that limit and hit an isl assert: assert(num_elements <= (1ull << 27)); To fix this, clamp the size in bytes to MaxTextureSize / texel_size. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-03-04 22:46:50 -08:00
Emil Velikov	eaf4a106bd	automake: move wayland-drm prior to Vulkan Earlier commit was picked from a larger series, but did not consider that it removed the vulkan <> wayland-drm interdependency. Rather than reverting everything, temporarily move wayland-drm further up to resolve the issue. Since it [wayland-drm] does not have any in-mesa dependencies that's perfectly safe. Cc: Vedran Miletić <vedran@miletic.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100060 Fixes: `e135ce6f08` ("vulkan: Build common Vulkan code earlier") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Javier Jardón <jjardon@gnome.org>	2017-03-04 23:44:14 +00:00
Mauro Rossi	6facb0c08f	android: fix libz dynamic library dependencies Fixes a series of libz related building errors: target SharedLib: gallium_dri_32 (out/target/prod...SHARED_LIBRARIES/gallium_dri_intermediates/LINKED/gallium_dri.so) external/elfutils/libelf/elf_compress.c:117: error: undefined reference to 'deflateInit_' ... external/elfutils/libelf/elf_compress.c:244: error: undefined reference to 'inflateEnd' clang++: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: `85a9b1b` "util/disk_cache: compress individual cache entries"	2017-03-04 21:47:26 +00:00
Timothy Arceri	28fd6556c3	svga: pass NULL to ureg_get_tokens() The number of tokens in never used and the pointer is NULL checked so just pass NULL. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-03-05 08:15:51 +11:00
Ilia Mirkin	8e6d67685e	nvc0: take extra pushbuf space into account for pushbuf_space calls See detailed explanation of why this is needed in commit `eb60a89bc3`. This spot was missed/overlooked. Basically as a result of the fact that BEGIN_* ends up calling PUSH_SPACE, which in turn adds an extra 8 to the requested amount, we have to be mindful of that when doing bare nouveau_pushbuf_space calls. Reportedly this fixes some crashes when replaying a hitman trace taken on radeonsi. Fixes: `eb60a89bc3` ("nouveau: take extra push space into account for pushbuf_space calls") Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Reported-by: Karol Herbst <nouveau@karolherbst.de> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-04 17:48:27 +01:00
Ilia Mirkin	32dd8d59b6	nvc0: increase alignment to 256 for texture buffers on fermi When binding as textures, the alignment can be 16. However when binding as an image, the address has to be aligned to 256. (Also when binding as an RT, but that can't happen with GL or current gallium APIs.) Reported-by: Roy Spliet <nouveau@spliet.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-04 17:48:27 +01:00
Tapani Pälli	66b62be4bb	android: fix outdir for gen_enum_to_str files when files are being generated the value of $intermediates var content can be completely random, this makes sure that outdir is the wanted one. Fixes: `3f2cb699` ("android: vulkan: add support for libmesa_vulkan_util") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-04 16:38:33 +00:00
Xiaosong Wei	2acc69da8c	EGL/Android: Add EGL_EXT_buffer_age extension This patch implements the EGL_EXT_buffer_age extension for Android. https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_buffer_age.txt Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-04 16:37:12 +00:00
Emil Velikov	2b1e22f9d8	docs: add news item and link release notes for 17.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-04 15:56:58 +00:00
Emil Velikov	1b19304f3f	docs: add sha256 checksums for 17.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `5c9273152c`)	2017-03-04 15:55:10 +00:00
Emil Velikov	6a4f6a49d4	docs: add release notes for 17.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `8fee1d348c`)	2017-03-04 15:55:09 +00:00
Emil Velikov	1cd4fde053	gallium/targets: don't leave an empty target directory(ies) Some drivers do not support certain targets - for example nouveau doesn't do VAAPI, while freedreno doesn't do of the video backends. As such if we enter vdpau when building freedreno/ilo/etc, a vdpau/ folder will be created, empty library will be build and almost immediately removed. Thus keeping an empty vdpau/ folder around. There are two ways to fix this. * add substantial tracking in configure/makefiles so that we never end up in targets/vdpau Downsides: Error prone, as the configure checks and the 'include gallium/drivers/foo/Automake.inc' can easily get out of sync. * remove the folder, if empty, alongside the empty library. Downsides: In the latter case vdpau/ might be empty before the mesa build has started, yet we'll remove it either way. This patch implements the latter option, as the downside isn't that significant, plus the patch is way shorter ;-) v2: use has_drivers to track since TARGET_DRIVERS can contain space, hence neither string comparison nor -n/-z works correctly. Gentoo Bugzilla: https://bugs.gentoo.org/545230 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-04 15:26:43 +00:00
Emil Velikov	342e5fdb64	radv: use enum_to_str util functions. Port of `e9dcb17962` vulkan/util: Add generator for enum_to_str functions Cc: Bas Nieuwenhuizen <basni@google.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-03-04 15:05:14 +00:00
Jason Ekstrand	e135ce6f08	vulkan: Build common Vulkan code earlier Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-04 14:46:53 +00:00
Jason Ekstrand	b3135c3cf3	anv: Advertise shaderInt64 on Broadwell and above Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-03-03 13:59:29 -08:00
Jason Ekstrand	bc456749bd	nir/int64: Properly handle imod/irem The previous implementation was fine for GLSL which doesn't really have a signed modulus/remainder. They just leave the behavior undefined whenever either source is negative. However, in SPIR-V, there is a defined behavior for negative arguments. This commit beefs up the pass so that it handles both correctly. Tested using a hacked up version of the Vulkan CTS test to get 64-bit support. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-03 13:59:27 -08:00
Jason Ekstrand	9745bef308	nir/builder: Add an int64 immediate helper Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-03 13:59:24 -08:00
Kenneth Graunke	46cd549c2b	genxml: Fill out Gen4 and G45 XML. This is a work in progress - some things may still need fixing. But it should be in pretty decent shape. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-03 10:23:17 -08:00
Marek Olšák	7f1446a8a1	ac: normalize build helper names s/emit/build/ Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 17:30:07 +01:00
Marek Olšák	8bde7fb3fc	ac: replace SI.vs.load.input with amdgcn.buffer.load.format Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 17:30:07 +01:00
Marek Olšák	94811dc66c	radeonsi: move SI.vs.load.input building into amd/common Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 17:30:07 +01:00
Marek Olšák	52660484c1	radeonsi: detect and mark loads/stores from read-only/write-only memory	2017-03-03 17:29:56 +01:00
Marek Olšák	97e21cfa25	ac: replace llvm.SI.tbuffer.store with llvm.amdgcn.buffer.store if ADD_TID=0 ADD_TID doesn't work. Needs more investigation. v2: remove leftover dead code Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)	2017-03-03 15:29:30 +01:00
Marek Olšák	684339827c	radeonsi: use the writeonly LLVM attribute	2017-03-03 15:29:30 +01:00
Marek Olšák	8cfdbba6c7	ac: remove offen parameter from ac_build_buffer_store_dword Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	1bc88c02c0	radeonsi: enable TC L2 for tessellation offchip stores Vulkan does the same thing.	2017-03-03 15:29:30 +01:00
Marek Olšák	27439dfdae	radeonsi: merge and simplify tbuffer_store functions Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	b46e412c2e	radeonsi: set noalias on input shader pointers	2017-03-03 15:29:30 +01:00
Marek Olšák	d4324ddb89	radeonsi: replace AMDGPU.bfe.* with amdgcn.*bfe Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	9c09592086	radeonsi: move kill intrinsic building into amd/common just a cleanup Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	e729dc7c46	radeonsi: set readnone on reads from read-only memory	2017-03-03 15:29:30 +01:00
Marek Olšák	25c7969a5a	radeonsi: replace SI.buffer.load.dword with amdgcn.buffer.load	2017-03-03 15:29:30 +01:00
Marek Olšák	653ac0b389	radeonsi: replace SI.packf16 with amdgcn.cvt.pkrtz	2017-03-03 15:29:30 +01:00
Marek Olšák	4b2e5b9389	ac: replace old image intrinsics with new ones Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	c6a3911e5d	radeonsi: remove last use of llvm.SI.resinfo and move one function up to reuse the code.	2017-03-03 15:29:30 +01:00
Marek Olšák	ad18d7f040	radeonsi: move image intrinsic building to amd/common Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	2b3ebe307c	ac: replace SI.export with amdgcn.exp.* Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	369f4a8726	radeonsi: move llvm.SI.export building to amd/common Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	9af03318aa	ac: unify build_type_name_for_intr functions Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	f8c823b103	radeonsi: set unorm=1 for TGSI_TEXTURE_SHADOWRECT as well It was harmless, because we also set unorm in the sampler state. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	b5744310d4	gallivm, ac: add writeonly and inaccessiblememonly attributes Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Marek Olšák	455c79b24f	tgsi/scan: record load/store/atomic image usage Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-03 15:29:30 +01:00
Eric Anholt	3958c01762	glapi: Fix a comment typo Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-03 20:29:12 +11:00
Alejandro Piñeiro	a54f0ad6d3	mesa/main: TextureSubImage generates INVALID_OPERATION on wrong target Equivalent TexSubImage methods generates INVALID_ENUM. From OpenGL 4.5 spec, section 8.6 Alternate Texture Image Specification Commands: "An INVALID_ENUM error is generated by TexSubImage if target does not match the command, as shown in table 8.15." And: "An INVALID_OPERATION error is generated by TextureSubImage if the effective target of texture does not match the command, as shown in table 8.15." Fixes: GL45-CTS.direct_state_access.textures_copy_errors v2: slightly change commit summary (Samuel) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-03 08:14:53 +01:00
Ben Widawsky	d844d8e4d5	i965: Add Kaby Lake brandstrings While here, use the spacing defined in Ark. https://ark.intel.com/products/codename/82879/Kaby-Lake Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2017-03-02 21:00:02 -08:00
Grazvydas Ignotas	4dc42ae792	tgsi/ureg: return correct token count in ureg_get_tokens Valgrind reports that the shader cache writes uninitialized data to disk. Turns out ureg_get_tokens() is returning the count of allocated tokens instead of how many are actually used, so the cache writes out unused space at the end. Use the real count instead. This change should not cause regressions elsewhere because the only ureg_get_tokens() user that cares about token count is the shader cache. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-03 12:11:55 +11:00
Timothy Arceri	6084855528	radeonsi: add support for an on-disk shader cache V2: - when loading from disk cache also binary insert into memory cache. - check that the binary loaded from disk is the correct size. If not delete the cache item and skip loading from cache. V3: - remove unrequired variable Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-03 12:09:08 +11:00
Timothy Arceri	85a9b1b562	util/disk_cache: compress individual cache entries This reduces the cache size for Deus Ex from ~160M to ~30M for radeonsi (these numbers differ from Grigori's results below probably due to different graphics quality settings). I'm also seeing the following improvements in minimum fps in the Shadow of Mordor benchmark on an i5-6400 CPU@2.70GHz, with a HDD: no-cache: ~10fps with-cache-no-compression: ~15fps with-cache-and-compression: ~20fps Note: The with cache results are from the second run after closing and opening the game to avoid the in-memory cache. Since we mainly care about decompression I went with Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson who has benchmarked decompression speeds. Grigori Goronzy provided the following stats for Deus Ex: Mankind Divided start-up times on a Athlon X4 860k with a SSD: No Cache 215 sec Cold Cache zlib BEST_COMPRESSION 285 sec Warm Cache zlib BEST_COMPRESSION 33 sec Cold Cache zlib BEST_SPEED 264 sec Warm Cache zlib BEST_SPEED 33 sec Cold Cache no compression 266 sec Warm Cache no compression 34 sec The total cache size for that game is 48 MiB with BEST_COMPRESSION, 56 MiB with BEST_SPEED and 170 MiB with no compression. These numbers suggest that it may be ok to go with Z_BEST_SPEED but we should gather some actual decompression times before doing so. Other options might be to do the compression in a separate thread, this might allow us to use a higher compression algorithim such as LZMA. Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-03-03 12:09:08 +11:00
Timothy Arceri	5afde61752	util/disk_cache: add support for detecting corrupt cache entries V2: fix pointer increments for writing/reading crc Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>	2017-03-03 12:09:08 +11:00
Samuel Pitoiset	9fc86d4f53	glsl: fix subroutine mismatch between declarations/definitions Previously, when q.subroutine was set to 1, a new subroutine declaration was added to the AST, while 0 meant a subroutine definition has been detected by the parser. Thus, setting the q.subroutine flag in both situations is obviously wrong because a new type identifier is added instead of trying to match the declaration. To fix it up, introduce ast_type_qualifier::is_subroutine_decl() to differentiate declarations and definitions easily. This fixes a regression with: arb_shader_subroutine/compiler/direct-call.vert Cc: Mark Janes <mark.a.janes@intel.com> Fixes: `be8aa76afd` ("glsl: remove unecessary flags.q.subroutine_def") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100026 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-03 00:57:57 +01:00
Matt Turner	10f2c86aa3	genxml: Depend on Makefile.am for generated sources. Depending on the generated Makefile means that all generated sources are recreated after ./configure. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-02 15:49:00 -08:00
Matt Turner	7d1195c1e4	clover: Work around build failure with AltiVec. Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=587210 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68504 Acked-by: Francisco Jerez <currojerez@riseup.net>	2017-03-02 15:49:00 -08:00
Nanley Chery	d7d64f1091	anv/image: Allow HiZ on input attachment-capable depth/stencil images While an input attachment may only take on one of those two layouts, other depth/stencil attachments that use the same image may have HiZ-enabled layouts. Improves the average frame rate on a release candidate of a proprietary Vulkan benchmark by 9.94% over 3 runs on my SKL GT4. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	76b8cc2a1c	anv/cmd_buffer: Centralize automatic layout transitions Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	0a72b5f3cb	anv/cmd_buffer: Add attachment transitioning functions This is needed to transition input attachments. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	9950774f8b	anv/blorp: Encapsulate subpass id querying Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	c78a959bcf	anv/cmd_buffer: Enable render pass awareness v2: Update cmd_state_reset (Jason Ekstrand) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	c0223d052b	anv/pass: Store subpass attachment reference list We'll loop through this array when performing automatic layout transitions. v2: Adjust formatting of an assignment (Jason Ekstrand) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	8f6a17c8e7	anv/pass: Fix size of anv_render_pass:subpass_attachments Don't allocate space for resolve attachments if the subpass has none. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	608d17b80e	anv: Store the user's VkAttachmentReference We will be using the image layout. Store the full struct directly from the user. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	6326f0f4be	anv/cmd_buffer: Remove extra resolve for certain depth buffers Due to recent commits, the sampler now bypasses the auxiliary HiZ buffer when reading from a depth image subresource that is in the general layout. Remove this unneeded resolve. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	ea744912b3	anv/cmd_buffer: Conditionally choose the sampled image surface state Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	5408d3fd05	anv/descriptor_set: Store aux usage of sampled image descriptors v2: Rebase onto latest changes v3: Account for NULL image_view in aux_usage assignment Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	efc2222323	anv/image: Create an additional surface state for sampling This will be used to sample a depth input attachment without having to pass through the HiZ buffer. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:54 -08:00
Nanley Chery	f3621f4e71	anv/image: Simplify setup of HiZ sampler surface state Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:54 -08:00
Nanley Chery	258af3a856	anv/image: Remove extra dependency on HiZ-specific variable surf_usage is only useful to image views that may use HiZ buffers. Storage image views don't use HiZ buffers. v2: Update commit message and add an assertion. Fixes: `055ff2ec52` ("anv: Replace anv_image_has_hiz() with ISL_AUX_USAGE_HIZ") Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:54 -08:00
Nanley Chery	54d29ee65f	anv: Update the HiZ sampling helper Validate the inputs, verify that this image has a depth buffer, use gen_device_info instead of v2: - Add parenthesis (Jason Ekstrand) - Make parameters const - Use gen_device_info instead of gen - Pass aspect to missed function in transition_depth_buffer Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:54 -08:00
Nanley Chery	172747a963	anv/cmd_buffer: Replace layout_to_hiz_usage() Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:54 -08:00
Nanley Chery	425e33bcdb	anv/image: Add anv_layout_to_aux_usage() This function supersedes layout_to_hiz_usage(). v2: - Don't find the optimal buffer for layout transitions (Jason Ekstrand). - Pass the devinfo instead of the gen (Jason Ekstrand) - Update the function documentation. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:54 -08:00
Nanley Chery	178f9e5f29	anv/pass: Avoid accessing attachment array out of bounds Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:54 -08:00
Jonas Pfeil	cd2b55e536	ralloc: Make sure ralloc() allocations match malloc()'s alignment. The header of ralloc needs to be aligned, because the compiler assumes that malloc returns will be aligned to 8/16 bytes depending on the platform, leading to degraded performance or alignment faults with ralloc. Fixes SIGBUS on Raspberry Pi at high optimization levels. This patch is not perfect for MSVC, as maybe in the future the alignment for the most demanding data type might change to more than 8. v2: Commit message reword/typo fix, and add a bigger explanation in the code (by anholt) Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org	2017-03-02 13:01:45 -08:00
Bruce Cherniak	a7b8d50bcb	swr: fix crash in swr_update_derived following st/mesa state changes Recent change to st/mesa state update logic caused major regressions to swr validation code. swr uses the same validation logic (swr_update_derived) for both draw and Clear calls. New st/mesa state update logic results in certain state objects not being set/bound during Clear. This was causing null ptr exceptions. Creation of static dummy state objects allows setting these pointers during Clear validation, without interfering with relevant state validation. Once fixed, new logic also highlighted an error in dirty bit checking for fragment shader and clip validation. (The alternative is to have a simplified validation routine for Clear. Which may do that at some point.) Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2017-03-02 13:39:56 -06:00
Bruce Cherniak	74aa6fd9a0	docs: update features.txt for GL_ARB_clear_texture with swr Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2017-03-02 13:39:56 -06:00
Bruce Cherniak	dd649a541d	swr: enable clear_texture with util_clear_texture Passes corresponding piglit tests. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-02 13:39:52 -06:00
Gregory Hainaut	b36050143f	doc: GL_ARB_buffer_storage is supported on llvmpipe/swr At least, the extension is exported (gallium capability PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT is 1) Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-02 17:31:04 +00:00
Emil Velikov	b23db2b840	automake: i965: list correct header in Makefile.source Fixes: `7ac47b1af7` ("i965: Add a header for brw_vec4_vs_visitor") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-02 17:30:33 +00:00
Brian Paul	b95ead850b	svga: fix crash regression since `e027935a79` During the first update of the hw_clear_state atoms, we may not yet have a current rasterizer state object. So, svga->curr.rast may be NULL and we crash. Add a few null pointer checks to work around this. Note that these are only needed in the state update functions which are called for 'clear' validation. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-03-02 10:11:19 -07:00
Brian Paul	69fb8f3cae	svga: s/unsigned/pipe_prim_type/ And add some default switch cases to silence compiler warnings. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-03-02 10:11:19 -07:00
Brian Paul	a9ff377d40	svga: whitespace fixes in svga_context.h Trivial.	2017-03-02 10:11:13 -07:00
Brian Paul	49134c0549	svga: whitespace and formatting fixes in svga_stage.c Trivial.	2017-03-02 10:11:04 -07:00
Robert Foss	88becf7302	mesa: Avoid read of uninitialized variable The is_color_attachement variable is later read when handling two separate error cases, where only one of the cases results in the variable being initialized. This can be avoided by giving the variable a safe default value. Coverity-Id: 1398631 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-02 15:45:19 +00:00
Lionel Landwerlin	af5f13e58c	anv: add VK_KHR_descriptor_update_template support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 10:34:06 +00:00
Lionel Landwerlin	9f60ed98e5	anv: add VK_KHR_push_descriptor support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 10:34:06 +00:00
Lionel Landwerlin	12dee851a3	anv: descriptor: make descriptor writing take a stream allocator This allows us to allocate surface states from the command buffer when pushing descriptor sets rather than allocating them through a descriptor set pool. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 10:34:06 +00:00
Lionel Landwerlin	194fa58285	anv: descriptors: extract writing of descriptors elements This will be reused later on. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 10:34:06 +00:00
Lionel Landwerlin	c2d199adec	anv: make layout size computation helper available across compilation units Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 10:34:06 +00:00
Lionel Landwerlin	c83e33e6ee	anv: move buffer_view declaration We will need this declaration closer for readability later. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 10:34:06 +00:00
Tomasz Figa	06758c1e8a	mesa: Use _mesa_has_OES_geometry_shader() when validating draws In validate_DrawElements_common() we need to check for OES_geometry_shader extension to determine if we should fail if transform feedback is unpaused. However current code reads ctx->Extensions.OES_geometry_shader directly, which does not take context version into account. This means that if the context is GLES 3.0, which makes the OES_geometry_shader inapplicable, we would not validate the draw properly. To fix it, let's replace the check with a call to _mesa_has_OES_geometry_shader(). Fixes following dEQP tests on i965 with a GLES 3.0 context: dEQP-GLES3.functional.negative_api.vertex_array#draw_elements dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_incomplete_primitive dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_instanced dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_instanced_incomplete_primitive dEQP-GLES3.functional.negative_api.vertex_array#draw_range_elements dEQP-GLES3.functional.negative_api.vertex_array#draw_range_elements_incomplete_primitive Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-03-02 00:37:17 -08:00
Kenneth Graunke	58793e514b	i965: Replace BRW_SURFACEFORMAT_* with ISL_FORMAT_. One less set of enums. Dropped the #defines from brw_defines.h and ran: $ for file in .cpp .c .h; do sed -i \ -e 's/BRW_SURFACEFORMAT_/ISL_FORMAT_/g' \ -e 's/ISL_FORMAT_ASTC_[A-Zxs0-9_]*/\U&/g' $file; \ done Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 00:30:45 -08:00
Chris Wilson	92281b2c7f	i965: Only flush the batchbuffer if we need to zero the SO offsets If we don't have pipelined register access (e.g. Haswell before kernel v4.2), then we can only implement EXT_transform_feedback by reseting the SO offsets between batches. However, if we do have pipelined access to the SO registers on gen7, we can simply emit an inline reset of the SO registers without a full batch flush. v2 [by Ken]: Simplify after recent kernel feature detection changes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-02 00:30:41 -08:00
Iago Toral Quiroga	7ad692d8e2	anv: do not subtract the base layer to compute depth in 3DSTATE_DEPTH_BUFFER According to the PRM description of the Depth field: "This field specifies the total number of levels for a volume texture or the number of array elements allowed to be accessed starting at the Minimum Array Element for arrayed surfaces" However, ISL defines array_len as the length of the range [base_array_layer, base_array_layer + array_len], so it already represents a value relative to the base array layer like the hardware expects. v2: Depth is defined as a U11-1 field, so subtract 1 from the actual value (Jason) This fixes a number of new CTS tests that would crash otherwise: dEQP-VK.pipeline.render_to_image.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 09:04:03 +01:00
Iago Toral Quiroga	64bf78270d	isl: document the meaning of the array_len field in isl_view Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 09:03:42 +01:00
Jacob Lifshay	3d8feb38e8	vulkan/wsi: Improve the DRI3 error message This commit improves the message by telling them that they could probably enable DRI3. More importantly, it includes a little heuristic to check to see if we're running on AMD or NVIDIA's proprietary X11 drivers and, if we are, doesn't emit the warning. This way, users with both a discrete card and Intel graphics don't get the warning when they're just running on the discrete card. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715 Co-authored-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Rene Lindsay <rjklindsay@hotmail.com> Acked-by: Dave Airlie <airlied@redhat.com> Cc: "17.0" <mesa-dev@lists.freedesktop.org>	2017-03-01 19:11:47 -08:00
Jason Ekstrand	424ac809bf	i965: Do int64 lowering in NIR Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-03-01 17:00:20 -08:00
Jason Ekstrand	074f5ba0b5	nir: Add a simple int64 lowering pass The algorithms used by this pass, especially for division, are heavily based on the work Ian Romanick did for the similar int64 lowering pass in the GLSL compiler. v2: Properly handle vectors v3: Get rid of log2_denom stuff. Since we're using bcsel, we do all the calculations anyway and this is just extra instructions. v4: - Add back in the log2_denom stuff since it's needed for ensuring that the shifts don't overflow. - Rework the looping part of the pass to be easier to expand. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-01 17:00:20 -08:00
Jason Ekstrand	86e749b1ad	spirv: Use nir_builder for control flow Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-03-01 17:00:20 -08:00
Jason Ekstrand	95972cd4fd	nir/lower_indirect: Use nir_builder control-flow helpers Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-03-01 17:00:20 -08:00
Jason Ekstrand	3ce8eeb5a1	nir/lower_gs_intrinsics: Use nir_builder control-flow helpers Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-03-01 17:00:20 -08:00
Jason Ekstrand	c75f965ab7	glsl/nir: Use nir_builder's new control-flow helpers Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-03-01 17:00:20 -08:00
Jason Ekstrand	e27c716ad7	nir/builder: Add support for easily building control-flow Each of the pop functions (and push_else) take a control flow parameter as their second argument. If NULL, it assumes that the builder is in a block that's a direct child of the control-flow node you want to pop off the virtual stack. This is what 90% of consumers will want. The SPIR-V pass, however, is a bit more "creative" about how it walks the CFG and it needs to be able to pop multiple levels at a time, hence the argument. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-03-01 17:00:20 -08:00
Jason Ekstrand	d5b355ce5f	i965: Move intel_debug.h to intel/common/gen_debug.h This is shared between the Vulkan and GL drivers as it's a requirement of the back-end compiler. However, it doesn't really belong in the compiler. We rename the file to match the prefix of the other stuff in common and because libdrm defines an intel_debug.h and this avoids a pile of possible name conflicts. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-03-01 16:14:03 -08:00
Jason Ekstrand	8048c1953c	i965: Reduce cross-pollination between the DRI driver and compiler Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:03 -08:00
Jason Ekstrand	a2195e561a	i965: Move select_clip_planes to brw_vs.c Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-01 16:14:03 -08:00
Jason Ekstrand	818bfdfa15	i965: Delete brw_do_cubemap_normalize This hasn't been used for quite some time now but we never bothered to get rid of it when we dropped GLSL IR support for vec4. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:03 -08:00
Jason Ekstrand	7ac47b1af7	i965: Add a header for brw_vec4_vs_visitor brw_vs.h is not a compiler file but brw_vec4_visitor is definitely a compiler thing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:02 -08:00
Jason Ekstrand	1c318af743	i965: Move a bunch of pre-compile and link stuff to brw_program.h It's all GL-specific and brw_program.h is not part of i965_compiler. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:02 -08:00
Jason Ekstrand	fbb9171968	i965: Move image uniform setup to brw_nir_uniforms.cpp It's the only thing that's using it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:02 -08:00
Jason Ekstrand	820ae39725	i965: Move channel_expressions and vector_splitting to brw_program.h They're GL-specific. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:02 -08:00
Jason Ekstrand	760c8a1d95	i965: Make mark_surface_used a static inline in brw_compiler.h One of these days, I'd like to see this function go away all together but for now, let's at least put it near the struct it updates. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:02 -08:00
Jason Ekstrand	f33d2b5d05	i965: Move BRW_ATTRIB_WA_* defines to brw_compiler.h Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:02 -08:00
Jason Ekstrand	4e274bcf66	i965: Move BRW_MAX_DRAW_BUFFERS to brw_compiler.h It does sort-of go with MAX_UBO and friends but MAX_DRAW_BUFFERS is an actual hardware constant based on the number of things we can blend rather than an arbitrary "number of things allowed in GL" like some of the other maximums are. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:02 -08:00
Jason Ekstrand	2523241660	i965/inst: Stop using fi_type It's a mesa define that's trivial to inline. This removes a dependence on main/imports.h. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:02 -08:00
Jason Ekstrand	ffeb738112	i965: Move brw_register_blocks to brw_fs.cpp Its one and only caller is brw_compile_fs which lives there. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:02 -08:00
Jason Ekstrand	5b87c7e0e3	i965: Move SHADER_TIME_STRIDE to brw_compiler.h Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:02 -08:00
Jason Ekstrand	f85ef11501	i965: Move SOL binding #defines to brw_compiler.h While we're at it, we also change the GEN6 binding macro to be a start index that gets added to the binding. This makes things a bit more explicit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:02 -08:00
Jason Ekstrand	81e5bdf072	i964/gs: Move MAX_GS_INPUT_VERTICES to brw_vec4_gs_visitor.h It's only users are in brw_vec4_gs_visitor and gen6_vec4_gs_visitor. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:01 -08:00
Jason Ekstrand	c6a719b64f	i965/gs: Add the gl_prim_to_hw_prim table to vec4_gs_visitor.cpp It's currently in brw_util.c but that's the only bit of brw_util.c that's shared between the compiler and the rest of the GL driver. It's just a fairly obvious table so the duplication isn't bad. It's certainly less pain than trying to figure out how to share the code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:01 -08:00
Jason Ekstrand	035616cb8e	i965: Don't use MAX_SURFACES in mark_surface_used Vulkan doesn't respect MAX_SURFACES so this assert isn't valid in that case. It should, however, assert that it isn't insanely large. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:01 -08:00
Jason Ekstrand	0d2c9ce1ce	i965: Get rid of BRW_PRIM_OFFSET This is a relic of when we wired up meta to be able to use RECTLIST primitives. It's no longer needed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-01 16:14:01 -08:00
Jason Ekstrand	406321caeb	i965/vue_map: Stop using GLbitfield types Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-03-01 16:13:58 -08:00
Jason Ekstrand	45d3dbebb2	i965: Move assign_common_binding_table_offsets to brw_program This isn't used by Vulkan and is specific to the way the GL driver works. There's no reason to have it in common compiler code. Also, it relies on BRW_MAX_* defines which are defined in brw_context.h Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-03-01 16:13:55 -08:00
Jason Ekstrand	8123402fd1	i965: Move some gen4 WM defines to brw_compiler.h These go in wm_prog_key so they're part of the compiler interface. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-03-01 16:13:27 -08:00
Jason Ekstrand	34ede38194	i965: Move brw_disassemble_inst to brw_eu.h Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-03-01 16:13:26 -08:00
Jason Ekstrand	f9c9d551ea	i965: Move some helpers from brw_context.h to brw_shader.h Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-03-01 16:13:24 -08:00
Jason Ekstrand	b97782c364	i965: Move a couple of #defines from brw_context to brw_compiler Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-03-01 16:13:09 -08:00
Jason Ekstrand	2c58709023	glsl/int64: Fix a typo in imod64 The zy swizzle gives us one component of quotient and one component of remainder. What we wanted was zw for the remainder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-01 15:31:44 -08:00
Jason Ekstrand	e647c4fbd9	util/build-id: Return a pointer rather than copying the data We're about to use the build-id as the starting point for another SHA1 hash in the Intel Vulkan driver, and returning a pointer is far more convenient. Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-03-01 15:31:44 -08:00
Jason Ekstrand	e3d33a23e6	anv: Properly handle destroying NULL devices and instances Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "17.0 13.0" <mesa-dev@lists.freedesktop.org>	2017-03-01 15:31:44 -08:00
Robert Bragg	f3ec9d33c6	mesa: Fix performance query id check The queryid_valid() function asserts that an ID given by an application isn't zero since the spec explicitly reserves an ID of zero as invalid. The implementation was written as if the ID was a signed integer and based on the assumption that queryid_to_index() is simply subtracting one from the ID. It was broken because in fact the ID was stored in an unsigned int and testing for an index >= 0 would always succeed. This adds a spec quote to clarify why zero is considered invalid and checks for zero before even passing the ID to queryid_to_index() for then checking the upper bound. This is a v2 of a patch originally posted by Juha-Pekka (thanks) Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2017-03-01 23:01:48 +00:00
Tobias Klausmann	6d600cf632	amd/common: Fix build with new ac_add_function_attr() Fix usage of ac_add_function_attr() and make it known! common/ac_nir_to_llvm.c: In function 'create_llvm_function': common/ac_nir_to_llvm.c:265:4: error: implicit declaration of function 'ac_add_function_attr' [-Werror=implicit-function-declaration] ac_add_function_attr(main_function, i + 1, AC_FUNC_ATTR_BYVAL); ^~~~~~~~~~~~~~~~~~~~ Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-03-01 23:53:38 +01:00
Daniel Stone	a1727aa75e	egl/wayland: Don't use DRM format codes for SHM The wl_drm interface (akin to X11's DRI2) uses the standard set of DRM FourCC format codes. wl_shm copies this, except for ARGB8888/XRGB8888, which use their own definitions. Make sure we only use wl_shm format codes when we're working with wl_shm. Otherwise, using swrast with 32bpp formats would fail with an error. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Daniel Stone <daniels@collabora.com> (v1) Fixes: `cb5e799448` ("egl/wayland: unify dri2_wl_create_surface implementations") v2: [Emil Velikov: move to dri2_wl_create_window_surface] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> (IRC)	2017-03-01 18:36:55 +00:00
Kenneth Graunke	c0e9e61c9a	mesa: Drop unused STATE_TEXRECT_SCALE program statevars. The last user is now gone. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2017-03-01 10:27:38 -08:00
Kenneth Graunke	f356d05393	i965: Drop unused STATE_TEXRECT_SCALE code. In the past, we used this on Gen4-5 to transform non-normalized texture coordinates (for sampler2DRect) to normalized ones. We also used it on Gen6-7.5 for sampler2DRect with GL_CLAMP. Jason dropped this code in `6c8ba59cff` in favor of using nir_lower_tex(), which just does a textureSize() call. But we were still setting up these state references for useless uniform data. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2017-03-01 10:27:36 -08:00
Kenneth Graunke	4061bbccf2	egl: Ensure ResetNotificationStrategy matches for shared contexts. Fixes: dEQP-EGL.functional.robustness.negative_context.invalid_robust_shared_context_creation Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2017-03-01 10:26:42 -08:00
Marek Olšák	940da36a65	gallivm,ac: add function attributes at call sites instead of declarations They can vary at call sites if the intrinsic is NOT a legacy SI intrinsic. We need this to force readnone or inaccessiblememonly on some amdgcn intrinsics. This is only used with LLVM 4.0 and later. Intrinsics only used with LLVM <= 3.9 don't need the LEGACY flag. gallivm and ac code is in the same patch, because splitting would be more complicated with all the LEGACY uses all over the place. v2: don't change the prototype of lp_add_function_attr. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (v1)	2017-03-01 18:59:36 +01:00
Marek Olšák	408f370710	gallivm,ac: remove unused FUNC_ATTR_LAST enums Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-03-01 18:59:36 +01:00
Nicolai Hähnle	40c77bbf83	st/mesa: inform the driver of framebuffer changes before compute dispatches Even though compute shaders cannot access the framebuffer, there is a synchronization issue when a compute dispatch accesses a texture that was previously bound and drawn to as a framebuffer. Section 9.3 (Feedback Loops Between Textures and the Framebuffer) of the OpenGL 4.5 spec rather implicitly clarifies that undefined behavior results if the texture is still attached to the currently bound framebuffer. However, the feedback loop is broken when the application changes the framebuffer binding before a compute dispatch, and the state tracker needs to let the driver known about this. Fixes GL45-CTS.compute_shader.pipeline-post-fs on SI family Radeons. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-03-01 18:59:36 +01:00
Nicolai Hähnle	911391bd70	st/glsl_to_tgsi: avoid iterating past the head of the instruction list exec_node::get_prev() does not guard against going past the beginning of the list, so we need to add explicit checks here. Found by ASAN in piglit arb_shader_storage_buffer_object-rendering. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-03-01 18:59:36 +01:00
Marc Dietrich	64b215223f	r600g: fix build without opencl and static llvm libs radeon_llvm_check and friends were never called in the no-opencl case, which ended up with an empty llvm module list. As --enable-opencl always requires --enable-llvm, we can use the latter as the guard. Signed-off-by: Marc Dietrich <marvin24@gmx.de> [Emil Velikov: commit message polish] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-01 13:22:48 +00:00
Samuel Pitoiset	be8aa76afd	glsl: remove unecessary flags.q.subroutine_def This bit is definitely not necessary because subroutine_list can be used instead. This frees one more bit in the flags.q struct which is nice because arb_bindless_texture will need 4 bits for the new layout qualifiers. No piglit regressions found (including compiler tests) with "-t subroutine". v2: set the subroutine flag for validating illegal flags Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-01 14:15:31 +01:00
Emil Velikov	ca7d2025a7	vulkan: provide vk.xml as argument to the python generator Do not hardcode the file in the python script, but pass it via the build system(s). The latter is the only one that should know about the file location/tree structure. Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-28 18:53:04 +00:00
Emil Velikov	14281c9035	automake: vulkan: rename/reuse VULKAN_UTIL_{GENERATED_,}FILES list Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-28 14:13:09 +00:00
Mauro Rossi	3f2cb699cf	android: vulkan: add support for libmesa_vulkan_util The following changes are implemented: Add src/vulkan/Android.mk to build libmesa_vulkan_util Android.mk: add src/vulkan to SUBDIR to build new module intel/vulkan: fix libmesa_vulkan_util,vk_enum_to_str.h dependencies Add -o OUTPUT_PATH option in src/vulkan/util/gen_enum_to_str.py script Use -o OUTPUT_PATH option in automake generation rules for vk_enum_to_str.{c,h} Fixes: `e9dcb17` "vulkan/util: Add generator for enum_to_str functions" Fixes: `8e03250` "vulkan: Combine wsi and util makefiles" Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [Emil Velikov] - Move parser within main() - Use --outdir instead of -o Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-28 01:24:41 +01:00
Emil Velikov	3bbbb63801	automake: r600: radeonsi: correctly manage libamd_common.la linking Since both r600 and radeonsi use code from libamd_common they need to static link it. At the same time, adding a common library to LIB_DEPS is fragile [can lean to multiple symbol definitions] and non-obvious - I had to do a double-take how things work atm. So follow the libradeon.la approach and put common libraries in TARGET_RADEON_COMMON Fixes: `936f5407a7` ("gallium/radeon: Add libamd_common.a to TARGET_LIB_DEPS also for r600") Cc: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2017-02-28 10:55:46 +00:00
Emil Velikov	8af447d6f0	glx/tests: automake: add dispatch-index-check to the tarball Otherwise we'll fail at `make distcheck' Fixes: `3cc33e7640` ("glx: add GLXdispatchIndex sort check") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-28 16:18:27 +00:00
Emil Velikov	3935690d58	automake: anv: add missing include $(top_srcdir)/src/vulkan/util Otherwise we'll fail to find the header and `make distcheck` will bail. Fixes: `e9dcb17962` ("vulkan/util: Add generator for enum_to_str functions") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-28 14:08:17 +00:00
Samuel Iglesias Gonsálvez	0dddad5b1b	i965/fs: emit MOV_INDIRECT with the source with the right register type This was hiding bugs as it retyped the source to destination's type. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-03-01 06:50:35 +01:00
Samuel Iglesias Gonsálvez	d8122128bc	i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handles When generating the MOV INDIRECT instruction, the source type is ignored and it is set to destination's type. However, this is going to change in a later patch, so we need to explicitly set the proper source type. brw_vec8_grf() creates an float type's fs_reg by default, when the ICP handle is actually unsigned. This patch fixes these cases before applying the aforementioned patch. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-03-01 06:50:35 +01:00
Samuel Iglesias Gonsálvez	56266df7ed	i965/fs: fix indirect load DF uniforms on BSW/BXT The lowered BSW/BXT indirect move instructions had incorrect source types, which luckily wasn't causing incorrect assembly to be generated due to the bug fixed in the next patch, but would have confused the remaining back-end IR infrastructure due to the mismatch between the IR source types and the emitted machine code. v2: - Improve commit log (Curro) - Fix read_size (Curro) - Fix DF uniform array detection in assign_constant_locations() when it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT. v3: - Move changes in assign_constant_locations() to other patch. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-03-01 06:50:35 +01:00
Samuel Iglesias Gonsálvez	a497ab6838	i965/fs: detect different bit size accesses to uniforms to push them in proper locations Previously, if we had accesses with different sizes to the same uniform, we might not push it aligned with the bigger one. This is a problem in BSW/BXT when we access an array of DF uniform with both direct and indirect addressing because for the latter we use 32-bit MOV INDIRECT instructions. However this problem can happen with other generations and bitsizes. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-03-01 06:50:29 +01:00
Samuel Iglesias Gonsálvez	7427425247	i965/fs: mark last DF uniform array element as 64 bit live one This bug can make that we don't detect the end of a contiguous area correctly and push larger areas than the real ones. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-03-01 06:50:10 +01:00
Dave Airlie	e66be3d3bb	radv: fix txs for sampler buffers I messed this up when I wrote it, this fixes: dEQP-VK.memory.pipeline_barrier.uniform_texel_buffer. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-01 08:02:24 +10:00
Marek Olšák	8c838730d0	amd/common: fix ASICREV_IS_POLARIS11_M for Polaris12 Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-28 21:44:30 +01:00
Bas Nieuwenhuizen	6e9fb1de7f	radv: Don't allocate space for unused immutable samplers. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-28 20:48:18 +01:00
Bas Nieuwenhuizen	137b06b437	radv/ac: Use constants for immutable samplers. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-28 20:48:14 +01:00
Bas Nieuwenhuizen	500e6e40f6	radv: Detect if all immutable samplers for a binding are equal. We can then use constants for indexed loads. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-28 20:48:10 +01:00
Bas Nieuwenhuizen	dd2a0c7aef	radv: Store the immutable samplers as uint32_t[4]. So we don't need to know about radv_sampler in ac_nir_to_llvm. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-28 20:46:02 +01:00
Brendan King	884f65e185	egl/dri3: implement query surface hook This is a DRI3 version of a change made for DRI2 (`4d6d4f939e`, "egl/dri2: implement query surface hook"), that fixed failures in dEQP-EGL.functional.resize.surface_size.grow and dEQP-EGL.functional.resize.surface_size.shrink. Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Mark Janes <mark.a.janes@intel.com> Cc: Chad Versace <chadversary@chromium.org> Signed-off-by: Brendan King <Brendan.King@imgtec.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-02-28 10:11:42 +00:00
Michel Dänzer	936f5407a7	gallium/radeon: Add libamd_common.a to TARGET_LIB_DEPS also for r600 Fixes build failure with --enable-opencl --enable-xvmc: make[4]: Entering directory '/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/targets/xvmc' CXXLD libXvMCgallium.la ../../../../src/gallium/drivers/r600/.libs/libr600.a(evergreen_compute.o): In function `evergreen_create_compute_state': /home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:254: undefined reference to `ac_elf_read' ../../../../src/gallium/drivers/r600/.libs/libr600.a(evergreen_compute.o): In function `r600_shader_binary_read_config': /home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:189: undefined reference to `ac_shader_binary_config_start' /home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:189: undefined reference to `ac_shader_binary_config_start' collect2: error: ld returned 1 exit status Makefile:760: recipe for target 'libXvMCgallium.la' failed Fixes: `dc4c551a34` ("radeon/ac: switch from radeon_elf_read() to ac_elf_read()") Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-02-28 16:35:21 +09:00
Kenneth Graunke	b8cd78eaa1	i965: Move intel_resolve_map.[ch] from i965_compiler_FILES to i965_FILES I have no idea why these were part of the compiler files. They're miptree related code, and the compiler doesn't appear to use them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-27 22:56:59 -08:00
Timothy Arceri	4d0d81379e	gallium/r600: fix r600 build when OpenCL is enabled Fixes build regression caused by `d90bf4ef3e`	2017-02-28 15:42:18 +11:00
Timothy Arceri	d90bf4ef3e	radeon: remove unused radeon_elf_util.{c,h} We now use the shared code in AMD common instead. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-28 13:20:31 +11:00
Timothy Arceri	503fb134e8	radeon/ac: switch to ac_shader_binary_config_start() For radeonsi we could probably switch to ac_shader_binary_read_config(). However the functions have diverged so just share this helper for now. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-28 13:20:31 +11:00
Timothy Arceri	f0aaa4b3a4	radeon/ac: make ac_shader_binary_config_start() available externally The read config functions are different for r600 and radeonsi so we can't just share the one in amd common. So just share this instead. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-28 13:20:31 +11:00
Timothy Arceri	dc4c551a34	radeon/ac: switch from radeon_elf_read() to ac_elf_read() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-28 13:20:31 +11:00
Timothy Arceri	69a687189e	radeon/ac: switch from radeon_shader_binary to ac_shader_binary Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-28 13:20:31 +11:00
Timothy Arceri	affc8314cb	radeon/ac: add llvm_ir_string to ac_shader_binary struct Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-28 13:20:31 +11:00
Kenneth Graunke	63d1ebca3a	ralloc: Delete autofree handling. There was exactly one user of this, and I just removed it. It also accessed an implicit global context, with no locking. This meant that it was only safe if all callers of ralloc_autofree_context() held the same lock...which is a pretty terrible thing for a utility library to impose. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-27 15:46:12 -08:00
Kenneth Graunke	aa8bb9fc15	compiler: Free types in _mesa_glsl_release_types() rather than autofree. Instead of using ralloc_autofree_context() to install an atexit() handler to ralloc_free(glsl_type::mem_ctx), we can simply free them from _mesa_glsl_release_types(). This is effectively the same, because _mesa_glsl_release_types() is called from _mesa_destroy_shader_compiler(), which is called from Mesa's one_time_fini() function, which Mesa installs as an atexit() handler. The one advantage here is that it ensures the built-in functions are destroyed before the types. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-27 15:46:12 -08:00
Jan Vesely	010fecb853	clover: Dump linked binary to a different file this allows to pass the generated files directly to llc or bugpoint v2: add atomic counter ID v3: remove extra scope operator, constify Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-02-27 16:11:48 -05:00
Dave Airlie	800b82ea13	radv: fix depth format in blit2d. For blitting we need to use the depth or stencil format, never the combined. This fixes: dEQP-VK.texture.shadow.2d.nearest.less_or_equal_d32_sfloat_s8_uint and a few others. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-28 06:11:54 +10:00
Dave Airlie	1121ce4525	radv/formats: add fast clear for 8-bit signed ints. These formats are used by some CTS tests, may as well fill them in. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-28 06:11:50 +10:00
Samuel Pitoiset	ec623f77eb	mesa/main: refactor sampler parameter error codepath This is similar to what we do in the texture error codepath. While we are at it, update the specification comment with latest GL 4.5 spec. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-27 19:42:23 +01:00
Samuel Pitoiset	e69fd0b43c	glsl: reject samplers not declared as uniform/function params earlier This improves consistency with image variables and atomic counters which are already rejected the same way. Note that opaque variables can't be treated as l-values, which means only the 'in' function parameter is allowed. v2: rewrite commit message Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2017-02-27 19:42:00 +01:00
Samuel Pitoiset	08a052966f	glsl: use is_sampler() anywhere it's possible Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-02-27 19:41:14 +01:00
Samuel Pitoiset	e12f4edf9c	glsl: use is_image() anywhere it's possible Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-02-27 19:41:11 +01:00
Samuel Pitoiset	46562a062b	glsl: add missing blend_support qualifier in validate_flags() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-02-27 19:40:12 +01:00
Samuel Pitoiset	87ee1729d0	glsl: use an enum for AMD_conservative_depth layout qualifiers The main idea behind this is to free some bits in the flags.q struct because currently all 64-bits are used and we can't add more layout qualifiers without reaching a static assert. In order to do that (mainly for ARB_bindless_texture), use an enumeration for the AMD_conservative_depth layout qualifiers because it's forbidden to declare more than one depth qualifier for gl_FragDepth. Note that ast_type_qualifier::merge_qualifier() will prevent using duplicate layout qualifiers by returning a compile-time error. No piglit regressions found (including compiler tests) with RX480 on RadeonSI. v2: use a switch case Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Andres Gomez <agomez@igalia.com> (v1)	2017-02-27 19:39:37 +01:00
Samuel Pitoiset	de2727925a	glsl: add has_shader_image_load_store() Preliminary work for ARB_bindless_texture which can interact with ARB_shader_image_load_store. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-27 19:33:10 +01:00
Samuel Pitoiset	ea8086861f	drirc: add force_glsl_version=440 for The Culling This game uses GLSL 430 but the interpolation qualifiers in some shaders don't match, which ends up in a link error. GLSL 440 spec removed this restriction, force it. This fixes the following link error, as well as serious rendering problems. error: vertex shader output `out_TEXCOORD1' specifies noperspective interpolation qualifier, but fragment shader input specifies no interpolation qualifier Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-02-27 19:32:55 +01:00
Jason Ekstrand	76c8327e6e	anv: Bump advertised version to 1.0.42 We've been following the spec changes. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-02-27 09:44:46 -08:00
Jason Ekstrand	54dd42eb94	vulkan: Update registry and headers to 1.0.42 This brings in a bunch of new extensions	2017-02-27 09:44:45 -08:00
Elie TOURNIER	082d5b1aee	nir: Delete unused arg in get_iteration nir_const_value is not needed in get_iteration Signed-off-by: Elie Tournier <tournier.elie@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-27 14:35:16 +00:00
Eric Engestrom	077879cf5e	docs: fix a few typos Noticed a couple, found the rest using vimspell. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-27 14:15:10 +00:00
Grazvydas Ignotas	7f268cf12b	gallium/u_queue: set num_threads correctly if not all threads start If i-th thread could not be created it means we have i threads, not i+1, because we start from 0. Fixes: `404d0d5` "gallium/u_queue: add an option to have multiple worker threads" Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-02-27 14:49:46 +01:00
Grazvydas Ignotas	9936121935	gallium/u_queue: fix a crash with atexit handlers Commit `4aea8fe` ("gallium/u_queue: fix random crashes when the app calls exit()") added a atexit handler which calls util_queue_killall_and_wait() for each queue to stop the threads. However the app is also free to use atexit handlers to clean up things, leading to util_queue_destroy() call which will also call util_queue_killall_and_wait() for the same queue again, causing threads being joined twice, and that is undefined. This happens with libglut, for example. A simple fix is to just set num_threads to 0 as there are no more valid threads after util_queue_killall_and_wait() returns. Fixes: `4aea8fe` "gallium/u_queue: fix random crashes when the app calls exit()" Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-02-27 14:49:15 +01:00
Bas Nieuwenhuizen	43d833ae97	radv: Use correct size for availability flag. Per spec, VK_QUERY_RESULT_64_BIT specifies the integer size and the availability flag is an integer. We apparently handled this correctly already for the copy to buffer case. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>	2017-02-27 01:33:10 +01:00
Bas Nieuwenhuizen	8ea34a98c0	radv: Only use PKT3_OCCLUSION_QUERY when it doesn't hang. PKT3_OCCLUSION_QUERY hangs when used in a nested IB. This only calls it when in a primary command buffer and we change GetQueryPoolResults to not need it. CmdCopyQueryPoolResults still needs it so we break that behavior for secondary command buffers. However, that would hang already and using an unitialized value is better than a hang. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>	2017-02-27 01:33:10 +01:00
Bas Nieuwenhuizen	bb878db7eb	radv: Reset emitted compute pipeline when calling secondary cmd buffer. Otherwise if the new compute pipeline is the same as the last used pipeline before the call, we don't emit it again. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>	2017-02-27 01:33:10 +01:00
Dave Airlie	15f47027ad	radv: add support for NV_dedicated_allocation This adds initial support for NV_dedicated_allocation, then uses it for the wsi image/memory allocation paths internally in the driver. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-27 00:22:51 +00:00
Andres Rodriguez	35189d3279	radv/winsys: fix freeing imported memory. This bo->fd wasn't setting some stuff correctly that could lead to crashes for anything using this path later. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-27 00:22:39 +00:00
Dave Airlie	f695735ed6	vulkan/wsi/radv: add initial prime support (v1.1) This is a complete rewrite of my previous rfc patches. This adds the ability to present to a different GPU that rendering using a driver side operation that can copy from the tiled to linear shared image. This does prime support completely in the swapchain present code, and each queue has a precreated command buffer for each image and for the each queue family. This means presenting should work on graphics and compute queues and transfer in the future. v1.1: initialise needs_linear_copy in swapchain. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-27 05:42:16 +10:00
Bas Nieuwenhuizen	336b05c49a	radv/ac: Add integer->integer casts. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-26 19:59:27 +01:00
Eric Engestrom	5b5ffb795f	check: add support for running test as standalone Signed-off-by: Eric Engestrom <eric@engestrom.ch>	2017-02-26 13:39:45 +00:00
Eric Engestrom	cd35a119ad	check: make any failure fatal Previously, only the last error code was returned. Using `set -e` makes the script quit on any unhandled error. Signed-off-by: Eric Engestrom <eric@engestrom.ch>	2017-02-26 13:39:43 +00:00
Eric Engestrom	a1e5e55989	check: mark two tests are requiring bash Requirement was removed just before pushing, but it's actually needed for heredocs (`<<<`). Signed-off-by: Eric Engestrom <eric@engestrom.ch>	2017-02-26 13:39:12 +00:00
Mike Lothian	47c49f6190	st/nine: Drop USER_INDEX_BUFFERS check This fixes `4a883966c1` where the PIPE_CAP was removed. Now USER_INDEX_BUFFERS are always enabled remove the check and only check for cmst_active directly. v2: Axel pointed out the code was still needed when cmst was inactive, Rebase on master too v3: Drop struct member user_ibufs also && fixup shortlog (Edward). v4: Fix negation v5: Use the right variable name csmt != cmst Fixes: `4a883966c1` ("gallium: remove PIPE_CAP_USER_INDEX_BUFFERS") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99953 Reported-and-tested-by: Vinson Lee <vlee@freedesktop.org> (v1) Cc: Marek Olšák <marek.olsak@amd.com> Cc: Axel Davy <axel.davy@ens.fr> Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-25 23:20:18 +11:00
Constantine Charlamov	abb1c645c4	st/nine: make use of common uploaders v4 Make use of common uploaders that landed recently to Mesa v2: fixed formatting, broken due to thunderbird configuration v3: per Axel comment: added a comment into NineDevice9_DrawPrimitiveUP v4: per Axel comment: changed style of the comment	2017-02-25 09:31:10 +01:00
Timothy Arceri	6b4bb24acf	compiler: style clean-ups in blob.h Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2017-02-25 13:30:28 +11:00
Brian Paul	fcf466383a	svga: fix MSVC build error after PIPE_CAP_USER_INDEX_BUFFERS removal Need to specify the zero for the struct initializer. My earlier test of the patch series was with MinGW, not MSVC. Trivial.	2017-02-24 19:07:10 -07:00
Eric Anholt	292c24ddac	vc4: Lazily emit our FS/VS input loads. This reduces register pressure in both types of shaders, by reordering the input loads from the var->data.driver_location order to whatever order they appear first in the NIR shader. These instructions aren't reorderable at our QIR scheduling level because the FS takes two in lockstep to do an interpolation, and the VS takes multiple read instructions in a row to get a whole vec4-level attribute read. shader-db impact: total instructions in shared programs: 76666 -> 76590 (-0.10%) instructions in affected programs: 42945 -> 42869 (-0.18%) total max temps in shared programs: 9395 -> 9208 (-1.99%) max temps in affected programs: 2951 -> 2764 (-6.34%) Some programs get their max temps hurt, depending on the order that the load_input intrinsics appear, because we end up being unable to copy propagate an older VPM read into its only use.	2017-02-24 17:01:29 -08:00
Eric Anholt	f06915d7b7	vc4: Refactor the load_input code out of the intrinsic code. It's going gain most of ntq_setup_inputs(), so simplify it first.	2017-02-24 16:31:54 -08:00
Eric Anholt	84a304eb96	vc4: Track the last block we emitted at the top level. This will be used for delaying our VPM reads (which must be unconditional) until just before they're used.	2017-02-24 16:31:54 -08:00
Eric Anholt	99d4203ad5	vc4: Emit max number of temps in the shader-db output. We need to be paying attention to optimization's impact on this -- even if we reduce instruction count, increasing max temps in general is likely to cause us to fail to register allocate on some shaders, which means that those won't run at all.	2017-02-24 16:31:54 -08:00
Vinson Lee	30a4b25efe	util/disk_cache: Use backward compatible st_mtime. Fix Mac OS X build error. CC libmesautil_la-disk_cache.lo In file included from disk_cache.c:46: ./disk_cache.h:57:20: error: no member named 'st_mtim' in 'struct stat' timestamp = st.st_mtim.tv_sec; ~~ ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99918 Fixes: `207e3a6e4b` ("util/radv: move _get_function_timestamp() to utils") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-24 16:06:40 -08:00
Vinson Lee	c3f9540a0c	glsl: Fix missing-braces warning. CXX glsl/ast_to_hir.lo glsl/ast_to_hir.cpp: In member function 'virtual ir_rvalue* ast_declarator_list::hir(exec_list, _mesa_glsl_parse_state)': glsl/ast_to_hir.cpp:4846:42: warning: missing braces around initializer for 'unsigned int [16]' [-Wmissing-braces] Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-02-24 16:04:06 -08:00
Marek Olšák	c7878b0167	ac: silence a warning trivial	2017-02-25 00:16:38 +01:00
Marek Olšák	35915af6c9	radeonsi: fix broken tessellation on Carrizo and Stoney Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99850 Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2017-02-25 00:03:09 +01:00
Marek Olšák	e027935a79	st/mesa: don't update unrelated states in non-draw calls such as Clear If a VAO isn't bound and u_vbuf isn't enabled because of the Core profile, we'll get user vertex buffers in drivers if we update vertex buffers in glClear. So don't do that. This fixes a regression since disabling u_vbuf for Core profiles. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-25 00:03:09 +01:00
Marek Olšák	cc2f92b09f	st/mesa: set blend state for PBO readbacks v2: restore the state Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-25 00:03:09 +01:00
Marek Olšák	a40b76143d	st/mesa: reset sample_mask, min_sample, and render_condition for PBO ops Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-25 00:03:09 +01:00
Marek Olšák	1a36bea445	st/mesa: don't check st->vp in update_clip The clip state is updated before VS, so it can be NULL for the first draw call. Just remove the unnecessary dependency on st->vp. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-25 00:03:09 +01:00
Marek Olšák	d17b8d08a3	trace: remove pipe_resource wrapping Not needed. ddebug does the same thing. The limitation is that drivers can only use pipe_resource::screen through pipe_resource_reference. This unbreaks trace, because pipe_context uploaders aren't wrapped, so trace doesn't understand buffers returned by them. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-25 00:03:09 +01:00
Marek Olšák	4a883966c1	gallium: remove PIPE_CAP_USER_INDEX_BUFFERS all drivers support it Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> (VMware driver only)	2017-02-25 00:03:09 +01:00
Marek Olšák	4700f409fb	st/mesa: assume all drivers support user index buffers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> (VMware driver only)	2017-02-25 00:03:09 +01:00
Marek Olšák	e78ccee933	svga: implement user index buffers Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> (VMware driver only)	2017-02-25 00:03:09 +01:00
Marek Olšák	7fff5b77f1	freedreno: add support for user index buffers Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-25 00:03:09 +01:00
Marek Olšák	19c51e072b	etnaviv: add support for user index buffers Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-25 00:03:09 +01:00
Marek Olšák	f139b6fb4f	gallium/util: add new helpers for user index buffer uploading v3: split from the etnaviv patch; fix new_ib.buffer leak Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> (VMware driver only)	2017-02-25 00:03:09 +01:00
Elie TOURNIER	b10197e3a4	nir: delete magic number Signed-off-by: Elie Tournier <tournier.elie@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-24 13:02:24 -08:00
Roland Scheidegger	c3a94d9195	gallium/util: (trivial) fix util_clear_render_target the format of the rt can be different than the one of the texture, so must propagate the format explicitly to the helper. Broken since `3f9c5d6244` (but unused by st/mesa).	2017-02-24 20:39:56 +01:00
Emil Velikov	9833488974	util: automake: add sha1/README to the tarball Suggested-by: Andreas Boll <andreas.boll.dev@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-24 17:38:16 +00:00
Emil Velikov	6854716f37	mapi: remove unused mapi.[ch] The final user of it was st/vega. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2017-02-24 17:37:02 +00:00
Emil Velikov	93369aa928	blorp: automake: add TODO to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2017-02-24 17:37:00 +00:00
Emil Velikov	ab6fa871ef	anv: automake: add TODO to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2017-02-24 17:36:59 +00:00
Emil Velikov	aa63b7fa16	vc4: automake: add the kernel/README to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2017-02-24 17:36:57 +00:00
Emil Velikov	f64a7c74c3	nir: automake: add the README to the tarball Similar to other accompanying documentation we have in-tree. For example glsl/README. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2017-02-24 17:36:45 +00:00
Emil Velikov	e3ad2d40db	radv/entrypoints: Only generate entrypoints for supported features This changes the way radv_entrypoints_gen.py works from generating a table containing every single entrypoint in the XML to just the ones that we actually need. There's no reason for us to burn entrypoint table space on a bunch of NV extensions we never plan to implement. RADV implements VK_AMD_draw_indirect_count, so add that to the list. Port of `114c281e70` "and/entrypoints: Only generate entrypoints for supported features" Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-02-24 17:36:25 +00:00
Robert Bragg	d1bb7895b9	main/performance_query: s/GLboolean/bool/ Ideally would have caught these when adding the interface but this just switches a few return types for the INTEL_performance_query backend interface to bool instead of GLboolean. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-24 17:16:11 +00:00
Eric Engestrom	1534fc6d10	eglapi: replace linear entrypoint search with binary search Tested with dEQP-EGL.functional.get_proc_address.* Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-24 17:00:50 +00:00
Eric Engestrom	d25dea0c68	egl: make sure entrypoints list is always sorted Starting with the next commit, badly sorting this list will break the eglGetProcAddress(). Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-24 17:00:50 +00:00
Eric Engestrom	557f3181bf	egl: distribute all tests Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-24 17:00:50 +00:00
Eric Engestrom	f92fd4d7a8	eglapi: move entrypoints list out to its own file This will allow us to make sure the list is always sorted in the next commit. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-24 17:00:50 +00:00
Eric Engestrom	2b3cd82e18	eglapi: sort entrypoints list Let's make that comment true. If will also be necessary in a couple commits (using bsearch). Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-24 17:00:50 +00:00
Eric Engestrom	3b69c4a8e8	eglapi: use macro to map entrypoints to functions As of the last 3 commits, there's a function for each entrypoint. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-24 17:00:50 +00:00
Eric Engestrom	66d5ec5f3f	eglapi: add entrypoint for eglClientWaitSyncKHR Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-24 17:00:50 +00:00
Eric Engestrom	b7f6f3b3e5	eglapi: add entrypoint for eglDestroySyncKHR Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-24 17:00:50 +00:00
Eric Engestrom	df7fa30aec	eglapi: add entrypoint for eglDestroyImageKHR Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-24 17:00:50 +00:00
Thomas Hellstrom	7b82efe4ee	st/va: Fix up YV12 to NV12 putImage conversion Use the utility u_copy_nv12_from_yv12 to implement this similarly to how it's been done in the VPAU state tracker. The old code mixed up planes and fields and didn't correctly handle video surfaces in interlaced format. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2017-02-24 16:44:34 +01:00
Thomas Hellstrom	3a418322ec	st/vdpau: Provide YV12 to NV12 putBits conversion v2 mplayer likes putting YV12 data, and if there is a buffer format mismatch, the vdpau state tracker would try to reallocate the video surface as an YV12 surface. A virtual driver doesn't like reallocating and doesn't like YV12 surfaces, so if we can't support YV12, try an YV12 to NV12 conversion instead. Also advertize that we actually can do the getBits and putBits conversion. v2: A previous version of this patch prioritized conversion before reallocating. This has been changed to prioritize reallocating in this version. Cc: Christian König <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2017-02-24 16:44:33 +01:00
Leo Liu	5398d006de	configure.ac: check require_basic_egl only if egl enabled Otherwise the configuration fails when building independant libs like vdpau, vaapi or omx Fixes: `1ac40173c2` ("configure.ac: simplify EGL requirements for drivers dependent on EGL") Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-02-24 09:48:47 -05:00
Eric Engestrom	3cc33e7640	glx: add GLXdispatchIndex sort check Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-24 14:44:58 +00:00
Lars Hamre	caf4252a01	docs: update features.txt for GL_ARB_clear_texture with llvmpipe and softpipe Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-24 15:41:26 +01:00
Lars Hamre	a876b50b20	softpipe: enable clear_texture with util_clear_texture Passes all corresponding piglit tests. Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-24 15:41:13 +01:00
Lars Hamre	12f2058b47	llvmpipe: enable clear_texture with util_clear_texture Passes all corresponding piglit tests. Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-24 15:40:57 +01:00
Lars Hamre	3f9c5d6244	gallium: implement util_clear_texture v3: have util_clear_texture mirror the pipe function (Roland Scheidegger) v2: rework util clear functions such that they operate on a resource instead of a surface (Roland Scheidegger) Creates a util_clear_texture function for implementing the GL_ARB_clear_texture in softpipe and llvmpipe. Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-24 15:40:11 +01:00
Jerome Duval	62e27170a7	haiku/winsys: fix dt prototype args	2017-02-24 14:10:57 +00:00
Jerome Duval	40b0c8666c	haiku: build fixes around debug defines	2017-02-24 14:10:57 +00:00
Dave Airlie	ccb70d6f53	radv: add sample mask output support This adds support to write to sample mask from the fragment shader. We can optimise this later like radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-24 10:31:53 +10:00
Dave Airlie	8282c5c771	radv/ac: refactor our fmask sample index fixup. This refactors out the sample index fixup between txf and image load. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-24 10:31:49 +10:00
Dave Airlie	5e9ead0fa2	radv: fetch sample index via fmask for image coord as well. This follows the txf_ms code, I can't figure out why amdgpu-pro doesn't do this in their shaders, they must know someone we don't. This fixes: dEQP-VK.pipeline.multisample_shader_builtin.sample_id.* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-24 10:31:44 +10:00
Dave Airlie	bdcbe7c76b	radv: add sample mask input support Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-24 10:31:35 +10:00
Dave Airlie	58c97a0791	radv: enable location at sample when persample is forced. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-24 10:31:30 +10:00
Dave Airlie	fc430c391b	radv: fix interpolation at wrong place for offset interp The code was interpolating at the offset from the sample, not the offset from the center. Also fix for persample interpolation modes we should force the pixel center to be at the sample. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-24 10:31:19 +10:00
George Kyriazis	dcac48bfee	swr: fix index buffers with non-zero indices Fix issue with index buffers that do not contain a 0 index. 0 index can be a non-valid index if the (copied) vertex buffers are a subset of the user's (which happens because we only copy the range between min & max). Core will use an index passed in from the driver to replace invalid indices. Only do this for calls that contain non-zero indices, to minimize performance Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> cost.	2017-02-23 16:36:18 -06:00
George Kyriazis	669d8f626f	swr: add fetch shader cache For now, the cache key is all of FETCH_COMPILE_STATE. Use new/delete for swr_vertex_element_state, since we have to call the constructors/destructors of the struct elements. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-02-23 16:36:13 -06:00
Timothy Arceri	987d8037ca	st/mesa: free shader cache buffer on fallback Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2017-02-24 09:01:59 +11:00
Timothy Arceri	c24d0aaa9a	st/mesa: fix crash in shader cache cased by race condition If a thread doesn't load GLSL IR from cache but does load TGSI from cache (that was created by another thread) than it will crash due to expecting gl_program_parameter_list to have been restored from the GLSL IR cache and not be null. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2017-02-24 09:01:59 +11:00
Jason Ekstrand	261092f7d4	anv: Enable MSAA compression This just enables basic MSAA compression (no fast clears) for all multisampled surfaces. This improves the framerate of the Sascha "multisampling" demo by 76% on my Sky Lake laptop. Running Talos on medium settings with 8x MSAA, this improves the framerate in the benchmark by 80%. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-02-23 12:10:42 -08:00
Jason Ekstrand	42b10b175d	anv/blorp/clear_subpass: Only set surface clear color for fast clears Not all clear colors are valid. In particular, on Broadwell and earlier, only 0/1 colors are allowed in surface state. No CTS tests are affected outright by this because, apparently, the CTS coverage for different clear colors is pretty terrible. However, when multisample compression is enabled, we do hit it with CTS tests and this commit prevents regressions when enabling MCS on Broadwell and earlier. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-23 12:10:42 -08:00
Pohjolainen, Topi	042cc201f2	intel/isl: Apply render target alignment constraints for MCS v2: Instead of having the same block in isl_gen7,8,9.c add it once into isl.c::isl_choose_image_alignment_el() instead. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-02-23 12:10:42 -08:00
Lionel Landwerlin	34e29b2ebd	intel/isl: add MCS width constraint 16 samples v3 (Jason Ekstrand): Add a comment explaining why Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-02-23 12:10:42 -08:00
Jason Ekstrand	3885375195	intel/isl: Return surface creation success from aux helpers The isl_surf_init call that each of these helpers make can, in theory, fail. We should propagate that up to the caller rather than just silently ignoring it. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-02-23 12:10:42 -08:00
Kenneth Graunke	e6e8475b0f	glsl: Raise a link error for non-SSO ES programs with a TES but no TCS. OpenGL allows the TCS to be missing and supplies an implicit passthrough shader, but OpenGL ES does not (see section 7.3 of the ES 3.2 spec, cited above in the code). One open question is how to handle this for ARB_ES3_2_compatibility. This patch raises the link error for all ES shading language programs, but it might make sense to base it on the API. The approach taken in this patch is more restrictive, but should still allow any valid ES programs to work in GL. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-02-23 11:07:06 -08:00
Samuel Iglesias Gonsálvez	a9c488f285	isl/state: fix assert on raw buffer surface state minimum size From IVB PRM, SURFACE_STATE::Height: "For typed buffer and structured buffer surfaces, the number of entries in the buffer ranges from 1 to 2^27 . For raw buffer surfaces, the number of entries in the buffer is the number of bytes which can range from 1 to 2^30." The minimum value is 1, according to the spec. The spec quote was already added into the code by `028f6d8317`. Fixes crashing tests under: dEQP-VK.robustness.buffer_access.* Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-23 11:46:47 +01:00
Iago Toral Quiroga	42b9057447	glsl: enable early_fragment_tests implicitly with post_depth_coverage From ARB_post_depth_coverage: "This extension allows the fragment shader to control whether values in gl_SampleMaskIn[] reflect the coverage after application of the early depth and stencil tests. This feature can be enabled with the following layout qualifier in the fragment shader: layout(post_depth_coverage) in; Use of this feature implicitly enables early fragment tests." And a bit later it also adds: "early_fragment_tests" requests that fragment tests be performed before fragment shader execution, as described in section 15.2.4 "Early Fragment Tests" of the OpenGL Specification. If neither this nor post_depth_coverage are declared, per-fragment tests will be performed after fragment shader execution." Fixes: GL45-CTS.post_depth_coverage_tests.PostDepthSampleMask Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-23 11:21:44 +01:00
Samuel Iglesias Gonsálvez	6ca4347c82	glsl: refactor get_variable_being_redeclared() to return always an ir_variable pointer It will return the current variable ('var') or the earlier declaration ('earlier') in case of redeclaration of that variable. In order to distinguish between both, 'is_redeclaration' boolean will indicate in which case we are. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-02-23 06:56:45 +01:00
Samuel Iglesias Gonsálvez	a73a618933	glsl: fix heap-use-after-free in ast_declarator_list::hir() The get_variable_being_redeclared() function can free 'var' because a re-declaration of an unsized array variable can establish the size, so we set the array type to the 'earlier' declaration and free 'var' as it is not needed anymore. However, the same 'var' is referenced later in ast_declarator_list::hir(). This patch fixes it by picking the ir_variable_mode from the proper ir_variable. This error was detected by Address Sanitizer. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Suggested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99677 Cc: "17.0" <mesa-stable@lists.freedesktop.org> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2017-02-23 06:56:16 +01:00
Charmaine Lee	043883647a	st/wgl: flush with ST_FLUSH_WAIT before releasing shared contexts Before releasing a shared context, flush the context with ST_FLUSH_WAIT to make sure all commands are executed. This ensures that rendering to any shared resources is completed before they will be referenced by another context. Fixes an intermittent flickering with Photoshop. (VMware bug# 1779340) Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-18 09:36:42 -08:00
Charmaine Lee	d793b54c4e	st: add ST_FLUSH_WAIT to st_context_flush() When st_context_flush() is called with ST_FLUSH_WAIT, the function will return after the fence is completed. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-18 09:36:42 -08:00
Dave Airlie	b71e6538a8	radv/ac: handle gs->copy shader clip distances. This fixes up the clip distance passing between the geometry shader and the copy shader. It packs the clip and cull distances into one or two consecutive slots, and avoids wasting space and make sure the gs output and copy shader input agree on where things are stored. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-23 15:31:41 +10:00
Dave Airlie	bec584ec0e	radv/ac: pass clips properly from vertex->geometry shader stages. This works out the geometry shader clip/cull inputs separately to the outputs, and uses that information to read from the ES->GS ring buffer. It stores the clip/cull distances packed into one or two slots. It fixes the es output emission and gs input reading to match. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-23 15:31:37 +10:00
Dave Airlie	c2cfb54f13	radv/ac: rename num clips/cull to output clips/culls As geom shaders can have different ones on entry and exit. also move to uint8_t as these are never that big. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-23 15:31:10 +10:00
Dave Airlie	c2ed2685fd	vulkan/wsi: move image count to shared structure. For prime support I need to access this, so move it in advance. [airlied: fix int->uint32_t] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-23 15:30:32 +10:00
Timothy Arceri	4711e54336	radeon: fix r600 builds when old version of llvm is present Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-23 14:05:55 +11:00
Dylan Baker	fb26e6c0d4	vulkan: Fix gen_enum_to_str in out of tree builds In some configurations the util directory is created when building out of tree, but not others. This patch ensures that it's created. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-and-Tested-by: Mike Lothian <mike@fireburn.co.uk>	2017-02-22 17:08:52 -08:00
Jason Ekstrand	1bd0e9ca33	anv/Makefile: Gather all the genX files into one place While we're here, we also fix the alphabetization of the list of genx_* files. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-22 15:07:18 -08:00
Timothy Arceri	2f3290ac28	r600/radeonsi: enable glsl/tgsi on-disk cache For gpu generations that use LLVM we create a timestamp string containing both the LLVM and Mesa build times, otherwise we just use the Mesa build time. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-23 09:20:22 +11:00
Timothy Arceri	27cecafefd	st/mesa: get on-disk shader cache Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-23 09:20:22 +11:00
Timothy Arceri	8239eef2f7	ddebug/rbug/trace: add get_disk_shader_cache() to pass-throughs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-23 09:20:22 +11:00
Timothy Arceri	4be98ed5fd	gallium: add get_disk_shader_cache() callback V2: Provide more detail in callback description and add description to screen.rst Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-23 09:20:22 +11:00
Timothy Arceri	9f506d817e	st/mesa: implement a tgsi on-disk shader cache Implements a tgsi cache for the OpenGL state tracker. V2: add support for compute shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-23 09:20:22 +11:00
Timothy Arceri	b9de1c2e02	st/mesa: add sha1 field to st program structs This will be used to share the sha1 computed by the tgsi load function with the tgsi write function. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-23 09:20:22 +11:00
Timothy Arceri	0d5130bdd0	st/mesa: move set_prog_affected_state_flags() to st_program.c We want to use this in the new tgsi shader cache so we move it here and make it available externally. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-23 09:20:22 +11:00
Timothy Arceri	d258055c8b	util/disk_cache: fix bug with deleting old cache dirs If there was more than a single directory in the .cache/mesa dir then it would only remove one (or none) of the directories. Apparently Valgrind was also reporting: Conditional jump or move depends on uninitialised value Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-23 09:20:22 +11:00
Dylan Baker	8e03250fcf	vulkan: Combine wsi and util makefiles Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-02-22 13:12:02 -08:00
Dylan Baker	e9dcb17962	vulkan/util: Add generator for enum_to_str functions This adds a python generator to produce enum_to_str functions for Vulkan from the vk.xml API description. It supports extensions as well as core API features, and the generator works with both python2 and python3. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-22 13:12:02 -08:00
Thomas Hellstrom	bda59f6e41	Revert "st/vdpau: Fix multithreading" This reverts commit `f1e5dfbe3c`. For a detailed discussion see https://lists.freedesktop.org/archives/mesa-dev/2017-February/145283.html Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2017-02-22 21:50:15 +01:00
Nayan Deshmukh	b8861911c5	vl: u_upload_alloc might fail to allocate buffer in bicubic filter Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-02-22 21:49:19 +01:00
Marek Olšák	7ce8adad43	gallium: reorder fields in pipe_draw_info sizeof(struct pipe_draw_info) = 104 -> 88 Also, vertices_per_patch is switched to ubyte, because it can't be more than 32. Seemed-reasonable-to: Roland Scheidegger	2017-02-22 20:36:40 +01:00
Marek Olšák	3b04566bba	gallium/hud: handle a thread switch for API-thread-busy monitoring Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-22 20:26:39 +01:00
Marek Olšák	31e7ba7124	gallium/hud: prevent an infinite loop v2: use UINT64_MAX / 11 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-22 20:26:39 +01:00
Marek Olšák	24847dd1b5	gallium/u_queue: isolate util_queue_fence implementation it's cleaner this way. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-22 20:26:39 +01:00
Marek Olšák	4aea8fe7e0	gallium/u_queue: fix random crashes when the app calls exit() This fixes: vdpauinfo: ../lib/CodeGen/TargetPassConfig.cpp:579: virtual void llvm::TargetPassConfig::addMachinePasses(): Assertion `TPI && IPI && "Pass ID not registered!"' failed. v2: use list_head, switch the call order in destroy Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-22 20:26:39 +01:00
Robert Bragg	a96c9564e3	i965: Implement INTEL_performance_query backend This adds a bare-bones backend for the INTEL_performance_query extension that exposes pipeline statistics. Although this could be considered redundant given that the same statistics are already available via query objects, they are a simple starting point for this extension and it's expected to be convenient for tools wanting to have a single go to api to introspect what performance counters are available, along with names, descriptions and semantic/data types. This code is derived from Kenneth Graunke's work, temporarily removed while the frontend and backend interface were reworked. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-22 19:16:21 +00:00
Robert Bragg	0e7464f0a9	mesa: Model INTEL perf query backend after query obj BE Instead of using the same backend interface as AMD_performance_monitor this defines a dedicated INTEL_performance_query interface that is modelled more on the ARB_query_buffer_object interface (considering the similarity of the extensions) with the addition of vfuncs for initializing and enumerating query and counter info. Compared to the previous backend, some notable differences are: - The backend is free to represent counters using whatever data structures are optimal/convenient since queries and counters are enumerated via an iterator api instead of declaring them using structures directly shared with the frontend. This is also done to help us support the full range of data and semantic types available with INTEL_performance_query which is awkward while using a structure shared with the AMD_performance_monitor backend since neither extension's types are a subset of the other. - The backend must support waiting for a query instead of the frontend simply using glFinish(). - Objects go through 'Active' and 'Ready' states consistent with the query object backend (hopefully making them more familiar). There is no 'Ended' state (which used to show that a query has ended at least once for a given object). There is a new 'Used' state, set when a query is first begun which implies that we are expecting to get results back for the object at some point. There's no equivalent to the 'EverBound' state since the spec doesn't require there to be a limbo state between generating IDs and associating them with an object on query Begin. The INTEL_performance_query and AMD_performance_monitor extensions are now completely orthogonal within Mesa main (though a driver could optionally choose to implement both extensions within a unified backend if that were convenient for the sake of sharing state/code). v2: (Samuel Pitoiset) - init PerfQuery.NumQueries in frontend - s/return_string/output_clipped_string/ - s/backed/backend/ typo - remove redundant *bytesWritten = 0 v3: - Add InitPerfQueryInfo for lazy probing of available queries v4: - Clean up some internal usage of GL typedefs (Ken) Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-22 14:07:09 +00:00
Robert Bragg	d83a33a9de	mesa: Separate INTEL_performance_query frontend To allow the backend interfaces for AMD_performance_monitor and INTEL_performance_query to evolve independently based on the more specific requirements of each extension this starts by separating the frontends of these extensions. Even though there wasn't much tying these frontends together, this separation intentionally copies what few helpers/utilities that were shared between the two extensions, avoiding any re-factoring specific to INTEL_performance_query so that the evolution will be easier to follow later. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-22 12:12:27 +00:00
Thomas Hellstrom	ccc8720cf7	gallium/vl: Simplify the matrix filter fragment shader It looks like it was partly copied from the median filter fragment shader and unnecessesarily saved a lot of temporary values. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-22 10:22:17 +01:00
Thomas Hellstrom	f1e5dfbe3c	st/vdpau: Fix multithreading The vdpau state tracker allows multiple threads access to the same gallium context simultaneously. We can fix this either by locking the same mutex each time the context is used or by using a different gallium context for each mutex domain. Here we do the latter, although I'm not sure that's really the best option. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Acked-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-22 10:20:37 +01:00
Thomas Hellstrom	bcc9fd378d	gallium/vl: Parameter substitution in the csc matrix computation Makes the code significantly more readable. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-22 10:20:07 +01:00
Thomas Hellstrom	4c3fe3257d	gallium/vl: Simplify usage of full range matrices When looking at the full range matrices, it becomes obvious that the difference between the standard matrices and the full range matrices is that the full range matrices are multiplied by 1.164. Together with offsetting the y value with -16/255, this will scale and offset RGB with the desired quantities. However, the standard SMPTE 240M matrix seems to differ a bit since the U and V coefficients are only multiplied with 1.138 to get the full range matrix. This would actually alter the color somewhat so I figure that's an error. The full range matrix is consistent with Nvidia's VDPAU implementation. We can also incorporate the ybias in the brightness simplifying the calculation somewhat. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-22 10:19:27 +01:00
Thomas Hellstrom	f01e947cdb	gallium/vl Fix brightness matrix description The brightness matrix doesn't actually match the procamp matrix and what's calculated in vl_csc_get_matrix. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-22 10:18:30 +01:00
Thomas Hellstrom	ec8139e50c	gallium/vl: Don't map vertex buffers on creation It will cause multiple simultaneous maps of the same vertex buffer and flushed-while-mapped warnings. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-22 10:17:51 +01:00
Thomas Hellstrom	f2872bf8c3	gallium/vl: Add sampler views to video filter fragment shaders Needed for at least the svga driver. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-22 10:17:07 +01:00
Thomas Hellstrom	53b4584555	gallium/vl: declare sampler views in compositor shaders The svga driver relies on the existence of these sampler views. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-22 10:15:16 +01:00
Brian Paul	b87ef9e606	util: fix MSVC build issue in disk_cache.h Windows doesn't have dlfcn.h. Protect the code in question with #if ENABLE_SHADER_CACHE test. And fix indentation. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-02-21 20:54:46 -07:00
Dave Airlie	40e0dbf96c	radv: fix typo in the subpass barrier patch. Fixes: dbb0eaccc radv: handle subpass cache flushes Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-22 02:22:30 +00:00
Rafael Antognolli	d71e1f32c6	i965/gen6+: Enable arb_transform_feedback_overflow_query. This extension adds new query types which can be used to detect overflow of transform feedback buffers. The new query types are also accepted by conditional rendering commands. v3: - s/gen7+/gen6+/ in the relnotes (Jordan Justen) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-21 16:28:32 -08:00
Rafael Antognolli	924a1b90aa	i965: Add support for xfb overflow query on conditional render. Enable the use of a transform feedback overflow query with glBeginConditionalRender. The render commands will only execute if the query is true (i.e. if there was an overflow). Use ARB_conditional_render_inverted to change this behavior. v4: - reuse MI_MATH calcs from hsw_queryob (Kenneth) - fallback to software conditional rendering when MI_MATH is not available (Kenneth) v5: - check query->Target (Kenneth) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-21 16:28:32 -08:00
Rafael Antognolli	d03ec496ee	i965: Add support for xfb overflow on query buffer objects. Enable getting the results of a transform feedback overflow query with a buffer object. v4: - hsw_overflow_result_to_gpr0 a public function, so it can be used by conditional render. (Kenneth) - fix typo grp0/gpr0 (Kenneth) - rename load_gen_written_data_to_regs to load_overflow_data_to_cs_gprs (Kenneth) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-21 16:28:32 -08:00
Rafael Antognolli	5933ec86fd	i965: add plumbing for ARB_transform_feedback_overflow_query. When querying for transform feedback overflow on one or all of the streams, store information about number of generated and written primitives. Then check whether generated == written. v2: - use only SO_PRIM_STORAGE_NEEDED, do not fallback to CL_INVOCATION_COUNT. (Kenneth) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-21 16:28:32 -08:00
Rafael Antognolli	a80ebff1b9	mesa: Track transform feedback overflow query objects. Also update checks on conditional rendering. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-21 16:28:31 -08:00
Rafael Antognolli	273bab26af	mesa: Add types for ARB_transform_feedback_oveflow_query. Add some basic types and storage for the queries of this extension. v2: - update date of extension (Kenneth) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-21 16:28:31 -08:00
Eric Engestrom	89af6bf2cb	gallium/docs: use imgmath instead of pngmath WARNING: sphinx.ext.pngmath has been deprecated. Please use sphinx.ext.imgmath instead. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-22 00:01:08 +00:00
Eric Engestrom	d88a0dffe3	gallium/docs: fix section title formatting src/gallium/docs/source/tgsi.rst:3488: WARNING: Title underline too short. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-22 00:01:01 +00:00
Eric Engestrom	5aa7fa2bbf	gallium/docs: add missing newlines Without these, mathjax considers these as the continuation of the previous line. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-22 00:00:57 +00:00
Eric Engestrom	3ae77c912e	gallium/docs: add missing math formatting Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-22 00:00:51 +00:00
Eric Engestrom	3a0d2c54cf	gallium/docs: fix sublist formatting src/gallium/docs/source/context.rst:95: ERROR: Unexpected indentation. Sub lists need to be surrounded by a blank line. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-22 00:00:38 +00:00
Timothy Arceri	0441e6bc8b	util/disk_cache: create timestamp and gpu_id dirs when MESA_GLSL_CACHE_DIR is used The make check test is also updated to make sure these dirs are created. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-22 08:40:14 +11:00
Timothy Arceri	207e3a6e4b	util/radv: move *_get_function_timestamp() to utils Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-22 08:40:00 +11:00
Kenneth Graunke	ed6b47f435	docs: Update features.txt and relnotes for GL_ARB_transform_feedback2	2017-02-21 12:38:13 -08:00
Kenneth Graunke	0a7b252c5b	i965: Enable ARB_transform_feedback2 on Sandybridge. The only feature over and above ES 3.0 is DrawTransformFeedback(). We already have to do the whole SOL_NUM_PRIMS_WRITTEN counter dance in order to compute the SVBI value for ResumeTransformFeedback(), at which point our existing GetTransformFeedbackVertexCount() implementation will do the trick (though with a stall to CPU map the buffer). Someday, we could probably implement DrawTransformFeedback() more efficiently, using the "Load Internal Vertex Count" feature of 3DSTATE_SVB_INDEX and the 3DPRIMITIVE indirect vertex count bit. Rumor has it this allows people to use WebGL 2.0 on Sandybridge. Note that we don't need pipelined register writes like Gen7+ because we use the 3DSTATE_SVB_INDEX command rather than MI_LOAD_REGISTER_MEM. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99842 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-02-21 12:38:13 -08:00
Kenneth Graunke	0235757422	i965: Properly reset SVBI counters on ResumeTransformFeedback(). This fixes Piglit's ARB_transform_feedback2/change-objects-while-paused GLES 3.0 test. When resuming the transform feedback object, we need to reset the SVBI counters so we continue writing at the correct point in the buffer. Instead of SO_WRITE_OFFSET counters (with a DWord offset), we have the Streamed Vertex Buffer Index (SVBI) counters, which contain a count of vertices emitted. Unfortunately, there's no straightforward way to store the current SVBI counter values to a buffer. They're not available in a register. You can use a bit in the 3DSTATE_SVB_INDEX packet to copy them to another internal counter which 3DPRIMITIVE can use...but there's no good way to extract that either. So, once again, we use SO_NUM_PRIMS_WRITTEN to calculate the vertex numbers. Thankfully, we can reuse most of the existing Gen7+ code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-02-21 12:38:13 -08:00
Kenneth Graunke	eb0331382a	i965: Save max_index in brw_transform_feedback_object. I'm going to need this in a new Resume hook shortly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-02-21 12:38:13 -08:00
Kenneth Graunke	8513090cd7	i965: Update brw_save_primitives_written_counters for pre-Gen7. Sandybridge and earlier only have a single counter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-02-21 12:38:13 -08:00
Kenneth Graunke	42a4f91820	i965: Use ctx->Const.MaxVertexStreams rather than BRW_XFB_MAX_STREAMS. This way on Sandybridge we'll only do 1 stream worth of math, since we only have one SO_NUM_PRIMS_WRITTEN counter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-02-21 12:38:13 -08:00
Kenneth Graunke	2af5f0caad	i965: Move some code from gen7_sol_state.c to gen6_sol.c. I plan to use these functions on Sandybridge soon. I changed the prefix on a couple of functions to "brw" instead of "gen7" as in theory they should be usable all the way back to G45. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-02-21 12:38:13 -08:00
Kenneth Graunke	bf8dd21191	i965: Drop dead Gen8+ code from Gen7/sometimes-HSW driver hooks. These driver hooks are not used when MI_MATH and MI_LOAD_REGISTER_REG are supported, which Gen8+ can always do. So this code is dead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-02-21 12:38:13 -08:00
Marek Olšák	96cbc1ca29	vbo: kill primitive restart lowering in glDrawArrays Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-21 21:28:02 +01:00
Marek Olšák	63c462226e	radeonsi: fix issues with monolithic shaders R600_DEBUG=mono has had no effect since: commit `1fabb29717` Author: Marek Olšák <marek.olsak@amd.com> Date: Tue Feb 14 22:08:32 2017 +0100 radeonsi: have separate LS and ES main shader parts in the shader selector Also, this assertion was failing: si_state_shaders.c:1307: si_shader_select_with_key: Assertion `!shader->is_optimized' failed. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-21 21:27:23 +01:00
Marek Olšák	52581606c2	radeonsi: set no-signed-zeros-fp-math Recommended by Matt Arsenault. 46757 shaders in 28742 tests Totals: SGPRS: 2068851 -> 2066907 (-0.09 %) VGPRS: 1604056 -> 1602676 (-0.09 %) Spilled SGPRs: 1402 -> 1382 (-1.43 %) Spilled VGPRs: 113 -> 113 (0.00 %) Private memory VGPRs: 1332 -> 1332 (0.00 %) Scratch size: 3224 -> 3188 (-1.12 %) dwords per thread Code Size: 58815520 -> 58716788 (-0.17 %) bytes LDS: 1162 -> 1162 (0.00 %) blocks Max Waves: 354616 -> 354905 (0.08 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 786452 -> 784508 (-0.25 %) VGPRS: 530000 -> 528620 (-0.26 %) Spilled SGPRs: 958 -> 938 (-2.09 %) Spilled VGPRs: 85 -> 85 (0.00 %) Private memory VGPRs: 636 -> 636 (0.00 %) Scratch size: 1880 -> 1844 (-1.91 %) dwords per thread Code Size: 26349936 -> 26251204 (-0.37 %) bytes LDS: 304 -> 304 (0.00 %) blocks Max Waves: 108962 -> 109251 (0.27 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-21 21:27:23 +01:00
Marek Olšák	fd3e73f54e	gallivm: add no-signed-zeros-fp-math option to lp_create_builder (v2) v2: define lp_float_mode Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-21 21:27:23 +01:00
Marek Olšák	84e72f2962	radeonsi: skip TESSINNER/OUTER offchip stores if TES doesn't read them We were unconditionally storing these outputs, sometimes even one component at a time, but apps never read them in TES. Move the TESSINNER/OUTER buffer stores into the TCS epilog where we can easily disable them on demand. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-21 21:27:23 +01:00
Marek Olšák	d633e23192	radeonsi: skip LDS stores in TCS if there are no LDS output reads This removes a lot of useless LDS stores. A few games read TESSINNER/OUTER, but not any other outputs. Most games don't read any outputs. The only app doing LDS output reads is UE4 Lightsroom Interior. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-21 21:27:23 +01:00
Marek Olšák	58af0a5385	tgsi/scan: add basic info about tessellation OUT and IN uses not all of them will be used immediately Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-21 21:27:23 +01:00
Jason Ekstrand	f31ed6d0cd	anv: Take a device parameter in anv_state_flush This allows the helper to check for llc instead of having to do it manually at all the call sites. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	f408971deb	anv: Pull all clflushing into a clflush_range helper All this cache line address calculation stuff is tricky. Let's not duplicate it more places than we have to. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	16b187c8bb	anv: Remove the unused state_pool_emit macro Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	f9d7d27d6d	anv: Rename clflush_range and state_clflush It's a bit shorter and easier to work with. Also, we're about to add a helper called clflush which does the clflush but without any memory fencing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	075ed20614	intel/blorp: Explicitly flush all allocated state Found by inspection. However, I expect it fixes real bugs when using blorp from Vulkan on little-core platforms. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	b6b03329af	anv: Put everything about queries in genX_query.c Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	965fad0e8b	anv/Makefile: alphabetize Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	40087bcb51	anv/query: Perform CmdResetQueryPool on the GPU This fixes a some rendering corruption in The Talos Principle Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	dc9abd0e6b	genxml: Make MI_STORE_DATA_IMM more consistent Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	3788cd3239	anv/query: clflush the bo map on non-LLC platforms Found by inspection Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	8582ab2d6e	anv: Add an invalidate_range helper This is similar to clflush_range except that it puts the mfence on the other side to ensure caches are flushed prior to reading. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-21 12:26:35 -08:00
Christian Gmeiner	e8d600710c	etnaviv: remove number of pixel pipes validation This validation was added before the etnaviv drm driver landed in the linux kernel. Due some pre-merge API changes we had to fix-up this value but with a mainline kernel this is not a problem anymore. Lets remove that validation which also gets rid of problem caught by Coverity, reported to me by imirkin. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-21 21:14:35 +01:00
Christian Gmeiner	a0b16a0890	etnaviv: move pctx initialisation to avoid a null dereference In case ctx->stream == NULL the fail label gets executed where pctx gets dereferenced - too bad pctx is NULL in that case. Caught by Coverity, reported to me by imirkin. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-21 21:14:27 +01:00
Christian Gmeiner	f709096d0e	etnaviv: add missing fallthrough annotation Caught by Coverity, reported to me by imirkin. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-21 21:14:01 +01:00
Emil Velikov	383e8e2d5d	docs/releasing.html: reword "distro breaking changes" hunk v2: s/rare/rarely/ (Eric) Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-21 18:39:40 +00:00
Emil Velikov	8b79f0ed08	radv: make radv_resolve_entrypoint static Used only within the generated source file. Fixes: `12301c5418` ("radv: drop the RADV_CALL macro.") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-02-21 18:31:16 +00:00
Emil Velikov	320561bd83	radv: remove unused radv_dispatch_table dtable Fixes: `12301c5418` ("radv: drop the RADV_CALL macro.") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-02-21 18:31:14 +00:00
Emil Velikov	9807e9dea6	anv: remove unused anv_dispatch_table dtable Fixes: `4c9dec80ed` ("anv: Get rid of the ANV_CALL macro") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-02-21 18:31:04 +00:00
Emil Velikov	aa5baf1d50	i915: remove extern "C" guards None of this code is used in C++ context. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:29:43 +00:00
Emil Velikov	0e74f390d9	i915: remove 'virtual' and extern C workarounds Analogous to previous commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:29:41 +00:00
Emil Velikov	3ea07d2be9	i965: remove 'virtual' and extern C workarounds The headers are properly annotated thus we don't need these. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:29:38 +00:00
Emil Velikov	8481914681	i965: add extern C notation in headers Otherwise symbols wont be annotated with C linkage and we'll fail at link time. Currently this is worked around by wrapping the header inclusion itself. The latter in itself fragile and not recommended. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:29:28 +00:00
Emil Velikov	dafc325f42	gallium: do not #include foo.h within extern C {} Analogous to previous commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:29:25 +00:00
Emil Velikov	e4f971c85f	nir: do not #include util/debug.h within extern C {} It's a problem waiting to happen. Individual headers should be annotated if needed. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:29:17 +00:00
Emil Velikov	7fcbb1a902	glsl: resolve extern C workarounds/hacks Do not wrap header inclusion in extern C since it can cause issues. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:29:10 +00:00
Emil Velikov	a177a13033	st/mesa: move extern C wrappers where applicable Namely, after the include directives. The headers are properly annotated so keeping things as-is is only asking for trouble. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:29:07 +00:00
Emil Velikov	94b88c1c75	mesa/tests: remove unneeded extern C { #include foo } hack The header itself (enums.h) is already properly annotated. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:29:01 +00:00
Emil Velikov	d5db27706c	mesa: remove unneeded extern C {} wrapper compiler.h defines a few mesa specific macros which are not C specific. This allows us to avoid buggy extern C { #include $system_header } constructs. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:28:59 +00:00
Emil Velikov	1451bcb125	mesa: annotate functions for C linkage i.e. add extern C {} in program/symbol_table.h It will allow us remove a workaround we have elsewhere in the code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:28:55 +00:00
Emil Velikov	e776e0385c	anv: remove unneeded extern C notation Analogous to previous commit - never used in any C++ code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:28:18 +00:00
Emil Velikov	944620bc0e	radv: remove unneeded extern C notation Header is never #include(d) by a C++ source. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:28:15 +00:00
Rhys Kidd	4bf9862747	glsl/tests: Add UINT64 and INT64 types glsl/tests/uniform_initializer_utils.cpp:83:14: warning: enumeration value ‘GLSL_TYPE_UINT64’ not handled in switch [-Wswitch] switch (type->base_type) { ^ glsl/tests/uniform_initializer_utils.cpp:83:14: warning: enumeration value ‘GLSL_TYPE_INT64’ not handled in switch [-Wswitch] Fixes: `8ce53d4a2f` ("glsl: Add basic ARB_gpu_shader_int64 types") Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2017-02-21 18:03:14 +00:00
Eric Engestrom	6181ab9d77	docs: fix gamma correction link That link has been dead for 15 years... We could link to Archive.org [1] to get the last time this page existed, but I feel like Wikipedia is a better choice. [1] http://web.archive.org/web/20021211151318/http://www.inforamp.net/~poynton/notes/colour_and_gamma/GammaFAQ.html Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-21 14:54:10 +00:00
Eric Engestrom	b347bbb63b	docs: add link to gallium doc Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-21 14:43:29 +00:00
Nicolai Hähnle	066a117be7	radeonsi: fix UINT/SINT clamping for 10-bit formats on <= CIK The same PS epilog workaround as for 8-bit integer formats is required, since the CB doesn't do clamping. Fixes GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels*. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-21 10:45:13 +01:00
Nicolai Hähnle	6a1d9684f4	radeonsi: handle MultiDrawIndirect in si_get_draw_start_count Also handle the GL_ARB_indirect_parameters case where the count itself is in a buffer. Use transfers rather than mapping the buffers directly. This anticipates the possibility that the buffers are sparse (once ARB_sparse_buffer is implemented), in which case they cannot be mapped directly. Fixes GL45-CTS.gtf43.GL3Tests.multi_draw_indirect.multi_draw_indirect_type on <= CIK. v2: - unmap the indirect buffer correctly - handle the corner case where we have indirect draws, but all of them have count 0. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-21 10:45:02 +01:00
Nicolai Hähnle	550125e1e7	winsys/amdgpu: reduce max_alloc_size based on GTT limits Allocating huge buffers in VRAM is not a problem, but when those buffers start being migrated, the kernel runs into errors because it cannot split those buffer up for moving through GTT. This should fix intermittent failures of GL45-CTS.texture_buffer.texture_buffer_max_size Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-21 10:43:38 +01:00
Bas Nieuwenhuizen	8cff852ae2	radv: Don't flush at the start of a command buffer. The preamble flushes now and the rest is the responsibility of the app. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-21 09:20:03 +01:00
Bas Nieuwenhuizen	5241fb0ffb	radv: Flush in the initial preamble CS. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-21 09:19:58 +01:00
Bas Nieuwenhuizen	c121739c47	radv: Special case the initial preamble. For flushing we don't want to flush every third IB. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-21 09:19:53 +01:00
Bas Nieuwenhuizen	eac790811b	radv: Split emitting the cache flush out. So that we can use it without a cmd_buffer. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-21 09:19:45 +01:00
Bas Nieuwenhuizen	b6e0df2edd	radv: Free empty_cs on device destruction. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-21 09:18:50 +01:00
Ben Skeggs	8f4483b609	nvc0: use PascalB for most Pascal boards Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2017-02-21 10:01:16 +10:00
Dave Airlie	6dbb0eaccc	radv: handle subpass cache flushes This splits out the cache flush bit setting code dependent on the src/dest access flags. It then calls it from the subpass barrier code. It also marks a TODO to remove the aggressive CS/PS flushes at some point. This fixes a bunch of the dEQP-VK.renderpass.attachment_allocation.input_output.* tests. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-21 09:48:37 +10:00
Grazvydas Ignotas	66d1cb587a	r300g: only allow byteswapped formats on big endian They cause regressions on little endian. Fixes: `172bfdaa9e` ("r300g: add support for PIPE_FORMAT_x8R8G8B8_*") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98869 Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-02-21 00:37:02 +01:00
Timothy Arceri	87687afb94	mesa: remove unused variable warning in release builds This assert might have made sense before but we no longer use gl_linked_shader here. Unless the caller has really done something crazy this assert is fairly useless. We also do some small tidy ups in this change. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-21 08:46:04 +11:00
Emil Velikov	a40ebe73a1	docs/submittingpatches.html: document the Fixes tag Provide information and an example. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-20 18:21:22 +00:00
Emil Velikov	9e4248b206	docs/submittingpatches.html: remove version tag for nominations The version tag used to nominate has bitten even experienced mesa developers. Not to mention that it deviates from the one used in the kernel leading to further confusion. Simplify things and omit it all together. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-20 18:21:22 +00:00
Emil Velikov	f9cdfa33c2	docs/submittingpatches.html: add #backports section Provide information about merge conflicts resolution and sending backports. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-20 18:21:22 +00:00
Emil Velikov	d7e0ff0e2b	docs/submittingpatches.html: rework the #criteria section Reword the section to focus on what is allowed, using a more brief, yet descriptive wording. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-20 18:21:22 +00:00
Emil Velikov	af9a4d9005	travis: bring the scons build on par with AppVeyor Namely, always build with LLVM and run the check target. Cc: Rhys Kidd <rhyskidd@gmail.com> Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-20 18:21:22 +00:00
Ben Crocker	3f1b6ef2aa	gallivm: Reenable PPC VSX (v3) Reenable the PPC64LE Vector-Scalar Extension for LLVM versions >= 3.8.1, now that LLVM bug 26775 and its corollary, 25503, are fixed. Amendment: remove extraneous spaces in macro def & invocations. We would prefer a runtime check, e.g. via an LLVMQueryString (analogous to glGetString, eglQueryString) or LLVMGetVersion API, but no such API exists at this time. Signed-off-by: Ben Crocker <bcrocker@redhat.com> [Emil Velikov: remove LLVM_VERSION macro] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-20 18:21:22 +00:00
Ben Crocker	b934aae364	gallivm: Override getHostCPUName() "generic" w/ "pwr8" (v4) If llvm::sys::getHostCPUName() returns "generic", override it with "pwr8" (on PPC64LE). This is a work-around for a bug in LLVM: a table entry for "POWER8NVL" is missing, resulting in (big-endian) "generic" being returned on little-endian Power8NVL systems. The result is that code that attempts to load the least significant 32 bits of a 64-bit quantity in memory loads the wrong half. This omission should be fixed in the next version of LLVM (4.0), but this work-around should be left in place in case some future version of POWER<n> also ends up unrepresented in LLVM's table. This workaround fixes failures in the Piglit arb_gpu_shader_fp64 conversion tests on POWER8NVL processors. (V4: add similar comment in the code.) Signed-off-by: Ben Crocker <bcrocker@redhat.com> Cc: 12.0 13.0 17.0 <mesa-stable@lists.freedesktop.org> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-20 18:21:22 +00:00
Ben Crocker	a8e9c630f3	gallivm: Improve debug output (V2) Improve debug output from gallivm_compile_module and lp_build_create_jit_compiler_for_module, printing the -mcpu and -mattr options passed to LLC. V2: enclose MAttrs debug_printf block and llc -mcpu debug_printf in "if (gallivm_debug & <flags>)..." Signed-off-by: Ben Crocker <bcrocker@redhat.com> Cc: 12.0 13.0 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v2) [Emil Velikov: rebase] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-20 18:21:22 +00:00
Marek Olšák	e8c2a05662	gallium/u_suballoc: update comments as requested by Brian. Trivial.	2017-02-20 18:04:27 +01:00
Jonathan Gray	a042465c21	util/build-id: define ElfW and NT_GNU_BUILD_ID if needed Define ElfW() and NT_GNU_BUILD_ID if needed as these defines are not present on at least OpenBSD and FreeBSD. Fixes the build on OpenBSD. Fixes: `d4fa083e11` ("util: Add utility build-id code.") Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-20 16:39:24 +00:00
Mauro Rossi	41b5620492	android: define HAVE_DL_ITERATE_PHDR for build-id code Required due to `d4fa083` "util: Add utility build-id code." to avoid following build error and warnings: external/mesa/src/intel/vulkan/anv_device.c:60:32: error: incompatible integer to pointer conversion initializing 'const struct build_id_note ' with an expression of type 'int' [-Werror,-Wint-conversion] const struct build_id_note note = build_id_find_nhdr("libvulkan_intel.so"); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ external/mesa/src/intel/vulkan/anv_device.c:64:19: warning: implicit declaration of function 'build_id_length' is invalid in C99 [-Wimplicit-function-declaration] unsigned len = build_id_length(note); ^ external/mesa/src/intel/vulkan/anv_device.c:68:4: warning: implicit declaration of function 'build_id_read' is invalid in C99 [-Wimplicit-function-declaration] build_id_read(note, uuid, VK_UUID_SIZE); ^ 3 warnings and 1 error generated. [ 40% 1438/3588] target C: libmesa_vulkan_common_32 <= external/mesa/src/intel/vulkan/anv_image.c ninja: build stopped: subcommand failed. build/core/ninja.mk:148: recipe for target 'ninja_wrapper' failed make: *** [ninja_wrapper] Error 1 Fixes: `d4fa083e11` ("util: Add utility build-id code.") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-20 16:33:03 +00:00
Mauro Rossi	9e3d66c1e5	android: glsl: build shader cache sources Fixes the following building errors: external/mesa/src/compiler/glsl/linker.cpp:4642: error: undefined reference to 'shader_cache_read_program_metadata(gl_context, gl_shader_program)' external/mesa/src/mesa/program/ir_to_mesa.cpp:3135: error: undefined reference to 'shader_cache_write_program_metadata(gl_context, gl_shader_program)' clang++: error: linker command failed with exit code 1 ... external/mesa/src/mesa/program/ir_to_mesa.cpp:3135: error: undefined reference to 'shader_cache_write_program_metadata(gl_context, gl_shader_program)' external/mesa/src/compiler/glsl/linker.cpp:4642: error: undefined reference to 'shader_cache_read_program_metadata(gl_context, gl_shader_program)' clang++: error: linker command failed with exit code 1 (use -v to see invocation) ninja: build stopped: subcommand failed. build/core/ninja.mk:148: recipe for target 'ninja_wrapper' failed make: *** [ninja_wrapper] Error 1 Fixes: `9f8dc3bf03` ("utils: build sha1/disk cache only with Android/Autoconf") Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-20 16:30:37 +00:00
Mauro Rossi	933988901a	android: radeonsi: fix sid_table.h generated header include path generated-sources-dir-for macro replaces intermediates-dir-for and LOCAL_MODULE_CLASS is defined as required by new macro, in order to avoid the following building error: external/mesa/src/gallium/drivers/radeonsi/si_debug.c:29:10: fatal error: 'sid_tables.h' file not found ^ 1 error generated. Fixes: `730574c58e` ("android: ac/debug: move sid_tables.h generation and IB decode to amd/common") Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-20 16:23:13 +00:00
Emil Velikov	920b4d537f	docs: add news item and link release notes for 13.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-20 11:56:39 +00:00
Emil Velikov	85acb42522	docs: add sha256 checksums for 13.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `112e75f51b`)	2017-02-20 11:55:10 +00:00
Emil Velikov	2b06e91ded	docs: add release notes for 13.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `71f3ff57fa`)	2017-02-20 11:55:09 +00:00
Dave Airlie	0a44a680ff	vulkan/wsi/x11: add support to detect if we can support rendering (v3) This adds support to radv_GetPhysicalDeviceXlibPresentationSupportKHR and radv_GetPhysicalDeviceXcbPresentationSupportKHR to check if the local device file descriptor is compatible with the descriptor retrieved from the X server via DRI3. This will stop radv binding to an X server until we have prime support in place. Hopefully apps use this API before trying to render things. v2: drop unneeded function, don't leak memory. (jekstrand) v3: also check in surface_get_support callback. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-20 12:53:52 +10:00
Dave Airlie	1f6376935b	Revert "radv: detect command buffers that do no work and drop them (v2)" This just keeps popping up minor problems and regressions we should revisit in a more sustainable manner later. This also reverts: Revert "radv: query cmds should mark a cmd buffer as having draws." Revert "radv: also fixup event emission to not get culled." This reverts commit `d1640e7932`. This reverts commit `8b47b97215`. This reverts commit `b4b19afebe`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-20 09:00:40 +10:00
Bas Nieuwenhuizen	81b2379664	radv: Handle VK_REMAINING_ARRAY_LAYERS in fast clear eliminate. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-19 20:58:06 +01:00
Marek Olšák	c8ef512398	gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionally It's OK for r300g (because r300g can't write to buffers via the GPU), but not later hardware. This issue was spotted randomly. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-19 17:16:26 +01:00
Marek Olšák	a264fee624	radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start (v2) start can only be non-zero with MultiDrawElements, which is unlikely to occur with UNSIGNED_BYTE indices. v2: Also fix the util_shorten_ubyte_elts_to_userptr call. Tested with the new piglit. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-19 17:16:26 +01:00
Dave Airlie	9aec76aca3	radv: handle layered fast clears. This iterates the fast clear flush across the layers in the specified range. It also moves the compute resolve flush into the function and builds the range in there. This fixes: dEQP-VK.geometry.layered.* regressions since fast clears. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-02-19 20:30:01 +10:00
Dave Airlie	efc89edf5a	radv: pass subresourceRange by pointer. This struct is 5 dwords, we should really just pass a pointer to it. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-19 20:28:22 +10:00
Dave Airlie	2b3c490e23	radv: fix typo in a2b10g10r10 fast clear calculation. This fixes: dEQP-VK.renderpass.formats.a2b10g10r10_unorm_pack32* regressions. Fixes: `f22836dbdd` radv: Add CPU color packing for VK_FORMAT_A2B10G10R10_UNORM_PACK32. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-02-19 20:27:28 +10:00
Bas Nieuwenhuizen	c7fcaf2314	radv: Invert ring SGPR check. I assume this wants to check if all pipelines use the same SGPR for the rings. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-19 10:13:11 +01:00
Bas Nieuwenhuizen	e12cf3f9bf	radv: Clamp framebuffer dimensions to min. attachment dimensions. Even though the preferred stance is not to fix incorrect applications via the driver, this prevents some nasty GPU hangs. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-19 10:13:01 +01:00
Marek Olšák	ad019bf5c6	gallium: remove TGSI_OPCODE_CLAMP Not used and not widely supported. Use MIN+MAX instead. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 02:58:43 +01:00
Marek Olšák	675ef9c0c7	ac/llvm: use min+max instead of AMDGPU.clamp on LLVM 5.0 It selects v_med3_f32, which has the same rate & size. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 02:58:43 +01:00
Marek Olšák	660b55e6d9	radeonsi: stop using TGSI_OPCODE_CLAMP by moving it amd/common Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 02:58:43 +01:00
Marek Olšák	73d1c8c686	tgsi/lowering: stop using TGSI_OPCODE_CLAMP v2: do it correctly Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 02:58:43 +01:00
Marek Olšák	1d1b769561	st/mesa: stop using TGSI_OPCODE_CLAMP Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 02:58:43 +01:00
Marek Olšák	45240ce598	radeonsi: use R600_RESOURCE_FLAG_UNMAPPABLE where it's desirable Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	a41587433c	gallium/radeon: add R600_RESOURCE_FLAG_UNMAPPABLE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	9434421213	gallium/radeon: change r600_aligned_buffer_create to take flags, not bind All call sites set bind = 0. The next commit will use this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	ac6007460a	radeonsi: upload constants into VRAM instead of GTT This lowers lgkm wait cycles by 30% on VI and normal conditions. The might be a measurable improvement when CE is disabled (radeon) or under L2 thrashing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	a550fbb510	gallium/radeon: use TCC line size as alignment in other places Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	791e8ce04a	radeonsi: use a clever alignment for index buffer uploads Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	d6c8c26851	radeonsi: use a clever alignment for descriptor uploads Non-VBO descriptors won't be smaller than the cache line, so simply use the cache line size. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	6b73aafceb	radeonsi: use a clever alignment for constant buffer uploads This results in a very tiny decrease in lgkm wait cycles. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	620aded541	radeonsi: move index buffer flushing into a non-upload indexed case The other codepaths don't need this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	22b8a773e1	radeonsi: use SI_MAX_ATTRIBS where it should be used for consistency; no change in behavior Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	054f853035	radeonsi: sort members of si_shader_key::part and improve some comments Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	1fabb29717	radeonsi: have separate LS and ES main shader parts in the shader selector This might reduce the on-demand compilation if the initial VS/LS/ES determination is wrong. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	a02117ba6e	radeonsi: don't compile pure monolithic shaders asynchronously there is no point, we have to wait anyway. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	9b91e0b54c	radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VI So that we can disable u_vbuf for GL core profiles. This is a v2 of the previous VI-only patch. It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	2fb021b620	radeonsi: remove the fix_size3 workaround not needed with the shader fallback Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	dbd38f2a92	radeonsi: add a workaround for clamping unaligned RGB 8 & 16-bit vertex loads Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	41a2157a68	radeonsi: make fix_fetch an array of uint8_t so that we can add 3-component fallbacks. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	f246ae1ee9	vl: fix a buffer leak in the bicubic filter by using an uploader there's no error checking, because the previous code didn't do it either. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	c8d84801b7	gallium/hud: create files after graphs are created to get final names Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	22c34bbc55	gallium/u_suballoc: allow setting pipe_resource::flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	edf6bcf6c6	gallium/u_suballoc: use clear_buffer if available Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	02cd8b20d1	gallium/util: correctly unref a buffer in u_prim_restart Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	42297c862f	gallium/util: remove unused u_index_modify helpers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	7f0bf00dc9	gallium/util: remove unused helper util_draw_texquad Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	b5b0936677	gallium/docs: remove documentation of non-existent instructions trivial	2017-02-18 01:22:08 +01:00
Jason Ekstrand	5f02c2a054	anv/TODO: Check off Storage Image Without Format The code for this landed a few days ago.	2017-02-17 14:18:34 -08:00
Marek Olšák	edd23e0606	ac/llvm: fix various findMSB bugs sffbh needs to be suffixed with ".i32" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-18 06:24:32 +10:00
Jose Maria Casanova Crespo	429f112a11	glsl: link error if unsized array not-last in ssbo If an unsized declared array is not the last in an SSBO and an implicit size can not be defined on linking time, the linker should raise an error instead of reaching an assertion on GL. This reverts part of commit `3da08e1664` getting back to the behavior of commit `5b2675093e` The original patch was correct for GLES that should produce a compile-time error but the linker error is still necessary in desktop GL. Fixes the following piglit tests: tests/spec/arb_shader_storage_buffer_object/non_integral_size_array_member.shader_test tests/spec/arb_shader_storage_buffer_object/unsized_array_member.shader_test Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2017-02-17 15:49:16 +02:00
Lionel Landwerlin	a0ac118398	i965/fs: fix uninitialized memory access Found while running shader-db under valgrind. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-17 10:07:56 +00:00
Timothy Arceri	62c90492ef	glsl: disable on disk shader cache when running as another user Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 20:21:22 +11:00
Alejandro Piñeiro	966ddd5d3d	mesa/formatquery: use consistent local function names Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-02-17 10:17:54 +01:00
Bas Nieuwenhuizen	d5bf4c7394	radv: Use different allocator for descriptor set vram. This one only keeps allocated memory in the list, and list nodes in the descriptor sets. Thsi doesn't need messing around with max_sets, and we get automatic merging of free regions. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-17 09:28:23 +01:00
Bas Nieuwenhuizen	f448701622	radv: Never try to create more than max_sets descriptor sets. We only use the freed ones after all free space has been used. If the app only allocates small descriptor sets, we might go over max_sets before the memory is full. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> CC: <mesa-stable@lists.freedesktop.org> Fixes: `f4e499ec79`	2017-02-17 09:28:14 +01:00
Samuel Iglesias Gonsálvez	fccbad73ef	i965/fs: fix 32-bit data type to int64 conversion on BSW/BXT The 32-bit to 64-bit conversions need to have the 32-bit data source elements aligned to 64-bit but only with doubles as destination type. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-02-17 06:50:22 +01:00
Timothy Arceri	172c48cc15	glsl: fix scons builds with shader cache For now its disabled for scons so wrap glsl cache calls in a define conditional.	2017-02-17 16:31:47 +11:00
Timothy Arceri	a2bf0954fb	util/disk_cache: fix typo in function stub	2017-02-17 15:54:00 +11:00
Jason Ekstrand	b073811617	i965/fs: Remove hand-coded 64-bit packing optimizations The optimization in unpack_64 is clearly subsumed with the opt_algebraic optimizations in the previous commit. The pack optimization may not be quite handled by opt_algebraic but opt_algebraic should get the really bad cases. Also, it's been broken since it was merged and we've never noticed so it must not be doing anything. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-16 17:28:03 -08:00
Jason Ekstrand	70e86a3f2d	nir/algebraic: Optimize 64bit pack/unpack This reduces the instruction count in some fp64 and int64 piglit tests Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-16 17:28:03 -08:00
Jason Ekstrand	e10f522cd7	nir: Rename lower_double_pack to lower_64bit_pack There's nothing "double" about it other than, perhaps, the fact that it packs two 32-bit values. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-16 17:28:03 -08:00
Jason Ekstrand	161d3e81be	nir: Combine the int and double [un]pack opcodes NIR is a typeless IR and the two opcodes, when considered bitwise, do exactly the same thing. There's no reason to have two versions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-16 17:28:03 -08:00
Jason Ekstrand	a4393bd97f	i965/fs: Fix the inline nir_op_pack_double optimization We can only do the optimization if the source is SSA. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-16 17:28:03 -08:00
George Kyriazis	e2abe80bee	swr: remove unneeded extern "C" the guards have been added to the header files that needed them. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-16 18:22:27 -06:00
George Kyriazis	d4b4a511f6	gallium: add extern "C" guards Added extern "C" __cplusplus guards on headers that did not have them. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-16 18:22:27 -06:00
Timothy Arceri	a3ab09f90f	util/disk_cache: check cache exists before calling munmap() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	512c046edd	util/disk_cache: add support for removing old versions of the cache Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	3342ce452c	util/disk_cache: allow drivers to pass a directory structure In order to avoid costly fallback recompiles when cache items are created with an old version of Mesa or for a different gpu on the same system we want to create directories that look like this: ./{TIMESTAMP}_{LLVM_TIMESTAMP}/{GPU_ID} Note: The disk cache util will take a single timestamp string, it is up to the backend to concatenate the llvm string with the mesa string if applicable. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	87009681a5	mesa: remove cache creation from _mesa_initialize_context() We will change the way we create the cache directory in the following patches. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	6602d0401c	st/mesa/glsl: build string of dri options and use as input to building sha for shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	ed61530121	glsl: reserve parameter storage on cache restore Since we know how big the list will be we can allocate the storage upfront. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	1183eb487f	glsl: don't try to load/store buffer object values in the cache Also add an assert to catch buffer overflows. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	cad1a9bfde	glsl: don't reprocess or clear UBOs on cache fallback Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	01d1e5a7ad	glsl: skip more uniform initialisation when doing fallback linking We already pull these values from the metadata cache so no need to recreate them. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	794f7326bc	glsl: don't lose uniform values when falling back to full compile Here we skip the recreation of uniform storage if we are relinking after a cache miss. This is improtant because uniform values may have already been set by the application and we don't want to reset them. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	0e9991f957	glsl: don't reference shader prog data during cache fallback We already have a reference. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	2f19accc5e	mesa/glsl: add cache_fallback flag to gl_shader_program_data This will allow us to skip certain things when falling back to a full recompile on a cache miss such as avoiding reinitialising uniforms. In this change we use it to avoid reading the program metadata from the cache and skipping linking during a fallback. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	e3adde023b	glsl: add api and glsl version to hash generation for shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	dc0c0c176d	glsl: cache uniform values These may be lowered constant arrays or uniform values that we set before linking so we need to cache the actual uniform values. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	49f3439089	glsl: make uniform values helper available for use elsewhere Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	bb16cf805d	glsl: cache some more image metadata Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	a3ff840d05	glsl: add support for caching atomic buffers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	3d15d814c0	glsl: add shader cache support for buffer blocks Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	6761259958	glsl: store subroutine remap table in shader cache V2: use new helpers to store/restore table entries. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	787535fb11	glsl: add support for caching subroutines Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	0057de58f9	glsl: add support for caching shaders with xfb qualifiers For now this disables the shader cache when transform feedback is enabled via the GL API as we don't currently allow for it when generating the sha for the shader. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	3bbfee3cd3	glsl: add shader cache support for samplers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	c4cff5f402	glsl: add basic support for resource list to shader cache This initially adds support for simple uniforms and varyings. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	3c45d8f464	glsl: fix uniform remap table cache when explicit locations used V2: don't store pointers use an enum instead to flag what should be restored. Also do the work in a helper that we will later use for the subroutine remap table. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Carl Worth	a01973a784	glsl: Serialize three additional hash tables with program metadata The three additional tables are AttributeBindings, FragDataBindings, and FragDataIndexBindings. The first table (AttributeBindings) was identified as missing by trying to test the shader cache with a program that called glGetAttribLocation. Many thanks to Tapani Pälli <tapani.palli@intel.com>, as it was review of related work that he had done previously that pointed me to the necessity to also save and restore FragDataBindings and FragDataIndexBindings. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	e5bb4a0b0f	glsl: use correct shader source in case of cache fallback The scenario is: glShaderSource glCompileShader <-- deferred due to cache hit of shader glShaderSource <-- with new source code glAttachShader glLinkProgram <-- no cache hit for program At this point we need to compile the original source when we fallback. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	8771940682	glsl: make use of on disk shader cache The hash key for glsl metadata is a hash of the hashes of each GLSL source string. This commit uses the put_key/get_key support in the cache put the SHA-1 hash of the source string for each successfully compiled shader into the cache. This allows for early, optimistic returns from glCompileShader (if the identical source string had been successfully compiled in the past), in the hope that the final, linked shader will be found in the cache. This is based on the intial patch by Carl. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	34ca0fce22	glsl: add initial implementation of shader cache This uses disk_cache.c to write out a serialization of various state that's required in order to successfully load and use a binary written out by a drivers backend, this state is referred to as "metadata" throughout the implementation. This initial version is intended to work with all stages beside compute. This patch is based on the initial work done by Carl. V2: extend the file's doxygen comment to cover some of the design decisions. V3: - skip cache for fixed function shaders - add int64 support - fix glsl IR program parameter caching/restore and cache the parameter values which are used by gallium backends. - use new link status enum V4: - add compute program support Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Dave Airlie	b0232d98e9	radeonsi: use shared emit_umsb helper. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 22:57:16 +00:00
Dave Airlie	ebed22ec67	radv/ac: use shared umsb helper. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 22:57:16 +00:00
Dave Airlie	0ec66b9969	radeon/ac: add emit umsb shared code. Since we shared imsb, makes sense to share umsb. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 22:57:16 +00:00
Dave Airlie	4617ad07e0	radeon/ac: use llvm.amdgcn.sffbh intrinsic instead of AMDGPU.flbit.i32 Use the newer intrinsic. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 22:57:16 +00:00
Dave Airlie	e933331cd7	radeonsi: use shared emit imsb code. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 22:57:15 +00:00
Dave Airlie	fb15a1e9dd	radv/ac: use shader imsb emission code. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 22:57:15 +00:00
Dave Airlie	cae1ff1a4b	radeon/ac: add ac_emit_imsb helper. We want to use a different intrinsic on newer llvm, so move this code to a shared area. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 22:57:15 +00:00
Emil Velikov	40bf7ba023	egl: _eglFilterArray's filter is always non-null Drop the extra handling and assert() if things change in the future. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-16 15:27:20 +00:00
Emil Velikov	b8ae2fe3e6	docs: add hyperlink to the releasing documentation Other files such as xlibdriver.html and versions.html explicitly left out, for now. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-16 15:25:02 +00:00
Emil Velikov	cadf174866	util/disk_cache: do not allow space in MESA_GLSL_CACHE_MAX_SIZE No other env var used in mesa allows for space in the variable contents. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-02-16 15:22:17 +00:00
Emil Velikov	350e8e821f	configure.ac: remove unneeded trailing semicolon Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-16 15:17:52 +00:00
Emil Velikov	78c747e820	r100: use correct libdrm_radeon macro Remove local definition of RADEON_INFO_TILE_CONFIG and use the correct macro provided by libdrm_radeon RADEON_INFO_TILING_CONFIG. Latter was present as of libdrm 2.4.22, sirca 2010. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-16 15:17:52 +00:00
Emil Velikov	c8f1f2dc2d	winsys/radeon: remove fall-back defines Provided by libdrm as of last commit. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-16 15:17:52 +00:00
Emil Velikov	f3637b3a1e	configure.ac: bump LIBDRM_RADEON requirement to 2.4.71 Such that we can remove all the local fall-back definitions and use the official UABI ones. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-16 15:17:52 +00:00
Emil Velikov	389478c4e9	bin/get-fixes-pick-list.sh: add new script The script parses the "Fixes" tags and nominates respective commit if applicable. Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-16 15:17:52 +00:00
Emil Velikov	f1b0b75099	bin/get-pick-list.sh: remove ancient way of nominating patches The old way of nominating patches [NOTE: .*[Cc]andidate] was deprecated and has been unused for approx. 3 years. Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-16 15:17:52 +00:00
Emil Velikov	d6b1d11d4f	bin/get-pick-list.sh: limit `git grep ...' only as needed Analogous to previous commit. Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-16 15:17:52 +00:00
Emil Velikov	d292f12d94	bin/get-typod-pick-list.sh: limit `git grep ...' to only as needed The currently used range HEAD..origin/master is far too broad. It looks for nominations within the already_landed list (branchpoint..HEAD). Similarly we look for already_landed whiting the [possible] nominations Rand branchpoint..origin/master. Improve things by limiting the look ups to the branch point. Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-16 15:17:52 +00:00
Emil Velikov	71e00d62ed	bin/get-extra-pick-list: rework to use already_picked list Currently we loop (git log --grep) to check if the fix has landed. We can simplify and make things faster by storing the already_picked list and grep ping through it. Slim down the message while we're here. Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-16 15:17:52 +00:00
Emil Velikov	cb1947eac7	bin/get-extra-pick-list: use git merge-base to get the branchpoint Since mesa development history is linear and the only diversion is at the branchpoint. Thus we can drop the ad-hoc parsing and use git merge-base to retrieve it. Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-16 15:17:51 +00:00
Emil Velikov	1c0a536a72	docs: provide some tips where to obtain Mesa binaries Mention the generic channels (PPA, Corp, other) as well as give a couple of examples. Even if the latter became out of date the former should a be good guide. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-16 15:17:51 +00:00
Emil Velikov	99266ec3ce	docs/submittingpatches: assorted grammar fixes Cc: Ben Crocker <bcrocker@redhat.com> Suggested-by: Ben Crocker <bcrocker@redhat.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-16 15:17:51 +00:00
Emil Velikov	e280a6bc8a	docs/releasing: update the website section Things are automated via git hooks. Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> --- Guys, let me know when things are in place.	2017-02-16 15:17:51 +00:00
Emil Velikov	652e367d5f	docs/releasing: tweak the glxinfo/glxgear/etc. command lines Print only the information needed. Namely: info: the DRI module picked and the vendor/renderer strings gears: everything but the "...configuration file..." line(s) v2: (Eric) Use "2>&1 \|" over "\|&", properly escape &. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-16 15:17:51 +00:00
Emil Velikov	f9b18d5acc	docs/releasing: build test the scons/mingw build We had multiple cases in the past where files used only by the Scons/MinGW/Windows build were missing. Avoid such instances and add a step to catch them early. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-16 15:17:51 +00:00
Dave Airlie	03f4982c68	nir: handle some 64-bit integer conversions These are enough for the spir-v generator to handle UConvert and SConvert operations, and fix the 4 tests in CTS. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:13:21 +10:00
Dave Airlie	adb9555794	nir: handle 64-bit integer types in glsl->nir type conversion. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:13:14 +10:00
Dave Airlie	14167080e2	spirv: handle SpvOpUConvert in proper place. This was falling into the quantizetof16 path. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:11:59 +10:00
Dave Airlie	2d0b145902	spirv: add support for Int64 capability This just adds the support at the spirv->nir level for the Int64 cap. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:11:13 +10:00
Dave Airlie	48ebdbecc5	spirv/nir: add support for int64 This adds the spirv->nir conversion for int64 types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:11:05 +10:00
Dave Airlie	7593f2ac1b	nir/types: add C accessors for 64-bit integer types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:10:45 +10:00
Dave Airlie	b292e662fc	radv: add fast color clear for b10g11r11 This is used in DOOM, so provide the fast clear path for it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:09:15 +10:00
Timothy Arceri	e6506b3cd2	mesa: retain gl_shader_programs after glDeleteProgram if they are in use Fixes regressions from `c505d6d852`. Switching from using gl_shader_program to gl_program for the pipline objects CurrentProgram array meant we were freeing gl_shader_programs immediately after glDeleteProgram was called, but the spec states the program should only get deleted once it is no longer in use. To work around this we add a new ReferencedPrograms array to track gl_shader_programs in use. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-16 15:01:41 +11:00
Timothy Arceri	300900516d	mesa: remove tabs in dri xmlconfig.c Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-16 14:47:13 +11:00
Timothy Arceri	703b592f7a	mesa: style fixes for dri xmlconfig.c Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-16 14:47:13 +11:00
Chris Wilson	ed442ee39b	i965: Do not use purged bo after calling glObjectUnpurgeable If the buffer has been freed by the kernel under memory pressure, it is invalid to try and access the backing storage for that buffer in the future - the backing storage is not recreated automatically. As such we need to mark the GL object as being freed for unretained buffers and so recreate the object on next use. Futhermore from the GL_APPLE_object_purgeable: "In contrast, by calling ObjectUnpurgeableAPPLE with an <option> of UNDEFINED_APPLE, the application is indicating that it intends to recreate the contents of the storage from scratch. Further, the application is is stating that it would like the GL to do only the minimal amount of work set PURGEABLE_APPLE to FALSE. If ObjectUnpurgeableAPPLE is called with the <option> set to UNDEFINED_APPLE, then ObjectUnpurgeableAPPLE will return the value UNDEFINED_APPLE." we must always report GL_UNDEFINED_APPLE when called with glObjectUnpurgeable(GL_UNDEFINED_APPLE). Testcase: piglit/object_purgeable-api-* Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-15 17:00:42 -08:00
Matt Turner	a1891da7c8	Revert "i915: Always enable GL 2.0 support." This partially reverts commit `97217a40f9`. It leaves ES 2.0 support in place per Ian's suggestion, because ES 2.0 is designed to work on hardware like i915. Chrome only uses the GPU if you have GL >= 2.0, and using i915 (and prog_execute) actually hurt performance compared with the software paths.	2017-02-15 14:52:27 -08:00
Matt Turner	656e30b686	anv: Use build-id for pipeline cache UUID. The --build-id=... ld flag has been present since binutils-2.18, released 28 Aug 2007. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-02-15 13:59:51 -08:00
Matt Turner	d4fa083e11	util: Add utility build-id code. Provides the ability to read the .note.gnu.build-id section of ELF binaries, which is inserted by the --build-id=... flag to ld. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-02-15 13:59:51 -08:00
Bas Nieuwenhuizen	4e6095ff61	radv: Add support for shaderStorageImageReadWithoutFormat. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-15 21:18:21 +01:00
Bas Nieuwenhuizen	501a4c0d73	spirv: Add support for SpvCapabilityStorageImageReadWithoutFormat. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-15 21:18:18 +01:00
Bas Nieuwenhuizen	53873697e4	radv: Add support for shaderStorageImageWriteWithoutFormat. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-15 21:18:13 +01:00
Eduardo Lima Mitev	633c959fae	getteximage: Return correct error value when texure object is not found glGetTextureSubImage() and glGetCompressedTextureSubImage() are currently returning INVALID_OPERATION error when the passed texture argument does not correspond to an existing texture object. However, the error should be INVALID_VALUE instead. From OpenGL 4.5 spec PDF, section '8.11. Texture Queries', page 236: "An INVALID_VALUE error is generated if texture is not the name of an existing texture object." Same wording applies to the compressed version. The INVALID_OPERATION error is coming from the call to _mesa_lookup_texture_err(). This patch uses _mesa_lookup_texture() instead and emits the correct error in the caller. Fixes: GL45-CTS.get_texture_sub_image.errors_test Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-15 19:37:21 +01:00
Jason Ekstrand	a9a517f530	util: Fix a typo in Makefile.sources Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-15 10:27:42 -08:00
Lionel Landwerlin	569231c55e	i965: define default allow_higher_compat_version value Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `9d16f3903e` ("driconf: add allow_higher_compat_version option")	2017-02-15 17:03:31 +00:00
Samuel Pitoiset	124d9dd57f	drirc: add allow_higher_compat_version for Tropico 5 v2: s/force_compat_profile/allow_higher_compat_version Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-15 16:15:54 +01:00
Samuel Pitoiset	76c6d85cbd	drirc: add allow_higher_compat_version for Crookz - The Big Heist v2: s/force_compat_profile/allow_higher_compat_version Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-15 16:15:54 +01:00
Samuel Pitoiset	34d587abc2	drirc: add allow_higher_compat_version for Worms WMD v2: s/force_compat_profile/allow_higher_compat_version Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-15 16:15:54 +01:00
Samuel Pitoiset	9d16f3903e	driconf: add allow_higher_compat_version option Mesa currently doesn't allow to create 3.1+ compatibility profiles mainly because various features are unimplemented and bugs can happen. However, some buggy apps request a compat profile without using any old features unimplemented in mesa, and they fail to start. This option should help some games to run but it's not enough for all (eg. Dying Light). v2: - s/force_compat_profile/allow_higher_compat_version Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-15 16:15:32 +01:00
Marek Olšák	d1fae627fa	gallium/radeon: add a HUD query for monitoring the CS thread activity Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-15 14:35:52 +01:00
Lionel Landwerlin	0fcb92c17d	anv: wsi: report presentation error per image request vkQueuePresentKHR() takes VkPresentInfoKHR pointer and includes a pResults fields which must holds the results of all the images requested to be presented. Currently we're not filling this field. Also as a side effect we probably want to go through all the images rather than stopping on the first error. This commit also makes the QueuePresentKHR() implementation return the first error encountered. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-02-15 11:43:05 +00:00
Eric Engestrom	fc9b119013	egl: remove duplicate 0 assignment The memset on the line before already takes care of this. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-15 08:57:05 +00:00
Hans de Goede	4c66f529a8	glx/glvnd: Fix GLXdispatchIndex sorting Commit `8bca8d89ef` ("glx/glvnd: Fix dispatch function names and indices") fixed the sorting of the array initializers in g_glxglvnddispatchfuncs.c because FindGLXFunction's binary search needs these to be sorted alphabetically. That commit also mostly fixed the sorting of the DI_foo defines in g_glxglvnddispatchindices.h, which is what actually matters as the arrays are initialized using "[DI_foo] = glXfoo," but a small error crept in which at least causes glXGetVisualFromFBConfigSGIX to not resolve, breaking games such as "The Binding of Isaac: Rebirth" and "Crypt of the NecroDancer" from Steam not working and possible causes other problems too. This commit fixes the last of the sorting errors, fixing these mentioned games not working. Fixes: `8bca8d89ef` ("glx/glvnd: Fix dispatch function names and indices") Cc: "13.0" <mesa-stable@lists.freedesktop.org> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Cc: Adam Jackson <ajax@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-15 09:55:57 +01:00
Dave Airlie	b4b19afebe	radv: also fixup event emission to not get culled. This is possibly a bad idea, I might have to consider a better one. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-15 00:36:30 +00:00
Jason Ekstrand	bfbb362601	anv: Use vk_foreach_struct for handling extension structs Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-14 16:15:39 -08:00
Jason Ekstrand	f76584e7b7	util: Add helpers for iterating over Vulkan extension structs Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-14 16:15:39 -08:00
Dave Airlie	d1640e7932	radv: query cmds should mark a cmd buffer as having draws. This fixes a regression with the remove non-draw cmd buffers in queries. Fixes: `8b47b97215` radv: detect command buffers that do no work and drop them (v2) Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-15 00:02:33 +00:00
Kenneth Graunke	a3e4fa5495	glsl: Handle packed_type == ivec4[] in lower_packed_varyings(). For GS input arrays, we may turn a packed_type of ivec4 into an array of ivec4s. We still want flat qualification. Found by inspection. Not known to help anything. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-02-14 14:47:40 -08:00
Jason Ekstrand	f434a60a53	anv: Implement the Skylake stencil PMA optimization Unfortunately, this doesn't substantially improve the performance of any known apps. With Dota 2 on my Sky Lake gt4, it seems help by somewhere between 0% and 1% but there's enough noise that it's hard to get a clear picture. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-02-14 14:18:55 -08:00
Jason Ekstrand	d665c51eea	genxml: Add the CACHE_MODE_0 register on gen9 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-14 14:18:55 -08:00
Jason Ekstrand	028e1137e6	anv/pipeline: Be smarter about depth/stencil state It's a bit hard to measure because it almost gets lost in the noise, but this seemed to help Dota 2 by a percent or two on my Broadwell GT3e desktop. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-02-14 14:18:55 -08:00
Jason Ekstrand	215fed7318	anv/pipeline: Make a copy of VkPipelineDepthStencilStateCreateinfo Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-02-14 14:18:55 -08:00
Jason Ekstrand	e8d52dab48	anv: Add support for the PMA fix on Broadwell This helps Dota 2 on Broadwell by 8-9%. I also hacked up the driver and used the Sascha "shadowmapping" demo to get some results. Setting uses_kill to true dropped the framerate on the demo by 25-30%. Enabling the PMA fix brought it back up to around 90% of the original framerate. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-02-14 14:18:55 -08:00
Jason Ekstrand	62bba4ba2d	genxml: Add the CACHE_MODE_1 register on gen8 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-14 14:18:55 -08:00
Jason Ekstrand	6ce8592836	anv: Disable stencil writes when both write masks are zero Vulkan doesn't have a stencilWriteEnable bit like it does for depth. Instead, you have a stencil mask. Since the stencil mask is handled as dynamic state, we have to handle it later during command buffer construction. This, combined with a later commit, seems to help Dota2 on my Broadwell GT3e desktop by a couple percent because it allows the hardware to move the depth and stencil writes to early in more cases. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-02-14 14:18:55 -08:00
Jason Ekstrand	114c281e70	anv/entrypoints: Only generate entrypoints for supported features This changes the way anv_entrypoints_gen.py works from generating a table containing every single entrypoint in the XML to just the ones that we actually need. There's no reason for us to burn entrypoint table space on a bunch of NV extensions we never plan to implement. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-14 14:18:55 -08:00
Connor Abbott	6319bfc2a6	anv: fix Get*MemoryRequirements for !LLC Even though we supported both coherent and non-coherent memory types, we effectively forced apps to use the coherent types by accident. Found by inspection, only compile tested. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-02-14 13:05:44 -08:00
Marek Olšák	b5eb38f071	radeonsi: implement uploading zero-stride vertex attribs This is the only kind of user buffer we can get with the GL core profile. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-14 22:04:35 +01:00
Marek Olšák	b8f3b00742	gallium/radeon: include SDMA in the GPU load query Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-14 21:47:51 +01:00
Marek Olšák	579ffe81f1	gallium/hud: add monitoring of API thread busy status Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-14 21:47:51 +01:00
Marek Olšák	626e4ef18f	gallium/u_queue: add util_queue_get_thread_time_nano Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-14 21:47:51 +01:00
Marek Olšák	6c61a8bfc6	gallium/os: add per-thread time clock queries Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-14 21:47:51 +01:00
Marek Olšák	5d19b503af	st/mesa: tell u_vbuf that GL core doesn't have user VBOs I think this only affects radeonsi - VI, because all other drivers using u_vbuf probably don't support GL_DOUBLE, so they won't be affected by this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-14 21:47:51 +01:00
Marek Olšák	e0f95ddd3e	gallium: let state trackers tell u_vbuf whether user VBOs are possible This can affect whether u_vbuf will be enabled or not. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-14 21:47:51 +01:00
Marek Olšák	0561b3c75a	vdpau: skip vlVdpOutputSurfacePutBitsNative with a zero-area rectangle This prevents errors: "EE r600_texture.c:1571 r600_texture_transfer_map - failed to create temporary texture to hold untiled copy" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99542 Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-14 21:47:51 +01:00
Marek Olšák	c196efcf03	gallium/radeon: add an assertion to texture_transfer_map for app bugs Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>	2017-02-14 21:47:51 +01:00
Marek Olšák	4c36553a46	radeonsi: implement legacy GL_DOUBLE vertex formats so that we can disable u_vbuf for GL core profiles. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-14 21:47:51 +01:00
Marek Olšák	2c8ee2e825	radeonsi: clean up si_get_param has_streamout is always true Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-14 21:47:51 +01:00
Marek Olšák	4fe1fd4df4	gallium/hud: don't use user vertex buffers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	00d170a5c3	gallium/hud: call u_upload_alloc only once Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	5699c8a2f7	gallium/u_upload_mgr: remove deprecated function u_upload_buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Charmaine Lee <charmainel@vmware.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	2ca3548eb9	gallium/radeon: remove the internal u_upload_mgr pointer also remove the BIND flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Tested-by: Charmaine Lee <charmainel@vmware.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	1e20112abd	st/mesa: use the common uploader (v2) v2: use const_uploader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1) Tested-by: Charmaine Lee <charmainel@vmware.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	d3de8e1096	gallium/vl: use the common uploader Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Tested-by: Charmaine Lee <charmainel@vmware.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	b1dc347822	gallium/vbuf: use the common uploader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Tested-by: Charmaine Lee <charmainel@vmware.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	5fe5321633	gallium/blitter: use the common uploader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Tested-by: Charmaine Lee <charmainel@vmware.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	8a84585951	gallium/primconvert: use the common uploader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Tested-by: Charmaine Lee <charmainel@vmware.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	9f78ec39e9	gallium/hud: use the common uploader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Tested-by: Charmaine Lee <charmainel@vmware.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	55ad59d2b7	gallium: set pipe_context uploaders in drivers (v3) Notes: - make sure the default size is large enough to handle all state trackers - pipe wrappers don't receive transfer calls from stream_uploader, because pipe_context::stream_uploader points directly to the underlying driver's stream_uploader (to keep it simple for now) v2: add error handling to nv50, nvc0, noop v3: set const_uploader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1) Tested-by: Charmaine Lee <charmainel@vmware.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	998396c32e	gallium/u_upload_mgr: add a helper that creates the default uploader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Tested-by: Charmaine Lee <charmainel@vmware.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	d71bc0d741	gallium: add common uploaders into pipe_context (v2) For lower memory usage and more efficient updates of the buffer residency list. (e.g. if drivers keep seeing the same buffer for many consecutive "add" calls, the calls can be turned into no-ops trivially) v2: add const_uploader, add documentation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Tested-by: Charmaine Lee <charmainel@vmware.com>	2017-02-14 21:46:16 +01:00
Dave Airlie	3360dbe0c1	radv: fixup IA_MULTI_VGT_PARAM handling. This ports the remains of the workarounds from radeonsi for the non-TESS cases. It should provide equivalent workarounds for hawaii and bonarie. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-14 20:29:19 +00:00
Dave Airlie	a465eae38f	radv: fix warning since using common gs emit code Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-14 20:02:13 +00:00
Dave Airlie	09bf5491c4	radv: adopt some init config workarounds from radeonsi. Just one bonaire fix. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-15 05:02:33 +10:00
Dave Airlie	eea562f875	radv: re-enable init gfx state on CIK. Once the color alignment was fixed this works fine now. Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-15 05:02:29 +10:00
Dave Airlie	5e988ac61f	radv: align the initial state command buffer. This just adds the padding to align this to an 8 dword boundary. Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-15 05:02:21 +10:00
Dave Airlie	0f1a4220a6	radv: fix cik macroModeIndex. This just a CIK fix ported from radeonsi. Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-15 05:02:13 +10:00
Dave Airlie	06ffd29925	radv: change base aligmment for allocated memory. On some CIK (Hawaii) this needs to be at least 64k, I'm not 100% sure it doesn't need to be 128k. This was causing fast clear eliminate to overwrite the previous buffer, which since my gfx init code, was the indirect buffer. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=99692 Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-15 04:59:57 +10:00
Alex Smith	924a8cbb40	anv: Add support for shaderStorageImageWriteWithoutFormat This allows shaders to write to storage images declared with unknown format if they are decorated with NonReadable ("writeonly" in GLSL). Previously an image view would always use a lowered format for its surface state, however when a shader declares a write-only image, we should use the real format. Since we don't know at view creation time whether it will be used with only write-only images in shaders, create two surface states using both the original format and the lowered format. When emitting the binding table, choose between the states based on whether the image is declared write-only in the shader. Tested on both Sascha Willems' computeshader sample (with the original shaders and ones modified to declare images writeonly and omit their format qualifiers) and on our own shaders for which we need support for this. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-14 08:16:52 -08:00
Alex Smith	94d48b7f9f	spirv: Add support for SpvCapabilityStorageImageWriteWithoutFormat Allow that capability if the driver indicates that it is supported, and flag whether images are read-only/write-only in the nir_variable (based on the NonReadable and NonWritable decorations), which drivers may need to implement this. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-14 08:16:52 -08:00
Iago Toral Quiroga	5c6eaa1421	nir/spirv: do not require a format with images that are not sampled As soon as we support shaderStorageImageWriteWithoutFormat we can see write-only images (sampled == 2) that don't have a format specified. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-14 08:16:52 -08:00
Jason Ekstrand	2c30918581	anv/apply_pipeline_layout: Set image.write_only to false This makes our driver robust to changes in spirv_to_nir which would set this flag on the variable. Right now, our driver relies on spirv_to_nir not setting var->data.image.write_only for correctness. Any patch which implements the shaderStorageImageWriteWithoutFormat will need to effectively revert this commit. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-14 08:16:45 -08:00
Jason Ekstrand	f8dfe9b826	intel/isl: Add format metadata for typed reads/writes This adds two columns to the format table as well as two helpers for determining whether or not a given format is supported for typed reads and writes. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-14 07:50:13 -08:00
Jason Ekstrand	0ef14cdc98	anv/cmd_buffer: Return a VkResult from verify_cmd_parser This fixes a "statement with no effect" compiler warning Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-14 07:50:13 -08:00
Ilia Mirkin	956556b3c3	nvc0: disable linked tsc mode in compute launch descriptor Empirically, this makes things work. Presumably this was originally copied from the blob, which does make use of linked tsc mode. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99532 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2017-02-13 20:10:53 -05:00
Anuj Phogat	5e2909e732	mesa: Add EXT_frag_depth bits and enable it on all drivers Passes the newly added piglit test for this extension on i965. V2: Fix comments by Ilia. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-13 16:08:40 -08:00
Dave Airlie	b3b4114a0f	radeonsi: use common sendmsg emission function. This just ports radeonsi to use the sendmsg common code. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-14 00:03:22 +00:00
Dave Airlie	e3324e0c60	radv/ac: use sendmsg emission interface. This uses the common code to emit the correct intrinsic. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-14 00:03:18 +00:00
Dave Airlie	f32955be43	radeon/ac/llvm: add support for sendmsg emission This lets us use the new intrinsic on the correct version of llvm. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-14 00:02:50 +00:00
Dave Airlie	f77d2871ac	radv: disable gfx init on CIK for now Luzipher on irc report this hangs his Hawaii, disable for now until I get time to debug. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-14 08:01:39 +10:00
Dave Airlie	69fc7a2c82	tgsi: fix memory leak in tgsi sanity check This just fixes this without repeating the code. Reported-by: Li Qiang Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-14 08:00:30 +10:00
Dave Airlie	62fef3e159	radv/ac: use common interp code for new intrinsics This uses the common fs interp code to use the new llvm intrinsics so llvm can drop the old ones. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-14 07:48:01 +10:00
Dave Airlie	592069c1fb	radv: use indirect buffer for initial gfx state. This puts the common gfx state for the device into an indirect buffer, and just calls out to it, on CIK and above. This is taken from what radeonsi does. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-13 20:02:45 +00:00
Dave Airlie	b26253b34d	radv: start splitting init config up This is just prep work for the following patch to use a common gfx init indirect buffer. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-13 20:02:34 +00:00
Dave Airlie	604e562e5b	radv: don't pass physical device to si_init_ fns. This is just a trivial cleanup. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-13 20:02:06 +00:00
Dave Airlie	8b47b97215	radv: detect command buffers that do no work and drop them (v2) If a buffer is just full of flushes we flush things on command buffer submission, so don't bother submitting these. This will reduce some CPU overhead on dota2, which submits a fair few command streams that don't end up drawing anything. v2: reorganise loop to count first then malloc, rename some vars (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-13 20:00:28 +00:00
Jason Ekstrand	d49d275c41	anv/blorp: Don't sanitize the swizzle for blorp_clear BLORP is now smart enough to handle any swizzle (even those that contain ZERO or ONE) in a reasonable manner. Just let BLORP handle it. This fixes the following Vulkan CTS tests on Haswell: - dEQP-VK.api.image_clearing.clear_color_image.1d_b4g4r4a4_unorm_pack16 - dEQP-VK.api.image_clearing.clear_color_image.2d_b4g4r4a4_unorm_pack16 - dEQP-VK.api.image_clearing.clear_color_image.3d_b4g4r4a4_unorm_pack16 Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-02-13 09:24:49 -08:00
Jason Ekstrand	e233db6e93	intel/blorp: Swizzle clear colors on the CPU It's trivial to swizzle clear colors on the CPU, easily deals with the hardware restrictions for render target swizzles, and makes swizzled clears work on all hardware as opposed to just HSW+. Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-02-13 09:24:43 -08:00
Emil Velikov	bd1c61261f	docs: add news item and link release notes for 17.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-13 12:05:34 +00:00
Emil Velikov	437b6a136e	docs: add sha256 checksums for 17.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `80b41d9899`)	2017-02-13 12:02:58 +00:00
Emil Velikov	2343b8a262	docs: Update 17.0.0 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `683462e680`)	2017-02-13 12:02:56 +00:00
Emil Velikov	20ccff56a0	st/xlib: remove always true ifdef GLX_EXTENSION guards Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com>	2017-02-13 10:15:02 +00:00
Emil Velikov	884fd1262f	xlib: remove always true ifdef GLX_EXTENSION guards Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com>	2017-02-13 10:14:40 +00:00
Emil Velikov	261d5e4c6d	glx: remove always true XDAMAGE_1_1_INTERFACE guard Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-02-13 10:14:32 +00:00
Emil Velikov	87f485e957	scons: check for libXdamage 1.1 or later Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-02-13 10:14:23 +00:00
Emil Velikov	43b09ee0b2	configure.ac: check for libXdamage 1.1 or later Released back in 2007 so it should not be an issue for anyone building from git. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-02-13 10:14:06 +00:00
Emil Velikov	bfac8d1749	glx: remove DRI2DriverPrimeShift compile guards DRI2DriverPrimeShift was added in dri2proto-2.8, which we now require as of the previous commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-02-13 10:13:46 +00:00
Emil Velikov	a1662d0dab	vl: remove DRI2DriverPrimeShift compile guards DRI2DriverPrimeShift was added in dri2proto-2.8, which we now require as of the previous commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-02-13 10:13:29 +00:00
Emil Velikov	cd1ebd8aba	scons: add missing dri2proto requirement Noticed while skimming through, although admittedly there's many other dependencies that are not tracked by the scons build. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-02-13 10:13:24 +00:00
Emil Velikov	6689cc0392	configure.ac: dump dri2proto requirement to 2.8 dri2proto 2.8 was released 4+ years ago, so it must be of no surprise for anyone building mesa from git. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-02-13 10:12:56 +00:00
Emil Velikov	404a5ca088	glx: remove always true ifdef guards The two symbols referenced were introduced with v2.2 and 2.3 of the dri2proto package and we require dri2proto >= 2.6. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-02-13 10:12:36 +00:00
Emil Velikov	4f080b46a8	winsys/intel: remove unused winsys - ilo was its only user Cc: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-13 10:09:52 +00:00
Emil Velikov	6ffddba33b	configure.ac: do not use deprecated macros - AC_HELP_STRING AC_ERROR Replace with AS_HELP_STRING and AC_MSG_ERROR respectively, as spotted by autoupdate. Note that the suggested AC_CANONICAL_SYSTEM > AC_CANONICAL_TARGET change is not addressed here since that requires very extensive testing. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-13 10:09:45 +00:00
Timothy Arceri	0cbde643eb	util/disk_cache: correctly use stat(3) I forgot to error check stat() and also I wasn't using the subdir in is_two_character_sub_directory(). Fixes: `d7b3707c61` "util/disk_cache: use stat() to check if entry is a directory" Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-13 10:01:12 +00:00
Michel Dänzer	0f53404565	configure.ac: Drop LLVM compiler flags more radically Drop all -m, -W, -O, -g and -f* flags, with the exception of -fno-rtti, which must be used if it's part of the llvm-config --cxxflags output. We don't want LLVM to dictate the flags we use, and it can even cause build failures, e.g. if LLVM and Mesa are built with different compilers. While we're at it, eat any whitespace preceding dropped flags as well. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-13 16:07:37 +09:00
Kenneth Graunke	57dc6d80a0	glsl: Drop resize-to-MaxPatchVertices hack. TCS and TES inputs without an array size are implicitly sized to gl_MaxPatchVertices. But TCS outputs are apparently not: "If no size is specified, it will be taken from the output patch size (gl_VerticesOut) declared in the shader." Fixes dEQP-GLES31.functional.program_interface_query.program_output. array_size.separable_tess_ctrl.var. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-02-12 21:09:25 -08:00
Kenneth Graunke	1fad070f96	mesa: Ignore per-vertex array size in SSO pipeline validation. We were already unwrapping types when the producer was a non-array stage and the consumer was an arrayed-stage...but we ought to unwrap both ends for TCS -> TES matching too. This will allow us to drop the "resize to gl_MaxPatchVertices" check shortly, which breaks some things. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-02-12 21:09:23 -08:00
Kenneth Graunke	e99df398f1	glsl: Update a comment about link errors for TCS && !TES. OpenGL ES actually has spec text to prohibit this. It's just OpenGL that's confusing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-02-12 21:09:21 -08:00
Kenneth Graunke	365afbdaef	mesa: Do a draw time check for TES && !TCS in ES 3.x. ES 3.x requires both TCS and TES to be present. We already checked the TCS && !TES case above, so we just have to check !TCS && TES here. Note that this is allowed in OpenGL, just not ES. This fixes a subcase of: dEQP-GLES31.functional.debug.negative_coverage.*.tessellation.single_tessellation_stage Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-02-12 21:09:19 -08:00
Kenneth Graunke	05a56893aa	mesa: Do (TCS && !TES) draw time validation in ES as well. Now that we have OES_tessellation_shader, the same situation can occur in ES too, not just GL core profile. Having a TCS but no TES may confuse drivers - i965 crashes, for example. This prevents regressions in ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage with some SSO pipeline validation changes I'm making. v2: Add an ES spec citation (suggested by Alejandro) Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-02-12 21:09:14 -08:00
Jason Ekstrand	c59d1ea51b	i965/sampler_state: Set the "Base Mip Level" field on Sandy Bridge Fixes two GL ES 3.0 CTS tests on Sandy Bridge: ES3-CTS.functional.texture.mipmap.cube.base_level.linear_linear ES3-CTS.functional.texture.mipmap.cube.base_level.linear_nearest Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-02-12 17:56:32 -08:00
Jason Ekstrand	c4f8f395b2	i965/sampler_state: Pass texObj into update_sampler_state Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-02-12 17:56:32 -08:00
Jason Ekstrand	9df3778016	i965/sampler_state: Clamp min/max LOD to 14 on gen7+ Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-02-12 17:56:32 -08:00
Ilia Mirkin	3970257cef	st/mesa: don't pass compare mode for stencil-sampled textures Fixes dEQP-GLES31.functional.stencil_texturing.misc.compare_mode_effect Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org	2017-02-12 19:26:25 -05:00
Ilia Mirkin	3f8b886e73	nv50,nvc0: use alternate samplers for stencil The blob uses these, and it fixes a bunch of dEQP stencil sampling tests involving border colors. Probably the Z-based samplers work somehow differently wrt border colors when using the stencil swizzle. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-12 18:22:17 -05:00
Bas Nieuwenhuizen	1811ccf125	radv: Fix radv_GetPhysicalDeviceQueueFamilyProperties2KHR. The struct have different size, so the arrays have different stride. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-13 00:18:19 +01:00
Wladimir J. van der Laan	55e00c7cfe	etnaviv: Set shader instruction area correctly for GC3000 - Use the same instruction area on GC3000 as the Vivante driver. This allows the same number of instructions on GC3000 as GC2000 instead of half. - Makes sure that the "PE to FE" stall before updating the shader code or constants is hit (which is conditional on vs_offset > 0x4000). This is necessary on GC3000 too, it increases stability. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-02-12 20:42:37 +01:00
Wladimir J. van der Laan	0fe60e4fcc	etnaviv: Update hw header files Update from etnaviv repository rnndb. This adds some newly discovered state for GC3000 (and some GC2000) features. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-02-12 20:38:56 +01:00
Dave Airlie	f466d4dd6a	radv: reduce CPU overhead merging bo lists. Just noticed we do a fair bit of unneeded searching here. Since we know that the buffers in a CS are unique already, the first time we get any buffers, we can just memcpy those into place, and when we are searching for subsequent CSes, we only have to search up until where the previous unique buffers were. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-12 19:00:19 +00:00
Ilia Mirkin	48f04862c1	nvc0: set the render condition in the compute object Fixes GL45-CTS.compute_shader.conditional-dispatching Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2017-02-11 21:06:52 -05:00
Ilia Mirkin	7e75f0913a	gm107/ir: fix address offset bitfield for ATOMS Fixes GL45-CTS.compute_shader.atomic-case1 on Maxwell Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2017-02-11 21:06:41 -05:00
Ilia Mirkin	b38aab50a0	nv50/ir: convert an ATOM.EXCH without a destination into a store On SM35 there does not appear to be a way to emit a ATOM.EXCH with a null destination. This should be functionally equivalent to a plain store however, so just do that. Fixes GL45-CTS.compute_shader.atomic-case2 on SM35. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-11 20:25:26 -05:00
Ilia Mirkin	2b0580123e	nvc0: fix 64-bit integer query buffer writes The former logic just plain didn't work at all. We need to write the subsequent dword to the next buffer location. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-11 20:25:26 -05:00
Ilia Mirkin	399e267f0e	nv50/ir: return a register when retrieving thread id sysval We have logic to short-circuit such retrievals to zero. However "zero" was an immediate, and some logic expected to get registers (to later be propagated). Fix this by using loadImm. Fixes GL45-CTS.gpu_shader5.images_array_indexing Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-11 20:25:26 -05:00
Ilia Mirkin	0d1edb01ec	nv50/ir: add missing break after DSSG Recently broken during int64 addition. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-11 17:21:55 -05:00
Christian Gmeiner	137ad879d5	etnaviv: shader-db traces Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>	2017-02-11 21:22:53 +01:00
Christian Gmeiner	7256ed3c79	etnaviv: keep track of emitted loops Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-02-11 21:22:48 +01:00
Christian Gmeiner	5a3ea68895	etnaviv: wire up core pipe_debug_callback Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2017-02-11 21:22:42 +01:00
Jose Maria Casanova Crespo	5bc222ebaf	glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1 From GLSL ES 3.10 spec, section 4.1.9 "Arrays": "If an array is declared as the last member of a shader storage block and the size is not specified at compile-time, it is sized at run-time. In all other cases, arrays are sized only at compile-time." In desktop GLSL it is allowed to have unsized-arrays that are not last, as long as we can determine that they are implicitly sized, which is detected at link-time. With this patch Mesa reports a compilation error as glslang does with the following shader: buffer SSBO { vec4 data[]; vec4 moreData;}; void main (void) { } Fixes: dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-10 23:14:12 -08:00
Eric Anholt	0514b0bdc9	vc4: Enable glSampleMask() even when !rasterizer->multisample. gallium's blitter expects that it can set the sample mask even when the rasterizer doesn't have the flag on. Between this and the previous test, 10 new ext_framebuffer_multisample tests start passing.	2017-02-10 14:17:05 -08:00
Eric Anholt	5c86f119b9	vc4: Respect glSampleMask() even when we're not writing color. gallium's quad-based blitter for copying MSAA depth textures expects to be able to do 4 passes updating a sample at a time using glSampleMask, and there's no color buffer bound when it's doing that.	2017-02-10 14:17:04 -08:00
Eric Anholt	30237193f5	vc4: Use the nir_builder helper for loading sample mask.	2017-02-10 14:17:04 -08:00
Eric Anholt	ce538a443d	vc4: Use accurate 1/w in coordinate shader as well as vert shader. We probably shouldn't be emitting different scaled viewport coordinates between vertex and coord.	2017-02-10 14:17:04 -08:00
Eric Anholt	a0b6841838	vc4: Drop VS inputs to 8. In the hardware we only get to declare 8 vertex elements (GLES2's minimum), so we should be exposing that number here. Fixes an assertion failure in piglit texrect-many, at the expense of various GL 2.0-ish minmax tests now complaining that our count is too low.	2017-02-10 14:17:04 -08:00
Eric Anholt	b230939303	vc4: Avoid emitting small immediates for UBO indirect load address guards. The kernel will reject our shader if we emit one here, and having 4, 8, or 12 as the top end of our UBO clamp rare is enough that it's not worth making the kernel let us. Fixes piglit fs-const-array-of-struct and fs-const-array-of-struct-of-array since recent GLSL linking changes made us get this as an indirect load of a uniform, instead of a tempoary. Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-10 14:17:04 -08:00
Timothy Arceri	d7b3707c61	util/disk_cache: use stat() to check if entry is a directory d_type is not supported on all systems. Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97967	2017-02-10 23:50:36 +11:00
Emil Velikov	463236bd31	st/nine: update configure options in the README Cc: Axel Davy <axel.davy@ens.fr> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-10 11:47:24 +00:00
Emil Velikov	b3b415609d	configure.ac: supersede --enable-gallium-llvm over --enable-llvm Currently we have extra (somewhat questionable) modularity, such that one could build some parts with LLVM while others w/o. That is extremely fragile, error prone and requires quite noticable amount of code throughout. Thus lets deprecate the gallium toggle in faviour of the generic one. The former will throw a warning when set, and it will be overwritten by the latter. This will allow gradual transition w/o breaking people's scripts. v2: Rebase, document in release notes. Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de> (v1)	2017-02-10 11:47:24 +00:00
Emil Velikov	bdd6147e29	configure.ac: remove dummy radeon_gallium_llvm_check() The extra function brings no added benefit as of earlier commit which made llvm_require_version (as called by radeon_llvm_check) require LLVM (--enable-gallium-llvm). Fixes: 5f966a96af7 "configure.ac: Mandate --enable-gallium-llvm when checking LLVM version" Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-02-10 11:47:24 +00:00
Emil Velikov	d4840c0c26	configure.ac: correctly manage llvm auto-detection Earlier refactoring commits changed from one, dare I say it, broken behaviour to another. Namely: Before, as you explicitly --enable-gallium-llvm your selection was ignored when llvm-config was not present/detected. Today, the "auto" heuristics enables gallium llvm regardless if you have llvm/llvm-config available or not. Rework the auto-detection to attribute for llvm's presence. v2: Set enable_gallium_llvm=no when LLVM is not found. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de> Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-02-10 11:47:24 +00:00
Emil Velikov	ce65cc1f1f	configure.ac: disable enable_gallium_llvm in the !x86 case Already implicitly handled throughout, but keep it clear and disable gallium-llvm. This change should be a no-op. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-02-10 11:47:24 +00:00
Emil Velikov	4d8bb9cf8c	configure.ac: set LLVM_{C, CXX, LD}FLAGS only as needed Earlier refactoring commits started setting the above regardless if LLVM is used or not. Move them to the respective section to restore the original functionality. Since we require the preprocessor flags (includes in particular) for the header version parsing keep those as-is. They are not used outside of configure.ac thus should not cause any side-effects. As-is adding the C/CXXFLAGS can lead to build issues on when cross-compiling. Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Tomasz Figa <tfiga@chromium.org> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-02-10 11:47:24 +00:00
Emil Velikov	fc30992a54	Revert "configure.ac: Create correct LLVM_VERSION_INT with minor >= 10" As stated in [1] by the LLVM devs, the new versioning scheme will not deploy any minor version (i.e. it will always be zero). As such the patch should not be needed. This reverts commit `0e9a5be7e7`. [1] http://blog.llvm.org/2016/12/llvms-new-versioning-scheme.html Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-02-10 11:47:24 +00:00
Emil Velikov	5e9f4a5f3f	configure.ac: don't use == with test Although it works, it's not the correct thing to do. v2: Rebase v3: Rebase Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de> (v1)	2017-02-10 11:47:23 +00:00
Emil Velikov	65ee9dff69	configure.ac: remove unused LLVM variables LLVM_BINDIR is completely unused while others such as LLVM_LIBDIR are used only internally. In the latter case there's no need to AC_SUBST it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-02-10 11:47:23 +00:00
Tobias Droste	143c566a81	configure.ac: Only define HAVE_LLVM if LLVM is used Make sure that HAVE_LLVM compiler define is only set if LLVM is actually used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99010 Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tobias Droste <tdroste@gmx.de> v2 [Emil] fold within the existing conditional Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-10 11:47:23 +00:00
Tobias Droste	04377cbdcf	configure.ac: Rework MESA_LLVM and LLVM detection Set FOUND_LLVM only when LLVM is present (checking for exact version/etc is deferred) and use enable-gallium-llvm to indicate the global LLVM status. Renaming the latter is not appropriate for stable patches, so we'll address it with a later commit. Loosely based on work by Tobias. v2: Check FOUND_LLVM if enable_gallium_llvm is set. Cc: Dave Airlie <airlied@redhat.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-02-10 11:47:23 +00:00
Emil Velikov	5869a7db75	configure.ac: move enable-gallium-llvm dependency with-gallium-drivers ... to where it's applicable. Since we effectively made --enable-gallium-llvm mean --enable-llvm with earlier commits, we need to move the requirement to guard the compnents added for the LLVM draw. Otherwise we'll error (as below) when building RADV w/o gallium drivers. configure: error: --enable-gallium-llvm is required when building radv v2: Don't remove but move the dependency (Tobias). Cc: Dave Airlie <airlied@redhat.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-02-10 11:47:23 +00:00
Emil Velikov	a66ffcd736	configure.ac: Mandate --enable-gallium-llvm when checking LLVM version With this change we effectively require --enable-gallium-llvm when building RADV. This should be perfectly safe since the gallium radeonsi driver already explicitly requires it. The "gallium" part in --enable-gallium-llvm is about to be removed soon (not in stable), but until then make sure that things can build. To reflect the requirement (as opposed to check previously) we rename llvm_check_version_for to llvm_require_version Cc: Dave Airlie <airlied@redhat.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-02-10 11:47:23 +00:00
Emil Velikov	514a494415	configure.ac: Rename the gallium_require_llvm helper Drop the gallium prefix since we're about it use it throughout the configure. Note we do want to check for enable_gallium_llvm check since (as explicitly requested) the toggle should mean --enable-llvm. Latter of which to be resolved with later patches. Cc: Dave Airlie <airlied@redhat.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-02-10 11:47:23 +00:00
Tobias Droste	f64d4d82bd	configure.ac: Don't check LLVM version in require_llvm This is actually not needed because the version is checked later. Around line 2380 if test "x$enable_gallium_llvm" == "xyes"; then llvm_check_version_for $LLVM_REQUIRED_GALLIUM "gallium" llvm_add_default_components "gallium" fi Cc: "17.0" <mesa-stable@lists.freedesktop.org> Cc: Tobias Droste <tdroste@gmx.de> Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1) v2: [Emil Velikov: rebase/respin series order] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-10 11:47:23 +00:00
Emil Velikov	38abcdba8a	configure.ac: move AC_ARG_ENABLE([gallium-llvm] hunk further up With next commits we'll require --enable-gallium-llvm (en route to a greater good later on) for RADV. The latter is required to ensure that as otherwise we'll fail to build. Cc: Dave Airlie <airlied@redhat.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-02-10 11:47:23 +00:00
Emil Velikov	3a7973fd15	configure.ac: remove unused AC_SUBST([MESA_LLVM]) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-02-10 11:47:22 +00:00
Nicolai Hähnle	de6e6a347d	loader: unconditionally include unistd.h and stdlib.h Otherwise we would fail with "implicit declaration of function" geteuid and getenv respectively. To trigger (re)move the libdrm.pc file and use the following: $ ./autogen.sh --disable-egl --disable-gbm --disable-dri \ --with-dri-drivers=swrast --with-gallium-drivers=swrast $ make Cc: Vinson Lee <vlee@freedesktop.org> Fixes: `3f462050c` ("loader: Add an environment variable to override driver name choice. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99701 v2: [Emil: handle stdlib.h add commit message] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-10 11:47:12 +00:00
Emil Velikov	a04cb3f8a5	intel/blorp: do not return const data by get_px_size_sa() Not much point in the const qualifier since we provide a copy to the user. Resolves the following -Wignored-qualifiers warning. src/intel/blorp/blorp_blit.c:1857:8: warning: 'const' type qualifier on return type has no effect [-Wignored-qualifiers] v2: keep const qualifier of local variable. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-10 11:47:12 +00:00
Marek Olšák	43a2ba1b7d	gallium/radeon: use staging for texture read mappings from GTT WC Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marek Olšák	dc7483f445	gallium/radeon: ignore the level parameter in buffer_transfer_map Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marek Olšák	d86099df0a	gallium/radeon: fix performance of buffer readbacks We want cached GTT for all non-persistent read mappings. Set level = 0 on purpose. Use dma_copy, because resource_copy_region causes a failure in the PBO read of piglit/getteximage-luminance. If Rocket League used the READ flag, it should get cached GTT. v2: mask out UNSYNCHRONIZED Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marek Olšák	24e3b06408	radeonsi: align vertex buffer descriptor list size for optimal prefetch Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marek Olšák	3a534c5c7d	radeonsi: align shader binaries to CP DMA alignment for optimal prefetch Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marek Olšák	1a392a4377	radeonsi: move CP_DMA_ALIGNMENT definition Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marek Olšák	4c288c73ea	radeonsi: remove SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER not necessary Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marek Olšák	65df38b191	radeonsi: remove separate CB/DB_META flush flags not used separately Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marek Olšák	8a2ae4153b	radeonsi: reduce the number of FMASK input coordinates Before: image_load v3, v[0:3] ... After: image_load v3, v[0:1] ... Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marek Olšák	28c06b3ceb	radeonsi: write shader asm annotated with wave info into GPU hang reports Note that the disassembly is written twice - first the unmodified compiler output and then the wave-annotated output only if there are waves executing the shader. Sample output from a real GPU hang most likely caused by image_sample: The number of active waves = 28 Pixel Shader - annotated disassembly: s_mov_b64 s[6:7], exec ; BE86017E [PC=0x10f3e3800, off=0, size=4] s_wqm_b64 exec, exec ; BEFE077E [PC=0x10f3e3804, off=4, size=4] ... image_sample v[7:9], v[0:1], s[12:19], s[20:23] dmask:0x7 ; F0800700 00A30700 [PC=0x10f3e3a94, off=660, size=8] s_buffer_load_dword s20, s[0:3], 0x50 ; C0220500 00000050 [PC=0x10f3e3a9c, off=668, size=8] s_load_dwordx4 s[24:27], s[4:5], 0x170 ; C00A0602 00000170 [PC=0x10f3e3aa4, off=676, size=8] s_load_dwordx8 s[12:19], s[4:5], 0x140 ; C00E0302 00000140 [PC=0x10f3e3aac, off=684, size=8] s_buffer_load_dword s11, s[0:3], 0x5c ; C02202C0 0000005C [PC=0x10f3e3ab4, off=692, size=8] s_buffer_load_dword s21, s[0:3], 0x54 ; C0220540 00000054 [PC=0x10f3e3abc, off=700, size=8] s_buffer_load_dword s22, s[0:3], 0x58 ; C0220580 00000058 [PC=0x10f3e3ac4, off=708, size=8] s_waitcnt vmcnt(0) ; BF8C0F70 [PC=0x10f3e3acc, off=716, size=4] ^ SE0 SH0 CU1 SIMD1 WAVE0 EXEC=aaaaaaa555aaaaaa INST32=BF8C0F70 ^ SE0 SH0 CU1 SIMD2 WAVE0 EXEC=aaaa85555555552a INST32=BF8C0F70 ^ SE0 SH0 CU1 SIMD3 WAVE0 EXEC=000000000000000a INST32=BF8C0F70 ^ SE0 SH0 CU6 SIMD1 WAVE0 EXEC=25a5a5aa82aaaaaa INST32=BF8C0F70 ^ SE0 SH0 CU6 SIMD3 WAVE0 EXEC=50aaaa8fffa55555 INST32=BF8C0F70 ^ SE0 SH0 CU7 SIMD0 WAVE0 EXEC=5554aaaaaaa1a555 INST32=BF8C0F70 ^ SE0 SH0 CU7 SIMD0 WAVE1 EXEC=aaaa5555ffffffff INST32=BF8C0F70 ^ SE0 SH0 CU7 SIMD1 WAVE0 EXEC=555557aaaaaaaaa5 INST32=BF8C0F70 ^ SE0 SH0 CU7 SIMD3 WAVE0 EXEC=5555aaaaaaaaaa85 INST32=BF8C0F70 ^ SE1 SH0 CU3 SIMD1 WAVE0 EXEC=aaaaaaaaaaaaaaaa INST32=BF8C0F70 ^ SE1 SH0 CU4 SIMD0 WAVE0 EXEC=aaaaaaaa5a5a5a5a INST32=BF8C0F70 ^ SE1 SH0 CU4 SIMD1 WAVE0 EXEC=aaaaaaa5a5a5a4a5 INST32=BF8C0F70 ^ SE1 SH0 CU4 SIMD2 WAVE0 EXEC=5555555000000000 INST32=BF8C0F70 ^ SE1 SH0 CU4 SIMD3 WAVE0 EXEC=aa555554155aaaaa INST32=BF8C0F70 ^ SE1 SH0 CU5 SIMD0 WAVE0 EXEC=55ffff55555555aa INST32=BF8C0F70 ^ SE1 SH0 CU5 SIMD1 WAVE0 EXEC=555555555aaaaaaa INST32=BF8C0F70 ^ SE1 SH0 CU5 SIMD2 WAVE0 EXEC=a0aaaaaaa8555555 INST32=BF8C0F70 ^ SE1 SH0 CU5 SIMD3 WAVE0 EXEC=8aaaaaaaaaaaa555 INST32=BF8C0F70 ^ SE1 SH0 CU6 SIMD0 WAVE0 EXEC=000000002aaaaaaa INST32=BF8C0F70 ^ SE2 SH0 CU1 SIMD0 WAVE0 EXEC=5aaaa5400aaaa15a INST32=BF8C0F70 ^ SE2 SH0 CU1 SIMD1 WAVE0 EXEC=00aaaaaaaa5555aa INST32=BF8C0F70 ^ SE2 SH0 CU1 SIMD2 WAVE0 EXEC=aa00005555554555 INST32=BF8C0F70 ^ SE2 SH0 CU1 SIMD3 WAVE0 EXEC=aaaaaaa000000000 INST32=BF8C0F70 ^ SE3 SH0 CU4 SIMD0 WAVE0 EXEC=5555aaaaaaaaaaaa INST32=BF8C0F70 ^ SE3 SH0 CU4 SIMD2 WAVE0 EXEC=ffaaaaaaaaaa5555 INST32=BF8C0F70 ^ SE3 SH0 CU4 SIMD3 WAVE0 EXEC=aaaa55555555aa00 INST32=BF8C0F70 ^ SE3 SH0 CU5 SIMD0 WAVE0 EXEC=00aaaaaaaaaaaa5a INST32=BF8C0F70 ^ SE3 SH0 CU5 SIMD1 WAVE0 EXEC=5a555555005555ff INST32=BF8C0F70 v_mul_f32_e32 v7, s6, v7 ; 0A0E0E06 [PC=0x10f3e3ad0, off=720, size=4] ... Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marek Olšák	3de8c5a3c5	radeonsi: write wave information into GPU hang reports UMR is our new debugging tool. It must have +s set for Mesa to use it without root privileges: sudo chmod +s .../umr Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marc-André Lureau	dc2d9b8da1	tgsi-dump: dump label if instruction has one The instruction has an associated label when Instruction.Label == 1, as can be seen in ureg_emit_label() or tgsi_build_full_instruction(). This fixes dump generating extra :0 labels on conditionals, and virgl parsing more than the expected tokens and eventually reaching "Illegal command buffer" (when parsing more than a safety margin of 10 we currently have). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-10 12:46:33 +10:00
Marc-André Lureau	bd1cab1168	tgsi: remove ureg_label_insn Unused since commit `2897cb3dba`. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-10 12:46:23 +10:00
Dave Airlie	e5a5d17d13	radv: handle queue submission with no cs but semaphores It's legal to submit just semaphores with no command streams, this patch fixes this case by emitting the empty cs, it also handles the fence emission for this case better. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-09 23:45:33 +00:00
Timothy Arceri	a4086bb531	util/disk_cache: error check asprintf() Fixes: `f3d911463e` "util/disk_cache: stop using ralloc_asprintf() unnecessarily" Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-10 09:25:32 +11:00
Timothy Arceri	41ad178b13	docs: add shader cache environment variables Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-10 09:22:52 +11:00
Ilia Mirkin	c95f821cb4	nvc0/ir: fix ubo max clamp, reset file index We just increased the max UBO, so we should also increase the clamp that we do for robustness. Similarly, as we're including the fileIndex in the new indirect value, we should reset fileIndex to 0 so that it is not added in a second time. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2017-02-09 15:50:58 -05:00
Ilia Mirkin	e4a698cb97	nv50/ir: always return 0 when trying to read thread id along unit dim Many many many compute shaders only define a 1- or 2-dimensional block, but then continue to use system values that take the full 3d into account (like gl_LocalInvocationIndex, etc). So for the special case that a dimension is exactly 1, we know that the thread id along that axis will always be 0, so return it as such and allow constant folding to fix things up. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-02-09 15:15:36 -05:00
Ilia Mirkin	1acdd62847	nvc0/ir: fix robustness guarantees for constbuf loads on kepler+ compute Kepler and up unfortunately only support up to 8 constbufs. We work around this by loading from constbufs as if they were storage buffers. However we were not consistently applying limits to loads from these buffers. Make sure to do the same thing we do for storage buffers. Fixes GL45-CTS.robust_buffer_access_behavior.uniform_buffer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2017-02-09 15:15:22 -05:00
Ilia Mirkin	59ca352fc5	nvc0: increase number of ubo binding points Apparently GL 4.5 requires 14 of these (there's a "*" in the spec, but it's unclear what it refers to). We need to expose an extra binding point for the "program parameters", which means this must be 15. Remove the last vestige of the "use c14 for immediates" idea. Fixes GL45-CTS.shading_language_420pack.binding_uniform_block_array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2017-02-09 15:15:08 -05:00
Ilia Mirkin	8a2d88e934	configure: add blurb about what the LIBDRM_*_REQUIRED stuff means Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-02-09 12:57:49 -05:00
Ilia Mirkin	1e4f5988ed	nvc0: expose int64 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-09 12:57:49 -05:00
Ilia Mirkin	ab00a41a6e	nvc0/ir: make it possible to have the flags def in def0 There's all kinds of logic that doesn't like there being holes in defs or srcs lists. Avoid them. This also fixes the sched logic for maxwell. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-09 12:57:48 -05:00
Ilia Mirkin	61d7676df7	nvc0/ir: add support for 64-bit shift lowering on SM20/SM30 Unfortunately there is no SHF.L/SHF.R instruction pre-SM35. So we have to do a bit more work to get the job done. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-09 12:57:48 -05:00
Ilia Mirkin	1aefd6159c	nvc0/ir: add support for all the new int64 tgsi opcodes A few thoughts: - Some of that LegalizeSSA logic should really live much earlier and be subject to the likes of DCE and other useful passes - Some of the "lowering" done in from_tgsi should be done later so that proper optimization might be done. However this all works and the above can be improved upon later. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-09 12:57:48 -05:00
Pierre Moreau	009c54aa7a	nv50/ir: Split 64-bit integer MAD/MUL operations Hardware does not support 64-bit integers MAD and MUL operations, so we need to transform them in 32-bit operations. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>	2017-02-09 12:57:48 -05:00
Ilia Mirkin	22c705ea8c	nvc0/ir: add a "high" subop for shifts, emit shf.l/shf.r for 64-bit Note that this is not available for SM20/SM30. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-09 12:57:48 -05:00
Ilia Mirkin	2e986fa806	nvc0/ir: fix SET and SLCT emission We were never emitting a .X flag for consuming condition code on SET, and weren't emitting a signed type for SLCT comparison. Discovered while working on int64 logic. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-09 12:57:48 -05:00
Ilia Mirkin	eac5099c11	nvc0/ir: add support for emitting partial min/max ops for int64 These operations allow you to compute min/max on arbitrary-width integers, 32 bits at a time. Note that the low/med ops implicitly set the condition code, and the med/high ops implicitly consume it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-09 12:57:48 -05:00
Ilia Mirkin	b090033087	gallium: add separate PIPE_CAP_INT64_DIVMOD Nouveau does not currently have logic to implement this as a library function. Even though such a library could be written, there's no big advantage to do it that way for now given that int64 is a very uncommon use-case. Allow a driver to expose INT64 without supporting division and modulo operations. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-09 12:57:21 -05:00
Eric Engestrom	6a71a69a12	docs: improve the list of gl implementations Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-09 15:45:08 +00:00
Eric Engestrom	8278f1ec35	docs: improve the list of implemented APIs Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-09 15:44:51 +00:00
Matt Turner	d7a0486a9e	glsl: Allow compatibility shaders with MESA_GL_VERSION_OVERRIDE=... Previously if you used MESA_GL_VERSION_OVERRIDE=3.3COMPAT, Mesa exposed an OpenGL 3.3 compatibility profile context (with various unimplemented features and bugs), but still refused to compile shaders with #version 330 compatibility This patch simply adds a small bit of plumbing to let that through. Of course the same caveats apply: compatibility profile is still not supported (and will not be supported), so there are no guarantees that anything will work. Tested-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-02-09 15:14:43 +00:00
Eric Engestrom	89b4176eb1	docs: reword sentence that my brain can't parse Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2017-02-09 13:04:16 +00:00
Eric Engestrom	30cf9ffb59	docs: https all the links \o/ Most of them already redirected to https anyway, so we might as well avoid the redirection and the security implications by linking directly to the right protocol. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-09 11:28:15 +00:00
Eric Engestrom	2b0fe3cff7	docs: fix gallium wiki link in relnotes Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-09 11:28:10 +00:00
Eric Engestrom	9f8a6a5b79	docs: update 'thanks' for hosting Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-09 11:26:22 +00:00
Samuel Iglesias Gonsálvez	ca16f0a282	i965/fs: add support for int64 to bool conversion Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-09 10:18:34 +01:00
Samuel Iglesias Gonsálvez	824e1bb078	nir: add opcode to perform int64 to bool conversions Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-09 10:18:34 +01:00
Samuel Iglesias Gonsálvez	7ab26613db	i965/fs: Add support for nir_op_[iu]2[iu]32 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-09 10:18:34 +01:00
Samuel Iglesias Gonsálvez	7b5834ff54	i965/fs: Add support for nir_op_[iu]642f Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-09 10:18:34 +01:00
Samuel Iglesias Gonsálvez	b115407d75	i965/fs: legalize [u]int64 to 32-bit data conversions in lower_d2x Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-09 10:18:34 +01:00
Jason Ekstrand	8734461c58	i965/fs: Add support for nir_op_[iu]642d Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-02-09 10:18:34 +01:00
Jason Ekstrand	91d2d26f33	i965: Allow int64 conversion operations in channel_expressions This fixes 143 of the new piglit tests added by Nicolai Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-02-09 10:18:34 +01:00
Timothy Arceri	f3d911463e	util/disk_cache: stop using ralloc_asprintf() unnecessarily Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-09 14:11:24 +11:00
Timothy Arceri	0bf21519b7	glsl: add param to force shader recompile This will be used to skip checking the cache and force a recompile. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-09 12:22:56 +11:00
Timothy Arceri	4026b45bbc	util: add a disk_cache_remove() function This will be used to remove cache items created with old versions of Mesa or other invalid cache items from the cache. V2: rename stub function (cache_* funtions were renamed disk_cache_*) in master. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-09 12:22:56 +11:00
Timothy Arceri	a3fd8bb8c5	st/mesa/i965: create link status enum For the on-disk shader cache we want to be able to differentiate between a program that was linked and one that was loaded from cache. V2: - don't return the new enum directly to the application when queried, instead return GL_TRUE or GL_FALSE as required. Fixes google-chrome corruptions when using cache. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-09 12:22:56 +11:00
Brian Paul	ac5845453c	docs: update intro.html to mention new APIs, etc Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2017-02-09 00:02:20 +00:00
Brian Paul	b2722a8970	docs: the site is now hosted by freedesktop.org Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2017-02-09 00:02:13 +00:00
Bas Nieuwenhuizen	f22836dbdd	radv: Add CPU color packing for VK_FORMAT_A2B10G10R10_UNORM_PACK32. For allowing fast color clears in the main render targets of dota2. [airlied: fix clear_vals[1] as suggested by Andres. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-08 22:43:11 +00:00
Roland Scheidegger	f64d74aa19	mesa: (trivial) include <inttypes.h> for PRIx64 macros Fixes a compile error with mingw.	2017-02-08 21:56:16 +01:00
Tim Rowley	c1aa444a3e	swr: [rasterizer jitter] Pass LLVM-IR size into jitter Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-02-08 13:58:13 -06:00
Tim Rowley	e0a829d320	swr: [rasterizer core] Frontend SIMD16 WIP Removed temporary scafolding in PA, widended the PA_STATE interface for SIMD16, and implemented PA_STATE_CUT and PA_TESS for SIMD16. PA_STATE_CUT and PA_TESS now work in SIMD16. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-02-08 13:58:06 -06:00
Tim Rowley	79174e52b5	swr: [rasterizer jitter] Disable unsafe FP optimizations in the jitter Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-02-08 13:58:00 -06:00
Tim Rowley	db599e316a	swr: [rasterizer core] Frontend SIMD16 WIP Widen simdvertex to SIMD16/simd16vertex in frontend for passing VS attributes from VS to PA. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-02-08 13:57:52 -06:00
Tim Rowley	09c54cfd2d	swr: [rasterizer jitter] Add DEBUGTRAP jit builder function Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-02-08 13:57:47 -06:00
Tim Rowley	b01f26e005	swr: [rasterizer jitter] Multisample blend jit fix Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-02-08 13:57:41 -06:00
Tim Rowley	8780706c62	swr: [rasterizer jitter] Change SimdVector representation to array Make all SimdVectors in LLVM represented as simdscalar[4] rather than a struct. Fixes issues with promotion of values from i32 to i64 to match register width. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-02-08 13:57:33 -06:00
Tim Rowley	d159b0bf34	swr: [rasterizer jitter] Fix issues with stream-out on llvm>=3.8 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-02-08 13:57:27 -06:00
Tim Rowley	8423ad437b	swr: [rasterizer jitter] Adjust jitter header includes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-02-08 13:57:20 -06:00
Tim Rowley	feecd7dcf5	swr: [rasterizer core] Frontend SIMD16 WIP SIMD16 Primitive Assembly (PA) only supports TriList and RectList. CUT_AWARE_PA, TESS, GS, and SO disabled in the SIMD16 front end. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-02-08 13:57:10 -06:00
Eric Engestrom	a618d6c3e9	docs: update package contents Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-08 12:00:28 -07:00
Eric Engestrom	06e40dc671	docs: fix unpacking instructions File names were wrong, file formats were wrong, bunzip command was wrong... I also removed all but the simplest example; people who use pipes already know how to untar, so let's simplify and remove potential confusion for non-tech-savvy users. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-08 12:00:24 -07:00
Eric Engestrom	d7e1a16f1a	docs: remove dead 'beta' link Release candidates haven't been in a 'beta' subdir in a long time, so let's replace the dead link with an explanation instead. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-08 12:00:19 -07:00
Eric Engestrom	5b10c362de	docs: add a note about the new version scheme Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-08 12:00:14 -07:00
Bartosz Tomczyk	94262e5f5d	r600/sb: Fix memory leak Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-02-08 17:36:05 +01:00
Timothy Arceri	90014d0766	mesa: use PRId64/PRIu64 when printing 64-bit ints V2: actually use PRIu64 Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-08 13:50:01 +11:00
Dave Airlie	c674f11e42	mesa/st: fix strict aliasing issue in int64 code. This fixes the int64 code same as the double code. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-08 02:13:07 +00:00
Dave Airlie	30cff4f5f7	mesa/uniform: fix strict aliasing issues with int64 code. This fixes these like the double version does. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-08 02:12:31 +00:00
Dave Airlie	6d5d6dad20	radv: handle dcc in explicit image resolve path. (v2) We need to initialize dcc like we do in the subpass path. v2: fix initial/final layouts Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-07 23:31:08 +00:00
Bas Nieuwenhuizen	0d1283850b	radv: Enable fast clears by default. Works for me on dota2 and talos now. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Andres Rodriguez <andresx7@gmail.com>	2017-02-07 22:58:06 +01:00
Jason Ekstrand	1de3cd8a34	spirv: Add more asserts in vtn_vector_construct Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99465	2017-02-07 08:08:06 -08:00
Emil Velikov	25aa98c014	configure.ac: remove src/gallium/winsys/intel/drm/Makefile reference Not wired up (not referenced in any SUBDIR), leading to `make distcheck' failure. Fixes: `d77fa310ed` "ilo: EOL drop unmaintained gallium drv from buildsys" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-07 14:18:13 +00:00
Emil Velikov	73bce69938	docs: reword ilo removal note Properly annotate <li> and keep the note analogous to all the previous ones - OpenVG, st/egl, etc. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-07 14:18:12 +00:00
Boyan Ding	97495c428d	configure.ac: Remove redundant libglvnd stanza There were two "libglvnd configuration" section in the squashed commit that added libglvnd support, while only one in the original libglvnd branch. A following commit moves one of them downwards. Now remove the upper "older" one and move GL_LIB name decision downwards after the new libglvnd configuration section. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>	2017-02-07 14:17:53 +00:00
Emil Velikov	bef4d74047	travis: use both cores for make/make check The instance offers 2 cores, so use them to speed things up. v2: Set MAKEFLAGS instead [Eric] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-02-07 11:14:10 +00:00
Emil Velikov	30267172c7	travis: add nearly all gallium drivers to the list Note: we need the explicit --enable-freedreno for libdrm since the latter is 'smart' and disables it if building on !arm platforms. The radeonsi and swr are explicitly left out since they require 'too-recent' LLVM - 3.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-02-07 11:14:10 +00:00
Emil Velikov	96d86b18ee	travis: correct libdrm required regex to also track libdrm itself The current regex was tracking only the libdrm_foo packages, while with recent changed we bumped only (and rightfully so) libdrm. Fix the regex to track any libdrm package. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-02-07 11:14:10 +00:00
Emil Velikov	49f6408940	configure.ac: add swr to the gallium drivers list. v2: Rebase on top of ILO removal. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-02-07 11:14:10 +00:00
Emil Velikov	9d5b681a11	configure.ac: list all the dri-drivers in the help string It's unlikely that any of the additions come as a suprise to anyone i915, nouveau, radeon, r200. Regardless, state clearly what's available. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-02-07 11:14:09 +00:00
Marc Di Luzio	21efe2528c	glsl: correct compute shader checks for memoryBarrier functions As per the spec - "The functions memoryBarrierShared() and groupMemoryBarrier() are available only in compute shaders; the other functions are available in all shader types." Conform to this by adding another delegate to check for compute shader support instead of only whether the current stage is compute This allows some fragment shaders in Dirt Rally to compile Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-06 21:12:33 -08:00
Li Qiang	83fb63d31d	gallium/tgsi: fix oob access in parse instruction When parsing texture instruction, it doesn't stop if the 'cur' is ',', the loop variable 'i' will also be increased and be used to index the 'inst.TexOffsets' array. This can lead an oob access issue. This patch avoid this. Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Li Qiang <liq3ea@gmail.com>	2017-02-07 14:00:04 +10:00
Kenneth Graunke	ce8a63de6d	Revert "i965: Disable guardband clipping in the smaller-than-viewport case." This reverts commit `0bac2551e4`. Now that we position the guardband correctly (applying translations in addition to scaling) and made it as large (or larger) than the render target, this shouldn't be necessary. Now we leave guardband clipping enabled 100% of the time, like the Windows driver does. Fixes GL45-CTS.gtf21.GL2FixedTests.clip.clip. It tries to draw a 16384x64 rectangle, and it appears that some kind of numerical imprecisions in the clipper result in some edge pixels going missing. The Windows driver passes this test because of guardband clipping. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-06 17:40:14 -08:00
Kenneth Graunke	ece0e535a4	i965: Always scissor on Gen6-7.5 instead of disabling guardband. Previously we disabled the guardband when the viewport was smaller than the framebuffer on Gen6-7.5, to prevent portions of primitives from being draw outside of the viewport. On Gen8+, we relied on the viewport extents test to effectively scissor this away for us. We can simply always enable scissoring instead. We already include the viewport in the scissor rectangle, so this will effectively do the viewport extents test for us. (The only difference is that the scissor rectangle doesn't support sub-pixel values. I think that's okay.) Given that the viewport extents test is essentially a second scissor, and is enabled for basically all 3D drawing on Gen8+, it stands to reason that scissoring is cheap. Enabling the guardband reduces the cost of clipping, which is expensive. The Windows driver appears to never disable guardband clipping, and appears to use scissoring in this case. I don't know if they leave it on universally though. This fixes misrendering in Blender, where the "floor plane" grid lines started rendering at wrong angles after I disabled XY clipping of line primitives. Enabling the guardband seems to solve the issue. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99339 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-06 17:40:14 -08:00
Jason Ekstrand	f3c068c5c8	i965: Use a better guardband calculation. (Patch co-authored by Jason and Ken.) We scaled the guardband based on the viewport size, but failed to take into account the translation portion of the viewport transform. This meant the guardband was always centered around the origin. We want it to be centered around the screen-space drawing area, which is the intersection of the viewport and the render target. At best, getting this wrong would reduce the guardband's effectiveness in some cases. At worst, it might break things - objects outside of the guardband are trivially rejected, so getting the guardband in the wrong place and leaving guardband clipping enabled could cause problems. v2: drop clamping of positive maximums. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-06 17:40:14 -08:00
Kenneth Graunke	89ad7f1be6	i965: Combine the Gen6 SF and Clip viewport atoms. The next patch will make the guardband calculation dependent on the transformation matrix. Instead of computing it in both atoms, just combine them into a single atom. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-02-06 17:40:14 -08:00
Dave Airlie	90ac2285f0	radv: pass FMASK alignment to application As was done for dcc and cmask. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-07 10:42:01 +10:00
Bas Nieuwenhuizen	47ca0f537d	radv: Pass DCC alignment to application. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Andres Rodriguez <andresx7@gmail.com>	2017-02-07 01:19:22 +01:00
Bas Nieuwenhuizen	eb01b20cc4	radv: Pass CMASK alignment to application. CMASK alignment can be greater than image data alignment, so pass it to the app so that it knows what alignment to backing memory should have. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-07 01:18:53 +01:00
Dave Airlie	a864ef7f48	radv/ac: avoid the fmask path when doing txs. This fixes the vulkan samples deferredmultisampling test. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-06 22:57:52 +00:00
Bruce Cherniak	11d6f836d0	swr: [rasterizer core] Removed unused clip code. Removed unused Clip() and FRUSTUM_CLIP_MASK define. Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2017-02-06 16:30:50 -06:00
Bruce Cherniak	bf29495dcd	swr: [rasterizer core] Remove dead code Clipper::ClipScalar() Clipper::ClipScalar() is dead code and should be removed. It is causing an error with gcc-7 because it references a now defunct member. v2: includes bugzilla reference, same code change Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99633 CC: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2017-02-06 16:27:53 -06:00
Eric Anholt	72e6d1f00a	gallium: Remove vc4 simulator hack from loader infrastructure. Now that there's MESA_LOADER_DRIVER_OVERRIDE for choosing the driver name we load, we don't need this any more. v2: Get the junk out of pipe_loader_drm.c, too. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v2)	2017-02-06 12:44:06 -08:00
Eric Anholt	3f462050c2	loader: Add an environment variable to override driver name choice. My vc4 simulator has been implemented so far by having an entrypoint claiming to be i965, which was a bit gross. The simulator would be a lot less special if we entered through the vc4 entrypoint like normal, so add a loader environment variable to allow the i965 fd to probe as vc4. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-06 12:44:06 -08:00
Eric Anholt	61bb1a9795	targets: Use a macro to reduce cut and paste in driver setup. All the replicated prototypes/function bodies obfuscated the interesting logic of the file: the mapping from driver enable macros to entrypoints we expose, and the way that the swrast entrypoints are special compared to the DRM entrypoints. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-06 12:44:06 -08:00
Dave Airlie	13a28ff236	radeon/ac: move common llvm build functions to a separate file. Suggested by Marek. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-07 05:46:35 +10:00
Nicolai Hähnle	8822f4dfb9	eglmesaext: add new enums for EGL_MESA_drm_image_formats Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-02-06 17:41:28 +01:00
Nicolai Hähnle	6b0d390184	docs: add EGL_MESA_drm_image_formats extension proposal	2017-02-06 17:41:10 +01:00
Nicolai Hähnle	7be0e602ed	dri/common: clear the loaderPrivate pointer in driDestroyDrawable The GLX specification says about glXDestroyPixmap: "The storage for the GLX pixmap will be freed when it is not current to any client." We're not really following this language to the letter: some of the storage is freed immediately (in particular, the dri3_drawable, which contains both GLXDRIdrawable and loader_dri3_drawable). So we NULL out the pointers to that freed storage; the previous patches added the corresponding NULL-pointer checks. This fixes memory corruption in piglit ./bin/glx-visuals-depth/stencil -pixmap -auto Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-06 17:39:44 +01:00
Nicolai Hähnle	f446f3fb33	glx: guard swap-interval functions against destroyed drawables The GLX specification says about glXDestroyPixmap: "The storage for the GLX pixmap will be freed when it is not current to any client." So arguably, functions like glXSwapIntervalMESA can be called after glXDestroyPixmap has been called for the currently bound GLXPixmap. In that case, the GLXDRIDrawable no longer exists, and so we just skip those calls. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-06 17:39:30 +01:00
Nicolai Hähnle	21ec35566b	glx/dri3: guard in_current_context against a disappeared drawable Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-06 17:39:10 +01:00
Nicolai Hähnle	40c304fc06	glx/dri3: handle NULL pointers in loader-to-DRI3 drawable conversion With a subsequent patch, we might see NULL loaderPrivates, e.g. when a DRIdrawable is flushed whose corresponding GLXDRIdrawable was destroyed. This resulted in a crash, since the loader vs. DRI3 drawable structures have a non-zero offset. Fixes glx-visuals-{depth,stencil} -pixmap Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-06 17:39:01 +01:00
Juan A. Suarez Romero	02264bc6f9	anv/pipeline: set ThreadDispatchEnable conditionally Set 3DSTATE_WM/ThreadDispatchEnable bit on/off based on the same conditions as used in the GL version. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-06 10:27:44 +01:00
Alejandro Piñeiro	dfb1b543f3	main/fboject: default_framebuffer allowed for GetFramebufferParameter Before 4.5, the default framebuffer was not allowed for GetFramebufferParameter, so it should return INVALID_OPERATION for any call using the default framebuffer. 4.5 included new pnames, and some of them are allowed for the default framebuffer. For the rest, INVALID_OPERATION. From OpenGL 4.5 spec, section 9.2.3 "Framebuffer Object Queries: "An INVALID_OPERATION error is generated by GetFramebufferParameteriv if the default framebuffer is bound to target and pname is not one of the accepted values from table 23.73, other than SAMPLE_POSITION." Fixes: GL45-CTS.direct_state_access.framebuffers_get_parameter_errors Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-06 08:50:21 +01:00
Alejandro Piñeiro	0fb0c57b15	main/fbobject: implement new 4.5 pnames for GetFramebufferParameter 4.5 added new pnames allowed for GetFramebufferParameter, and GetNamedFramebufferParameter. From OpenGL 4.5 spec, section 9.2.3 "Framebuffer Object Queries" (quoting the paragraph with only the new pnames, not all the supported): "pname may also be one of DOUBLEBUFFER, IMPLEMENTATION_COLOR_READ_FORMAT, IMPLEMENTATION_COLOR_READ_TYPE, SAMPLES, SAMPLE_BUFFERS, or STEREO, indicating the corresponding framebuffer-dependent state from table 23.73. Values of framebuffer-dependent state are identical to those that would be obtained were the framebuffer object bound and queried using the simple state queries in that table. These values may be queried from either a framebuffer object or a default framebuffer." Fixes: GL45-CTS.direct_state_access.framebuffers_get_parameters Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-06 08:50:21 +01:00
Alejandro Piñeiro	0cd2a4737e	main/framebuffer: refactor _mesa_get_color_read_format/type Current implementation returns the value for the currently bound read framebuffer. GetNamedFramebufferParameteriv allows to get it for any given framebuffer. GetFramebufferParameteriv would be also interested on that method It was refactored by allowing to pass a given framebuffer. If NULL is passed, it used the currently bound framebuffer. It also adds a call to _mesa_update_state. When used only by GetIntegerv, this one was called as part of the extra checks defined at get_hash. But now that the method is used by more methods, and the update is needed, it makes sense (and it is safer) just calling it on the method itself, instead of rely on the caller. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-06 08:50:21 +01:00
Dave Airlie	106a51440d	radv: fix shared memory load/stores. If we have an indirect index here we need to scale it by attribute slots e.g. is this is vec2[256] then we get an indir_index in the 0.255 range but the vec2 are aligned inside vec4 slots. So scale the indir index, then extract the channels. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 19:53:03 +00:00
Dave Airlie	a1a8aef4c9	radv/ac: correctly size shared memory usage. We count the number of slots used, but slots are vec4 sized, so we have to scale by 16 not 4. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 19:52:13 +00:00
Dave Airlie	66463b7f75	radv: fix compute shared memory stores since 64-bit. These regressed and caused doom to stop loading. Fixes: `03724af26` radv/ac: Implement Float64 load/store var. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 19:51:52 +00:00
Brian Paul	023a9e3d92	docs: replace URL in features.txt Replace unmaintained http://dri.freedesktop.org/wiki/MissingFunctionality URL with http://mesamatrix.net/ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95460 Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-03 12:02:38 -07:00
Brian Paul	2fac98f865	mesa: whitespace fixes in context.c Remove trailing whitespace, replace tabs with spaces. Trivial.	2017-02-03 11:48:25 -07:00
Nanley Chery	84dbf68378	anv/blorp: Disable resolves for transparent black clears Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-03 09:23:13 -08:00
Nanley Chery	93b819154f	anv/cmd_buffer: Don't temporarily enable CCS_E within a render pass Compressing a render target and decompressing it in the same single-subpass render pass may waste bandwidth. While this may be beneficial in some circumstances, it does not help in all. Reclaims about 1.95% FPS for Dota 2 on some configurations. v2 (Jason Ekstrand): - Provide a more thorough comment - Enable CCS_D for input attachments v3 (Jason Ekstrand): - Provide performance numbers Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-03 09:23:13 -08:00
Kenneth Graunke	3f064e9a40	mesa: Don't crash when destroying contexts created with no visual. dEQP-EGL.functional.create_context.no_config tries to create a context with no config, then immediately destroys it. The drawbuffer is never set up, so we can't dereference it asking if it's double buffered, or we'll crash on a null pointer dereference. Just bail early. Applications using EGL_KHR_no_config_context could hit this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-02-03 08:55:02 -08:00
Samuel Pitoiset	af303abcdb	winsys/amdgpu: avoid potential segfault in amdgpu_bo_map() cs can be NULL when it comes from r600_buffer_map_sync_with_rings() to avoid doing the same checks. It was checked for write mappings but not for read mappings. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-03 12:07:14 +01:00
Tapani Pälli	0a2dcd3a8a	android: fix droid_create_image_from_prime_fd_yuv for YV12 Earlier changes introduced is_ycrcb flag which checks the component order of u and v components. Condition for setting the flag was incorrect, with ycrcb we are supposed to have cr before cb. This patch (together with a fix in our gralloc) fixes corrupted rendering from 'test-opengl-gl2_yuvtex' native test and corrupted gallery thumbnail in application switcher on Android-IA. Fixes: `51727b1cf5` Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2017-02-03 07:44:33 +02:00
Edward O'Callaghan	3879425917	ilo: EOL unmaintained older gallium intel driver This is no longer actively maintained and is just accumulating bitrot. Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Chia-I Wu <olvaffe@gmail.com>	2017-02-03 16:13:46 +11:00
Edward O'Callaghan	d77fa310ed	ilo: EOL drop unmaintained gallium drv from buildsys This is no longer actively maintained and is just accumulating bitrot. Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Chia-I Wu <olvaffe@gmail.com>	2017-02-03 16:13:36 +11:00
Edward O'Callaghan	01b625ef1a	ilo: EOL unplumb unmaintained gallium drv from winsys This is no longer actively maintained and is just accumulating bitrot. Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Chia-I Wu <olvaffe@gmail.com>	2017-02-03 16:13:32 +11:00
Ilia Mirkin	2b4eaabff0	configure: libdrm is a single package The intent of the libdrm_$driver version limits has always been to not burden the "other" drivers with updating their libdrm unless really necessary. Unfortunately the configure script erroneously only checked the driver-specific bit and not the generic bit of libdrm as well. Fix this. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-02 20:36:09 -05:00
Ilia Mirkin	7d3f9ed71c	st/mesa: MAX_VARYING is the max supported number of patch varyings, not min This fixes GL45-CTS.tessellation_shader.tessellation_shader_tessellation.max_in_out_attributes on nouveau. We only support 30 patch varyings (as 2 vec4 slots end up being used for tess level settings), but were getting 32 exposed. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-02 20:28:58 -05:00
Ilia Mirkin	e73f87fcbd	vbo: process buffer binding state changes on draw when recording The VBO module keeps track of any vbo buffers. It updates this list when receiving an InvalidateState call, however this never happens when recording draws right now. Make sure that we do all the usual state updates when recording draws so that the VBO list may be kept up to date. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99631 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-02 20:28:27 -05:00
Dave Airlie	6cc3c46f58	radv/ac: move to using shared emit_ddxy code. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	c9a2fc3679	radeonsi/ac: move most of emit_ddxy to shared code. We can reuse this in radv. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	278d5ef70a	radv/ac: use shared thread id code Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	c5f0a56aeb	radeonsi/ac: move get thread id to shared code. radv will use this. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	1c5c268a8a	radv/ac: migrate to using shared code for some load/store stuff. This migrates to the code shared with radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	b3c28942c7	radeonsi/ac: move tbuffer store and buffer load to shared code. These are all reuseable by radv. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Dave Airlie	a9773311f6	radeonsi/ac: move a bunch of load/store related things to common code. These are all shareable with radv, so start migrating them to the common code. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:54:04 +10:00
Eduardo Lima Mitev	e198a64e35	texgetimage: Add check for the effective target to GetTextureSubImage OpenGL 4.5 spec, section "8.11.4 Texture Image Queries", page 233 of the PDF states: "An INVALID_OPERATION error is generated if texture is the name of a buffer or multisample texture." This is currently not being checked and e.g a multisample texture image can be passed down to the driver hook. On i965, it is crashing the driver with an assertion: intel_mipmap_tree.c:3125: intel_miptree_map: Assertion `mt->num_samples <= 1' failed. v2: (Ilia Mirkin) Move the check from gettextimage_error_check() to GetTextureSubImage() and use the texObj target. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-03 00:43:46 +01:00
Marek Olšák	dfe111368d	Revert "radeonsi: decrease the number of texture slots to 24" This reverts commit `bdd860e307`. Requested by a game developer. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-03 00:39:48 +01:00
Dave Airlie	b457f67495	configure.ac: explicitly require libdrm for dri classic drivers. Although this might come from somewhere else require it explicitly. Reviewed-by: Chad Versace <chadversary@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-03 09:38:15 +10:00
Jason Ekstrand	37a6f48ceb	intel/isl: Add a better comment for format_supports_ccs_e Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-02 13:33:43 -08:00
Jason Ekstrand	45b3eb4dfc	anv: Remove the finishme for CCS_E with storage images The data port can't handle CCS at all so replace the finishme with better comments. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-02 13:33:43 -08:00
Jason Ekstrand	fc9f0db8e3	intel/isl: Assert that we don't use CCS for storage images I enabled CCS for storage images in the Vulkan driver and ran it through the CTS. It didn't result in any hangs but it demonstrated that the data port cannot handle CCS. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-02 13:33:43 -08:00
Jason Ekstrand	7e6a9d9c4b	intel/isl: Add a formats_are_ccs_e_compatible helper Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-02 13:33:43 -08:00
Jason Ekstrand	6142e3c07c	intel/isl: Add a format_supports_ccs_d helper Nothing uses this yet but it serves as a nice bit of documentation that's relatively easy to find. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-02 13:33:43 -08:00
Jason Ekstrand	ab06fc6684	intel/isl: Rename supports_lossless_compression to supports_ccs_e The term "lossless compression" could potentially mean multisample color compression, single-sample color compression or HiZ because they are all lossless. The term CCS_E, however, has a very precise meaning; in ISL and is only used to refer to single-sample color compression. It's also much shorter which is nice. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-02-02 13:33:43 -08:00
Nanley Chery	043d92fef9	anv/pass: Store the depth-stencil attachment's last subpass index Commit `968ffd6c86` stored the last subpass index of all the attachments but that of the depth-stencil attachment. This could cause depth buffers used in multiple subpasses not to be in the requested final layout. Fix this error. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2017-02-02 10:36:14 -08:00
Nicolai Hähnle	a020cb3a72	gallium: turn PIPE_SHADER_CAP_DOUBLES into a screen capability Make the cap consistent with PIPE_CAP_INT64. Aside from the hypothetical case of using draw for vertex shaders (and actually caring about doubles...), every implementation supports doubles either nowhere or everywhere. Also, st/mesa didn't even check the cap correctly in all supported shader stages. While at it, add a missing LLVM version check for 64-bit integers in radeonsi. This is conservative: judging by the log, LLVM 3.8 might be sufficient, but there are probably bugs that have been fixed since then. v2: fix clover (Marek) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-02-02 16:53:42 +01:00
Plamena Manolova	96123dbad9	mesa: Enable EXT_compressed_ETC1_RGB8_sub_texture Since we already have the functionality in place and games like Game of Thrones seem to depend on this extension, I think it makes sense to enable it by making it part of the extension string even though it's still a draft: https://www.khronos.org/registry/gles/extensions/EXT/EXT_compressed_ETC1_RGB8_sub_texture.txt Note: OES_compressed_ETC1_RGB8_sub_texture seems to be listed in gl2ext.h, but there's no documentation for it in the KHR registry Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-02 12:28:31 +00:00
Vinson Lee	6ee4665a77	configure: Only require libdrm 2.4.75 for intel. Fixes: `b8acb6b179` ("configure: Require libdrm >= 2.4.75") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Dave Airlie <airlied@redhat.com>	2017-02-02 13:10:00 +10:00
Lionel Landwerlin	7158255069	anv: enable VK_KHR_shader_draw_parameters Enables 10 tests from: dEQP-VK.draw.shader_draw_parameters.* Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-02 01:33:16 +00:00
Lionel Landwerlin	9413e11869	anv: emit DrawID if needed v2: use define for buffer ID (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-02 01:33:06 +00:00
Lionel Landwerlin	543d5db4e2	anv: always allocate a vertex element with vertexid or instanceid Up to now on Gen8+ we only allocated a vertex element for gl_InstanceIndex or gl_VertexIndex when a vertex shader uses gl_BaseInstanceARB or gl_BaseVertexARB. This is because we would configure the VF_SGVS packet to make the VF unit write the gl_InstanceIndex & gl_VertexIndex values right behind the values computed from the vertex buffers. In the next commit we will also write the gl_DrawIDARB value. Our backend expects to pull the gl_DrawIDARB value from the element following the element containing gl_InstanceIndex, gl_VertexIndex, gl_BaseInstanceARB and gl_BaseVertexARB (see vec4_vs_visitor::setup_attributes). Therefore we need to allocate an element for the SGVS elements as long as at least one of the SGVS element is read by the shader. Otherwise our shader will use a gl_DrawIDARB value pulled from the URB one element too far (most likely garbage). v2: Fix my english (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-02 01:32:58 +00:00
Lionel Landwerlin	289aef771d	anv: move BaseVertexID/BaseInstanceID vertex buffer index to 31 v2: use define for buffer ID (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-02 01:32:48 +00:00
Lionel Landwerlin	98cf60a3ce	anv: limit vertex buffers to 31 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-02 01:32:39 +00:00
Mauro Rossi	9c45bb731c	android: fix llvm, elf dependencies for M, N releases These changes set the correct llvm version and elf include path which differ for Marshmallow and Nougat Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-02-01 23:01:35 +00:00
Jason Ekstrand	ccdd5b3738	anv: Don't use bogus alpha swizzles For RGB formats in Vulkan, we use the corresponding RGBA format with a swizzle of RGB1. While this swizzle is exactly what we want for texturing, it's not allowed for rendering according to the docs. While we haven't been getting hangs or anything, we should probably obey the docs. This commit just sanitizes all render swizzles so that the alpha channel maps to ALPHA. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-01 14:41:06 -08:00
Micah Fedke	752ae38a09	Add missing copyright header to wayland-egl-priv.h Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-01 22:33:40 +00:00
Dave Airlie	cda9f3d8ec	radv: handle VK_QUEUE_FAMILY_IGNORED in image transitions (v3) The CTS tests at least are using this, and we were totally ignoring it. This hopefully fixes the bouncing multisample CTS tests. v2: get family mask in ignored case from command buffer. v3: only change things in one place, use logic from Bas. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-02 08:25:04 +10:00
Dave Airlie	fa316ed02f	radv/ac: handle clip/cull distance sizing in geometry shader outputs Otherwise we were writing these as 4 components, and things went bad. Fixes (the remaining): dEQP-VK.clipping.user_defined..vert_geom. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-02 08:25:04 +10:00
Dave Airlie	230e308ff9	radv/ac: add const_index to fetch index for gs inputs This fixes clip distance fetches as they are single item loads with a const_index like float[1]. Fixes: dEQP-VK.clipping.user_defined.*.vert_geom.[0-6] Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-02 08:25:04 +10:00
Dave Airlie	dc68b920df	radeonsi/ac: move frag interp emission code to shared llvm code. This code should be used in radv, so move it to a shared location in advance of doing that. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-02 08:24:53 +10:00
Timothy Arceri	b940b2fd16	st/mesa: inline get_mesa_program() In the past I've gotten this function confused with the one in ir_to_mesa.cpp of the same name. Now that the affected flag setting has move into a helper it makes sense just to inline this remaining code. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-02 08:31:28 +11:00
Timothy Arceri	a7050ea1f9	st/mesa: create set_prog_affected_state_flags() helper This will be used when restoring tgsi from the on-disk shader cache. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-02 08:31:28 +11:00
Timothy Arceri	8d3d8a6d4e	st/mesa: st_atom_shader.c C99 tidy up Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-02 08:31:28 +11:00
Timothy Arceri	f3e2428a7a	st/mesa: remove pre C99 statement block for variable declaration Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-02-02 08:31:28 +11:00
Jason Ekstrand	0c114f2cf0	isl: Add assertions for render target swizzle restrictions Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-01 12:07:54 -08:00
Boyuan Zhang	f90ccf48bc	st/va: add h264 constrained baseline profile Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-01 14:32:32 -05:00
Boyuan Zhang	d596bd29ec	st/vdpau: add h264 constrained baseline profile Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-01 14:32:32 -05:00
Boyuan Zhang	c29191eea8	radeon/uvd: add h264 constrained baseline support Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-01 14:32:32 -05:00
Boyuan Zhang	22841ec84a	vl: add h264 constrained baseline profile Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-01 14:32:32 -05:00
Bas Nieuwenhuizen	f5f8eb2c7c	radv: Enable VK_KHR_shader_draw_parameters. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-01 19:49:40 +01:00
Bas Nieuwenhuizen	cf8a11c1ba	radv: Pass draw index to shader. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-01 19:49:40 +01:00
Bas Nieuwenhuizen	80f4331ed1	radv/ac: Add draw index support. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-01 19:49:40 +01:00
Robert Foss	25f2d3c1d3	i965: Prevent coverity warning Add assert checking that num_sources is never larger than 3. This prevents Coverity from concluding that the unhandled cases of num_sources not being 0-3 are relevant. Coverity-Id: 1399480-1399489 Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-02-01 16:47:05 +00:00
Lionel Landwerlin	875b15eec4	spirv: add SPV_KHR_shader_draw_parameters support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-01 15:08:33 +00:00
Lionel Landwerlin	bd46040162	compiler: add missing enums for debug Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-01 15:08:30 +00:00
Emil Velikov	1e8fd790e1	docs: add news item and link release notes for 13.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-01 11:21:59 +00:00
Emil Velikov	f2391e8134	docs: add sha256 checksums for 13.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `6bfc352f5a`)	2017-02-01 11:20:28 +00:00
Emil Velikov	7b6931e7fb	docs: add release notes for 13.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `3255d10da4`)	2017-02-01 11:20:27 +00:00
Michel Dänzer	31136eae3a	winsys/radeon: Allow visible VRAM size > 256MB with kernel driver >= 2.49 The kernel driver reports correct values now. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2017-02-01 16:38:14 +09:00
Tapani Pälli	58828fe4ae	android: add vulkan build for intel fixes to issues spotted by Emil Velikov: - set ANV_TIMESTAMP corretly - fix typo with VULKAN_GEM_FILES v2: update to use Makefile.sources under vulkan instead of having own v3: update to changes to generate from vk.xml (commit `c7fc310`) v4: remove 'hw' relative path cleanups, remove unnecessary cruft review from Emil Velikov: - move to vulkan folder - remove timestamp gen, no longer necessary - more cleanups Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-01 07:58:49 +02:00
Ilia Mirkin	62b8f494fa	mesa: use same is_color_attachment trick to discern error cases All the other calls to retrieve the attachment have been covered except this one - return the proper error for attachment points that are valid enums but out of bound for the driver. Fixes GL45-CTS.geometry_shader.layered_fbo.fb_texture_invalid_attachment Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-31 22:12:57 -05:00
Jason Ekstrand	92128590bc	anv: Improve flushing around STATE_BASE_ADDRESS It is not clear from the docs exactly how pipelined STATE_BASE_ADDRESS actually is. We know from experimentation that we need to flush the render cache prior to emitting STATE_BASE_ADDRESS and invalidate the texture cache afterwards. The only thing the PRM says is that, on gen8+ we're supposed to invalidate the state cache after STATE_BASE_ADDRESS but experimentation has indicated that doing so does nothing whatsoever. Since we don't really know, let's do just a bit more flushing in the hopes that this won't be a problem again. In particular: 1) Do a CS stall before we emit STATE_BASE_ADDRESS since we don't really know whether or not it's pipelined. 2) Do a data cache flush in case what runs before STATE_BASE_ADDRESS is a compute shader. 3) Invalidate the state and constant caches after STATE_BASE_ADDRESS because the state may be getting cached there (we don't really know). Reported-by: Mark Janes <mark.a.janes@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-01-31 18:49:44 -08:00
Jason Ekstrand	f1f9794118	anv: Flush render cache before STATE_BASE_ADDRESS on gen7 We had no good reason for not doing this on gen7 before but we didn't know it was needed. Recently, when trying update to Vulkan CTS version 1.0.2 in our CI system, Mark discovered GPU hangs on Haswell that appear to be STATE_BASE_ADDRESS related. This commit fixes them. Reported-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-01-31 18:49:44 -08:00
Jason Ekstrand	4871930451	isl/formats: Only advertise sampling for A4B4G4R4 on Broadwell This causes hangs on Broadwell if you try to render to it. I have no idea how we managed to not hit this earlier. Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-01-31 18:49:44 -08:00
Jason Ekstrand	a0348b5a0b	intel/blorp: Handle clearing of A4B4G4R4 on all platforms Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-01-31 18:49:44 -08:00
Tom Stellard	226a2c6d6e	radeonsi: Fix build on LLVM < 3.9 v2 This was broken by: `e0cc0a614c` v2: - Use preprocessor macro Tested-by: Mark Janes <mark.a.janes@intel.com>	2017-02-01 02:10:00 +00:00
Bas Nieuwenhuizen	798ae37cc9	radv: Enable Float64 support. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:34 +01:00
Bas Nieuwenhuizen	441ee1e65b	radv/ac: Implement Float64 SSBO loads. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:34 +01:00
Bas Nieuwenhuizen	bb1ce63002	radv/ac: Implement Float64 UBO loads. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:29 +01:00
Bas Nieuwenhuizen	03724af262	radv/ac: Implement Float64 load/store var. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:05 +01:00
Bas Nieuwenhuizen	91074bb11b	radv/ac: Implement Float64 SSBO stores. No f16 support as I'm not quite sure about alignment yet. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:05 +01:00
Bas Nieuwenhuizen	29577b2123	radv/ac: Add core Float64 support. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:05 +01:00
Rob Herring	01e18b21d1	vc4: Enable Neon on arm android builds Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-31 14:06:21 -08:00
Rob Herring	83107acb7b	vc4: fix arm64 build with Neon The addition of Neon assembly breaks on arm64 builds because the assembly syntax is different. For now, restrict Neon to ARMv7 builds. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-31 14:06:19 -08:00
Rob Herring	6d92f32852	vc4: Make Neon inline assembly clang compatible clang throws an error on "%r2" and similar. I couldn't find any documentation on what "%r?" is supposed to mean and I've never seen any use like that as far as I remember. The parameter is supposed to be cpu_stride and just %2/%3 should be sufficient. There's no need for trailing ";" either, so remove those, too. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-31 14:06:09 -08:00
Tom Stellard	e0cc0a614c	radeonsi: Set datalayout on the llvm module This prevents LLVM from using sext instructions for local memory offsets and allows the backend to fold immediate offsets into the instruction. This also prevents some incorrect code generation for ptrtoint and inttoptr instructions. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-31 20:39:30 +00:00
Francisco Jerez	11e9ebbf15	nir/spirv/glsl450: Implement IEEE-compliant handling of atan2(±∞, ±∞). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:33:33 -08:00
Francisco Jerez	013d40d1ce	glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:33:33 -08:00
Francisco Jerez	7215375c44	nir/spirv/glsl450: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity. See "glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity." for the rationale, but note that the instruction count benefit discussed there is somewhat less important for the SPIRV implementation, because the current code already emitted no control flow instructions -- Still this saves us one hardware instruction per scalar component on Intel SKL hardware. Fixes the following Vulkan CTS tests on Intel hardware: dEQP-VK.glsl.builtin.precision.atan2.highp_compute.scalar dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec2 dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec3 dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec4 dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec2 dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec4 Note that most of the test-cases above expect IEEE-compliant handling of atan2(±∞, ±∞), which this patch doesn't explicitly handle, so except for the last two the test-cases above weren't expected to pass yet. The reason they do is that the i965 back-end implementation of the NIR fmin and fmax instructions is not quite GLSL-compliant (it complies with IEEE 754 recommendations though), because fmin/fmax of a NaN and a non-NaN argument currently always return the non-NaN argument, which causes atan() to flush NaN to one and return the expected value. The front-end should probably not be relying on this behavior for correctness though because other back-ends are likely to behave differently -- A follow-up patch will handle the atan2(±∞, ±∞) corner cases explicitly. v2: Fix up argument scaling to take into account the range and precision of exotic FP24 hardware. Flip coordinate system for arguments along the vertical line as if they were on the left half-plane in order to avoid division by zero which may give unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in some more comments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-01-31 10:33:27 -08:00
Francisco Jerez	e9ffd12827	glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity. This addresses several issues of the current atan2 implementation: - Negative zero (and negative denorms which end up getting flushed to zero) isn't handled correctly by the current implementation. The reason is that it does 'y >= 0' and 'x < 0' comparisons to decide on which side of the branch cut the argument is, which causes us to return incorrect results (off by up to 2π) for very small negative values. - There is a serious precision problem for x values of large enough magnitude introduced by the floating point division operation being implemented as a mul+rcp sequence. This can lead to the quotient getting flushed to zero in some cases introducing an error of over 8e6 ULP in the result -- Or in the most catastrophic case will cause us to return NaN instead of the correct value ±π/2 for y=±∞ and x very large. We can fix this easily by scaling down both arguments when the absolute value of the denominator goes above certain threshold. The error of this atan2 implementation remains below 25 ULP in most of its domain except for a neighborhood of y=0 where it reaches a maximum error of about 180 ULP. - It emits a bunch of instructions including no less than three if-else branches per scalar component that don't seem to get optimized out later on. This implementation uses about 13% less instructions on Intel SKL hardware and doesn't emit any control flow instructions. v2: Fix up argument scaling to take into account the range and precision of exotic FP24 hardware. Flip coordinate system for arguments along the vertical line as if they were on the left half-plane in order to avoid division by zero which may give unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in some more comments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-01-31 10:32:45 -08:00
Francisco Jerez	69042a5be4	i965/fs: Fix nir_op_fsign of absolute value. This does point at the front-end emitting silly code that could have been optimized out, but the current fsign implementation would emit bogus IR if abs was set for the argument (because it would apply the abs modifier on an unsigned integer type), and we shouldn't rely on the upper layer's optimization passes for correctness. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-01-31 10:32:43 -08:00
Francisco Jerez	7ec3af3f8f	glsl/ir_builder: Add rcp builder. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:32:43 -08:00
Francisco Jerez	6643a97de3	glsl: Fix constant evaluation of the rcp op. Will avoid a regression in a future commit that introduces some additional rcp operations. According to the GLSL 4.10 specification: "Dividing by 0 results in the appropriately signed IEEE Inf." Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:32:43 -08:00
Francisco Jerez	e81130d7a1	mesa/program: Translate csel operation from GLSL IR. This will be used internally by the GLSL front-end in order to implement some built-in functions. Plumb it through MESA IR for back-ends that rely on this translation pass. v2: Add comment. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:32:42 -08:00
Wladimir J. van der Laan	56314f5baf	etnaviv: Set SE.CLIP registers, add margins for scissor/clip registers This fixes rendering of full-screen quads (and other screen-filling geometry, e.g. ioquake3 walls up-close) on gc3000. It should be a no-op on other hardware. - It looks like SE_CLIP registers were not set at all. I'm amazed that rendering worked without them. Emit them to avoid issues on gc3000. - Define constants ETNA_SE_SCISSOR_MARGIN_RIGHT (0x1119) ETNA_SE_SCISSOR_MARGIN_BOTTOM (0x1111) ETNA_SE_CLIP_MARGIN_RIGHT (0xffff) ETNA_SE_CLIP_MARGIN_BOTTOM (0xffff) These demarcate the margin (fixp16) between the computed sizes and the value sent to the chip. I have set these to the numbers used by the Vivante driver for gc2000. I am not sure whether any old hardware was relying on the old numbers, or whether those were just a guess. But if so, these need to be moved to the _specs structure. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-31 19:29:23 +01:00
Wladimir J. van der Laan	fe3bb8cdb5	etnaviv: Generate new sin/cos instructions on GC3000 Shaders using sin/cos instructions were not working on GC3000. The reason for this turns out to be that these chips implement sin/cos in a different way (but using the same opcodes): - Need their input scaled by 1/pi instead of 2/pi. - Output an x and y component, which need to be multiplied to get the result. - tex_amode needs to be set to 1. Add a new bit to the compiler specs and generate these instructions as necessary. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-31 19:29:16 +01:00
Nanley Chery	33e0c5d003	anv/cmd_buffer: Use the proper depth input attachment surface state Commit `2852efcda4` moved the location of the depth input attachment surface state from the render pass to the image view, but failed to update the surface state location used when emitting the binding table. Fix this by loading the surface state from the correct location. Fixes: dEQP-VK.renderpass.formats.d16_unorm.input.* dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.* dEQP-VK.renderpass.formats.d32_sfloat.input.* dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.* dEQP-VK.renderpass.attachment_allocation.input_output.93 dEQP-VK.renderpass.attachment_allocation.input_output.92 dEQP-VK.renderpass.attachment_allocation.input_output.82 dEQP-VK.renderpass.attachment_allocation.input_output.46 Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2017-01-31 09:00:50 -08:00
Bartosz Tomczyk	fc27181f9e	glsl: fix heap-buffer-overflow The `end+1` skips the ']', whereas the `strlen+1` includes the final '\0' in the move to terminate the string. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-31 15:58:52 +01:00
Wladimir J. van der Laan	658568941d	etnaviv: Cannot render to rb-swapped formats Exposing rb swapped (or other swizzled) formats for rendering would involve swizzing in the pixel shader. This is not the case at the moment, so reject requests for creating such surfaces. (GPUs that need an extra resolve step anyway due to multiple pixel pipes, such as gc2000, might also do this swap in the resolve operation. But this would be tricky to keep track of) CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-31 09:28:28 +01:00
Christian Gmeiner	82fe240a99	etnaviv: Avoid infinite loop in find_frame() Use of unsigned loop control variable with '>= 0' would lead to infinite loop. Reported by clang: etnaviv_compiler.c:1024:39: warning: comparison of unsigned expression >= 0 is always true [-Wtautological-compare] for (unsigned sp = c->frame_sp; sp >= 0; sp--) ~~ ^ ~ v2: Simply use the same datatype as c->frame_sp is using. CC: <mesa-stable@lists.freedesktop.org> Reported-by: Rhys Kidd <rhyskidd@gmail.com> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2017-01-31 09:19:25 +01:00
Dave Airlie	8477aa71d9	radv/ac: apply slice rounding to 1d arrays as well. Fixes: dEQP-VK.glsl.texture_functions.texture.1darray Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 11:13:15 +10:00
Dave Airlie	3882f3da22	radv/geom: check if esgs and gsvs ring exists before filling geom rings There are some corner cases where you end up with an esgs ring, but no gsvs ring, test for both before dereferencing. Fixes: dEQP-VK.geometry.emit.points_emit_0_end_0 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 11:13:15 +10:00
Dave Airlie	723941bb3d	radv: enable geometryShader and multiViewport capabilities. This enables geometry shader support on radv. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:53 +10:00
Dave Airlie	ca822e1b7c	radv: handle layer export from vs->fs properly Fixes: dEQP-VK.geometry.layered.1d_array.fragment_layer Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:49 +10:00
Dave Airlie	c9c8ae1fd3	radv: emit esgs itemsize register. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:46 +10:00
Dave Airlie	77ec78669a	radv: handle prim id inputs to fragment shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:41 +10:00
Dave Airlie	105ce24d46	radv: emit geometry shaders to hardware This emits the compiled geometry shader and other state registers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:37 +10:00
Dave Airlie	1fa5b755c2	radv: emit geometry ring size and pointers via preamble (v2) This uses the scratch infrastructure to handle the esgs and gsvs rings. (this replaces the old code that did this with patching). v2: fix correct ring sizes, reset sizes (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:19 +10:00
Dave Airlie	8f41fe4389	radv: add gs ring size calculations to pipeline. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:15 +10:00
Dave Airlie	99936d3606	radv: add pipeline creation support for geometry shaders (v2.1) This adds gs copy shader support to the pipeline cache, and few geometry related changes. v2: rebase for spill changes. v2.1: fix incorrect pipeline destruction. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:10 +10:00
Dave Airlie	fd4ea9e62d	radv/ac: handle primitive id Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:08 +10:00
Dave Airlie	4ec294adce	radv/ac: handle emitting vertex outputs to esgs ring. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:05 +10:00
Dave Airlie	ac642c6195	radv/ac: handle gs inputs This handles geometry shader inputs written by the vertex (es) shader to the esgs ring. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:30:01 +10:00
Dave Airlie	80cdf2c17e	radv/ac: add geom input support to get deref offset. This just adds the API and fixes up the callers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:59 +10:00
Dave Airlie	23999a363b	radv/ac: handle invocation and primitive id intrinsics Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:55 +10:00
Dave Airlie	63fa6c6eb4	radv/ac: handle geometry emit vertex and end prim intrinsics. This handles emitting things to the gsvs ring, and sending the correct GS msgs. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:52 +10:00
Dave Airlie	2a56186d57	radv/ac: handle emitting gs epilogue Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:48 +10:00
Dave Airlie	a615a01942	radv/ac: add copy shader creation This create the gs copy shader and compiles it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:40 +10:00
Dave Airlie	09cd037ca4	radv/ac: setup function parameters for vs as es and copy shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:33 +10:00
Dave Airlie	e1e9301b2a	radv: pass some necessary gs info back to state handling. We need this info to program some registers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:30 +10:00
Dave Airlie	68a77411e1	radv: emit vertex shader to correct hw block. This emits the shader to the ES block in the correct case. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:27 +10:00
Dave Airlie	2a57bddd4c	radv/ac: propogate as_es flag into shader info from key. This just places the flag into the shader info so we can use it from the driver after we create the shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:23 +10:00
Dave Airlie	b941a88e01	radv: extend shader stage code to cover geometry shaders. This enables the paths for setting up user ptrs to vs/es and gs. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:20 +10:00
Dave Airlie	ec7bf863d2	radv/ac: start setting up the geom shader rings (v2) This sets up the rings and adds the variables needed to make them work. v2: rework for sharing ring and scratch Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:17 +10:00
Dave Airlie	ca91db2402	radv/ac: handle geom shader sgpr/vgpr inputs This just sets up the gpr inputs. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:13 +10:00
Dave Airlie	374e978438	radv/ac: add geom shader sendmsg defines. This just adds some defines needed for geom shaders. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:29:10 +10:00
Dave Airlie	583cf8efd4	radv/ac: add some geom shader info from nir->ac shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:28:50 +10:00
Dave Airlie	ecb8a34910	radv: move hw vertex shader emit to separate function This is to later allow ES shaders to be emitted. Review-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:28:46 +10:00
Dave Airlie	3b507855cb	radv: fixup ia multi vgt param code to handle geom shaders. This fixes up a few of the commented out blocks. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:28:28 +10:00
Dave Airlie	68c5da7e66	radv: add code to set gs_table_depth. Review-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:28:24 +10:00
Dave Airlie	f26fa879b7	radv: add small helper to denote when a geom shader is in the pipeline. Review-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 09:28:13 +10:00
Robert Foss	0b63f47030	radv: Prevent Coverity warning Prevent Coverity seeing potential errors when src is no initialized in the switch case. Coverity-Id: 1396397 Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-30 23:59:22 +01:00
Timothy Arceri	30aa22dec0	mesa: add new MESA_GLSL flag for printing shader cache debug info Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-31 09:51:31 +11:00
Carl Worth	ba1eb854bd	glsl: add cache to ctx and add sha1 string fields We also add a flag for detecting shaders written to shader cache. V2: dont leak cache Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-31 09:51:30 +11:00
Carl Worth	b8cb1a05cd	glsl: add new uniform fields to be used to restore state from cache Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-31 09:51:30 +11:00
Carl Worth	0f60c6616e	glsl: Switch to disable-by-default for the GLSL shader cache The shader cache is expected to be developed incrementally over a fairly long series of commits. For that period of instability, we require users to opt into the shader cache by setting: MESA_GLSL_CACHE_ENABLE=1 In the future, when the shader cache is complete, we can revert this commit so that the cache will be on by default. The user can always disable the cache with MESA_GLSL_CACHE_DISABLE=1. That functionality is not affected by this commit, (nor will it be affected by the future revert). Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-31 09:51:30 +11:00
Dave Airlie	0ecd426490	radv/ac: implement txs for buffer textures. This fixes a bunch of buffer related: dEQP-VK.memory.pipeline_barrier.* tests, that were crashing in LLVM due to this being missing. Reviewed-by: Andres Rodriguez<andresx7@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 06:26:53 +10:00
Dave Airlie	ecc3fa3ba3	radv/ac: handle nir irem opcode. This fixes: dEQP-VK.spirv_assembly.instruction.compute.opsrem.* Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org" Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 05:38:57 +10:00
Dave Airlie	059dd17175	radv/ac: fix multisample subpass image. We weren't adding the fragment position properly. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 04:44:59 +10:00
Dave Airlie	a1c1ba7d56	radv: handle transfer_write as a dst flag. It appears we can get image barriers like: srcStageMask: VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT) dstStageMask: VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT) dependencyFlags: VkDependencyFlags = 0 memoryBarrierCount: uint32_t = 0 pMemoryBarriers: const VkMemoryBarrier* = NULL bufferMemoryBarrierCount: uint32_t = 0 pBufferMemoryBarriers: const VkBufferMemoryBarrier* = NULL imageMemoryBarrierCount: uint32_t = 1 pImageMemoryBarriers: const VkImageMemoryBarrier* = 0x7ffc882367b0 pImageMemoryBarriers[0]: const VkImageMemoryBarrier = 0x7ffc882367b0: sType: VkStructureType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER (45) pNext: const void* = NULL srcAccessMask: VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT) dstAccessMask: VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT) oldLayout: VkImageLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL (7) newLayout: VkImageLayout = VK_IMAGE_LAYOUT_GENERAL (1) srcQueueFamilyIndex: uint32_t = 4294967295 dstQueueFamilyIndex: uint32_t = 4294967295 image: VkImage = 0x2df55e0 subresourceRange: VkImageSubresourceRange = 0x7ffc882367e0: aspectMask: VkImageAspectFlags = 1 (VK_IMAGE_ASPECT_COLOR_BIT) baseMipLevel: uint32_t = 0 levelCount: uint32_t = 1 baseArrayLayer: uint32_t = 0 layerCount: uint32_t = 1 This fixes all the CTS dEQP-VK.memory.pipeline_barrier.transfer_dst tests here, not sure if this is a too large hammer. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-31 04:42:21 +10:00
Samuel Pitoiset	af7fef12f7	r600: fix a compilation warning in r600_screen_create() Should be r600_common_screen instead of r600_screen. Fixes: `80157a2c20` ("gallium/radeon: clean up r600_query_init_backend_mask") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 18:13:18 +01:00
Marek Olšák	f8bc628b2c	gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter to simplify things in draw_vbo a little Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 17:45:29 +01:00
Marek Olšák	75c425e511	winsys/radeon: clamp vram_vis_size to 256MB the value from the kernel is wrong Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 17:45:29 +01:00
Marek Olšák	eba9e9dd1d	radeonsi: handle count_from_stream_output in a few IA_MULTI_VGT_PARAM cases Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 17:45:29 +01:00
Marek Olšák	a0740d59aa	radeonsi: don't invoke DCC decompression in update_all_texture_descriptors This fixes a bug uncovered by the 17-part patch series, specifically: "gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter" If dirty_tex_counter has been updated and set_shader_image invokes DCC decompression, the DCC decompression itself checks the counter and updates descriptors, which in turn invokes the same DCC decompression. The blitter can't handle the recursion and the driver eventually crashes. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 17:45:29 +01:00
Marek Olšák	f8dd2f5bac	radeonsi: fold info->indirect conditionals into the last one in draw_vbo Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 17:29:36 +01:00
Marek Olšák	408f9a1584	radeonsi: atomize the scratch buffer state The update frequency is very low. Difference: Only account for the size when allocating a new one and when starting a new IB, and check for NULL. (v3) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 17:29:36 +01:00
Bartosz Tomczyk	a41f2527ae	r600: Fix stack overflow Commit `7b5878ee04` increased number of outputs to 64, but left output array intact. This caused stack overflow when number of outputs is bigger then 32. Found by ASAN. Cc: "12.0 13.0 17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 15:30:03 +01:00
Samuel Pitoiset	e2c15ea092	gallium/radeon: add new HUD queries for monitoring the CP There are even more counters in the CP_STAT register but I think these ones are enough for now. v2: only read (and expose) CP_STAT on VI+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-30 14:37:00 +01:00
Samuel Pitoiset	0e04a078c5	gallium/radeon: add new GPU-sdma-busy HUD query For simplicity, GPU-sdma-busy will return 0 on previous gens. v2: only read SRBM_STATUS2 on Evergreen+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-30 14:37:00 +01:00
Samuel Pitoiset	b0f7ddef4f	gallium/radeon: rename grbm to mmio in the gpu load path We also want to monitor other MMIO counters like SRBM_STATUS2 in order to know if SDMA is busy. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-30 14:37:00 +01:00
Marek Olšák	2fc5fe0e85	winsys/amdgpu: add a fast exit path into amdgpu_cs_add_buffer The time spent in the function dropped by 37% for torcs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:57:09 +01:00
Samuel Pitoiset	86eb52adad	winsys/amdgpu: do not iterate twice when adding fence dependencies The perf difference is very small, 3.25->2.84% in amdgpu_cs_flush() in the DXMD benchmark. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-30 13:44:25 +01:00
Samuel Pitoiset	5a6b1aadea	winsys/amdgpu: add one likely() call in amdgpu_cs_flush() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-30 13:44:19 +01:00
Samuel Pitoiset	db2b0210b1	hud: fix compilation warnings in hud_nic_graph_install() v2: use PRId64 instead of PRIx64 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-30 13:43:30 +01:00
Samuel Pitoiset	0b646ad05e	st/mesa: make st_texture_get_sampler_view() static Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:42:50 +01:00
Marek Olšák	62732ce263	gallium/radeon: remove r600_common_context::max_db this cleanup is based on the vulkan driver, which seems to do the same thing Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	9327780da6	winsys/amdgpu: fix ADDR_REGISTER_VALUE::backendDisables This would be a fix if the value was used anywhere. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	80157a2c20	gallium/radeon: clean up r600_query_init_backend_mask This just needs to be done for r600g in the screen. We don't need an IB submission for every new context created for GCN. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	5f99c49008	radeonsi: precompute IA_MULTI_VGT_PARAM values into a table The perf difference is very small: 0.99% -> 0.40% for the time spent in si_get_ia_multi_vgt_param when si_draw_vbo is 20%. Pretty much nothing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	c78177fc64	radeonsi: move VGT_VERTEX_REUSE_BLOCK_CNTL into shader states for Polaris Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	ccecf79c2b	radeonsi: state atom IDs don't have to be off by one Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	ac059f1c23	radeonsi: use a bitmask for looping over dirty PM4 states also move it to draw_vbo, because it should be 0 in most cases Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	802fcdc0d2	radeonsi: atomize L2 prefetches to move the big conditional statement out of draw_vbo Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	c99ba3eb47	radeonsi: unbind disabled shader stages to prevent useless L2 prefetches Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	4a4ff66dbe	radeonsi: also prefetch compute shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	879c73fac8	radeonsi: update dirty_level_mask only after the first draw after FB change Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	cecc068774	gallium/radeon: allow VRAM-only placements again on APUs & recent amdgpu Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	0d0f357de6	radeonsi: don't set +fp64-denormals it's the default and the name will change to +fp64-fp16-denormals. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	b177162489	radeonsi: remove si_shader_context::param_tess_offchip we don't use on-chip tess. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Lucas Stach	e158b74971	etnaviv: force vertex buffers through the MMU This fixes a vertex data corruption issue if some of the vertex streams go through the MMU and some don't. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-30 12:40:57 +01:00
Andres Rodriguez	33f418bd67	radv: Expose VK_KHR_maintenance1 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-30 08:44:11 +01:00
Andres Rodriguez	7b890a36df	radv: Fix vkCmdCopyImage for 2d slices into 3d Images Previously the z offset of the destination image was being ignored. It should be taken into account when copying into a 3d target. Also, img_extent_el.depth was being incorrectly clamped to 1 due to the source image being VK_IMAGE_TYPE_2D. This would result in the blit failing to iterate over all the 3d slices. Instead we clamp to the destination image type. Fixes failures in CTS tests: dEQP-VK.api.copy_and_blit.image_to_image.3d_images.* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-30 08:44:07 +01:00
Bas Nieuwenhuizen	4eae3597eb	radv: Expose transfer format features. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-01-30 08:42:26 +01:00
Bas Nieuwenhuizen	34bfe4b1bb	radv: Don't allow any operations on non-supported depth/stencil formats. We really use the depth block for the blits. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-30 08:42:26 +01:00
Andres Rodriguez	f8d5e1ab2d	radv: use new error codes for AllocateDescriptorSets There is a new error code in Maintenance1 that is more specific to the situation: VK_ERROR_OUT_OF_POOL_MEMORY_KHR Fixes CTS test case: dEQP-VK.api.descriptor_pool.out_of_pool_memory Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-30 08:42:17 +01:00
Andres Rodriguez	e199a993b2	radv: vkAllocateCommandBuffers should NULL all output handles This is part of the spec and fixes CTS tests: dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-30 08:38:13 +01:00
Andres Rodriguez	ec0f5c005c	radv: add trim command pool stub Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-30 08:37:54 +01:00
Kenneth Graunke	2f7a7ae131	i965: Support the force_glsl_version driconf option. Gallium drivers have had this for a while. It makes sense to support it consistently across drivers, so expose it in i965 as well. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-29 18:20:57 -08:00
Kenneth Graunke	02216a1ddf	i965: Fix check for negative pitch in can_do_fast_copy_blit(). At this point, the pitch is in bytes. We haven't yet divided the pitch by 4 for tiled surfaces, so abs(pitch) may be larger than 32K. This means the bit 15 trick won't work. The caller now has signed integers anyway, so just pass those through and do the obvious check. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-29 18:20:35 -08:00
Bas Nieuwenhuizen	c4d7b9cd29	radv: Handle command buffers that need scratch memory. v2: Create the descriptor BO with CPU access. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-30 02:07:20 +01:00
Bas Nieuwenhuizen	ccff93e138	radv: Track scratch usage across pipelines & command buffers. Based on code written by Dave Airlie. Signed-off-by: Bas Nieuwenhuizen <basni@oogle.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-30 02:07:16 +01:00
Bas Nieuwenhuizen	29c1f67e9f	radv/ac: Add compiler support for spilling. Based on code written by Dave Airlie. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-30 02:07:12 +01:00
Bas Nieuwenhuizen	d115b67712	radv/amdgpu: Support a preamble CS. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-30 02:07:08 +01:00
Timothy Arceri	2842dea310	i965: add assert to while_jumps_before_offset() jip should always be negative here as its the result of do instruction - while instruction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-30 10:17:54 +11:00
Timothy Arceri	77a6597bb7	i965: fix up asserts in brw_inst_set_jip() We are casting from a signed 32bit int to an unsigned 16bit int so shift 15 bits rather than 16. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-30 10:17:46 +11:00
Bas Nieuwenhuizen	b8ee45ebdc	llvmpipe: Use LLVMDumpModule, not DumpModule. Forgot the prefix ... Fixes: `0fca80b3db` Signed-off-by: Bas Nieuwenhuizen <basni@google.com>	2017-01-29 17:03:25 +01:00
Bas Nieuwenhuizen	0fca80b3db	various: Fix missing DumpModule with recent LLVM. Since LLVM revision 293359 DumpModule gets only implemented when either a debug build or LLVM_ENABLE_DUMP is set. This patch adds a direct replacement for the function for radv and radeonsi, However, as I don't know a good place to put common LLVM code for all three I inlined the implementation for LLVMPipe. v2: Use the new code for LLVM 3.4+ instead of LLVM 5+ & fixed indentation Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-01-29 10:25:00 +01:00
Ilia Mirkin	ce7a045fee	r600g: use ieee variants of multiplication instructions This matches the behavior of most other drivers, including nouveau, radeonsi, and i965. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-29 00:00:07 -05:00
Ilia Mirkin	bacbb01105	r600g: add support for optionally using non-IEEE mul ops Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-28 23:59:43 -05:00
Eric Anholt	5b7e2697dc	vc4: Coalesce into TLB writes as well as VPM/tex. This generally cuts an instruction when blending is enabled and we thus have a single instruction generating the color value. total instructions in shared programs: 91759 -> 91634 (-0.14%) instructions in affected programs: 5338 -> 5213 (-2.34%)	2017-01-28 19:35:20 -08:00
Eric Anholt	c1299615fb	vc4: Avoid an extra temporary and mov in ffloor/ffract/fceil. shader-db results: total instructions in shared programs: 92611 -> 91764 (-0.91%) instructions in affected programs: 27417 -> 26570 (-3.09%) The star is one shader in glmark2's terrain (drops 16% of its instructions), but there are also wins in mupen64plus and glb2.7.	2017-01-28 19:35:20 -08:00
Eric Anholt	0079df0b2d	vc4: Flip the switch to run the GLSL compiler optimization loop once. This has almost no effect on shader-db: total instructions in shared programs: 92572 -> 92611 (0.04%) instructions in affected programs: 4486 -> 4525 (0.87%) Looking at 2 of the 7 different shaders that were hurt (all of which were in mupen64), they all appear to be just differences in order of instructions at the NIR level. The advantage is that this should significantly reduce time in the compiler.	2017-01-28 19:35:20 -08:00
Kenneth Graunke	7c5629a269	i965: Unbind deleted shaders from brw_context, fixing malloc heisenbug. Applications may delete a shader program, create a new one, and bind it before the next draw. With terrible luck, malloc may randomly return a chunk of memory for the new gl_program that happened to be the exact same pointer as our previously bound gl_program. In this case, our logic to detect new programs in brw_upload_pipeline_state() would break: if (brw->vertex_program != ctx->VertexProgram._Current) { brw->vertex_program = ctx->VertexProgram._Current; brw->ctx.NewDriverState \|= BRW_NEW_VERTEX_PROGRAM; } Because the pointer is the same, we'd think it was the same program. But it could be wildly different - a different stage altogether, different sets of resources, and so on. This causes utter chaos. As unlikely as this seems, I believe I hit this when running a subset of the CTS in a loop, in a group of tests that churns through simple programs, deleting and rebuilding them. Presumably malloc uses a bucketing cache of sorts, and so freeing up a gl_program and allocating a new one fairly quickly causes it to reuse that memory. The result was that brw->vertex_program->info.num_ssbos claimed the program had SSBOs, while brw->vs.base.prog_data.binding_table claimed that there were none. This was crazy, because the binding table is calculated from info.num_ssbos - the shader info appeared to change between shader compile time and draw time. Careful use of watchpoints revealed that it was being clobbered by rzalloc's memset when building an entirely different program... Fortunately, our 0xd0d0d0d0 canary for unused binding table entries caused us to crash out of bounds when trying to upload SSBOs, or we may have never discovered this heisenbug. Fixes crashes in GL45-CTS.compute_shader.sso-case2 when using a hacked cts-runner that only runs GL45-CTS.compute_shader.s* in EGL config ID 5 at 64x64 in a loop with 100 iterations. Cc: "17.0 13.0 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-27 21:52:37 -08:00
Bas Nieuwenhuizen	96c60b7f07	radv/ac: Use base in push constant loads. Apparently the source is not an address but an offset, so we actually need to use the base. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> CC: <mesa-stable@lists.freedesktop.org>	2017-01-28 03:07:39 +01:00
Andres Rodriguez	e8047980d2	radv: drop support for VK_AMD_NEGATIVE_VIEWPORT_HEIGHT This extension was not correctly supported, and it conflicts with the VK_KHR_MAINTENANCE1 spec. Reviewed-by: Fredrik Höglund <fredrik@kde.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-28 11:02:35 +10:00
Dave Airlie	e9b16c74fa	radv: implement VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2 Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-28 10:52:23 +10:00
Dave Airlie	989ec61703	radv: use proper maximum slice for layered view this fixes deferred shadows with geom shaders enabled. but I think this fix is fine by itself. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-28 10:52:20 +10:00
Chad Versace	6403e37651	i965/sync: Implement fences based on Linux sync_file This patch implements a new type of struct brw_fence, one that is based struct sync_file. This completes support for EGL_ANDROID_native_fence_sync. * Background Linux 4.7 added a new file type, struct sync_file. See commit 460bfc41fd52959311ed0328163f785e023857af Author: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Date: Thu Apr 28 10:46:57 2016 -0300 Subject: dma-buf/sync_file: de-stage sync_file headers A sync file is a cross-driver explicit synchronization primitive. In a sense, sync_file's relation to synchronization is similar to dma_buf's relation to memory: both are primitives that can be imported and exported across drivers (at least in theory). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-27 13:10:07 -08:00
Chad Versace	0b6dd31d68	i965/sync: Rename brw_fence_insert() Rename to brw_fence_insert_locked(). This is correct because the fence's mutex is effectively locked, as all callers are also creators of the fence, and have not yet returned the new fence. This reduces noise in the next patch, which defines and uses brw_fence_insert(), an unlocked variant. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-27 13:10:07 -08:00
Chad Versace	a5c17f5c29	i965/sync: Fail sync creation when batchbuffer flush fails Pre-patch, brw_sync.c ignored the return value of intel_batchbuffer_flush(). When intel_batchbuffer_flush() fails during eglCreateSync (brw_dri_create_fence), we now give up, cleanup, and return NULL. When it fails during glFenceSync, however, we blindly continue and hope for the best because there does not exist yet a way to tell core GL that sync creation failed. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-27 13:10:07 -08:00
Chad Versace	014d0e0f88	i965/sync: Add brw_fence::type This a refactor patch; no expected changed in behavior. Add `enum brw_fence_type` and brw_fence::type. There is only one type currently, BRW_FENCE_TYPE_BO_WAIT. This patch reduces a lot of noise in the next, which adds new type BRW_FENCE_TYPE_SYNC_FD. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-27 13:10:06 -08:00
Chad Versace	d1ce499dae	i965: Add intel_batchbuffer_flush_fence() A variant of intel_batchbuffer_flush() with parameters for in and out fence fds. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-27 13:10:06 -08:00
Chad Versace	358661c794	i965: Add intel_screen::has_fence_fd This bool maps to I915_PARAM_HAS_EXEC_FENCE_FD. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-27 13:10:06 -08:00
Chad Versace	b8acb6b179	configure: Require libdrm >= 2.4.75 Required to implement EGL_ANDROID_native_fence_sync on i965. Specifically, i965 needs drm_intel_gem_bo_exec_fence(), I915_PARAM_HAS_EXEC_FENCE, and libsync.h. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-27 13:10:06 -08:00
Emil Velikov	cb6be5c8c0	configure.ac: list radeon in --with-vulkan-drivers help string Analogous to what we do for the dri and gallium drivers. Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@colllabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-27 19:25:30 +00:00
Emil Velikov	6f2dec0a23	radv: automake: Don't install vk_platform.h or vulkan.h. These files belong to the vulkan loader. Identical to `045f38a507` vulkan: Don't install vk_platform.h or vulkan.h. Cc: Dave Airlie <airlied@redhat.com> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: 17.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-27 19:25:26 +00:00
Jason Ekstrand	d96ade1c4c	anv: Advertise API version 1.0.39 I'm pretty sure we've kept up with the bug fixes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-01-27 10:06:14 -08:00
Eric Engestrom	5f301fe2e6	gbm/dri: fix memory leaks in error path Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> [Emil Velikov: make sure it builds] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:58 +00:00
Emil Velikov	1d104f9aa7	docs/releasing: add a note about the relnotes template Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-27 17:56:58 +00:00
Emil Velikov	2e076af067	mesa: remove explicit __STDC_FORMAT_MACROS define Analogous to previous commits. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-27 17:56:57 +00:00
Emil Velikov	1cfe97ff0e	nouveau: remove explicit __STDC_FORMAT_MACROS define Already handled by the build. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-27 17:56:57 +00:00
Emil Velikov	027e04932a	scons: swr: remove explicit __STDC_.*_MACROS defines Analogous to previous commits. Cc: George Kyriazis <george.kyriazis@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-27 17:56:57 +00:00
Emil Velikov	e809fadb86	gallium: remove explicit __STDC_.*_MACROS defines Analogous to previous commits. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-27 17:56:57 +00:00
Emil Velikov	01e28c6cf5	gallivm: remove explicit __STDC_.*_MACROS defines Correctly handled by the build systems. Cc: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-27 17:56:57 +00:00
Emil Velikov	74a174e12f	glsl: remove explicit __STDC_FORMAT_MACROS define Correctly handled by all the build systems. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-27 17:56:57 +00:00
Emil Velikov	9e9e917c26	autoconf: set all __STDC_*_MACROS Analogous to previous commit(s), with a minor detail - here we set the macros when building both C and C++ sources. Resolving that is a more challenging task that we'll sort out another day. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-27 17:56:57 +00:00
Emil Velikov	d68ffa9446	scons: always set __STDC_*_MACROS for C++ sources Analogous to previous commit - just set the lot once throughout. Cc: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-27 17:56:57 +00:00
Emil Velikov	13e2928d57	android: always set __STDC_*_MACROS for C++ sources Various parts of the code depend on the macros being defined. Just set those unconditionally, only where needed (c++ sources) so that we can drop the workarounds through the code. Cc: Rob Herring <robh@kernel.org> Cc: Chih-Wei Huang <cwhuang@android-x86.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-27 17:56:57 +00:00
Emil Velikov	c4862fa382	st/xa: automake: remove duplicate -Wall Already handled by configure.ac Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:57 +00:00
Emil Velikov	6a5850b04a	mesa: move variable declaration to where its used The variable replacement was unused when building w/o ENABLE_SHADER_CACHE. Since we can mix variable declarations and code, move it to where its used. Fixes: `9f8dc3bf03` "utils: build sha1/disk cache only with Android/Autoconf" Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-27 17:56:57 +00:00
Emil Velikov	01874d5278	st/mesa: use correct return statement for a void function Analogous to previous commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-27 17:56:56 +00:00
Emil Velikov	c1960e23ff	mesa: use correct return statement for a void function Using return foo() is incorrect even if foo itself returns void. Spotted by AppVeyor, as below: teximage.c(3653) : warning C4098: 'copyteximage' : 'void' function returning a value Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-27 17:56:56 +00:00
Emil Velikov	be3b5e015c	svga: remove const qualifier from SVGA3D_vgpu10_GenMips() prototype Does not match the function definition or how it's used. Triggers the following warning in AppVeyor svga_cmd_vgpu10.c(1301) : warning C4028: formal parameter 2 different from declaration Cc: Charmaine Lee <charmainel@vmware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:56 +00:00
Emil Velikov	cf00cc72e9	nir: add extra const notation in compare_blocks() MSVC warns about different const qualifiers. Add the extra const to silence it. nir_phi_builder.c(244) : warning C4090: 'initializing' : different 'const' qualifiers nir_phi_builder.c(245) : warning C4090: 'initializing' : different 'const' qualifiers Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-27 17:56:56 +00:00
Emil Velikov	a2dea3b654	nir: silence implicit conversion to 64bit MSVC warns about implicit conversion as below. Annotate the literal appropriately to silence the warning. nir_gather_info.c(249) : warning C4334: '<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-27 17:56:56 +00:00
Emil Velikov	01849ae0dc	i915, i965: automake: remove NA include directive The path in question (... dri/intel/server) was removed years ago. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:56 +00:00
Emil Velikov	091f2b8c98	mesa/tests: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:56 +00:00
Emil Velikov	6ba96bdcab	dri/osmesa: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:56 +00:00
Emil Velikov	ede4ff9adc	dri/swrast: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:56 +00:00
Emil Velikov	5a0ba1e5de	radeon, r200: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:56 +00:00
Emil Velikov	ee5de93269	mapi: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:56 +00:00
Emil Velikov	af860850a0	loader: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:56 +00:00
Emil Velikov	912b4f5472	glx/windows: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Cc: Jon Turney <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:55 +00:00
Emil Velikov	5b874cee09	glx/apple: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jeremy Sequoia <jeremyhu@apple.com>	2017-01-27 17:56:55 +00:00
Emil Velikov	d66f9e6d93	glx: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:55 +00:00
Emil Velikov	d221bf9b91	d3dadapter9: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Cc: Axel Davy <axel.davy@ens.fr> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:55 +00:00
Emil Velikov	517f34b4be	st/dri: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:55 +00:00
Emil Velikov	02f991c00d	clover: automake: remove -I$(srcdir) Already implicitly handled by the build system. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Aaron Watry <awatry@gmail.com>	2017-01-27 17:56:55 +00:00
Emil Velikov	65d5a60cac	clover: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Cc: Aaron Watry <awatry@gmail.com> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:55 +00:00
Emil Velikov	c5921ae0d2	egl: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:55 +00:00
Emil Velikov	90ac5c339e	i915: automake: include builddir prior to srcdir Analogous to previous commit. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 17:56:55 +00:00
Emil Velikov	4622c75dfb	i965: automake: include builddir prior to srcdir The latter can contain stale generated file, which, as-is, we'll end up using. Fixes: `bfd17c76c1` "i965: Port INTEL_PRECISE_TRIG=1 to NIR." Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-01-27 17:56:55 +00:00
Emil Velikov	a922c82125	freedreno: automake: correctly set MKDIR_GEN Analogous to previous commit. Fixes: `4610e5ef28` "freedreno/ir3: fix sin/cos" Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Cc: Rob Clark <robclark@freedesktop.org> Cc: Nicolas Dechesne <nicolas.dechesne@linaro.org> Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>	2017-01-27 17:56:55 +00:00
Emil Velikov	5eed48d237	i965: automake: correctly set MKDIR_GEN Otherwise we might end up w/o the respective folder (depending on autotools version) and fail at build time. Fixes: `bfd17c76c1` "i965: Port INTEL_PRECISE_TRIG=1 to NIR." Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-27 17:56:54 +00:00
Eric Engestrom	1ee2ae8348	anv: add missing extension errors in vk_errorf() Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-27 17:23:32 +00:00
Eric Engestrom	86879bf4ed	anv: add missing core errors in vk_errorf() Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-27 17:23:32 +00:00
Lionel Landwerlin	ba26c79157	anv: don't assert on out of memory descriptor pool in debug mode Fixes: dEQP-VK.api.descriptor_pool.out_of_pool_memory Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-01-27 17:23:32 +00:00
Eric Engestrom	4da0d1c59a	docs/repository: fix name of main branch This is git, not svn :P Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-27 16:52:23 +00:00
Eric Engestrom	87619a1a6a	egl: EGL_PLATFORM_SURFACELESS_MESA is now upstream EGL_PLATFORM_SURFACELESS_MESA is in eglext.h as of last commit. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-01-27 16:52:23 +00:00
Eric Engestrom	a98b3a0872	egl: update headers from registry Khronos introduced a new macro (suggested by Google) to avoid using C-style casts in C++ code, as those generate warnings. Khronos Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16113 Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-01-27 16:52:23 +00:00
Eric Engestrom	06842585df	radv: add missing extension errors in vk_errorf() v2(Bas): Remove the extra VK_ERROR_FRAGMENTED_POOL cases. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-27 17:33:05 +01:00
Eric Engestrom	43cf967512	radv: add missing core errors in vk_errorf() Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-27 17:33:05 +01:00
Andreas Boll	1f2a890ace	configure.ac: Require LLVM for r300 only on x86 and x86_64 `b3119a3` introduced a strict LLVM requirement for r300 on all architectures and thus configure fails on architectures where LLVM is not available or buggy. r300 doesn't strictly require LLVM, but for performance reasons we highly recommend LLVM usage. So require it at least on x86 and x86_64 architectures as we have done before `b3119a3`. Fixes: `b3119a3` ("configure.ac: Check gallium LLVM version in gallium_require_llvm") Cc: 17.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-27 12:31:17 +01:00
Nicolai Hähnle	c5e76a262a	gallium: enable int64 on radeonsi, llvmpipe, softpipe All of these have had support for the TGSI opcodes since before most of the glsl compiler work landed. Also update the docs accordingly, including the missing note about i965. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-27 10:19:48 +01:00
Dave Airlie	93dc5c1a06	st/mesa: add support for enabling ARB_gpu_shader_int64. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-27 10:19:43 +01:00
Dave Airlie	278580729a	st/glsl_to_tgsi: add support for 64-bit integers v2: add conversion opcodes. v3 (idr): Rebase on replacemtn of TGSI_OPCODE_I2U64 with TGSI_OPCODE_I2I64. v4 (idr): "cut them down later" => Remove ir_unop_b2u64 and ir_unop_u642b. Handle these with extra i2u or u2i casts just like uint(bool) and bool(uint) conversion is done. v5 (nha): add clarifying comment about a subtle assumption Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-27 10:19:39 +01:00
Dave Airlie	f804506d4d	gallium: Add integer 64 capability v1.1: move to using a normal CAP. (Marek) v2: fill in the cap everywhere Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-27 10:19:25 +01:00
Topi Pohjolainen	a283a4ee2f	meta: Refactor texture format translation Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-27 08:57:26 +02:00
Topi Pohjolainen	542bb85049	intel/blorp/dbg: Name blit shaders for easy recognition in dumps Blorp clears already have an equivalent. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-27 08:57:26 +02:00
Topi Pohjolainen	56094cfb9e	i965/hiz/gen6: Stop setting false qpitch which is not applicable for "all slices at each lod". Current logic makes one to believe it has some purpose. When miptree layout is calculated brw_miptree_layout_texture_array() sets the qpitch unconditionally but later on ignores it altogether for ALL_SLICES_AT_EACH_LOD. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-27 08:57:26 +02:00
Topi Pohjolainen	b13d30a72b	i965/blorp/gen6: Remove dead code in hiz setup Such as comment states for intel_miptree_hiz_buffer::mt, hiz_mt only exists for gen6. In addition, intel_hiz_miptree_buf_create() uses MIPTREE_LAYOUT_FORCE_ALL_SLICE_AT_LOD unconditionally. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-27 08:57:26 +02:00
Topi Pohjolainen	b864e3d7ee	i965/gen6: Simplify hiz surface setup In intel_hiz_miptree_buf_create() intel_miptree_aux_buffer::bo is unconditionally initialised to point to the same buffer object as hiz_mt does. The same goes for intel_miptree_aux_buffer::pitch/qpitch. This will make following patches simpler to read. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-27 08:57:26 +02:00
Topi Pohjolainen	40bf622ced	i965/blorp/gen6: Simplify hiz surface setup In intel_hiz_miptree_buf_create() intel_miptree_aux_buffer::bo is unconditionally initialised to point to the same buffer object as hiz_mt does. Also intel_miptree_aux_buffer::offset is initialised to zero (calloc()). This will make following patches significantly simpler to read. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-27 08:57:26 +02:00
Topi Pohjolainen	5201d2991b	i965/gen6: Remove check for stencil format There are is no alternative. Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-27 08:57:26 +02:00
Topi Pohjolainen	19412abb3f	i965: Remove check for hiz on earlier gens than SNB Only caller, brw_workaround_depthstencil_alignment(), returns early for gen6+. While at it, reduce scope for brw_get_depthstencil_tile_masks() as well. Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-27 08:57:26 +02:00
Topi Pohjolainen	26a9e039fd	i965/miptree: Remove redundant check for null texture There exact same check earlier in brw_miptree_layout() which intel_miptree_create_layout() in turn calls unconditionally. Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-27 08:57:26 +02:00
Topi Pohjolainen	bcec4113cc	i965/miptree: Tell when brw_miptree_layout() fails In addition, let intel_miptree_create_layout() release the miptree - it is the allocator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-27 08:57:25 +02:00
Topi Pohjolainen	aa9e21a316	i965/meta: Remove unused brw_get_rb_for_slice() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gons<C3><A1>lvez <siglesias@igalia.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-27 08:57:25 +02:00
Michel Dänzer	d9f8bae616	clover: Fix build against clang SVN >= r293097 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-01-27 09:53:14 +09:00
Eric Anholt	9baf1ff8fc	vc4: Use NEON to speed up utile stores on Pi2+. Improves 1024x1024 TexSubImage2D by 41.2371% +/- 3.52799% (n=10).	2017-01-26 12:50:05 -08:00
Eric Anholt	4d30024238	vc4: Use NEON to speed up utile loads on Pi2. We had a lot of memcpy call overhead because gpu_stride wasn't being inlined. But if you split out the stride==8 and stride==16 cases like this code does while still using memcpy, you'd no longer have glibc's NEON memcpy applied at which point we'd be doing 16 uncached reads instead of 64/(NEON memcpy granularity), for about a 30% performance hit. By hand writing the assembly, we can get a whole cacheline loaded at a time. Unfortunately, NEON intrinsics turned out to be unusable -- they didn't have the vldm instruction available. Note that, for now, the NEON code is only enabled when building for ARMv7 (Pi 2+). We may want to do runtime detection for the Raspbian case, in the future. Improves 1024x1024 GetTexImage by 208.256% +/- 7.07029% (n=10).	2017-01-26 12:48:10 -08:00
Eric Anholt	347b69e7d7	vc4: Move LT tiling code to a separate file. This paves the way for building it twice, with NEON assembly or not.	2017-01-26 12:23:31 -08:00
Eric Anholt	14cf5c60b8	vc4: Use unreachable() in an unreachable codepath for tiling.	2017-01-26 12:23:31 -08:00
Samuel Pitoiset	eca96ea308	gallium/radeon: add VRAM-vis-usage HUD query This new query returns the current visible usage of VRAM accessed by the CPU. It will return 0 on radeon because it's unimplemented. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-26 19:40:52 +01:00
Samuel Pitoiset	9f087e1c7c	gallium/radeon: query the CPU accessible size of VRAM R600_DEBUG="info" can be used to display that size, as well as the total amount of VRAM/GTT. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-26 19:40:14 +01:00
Ian Romanick	13439031c8	mesa: Arrange validate_uniform_parameters parameters to match call sites Saves a measly 20 bytes on IA32 and nothing on x64. Depending on exactly when this is applied, a lot of variation is possible due to function alignment. text data bss dec hex filename 6670131 228340 22552 6921023 699b3f lib/i965_dri.so before 6670111 228340 22552 6921003 699b2b lib/i965_dri.so after 6342932 293872 29880 6666684 65b9bc lib64/i965_dri.so before 6342932 293872 29880 6666684 65b9bc lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-26 09:46:18 -08:00
Ian Romanick	9be5fd3c87	mesa: Arrange _mesa_uniform parameters to match the call sites By putting the parameters first that match the parameters to the call site, 4 (of 14) instructions are saved at _mesa_Uniform4fv on x64. On IA32, the details of the instructions change, but it is the same count and mix of instructions. Before: 0000000000000830 <_mesa_Uniform4fv>: 830: 48 83 ec 10 sub $0x10,%rsp 834: 49 89 d0 mov %rdx,%r8 837: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 83e <_mesa_Uniform4fv+0xe> 83e: 89 f8 mov %edi,%eax 840: 89 f1 mov %esi,%ecx 842: 41 b9 02 00 00 00 mov $0x2,%r9d 848: 64 48 8b 3a mov %fs:(%rdx),%rdi 84c: 48 8b 97 c8 01 02 00 mov 0x201c8(%rdi),%rdx 853: 48 8b 72 70 mov 0x70(%rdx),%rsi 857: 6a 04 pushq $0x4 859: 89 c2 mov %eax,%edx 85b: e8 00 00 00 00 callq 860 <_mesa_Uniform4fv+0x30> 860: 48 83 c4 18 add $0x18,%rsp 864: c3 retq After: 00000000000007f0 <_mesa_Uniform4fv>: 7f0: 48 83 ec 10 sub $0x10,%rsp 7f4: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 7fb <_mesa_Uniform4fv+0xb> 7fb: 41 b9 02 00 00 00 mov $0x2,%r9d 801: 64 48 8b 08 mov %fs:(%rax),%rcx 805: 48 8b 81 c8 01 02 00 mov 0x201c8(%rcx),%rax 80c: 6a 04 pushq $0x4 80e: 4c 8b 40 70 mov 0x70(%rax),%r8 812: e8 00 00 00 00 callq 817 <_mesa_Uniform4fv+0x27> 817: 48 83 c4 18 add $0x18,%rsp 81b: c3 retq Saves a measly 416 bytes of text on x64. Depending on exactly when this is applied, a lot of variation is possible due to function alignment. text data bss dec hex filename 6670131 228340 22552 6921023 699b3f lib/i965_dri.so before 6670131 228340 22552 6921023 699b3f lib/i965_dri.so after 6343348 293872 29880 6667100 65bb5c lib64/i965_dri.so before 6342932 293872 29880 6666684 65b9bc lib64/i965_dri.so after There is likely to be no performance change with just this patch. _mesa_uniform immediately calls validate_uniform_parameters with parameters in the "wrong" (different from the call site) order. v2: Rebase on GL_ARB_gpu_shader_fp64. v3: Rebase on GL_ARB_gpu_shader_int64. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-26 09:46:14 -08:00
Ian Romanick	9f7ac45ce4	mesa: Arrange _mesa_uniform_matrix parameters to match the call sites By putting the parameters first that match the parameters to the call site, 4 (of 16) instructions are saved at _mesa_UniformMatrix4fv on x64. On IA32, the details of the instructions change, but it is the same count and mix of instructions. Before: 0000000000001380 <_mesa_UniformMatrix4fv>: 1380: 48 83 ec 10 sub $0x10,%rsp 1384: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 138b <_mesa_UniformMatrix4fv+0xb> 138b: 41 89 f8 mov %edi,%r8d 138e: 41 89 f1 mov %esi,%r9d 1391: 0f b6 d2 movzbl %dl,%edx 1394: 64 48 8b 38 mov %fs:(%rax),%rdi 1398: 48 8b b7 c8 01 02 00 mov 0x201c8(%rdi),%rsi 139f: 48 8b 76 70 mov 0x70(%rsi),%rsi 13a3: 68 06 14 00 00 pushq $0x1406 13a8: 51 push %rcx 13a9: 52 push %rdx 13aa: b9 04 00 00 00 mov $0x4,%ecx 13af: ba 04 00 00 00 mov $0x4,%edx 13b4: e8 00 00 00 00 callq 13b9 <_mesa_UniformMatrix4fv+0x39> 13b9: 48 83 c4 28 add $0x28,%rsp 13bd: c3 retq After: 0000000000001360 <_mesa_UniformMatrix4fv>: 1360: 48 83 ec 10 sub $0x10,%rsp 1364: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 136b <_mesa_UniformMatrix4fv+0xb> 136b: 0f b6 d2 movzbl %dl,%edx 136e: 64 4c 8b 00 mov %fs:(%rax),%r8 1372: 49 8b 80 c8 01 02 00 mov 0x201c8(%r8),%rax 1379: 68 06 14 00 00 pushq $0x1406 137e: 6a 04 pushq $0x4 1380: 6a 04 pushq $0x4 1382: 4c 8b 48 70 mov 0x70(%rax),%r9 1386: e8 00 00 00 00 callq 138b <_mesa_UniformMatrix4fv+0x2b> 138b: 48 83 c4 28 add $0x28,%rsp 138f: c3 retq Saves a measly 576 bytes of text on x64. text data bss dec hex filename 6670131 228340 22552 6921023 699b3f lib/i965_dri.so before 6670131 228340 22552 6921023 699b3f lib/i965_dri.so after 6343924 293872 29880 6667676 65bd9c lib64/i965_dri.so before 6343348 293872 29880 6667100 65bb5c lib64/i965_dri.so after v2: Rebase on GL_ARB_gpu_shader_fp64. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-26 09:46:09 -08:00
Ian Romanick	874393186b	mesa: Trivial clean-ups in uniform_query.cpp This is C++, so we can mix code and declarations. Doing so allows constification. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-26 09:46:07 -08:00
Lionel Landwerlin	bbe8705c57	spirv: handle undefined components for OpVectorShuffle Fixes: dEQP-VK.spirv_assembly.instruction.compute.opspecconstantop.vector_related dEQP-VK.spirv_assembly.instruction.graphics.opspecconstantop.vector_related* Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-01-26 17:31:21 +00:00
Lionel Landwerlin	df7063cba3	spirv: handle OpUndef as part of the variable parsing pass Looking at the following bit of SPIRV shader : ... %zero = OpConstant %i32 0 %ivec3_0 = OpConstantComposite %ivec3 %zero %zero %zero %vec3_undef = OpUndef %ivec3 %sc_0 = OpSpecConstant %i32 0 %sc_1 = OpSpecConstant %i32 0 %sc_2 = OpSpecConstant %i32 0 ... Our compiler currently stops parsing variables & types on the OpUndef and switches to instructions, leaving the following sc_[0-2] variables untreated. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-01-26 17:29:29 +00:00
Lionel Landwerlin	c3421106ec	anv: fix descriptor pool internal size allocation The size of the pool is slightly smaller than the size of the structure containing the whole pool. We need to take that into account on when setting up the internals. Fixes a crash due to out of bound memory access in: dEQP-VK.api.descriptor_pool.out_of_pool_memory v2: Drop debug traces (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-01-26 17:24:21 +00:00
Kenneth Graunke	f8f7ea508b	i965: Make intelEmitCopyBlit not truncate large strides. When trying to blit larger tiled surfaces, the pitch can be larger than 32768 bytes, which means it won't fit in a GLshort. Passing it in will truncate the stride to 0, which has...surprising results. The pitch can be up to 32,768 DWords, or 128kB. We measure it in bytes, but divide by 4 when programming it. So we need to handle values up to 131,072. Switch from GLshort to int32_t to avoid the truncation. Fixes GL45-CTS.gtf30.GL3Tests.depth_texture.depth_texture_copyteximage at widths greater than 8192. v2: Use int32_t as negative values can be used (Jason). Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-26 01:43:20 -08:00
Kenneth Graunke	fcf723b647	i965: Use a UW source type for CS_OPCODE_CS_TERMINATE. SIMD16 compute shaders use a send(16) with mlen 1 for the EOT message, using a source of g127 for the single register. With a UD type, this supposedly could read g128, which doesn't exist, causing the simulator to get cranky. Use a UW type to avoid this. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-26 00:52:52 -08:00
Iago Toral Quiroga	9b25769da6	anv/lower_input_attachments: honor sample index parameter to subpassLoad() According to GL_KHR_vulkan_glsl, the signature of subpassLoad() is: gvec4 subpassLoad(gsubpassInput subpass); gvec4 subpassLoad(gsubpassInputMS subpass, int sample); So the multisampled case always receives an explicit sample index that we should use. The current implementation was ignoring this parameter and using gl_SampleID value instead. Fixes: dEQP-VK.pipeline.multisample_shader_builtin.sample_id.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-01-26 08:11:21 +01:00
Kenneth Graunke	5106df85da	i965: Fix fast depth clears for surfaces with a dimension of 16384. I hadn't bothered to set this bit because I figured it would just paper over us getting the rectangle wrong. But it turns out that there is a legitimate reason to use it, so let's do so. The alternative would be to chop up 16k clears to multiple 8k clears, which is pointlessly painful. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-01-25 22:24:08 -08:00
Chad Versace	022e5c7e5a	anv: Implement VK_KHR_get_physical_device_properties2 Reviewed-by: Jason Ekstranad <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-25 19:18:47 -08:00
Chad Versace	cd03021c83	anv: Refactor anv_GetPhysicalDeviceQueueFamilyProperties() Add a helper function, anv_get_queue_family_properties(), which fills the struct. This patch reduces churn in the following patch that implements vkGetPhysicalDeviceQueueFamilyProperties2KHR. Reviewed-by: Jason Ekstranad <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-25 19:18:46 -08:00
Chad Versace	5826190095	anv: Refactor anv_GetPhysicalDeviceFormatProperties() Add a helper function, anv_get_image_format_properties(), which does all the work and has a VkPhysicalDeviceImageFormatInfo2KHR parameter. This patch reduces churn in the following patch that implements vkGetPhysicalDeviceImageFormatProperties2KHR. Reviewed-by: Jason Ekstranad <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-25 19:18:43 -08:00
Chad Versace	b2de77a07d	anv: Revive struct anv_common The struct was deleted by: commit `efe9d1cde3` Author: Edward O'Callaghan <funfunctor@folklore1984.net> Subject: anv: Clean up some unused variables Unlike the original anv_common, the new one has a non-const pNext pointer because we will use it for the output structs of VK_KHR_get_physical_device_properties2. v2: - Retype pNext from void* to struct anv_common*. Reviewed-by: Jason Ekstranad <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-25 19:18:33 -08:00
Chad Versace	c5d99c9983	anv: Define macro anv_debug() This is a printf-like macro that prints a debug message to stderr when built with DEBUG. If no DEBUG, then do nothing. Reviewed-by: Jason Ekstranad <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-25 19:17:45 -08:00
Ian Romanick	fd43bee0ea	mesa: Fix copy-and-paste bug in _mesa_(Program\|)Uniform[1234](i\|ui)64vARB functions All of the functions were passing 1 to _mesa_uniform instead of passing count. Fixes 16 unsed parameter warnings like: main/uniforms.c: In function ‘_mesa_Uniform1i64vARB’: main/uniforms.c:1692:47: warning: unused parameter ‘count’ [-Wunused-parameter] _mesa_Uniform1i64vARB(GLint location, GLsizei count, const GLint64 *value) ^~~~~ This is why I build with extra warnings enabled. Unfortunately, there are so many unused parameter warnings in Mesa that I didn't notice these added warnings for over 6 months. :( Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-25 09:28:40 -08:00
Lionel Landwerlin	173dd60ced	spirv: bump headers to SPIRV 1.1 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-25 17:22:23 +00:00
Lionel Landwerlin	05e2d99bf2	spirv: add default handler for new enums Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-25 17:22:23 +00:00
Lionel Landwerlin	4fd54d611f	spirv: fix typos Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-25 17:21:15 +00:00
Lionel Landwerlin	25e21cb8d0	anv: set command buffer to NULL when allocations fail The spec section 5.2 says: "vkAllocateCommandBuffers can be used to create multiple command buffers. If the creation of any of those command buffers fails, the implementation must destroy all successfully created command buffer objects from this command, set all entries of the pCommandBuffers array to VK_NULL_HANDLE and return the error." Fixes: dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-01-25 17:15:30 +00:00
Jason Ekstrand	d6397dd625	vulkan/wsi: Lower the maximum image sizes Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "17.0" <mesa-dev@lists.freedesktop.org>	2017-01-25 09:05:30 -08:00
Jason Ekstrand	659edd9f5c	vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModes Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "17.0" <mesa-dev@lists.freedesktop.org>	2017-01-25 09:05:25 -08:00
Jason Ekstrand	dc578ef060	vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormats Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "17.0" <mesa-dev@lists.freedesktop.org>	2017-01-25 09:04:56 -08:00
George Kyriazis	e259efd805	swr: Update fs texture & sampler state logic In swr_update_derived() update texture and sampler state on a new fragment shader. GALLIUM_HUD can update fs using a previously bound texture and sampler. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-01-25 10:02:50 -06:00
Samuel Pitoiset	cff199ceb7	gallium/radeon: add a new HUD query for the number of mapped buffers Useful when debugging applications which map a ton of buffers and also because we used to run into Linux's limit on the number of simultaneous mmap() calls. v2: - update the commit message Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-25 15:19:21 +01:00
Iago Toral Quiroga	56495080ed	spirv: handle gl_SampleMask SPIR-V maps both gl_SampleMask and gl_SampleMaskIn to the same builtin (SampleMask). The only way to tell which one we are dealing with is to check if it is an input or an output. Fixes: dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.write.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-25 08:08:16 +01:00
Iago Toral Quiroga	9467d78d38	spirv: acknowledge multisampled input attachments Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-25 08:07:09 +01:00
Dave Airlie	2ab2be092d	radv: program a default point size. Along the lines of what `3b804819` anv: Default PointSize to 1.0 if not written by the shader does for anv, program a default point size in the hw of 1.0. This preempt fixes a bunch of geom shader tests. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-25 09:58:38 +10:00
Marek Olšák	eac7df43ca	radeonsi: handle first_non_void correctly in si_create_vertex_elements This fixes R11G11B10_FLOAT, because it's in the category of "OTHER", meaning that it doesn't have any channel description. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-24 23:52:01 +01:00
Marek Olšák	d9ef549238	st/mesa: destroy pipe_context before destroying st_context (v2) If radeonsi starts compiling an optimized shader variant asynchronously with a GL debug callback set and the application destroys the GL context, radeonsi crashes when trying to write shader stats into the debug output of a non-existent context after compilation, because st/mesa was destroyed before pipe_context. Firefox with WebGL2 enabled hits this bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99456 v2: protect against a double destroy in st_create_context_priv and callers. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-24 23:52:01 +01:00
Timothy Arceri	dd65f0efc9	nir: bump loop max unroll limit The original number was chosen in an attempt to match the limits applied to GLSL IR. A look at the git history of the why these limits were chosen for GLSL IR shows it was more to do with the slow speed of unrolling large loops in GLSL IR than anything else. The speed of loop unrolling in NIR is not a problem so we may wish to bump this even higher in future. No shader-db change, however a furture change will disbale the GLSL IR optimisation loop in the i965 backend results in 4 loops from The Talos Principle failing to unroll. Bumping the limit allows them to unroll which results in the instruction count matching the previous output from when the GLSL IR opts were still enabled. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-25 09:43:29 +11:00
Timothy Arceri	34ab9b0947	glsl: lower constant arrays to uniform arrays before optimisation loop Previously the constant array would not get copy propagated until the backend did its GLSL IR opt loop. I plan on removing that from i965 shortly which caused huge regressions in Deus-ex and Tomb Raider which have large constant arrays. Moving lowering before the opt loop in the GLSL linker fixes this and unexpectedly improves some compute shaders also. shader-db results BDW: instructions helped: shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 204 -> 194 (-4.90%) instructions helped: shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1010 -> 741 (-26.63%) instructions helped: shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 542 -> 385 (-28.97%) cycles helped: shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1831382 -> 1818492 (-0.70%) cycles helped: shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 216238 -> 206180 (-4.65%) cycles helped: shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 18484 -> 16644 (-9.95%) total instructions in shared programs: 13060313 -> 13059877 (-0.00%) instructions in affected programs: 1756 -> 1320 (-24.83%) helped: 3 HURT: 0 total cycles in shared programs: 256586698 -> 256561910 (-0.01%) cycles in affected programs: 2066104 -> 2041316 (-1.20%) helped: 3 HURT: 0 V3: only call the opt loop if lowering progressed (Suggested by Eric) V2: call opts before and after lowering (Suggested by Ken) Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-25 09:07:30 +11:00
Ian Romanick	c4a0c1efff	mesa: Don't advertise GL_OES_read_format in core profile OpenGL ES implementations are not allowed to ship ARB extensions, and OpenGL implementations are not allowed to ship OES extensions. The functionality is also included in GL_ARB_ES2_compatibility. Ever OpenGL core-profile driver currently exposes both extensions. I don't know of any applications that explicitly check for GL_OES_read_format, so removing it seems very unlikely to cause problems. No functionality is removed. I have left this extension in place for compatibility profile. There are still OpenGL 1.x drivers in Mesa, and adding code to check for compatibility profile and not GL_ARB_ES2_compatibility for GL_IMPLEMENTATION_COLOR_READ_TYPE and GL_IMPLEMENTATION_COLOR_READ_FORMAT just feels dumb. Three other other alternatives considered: - Remove the string from compatibility profile drivers but leave the functionality in place. - Add a flag to expose the extension string, and set it in every OpenGL driver that does not expose GL_ARB_ES2_compatibility (and those drivers only). I tried this. You can't have two instances of an extension in the extension table (one dummy_true for ES1 and one with a flag for compatibility profile), so the implementation requires a bit of effort. - Only expose the extension in compatibility if the version is less than 2.0. I didn't see an easy way to do this. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2017-01-24 13:39:26 -08:00
Brian Paul	b87eedd405	docs: fix incorrect link to 12.0.6 release notes Trivial.	2017-01-24 14:30:44 -07:00
Jason Ekstrand	a435991d3c	anv: Expose VK_KHR_maintenance1 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-24 12:27:48 -08:00
Jason Ekstrand	756533520e	anv: Return better errors from AllocateDescriptorSets Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-24 12:27:48 -08:00
Jason Ekstrand	99bb4c22a5	anv: Allow selecting the slice of a 3D image As per VK_KHR_maintenance1, clients can render to a slice of a 3D image by creating a VK_IMAGE_VIEW_TYPE_2D view of it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-24 12:27:48 -08:00
Jason Ekstrand	6d79111834	anv: Report FORMAT_FEATURE_TRANSFER_SRC/DST_BIT_KHR As of VK_KHR_maintenance1, these are supposed to be reported for any formats on which we support transfer operations. For us, this is anything that we can texture from. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-24 12:27:48 -08:00
Jason Ekstrand	8a8630486b	anv: Add trivial support for TrimCommandPoolKHR Our command buffers already efficiently use a global pool so trimming doesn't really need to do anything. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-24 12:27:48 -08:00
Jason Ekstrand	5edcc96bf6	anv: Set viewport extents correctly when height is negative As per VK_KHR_maintenance1, setting a negative height in the viewport can be used to get flipped coordinates. This is, aparently, very useful when porting D3D apps to Vulkan. All we need to do to support this is to make sure we actually set the min and max correctly. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-24 12:27:48 -08:00
Matt Turner	045f38a507	vulkan: Don't install vk_platform.h or vulkan.h. These files belong to the vulkan loader. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-24 11:27:20 -08:00
Roland Scheidegger	aceae09ef0	glsl: fix compile errors with mingw due to missing PRIx64 definitions define __STDC_FORMAT_MACROS and include <inttypes.h> (same as ir_builder_print_visitor.cpp already does). Otherwise, some mingw build errors out (since `8e7e1ae036` and `bbce1c538d` presumably) with: src/compiler/glsl/ir_print_visitor.cpp:479:40: error: expected ‘)’ before ‘PRIu64’ case GLSL_TYPE_UINT64:fprintf(f, "%" PRIu64, ir->value.u64[i]); break; (Note even with that fix I get other format specifier warnings: src/compiler/glsl/ir_print_visitor.cpp:473:47: warning: unknown conversion type character ‘a’ in format [-Wformat=] fprintf(f, "%a", ir->value.f[i]); ^ src/compiler/glsl/ir_print_visitor.cpp:473:47: warning: too many arguments for format [-Wformat-extra-args] but it still compiles at least) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-24 19:12:46 +01:00
Roland Scheidegger	f4df21ed95	gallivm: don't try to use fast rcp for fdiv The use of fast rcp instruction is disabled, and will always fall back to use a division instead (1 / x). Hence, if we get a division opcode, it doesn't make much sense trying to split that into rcp/mul. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-24 19:12:46 +01:00
Roland Scheidegger	25208949d7	gallivm: (trivial) fix ddiv cpu implementation we can't use the cpu implementation of fdiv, as this one uses different lp_build_context, which causes assertion failure. Just use default fdiv action (there is no fast rcp for doubles which we could potentially use anyway). Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-24 19:12:46 +01:00
Roland Scheidegger	3b575a955c	tgsi: implement ddiv opcode softpipe (along with llvmpipe) claims to support arb_gpu_shader_fp64, so we really need to support that opcode. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-24 19:12:46 +01:00
Jason Ekstrand	4c180f9633	i965/blorp: Use the correct ISL format for combined depth/stencil In brw_blorp_copyteximage, we use the format from the render buffer. This could be a combined depth/stencil format. In this case, we handle stencil properly but we give blorp the wrong ISL format. Specifically, we would give blorp ISL_FORMAT_R32G32B32A32_FLOAT which is the wrong size was causing GPU hangs. Fixes: GL45-CTS.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_copyteximage Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-01-24 10:06:07 -08:00
Samuel Pitoiset	0054dded03	st/glsl_to_tgsi: fix compilation warnings since int64 types state_tracker/st_glsl_to_tgsi.cpp:302:28: warning: ‘glsl_to_tgsi_instruction::tex_type’ is too small to hold all values of ‘enum glsl_base_type’ glsl_base_type tex_type:4; Fixes: `8ce53d4a2f` ("glsl: Add basic ARB_gpu_shader_int64 types") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-24 12:45:39 +01:00
Samuel Pitoiset	d90d37db73	gallium/radeon: undef the very specific UPDATE_COUNTER macro Also, wrap this into a do { ... } while (0). Suggested by Nicolai. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-24 11:17:25 +01:00
Topi Pohjolainen	ba6399df94	i965/blorp: Add also depth and stencil buffers to render cache v2 (Jason, Curro): Add stencil also even though it is not enabled yet. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-24 10:41:58 +02:00
Ben Widawsky	e63ab36d0e	gbm: Fix width height getters return type (trivial) v2: Other way round... to make consistent, make both return type have the fixed width - uint32_t. Cc: Daniel Stone <daniel@fooishbar.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Daniel Stone <daniels@collabora.com>	2017-01-23 21:43:38 -08:00
Ben Widawsky	bb9ff98b4c	gbm: Move getters to match order in header file (trivial) Other things are out of order, but I need to add a getter so I'm just fixing those. This helps people adding to GBM know where the right place to put things is. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Daniel Stone <daniels@collabora.com>	2017-01-23 21:43:34 -08:00
Emil Velikov	530cd248f5	docs: add news item and link release notes for 12.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-24 02:15:30 +00:00
Emil Velikov	9b16bd8b6c	docs: use correct year for the 12.0.6 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `13953f012d`)	2017-01-24 02:15:30 +00:00
Emil Velikov	c16e7e0a60	docs: add sha256 checksums for 12.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `36e3f2542d`)	2017-01-24 02:15:30 +00:00
Emil Velikov	b1137cb9de	docs: add release notes for 12.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `555885a0bf`)	2017-01-24 02:15:30 +00:00
Emil Velikov	9924cdecd9	docs/releasing: remove stray "cd" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-24 02:15:29 +00:00
Ilia Mirkin	b755f2f233	nv50: add support for MUL_ZERO_WINS property This is simply keyed off the vertex shader, as that's guaranteed to be present in any pipeline. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-23 20:37:14 -05:00
Ilia Mirkin	8c764a2321	nvc0: add support for MUL_ZERO_WINS property This sets the dnz flag on all the relevant multiplication operations. At emission time, this will only be supported by nvc0+, so nv50 will need a different solution. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-23 20:37:14 -05:00
Ilia Mirkin	e1346f25bf	st/nine: set the MUL_ZERO_WINS flag when supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2017-01-23 20:37:10 -05:00
Ilia Mirkin	6e40938fbc	gallium: add PIPE_CAP_TGSI_MUL_ZERO_WINS Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2017-01-23 20:36:47 -05:00
Ilia Mirkin	a2b2cd81d1	gallium: add TGSI_PROPERTY_MUL_ZERO_WINS This will be useful for proper D3D9 emulation, where this behavior is expected by some shaders. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2017-01-23 20:35:55 -05:00
Marek Olšák	573bf0940a	radeonsi: always set the TCL1_ACTION_ENA when invalidating L2 Some CIK-VI docs say this is the default behavior on SI. That doesn't answer whether it's also the default behavior on CIK-VI. Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 23:43:38 +01:00
Marek Olšák	5d3dd70cab	radeonsi: don't declare LDS in TES not used since we started using the offchip tess ring Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 23:43:38 +01:00
Marek Olšák	59c5da40ed	radeonsi: preload PS inputs only if KILL is used so that most shaders can get lower VGPR usage thanks to lazy input loading. I think this is a more accurate constraint that prevents the black transitions in Witcher 2. Affected shaders (7758): Max Waves: 57437 -> 58231 (1.38 %) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 23:43:38 +01:00
Marek Olšák	7b32ae4df5	gallium/radeon: adjust the rule for using the LINEAR_ALIGNED layout Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 23:43:38 +01:00
Marek Olšák	e248390e93	winsys/amdgpu: drop all IBs if at least one was rejected within the context The corruption is inevitable and hangs are possible too. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 23:43:38 +01:00
Marek Olšák	1840800860	winsys/amdgpu: report a rejected IB as a lost context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 23:43:38 +01:00
Dave Airlie	dcfcb3047c	vulkan: import latest registry for 1.0.39 extensions. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-24 08:13:37 +10:00
Dave Airlie	e38bee34bf	vulkan: bump vulkan.h to 1.0.39 version This introduces a bunch of new extension defines. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-24 08:13:23 +10:00
Grazvydas Ignotas	f65b3641c3	radv: don't resubmit the same cs over and over while tracing Fixes: `97dfff54` ("radv: Dump command buffer on hang.") Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: <mesa-stable@lists.freedesktop.org>	2017-01-23 22:27:05 +01:00
Samuel Pitoiset	aa2ace8e49	gallium/radeon: add HUD queries for monitoring some hw blocks It's also possible to monitor them via performance counters but the hardware can only use two counters simultaneously. It seems easier to re-use the existing code which reads from MMIO instead of writing a multi-pass approach. v2: - add new lines after ':' Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-23 21:19:49 +01:00
Samuel Pitoiset	a704f19247	gallium/radeon: refactor the GRBM counters path This will allow to expose more queries in order to know which blocks are busy/idle. v2: - add new lines after ':' Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-23 21:19:49 +01:00
George Kyriazis	00847e4f14	swr: Align query results allocation Some query results struct contents are declared as cache line aligned. Use aligned malloc, and align the whole struct, to be safe. Fixes crash when compiling with clang. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-01-23 14:15:54 -06:00
Bruce Cherniak	b829206b07	swr: Prune empty nodes in CalculateProcessorTopology. CalculateProcessorTopology tries to figure out system topology by parsing /proc/cpuinfo to determine the number of threads, cores, and NUMA nodes. There are some architectures where the "physical id" begins with 1 rather than 0, which was creating and empty "0" node and causing a crash in CreateThreadPool. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102 Reviewed-By: George Kyriazis <george.kyriazis@intel.com> CC: <mesa-stable@lists.freedesktop.org>	2017-01-23 13:52:26 -06:00
Matt Turner	d349449a16	i965: Use UNUSED to silence unused variable (used in assert).	2017-01-23 10:50:20 -08:00
Rainer Hochecker	09b140abb5	dri: allow 16bit R/GR images to be exported via drm buffers This allows eglCreateImageKHR to access P010 surfaces created by vaapi Signed-off-by: Rainer Hochecker <fernetmenta@online.de> Acked-by: Ben Widawky <ben@bwidawsk.net>	2017-01-23 08:47:15 -08:00
Christian König	1338d912f5	st/va: make sure that we call begin_frame() only once v2 This fixes "st/va: delay calling begin_frame until we have all parameters". v2: call begin frame after decoder (re)creation as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Tested-by: Andy Furniss <adf.lists@gmail.com>	2017-01-23 17:00:04 +01:00
Eric Engestrom	50141e131a	drirc: remove spurious tabs Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 16:34:58 +01:00
Nicolai Hähnle	cfabbbcfd7	st/glsl_to_tgsi: use DDIV instead of DRCP + DMUL Fixes GL45-CTS.gpu_shader_fp64.built_in_functions. v2: use DDIV unconditionally (Roland) Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Tested-by: Glenn Kennard <glenn.kennard@gmail.com> Tested-by: James Harvey <lothmordor@gmail.com> Cc: 17.0 <mesa-stable@lists.freedesktop.org>	2017-01-23 16:17:26 +01:00
Nicolai Hähnle	b71c415c3d	glsl: split DIV_TO_MUL_RCP into single- and double-precision flags Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Tested-by: Glenn Kennard <glenn.kennard@gmail.com> Tested-by: James Harvey <lothmordor@gmail.com> Cc: 17.0 <mesa-stable@lists.freedesktop.org>	2017-01-23 16:17:19 +01:00
Nicolai Hähnle	e4f8f9a638	r600: implement DDIV Tested-by: Glenn Kennard <glenn.kennard@gmail.com> Tested-by: James Harvey <lothmordor@gmail.com> Cc: 17.0 <mesa-stable@lists.freedesktop.org>	2017-01-23 16:17:15 +01:00
Nicolai Hähnle	488560cfe6	r600: factor out cayman_emit_unary_double_raw We will use it for DDIV. Tested-by: Glenn Kennard <glenn.kennard@gmail.com> Tested-by: James Harvey <lothmordor@gmail.com> Cc: 17.0 <mesa-stable@lists.freedesktop.org>	2017-01-23 16:17:12 +01:00
Nicolai Hähnle	76b02d2fe1	r600: double multiply can handle only one multiply at a time It seems clear that trying to multiply two pairs of doubles would result in the temporary register getting overwritten by the second pair. So make the code more explicit. Tested-by: Glenn Kennard <glenn.kennard@gmail.com> Tested-by: James Harvey <lothmordor@gmail.com> Cc: 17.0 <mesa-stable@lists.freedesktop.org>	2017-01-23 16:15:45 +01:00
Timothy Arceri	f3f9207786	glsl: fix tes linking regression Fixes regression caused by `cbeba6bd48`. I accidentally pushed the wrong version of the patch.	2017-01-23 19:07:22 +11:00
Timothy Arceri	38a67f020d	mesa: remove unused gl_shader_info field from gl_linked_shader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 14:48:04 +11:00
Timothy Arceri	79f07e87c9	mesa/glsl: set and get cs layouts to and from shader_info Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 14:48:04 +11:00
Timothy Arceri	b96bddae67	mesa/glsl: set and get gs layouts directly to and from shader_info Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 14:48:04 +11:00
Timothy Arceri	cbeba6bd48	mesa/glsl/i965: set and get tes layouts directly to and from shader_info Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 14:48:04 +11:00
Timothy Arceri	64e201ab8f	glsl: use last_vert_prog to get last {clip,cull}_distance_array_size Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 14:48:04 +11:00
Timothy Arceri	fc707f570f	mesa/glsl: set {clip,cull}_distance_array_size directly in gl_program There are some line wrapping violations here but those lines will get deleted in the following patch. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 14:48:04 +11:00
Timothy Arceri	f86d15ed94	st/mesa/glsl: change xfb_program field to last_vert_prog Now that the i965 backend doesn't depend on this field we can make it more generic and short circuit a bunch of code paths. The new field will be used in a following patch for another clean-up. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 14:48:04 +11:00
Timothy Arceri	c505d6d852	mesa: use gl_program for CurrentProgram rather than gl_shader_program This makes much more sense and should be more performant in some critical paths such as SSO validation which is called at draw time. Previously the CurrentProgram array could have contained multiple pointers to the same struct which was confusing and we would often need to fish out the information we were really after from the gl_program anyway. Also it was error prone to depend on the _LinkedShader array for programs in current use because a failed linking attempt will lose the infomation about the current program in use which is still valid. V2: fix validate_io() to compare linked_stages rather than the consumer and producer to decide if we are looking at inward facing shader interfaces which don't need validation. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> To avoid build regressions the following 2 patches were squashed in to this commit: mesa/meta: rewrite _mesa_shader_program_use() and _mesa_program_use() These are rewritten to do what the function name suggests, that is _mesa_shader_program_use() sets the use of all stage and _mesa_program_use() sets the use of a single stage. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> mesa: update active relinked program This likely fixes a subroutine bug were _mesa_shader_program_init_subroutine_defaults() would never have been called for the relinked program as we previously just set _NEW_PROGRAM as dirty and never called the _mesa_use* functions when linking. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-01-23 14:48:04 +11:00
Rob Clark	31daeb5bf1	freedreno/a5xx: set frag shader threadsize Signed-off-by: Rob Clark <robdclark@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-01-22 14:12:05 -05:00
Rob Clark	8d6af93e76	freedreno/a5xx: set fragcoordxy properly What a3xx docs call IJPERSPCENTERREGID.. the xy coord passed into bary.f. We were incorrectly setting both this and gl_FragCoord.xy to the same register resulting in all sorts of hilarity. Fixes stk, vdrift, 0ad, probably a bunch others. Signed-off-by: Rob Clark <robdclark@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-01-22 14:11:43 -05:00
Rob Clark	278b97946f	freedreno/ir3: setup var locations in standalone compiler Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-01-22 14:11:26 -05:00
Rob Clark	6cc93bedc1	freedreno/a5xx: fix psize Note spritelist (POINTLIST_PSIZE) seems not to be a thing anymore on a5xx. Signed-off-by: Rob Clark <robdclark@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-01-22 14:11:15 -05:00
Rob Clark	141a4f86d6	freedreno/a5xx: srgb fix Signed-off-by: Rob Clark <robdclark@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-01-22 14:11:04 -05:00
Rob Clark	69fbb458cf	freedreno/a5xx: fix int vbos Signed-off-by: Rob Clark <robdclark@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-01-22 14:10:54 -05:00
Rob Clark	16671e9704	freedreno/a5xx: fix clear for uint/sint formats Signed-off-by: Rob Clark <robdclark@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-01-22 14:10:42 -05:00
Rob Clark	4d9aa4f67d	freedreno/a5xx: fix cull state Signed-off-by: Rob Clark <robdclark@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-01-22 14:10:28 -05:00
Rob Clark	4c39458460	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-01-22 14:09:45 -05:00
Lionel Landwerlin	494b63f525	anv: descriptors: don't update immutables samplers with anything but their immutable value Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-21 19:22:27 +00:00
Jason Ekstrand	bb96b03461	nir/search: Use the correct bit size for integer comparisons The previous code always compared integers as 64-bit. Due to variations in sign-extension in the code generated by nir_opt_algebraic.py, this meant that nir_search doesn't always do what you want. Instead, 32-bit values should be matched as 32-bit and 64-bit values should be matched as 64-bit. While we're here we unify the unsigned and signed paths. Now that we're using the right bit size, they should be the same since the only difference we had before was sign extension. This gets the UE4 bitfield_extract optimization working again. It had stopped working due to the constant 0xff00ff00 getting sign-extended when it shouldn't have. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-01-21 10:34:21 -08:00
Jason Ekstrand	817f9e3b17	intel/blorp/copy: Properly handle clear colors for CCS_E images In order to handle CCS_E, we stomp the image format to a UINT format and then do some bitcasting logic in the shader. This works fine since SKL render compression only considers the channel layout of the format and not the format itself. In order for this to work on images that have been fast-cleared, we need to also convert the clear color so that, when interpreted as UINT, it provides the same bit value as it would have in the original format. This fixes a bunch of OpenGL ES CTS tests for copy_image when we start using CCS more aggressively. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-01-21 10:34:09 -08:00
Kenneth Graunke	bb5db5564f	glsl: Rename [u]int64_t tokens. basetsd.h on Windows defines INT64 and UINT64 typedefs which conflict with these. Append "_TOK" to avoid conflicts. Should fix the Windows build. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 19:39:20 -08:00
Matt Turner	892781d6c7	Revert "i965: Really don't emit Q or UQ moves on Gen < 8" This reverts commit `c95380c404`. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 19:12:31 -08:00
Matt Turner	d871f8e820	i965: Select DF type for 64-bit integers on Gen < 8. Gen8 adds Q/UQ types. We attempted to change the types back to DF in the generator (commit `c95380c40`), but an assertion added in the FP64 series (commit `e481dcc3`) triggers before that code has a chance to execute. In fact, using Q/UQ in the IR and then changing to DF in the generator would not work in the presence of source modifiers, etc. Fixes: `d6fcede6` ("i965: Return Q and UQ types for int64 and uint64") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 19:12:24 -08:00
Ian Romanick	db6d23cfd2	i965: Enable ARB_gpu_shader_int64 on Gen8+ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	fc16bf125f	i965: Split SIMD16 CMP of Q and UQ instructions This is basically the same as happens for doubles. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	51807c6493	i965: Enable 64-bit integer support for almost all unary and binary operations Integer comparison functions (e.g., nir_op_ilt) are handled in the next commit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	821d7cece8	i965: Enable uploading 64-bit integer uniforms Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	e0579c5017	i965: Add 64-bit integer support for conversions and bitcasts v2 (idr): Make the "from" type in a cast unsized. This reduces the number of required cast operations at the expensive slightly more complex code. However, this will be a dramatic improvement when other sized integer types are added. Suggested by Connor. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	f2fa510594	i965: Enable emitting Q and UQ instructions in the fs backend v2: Fixup assertion in brw_reg_type_to_hw_type to allow BRW_REGISTER_TYPE_{UQ,Q} on Gen8+. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	409e0b2d48	i965: Add support for constant evaluation on Q and UQ types Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	d6fcede60f	i965: Return Q and UQ types for int64 and uint64 It seems like maybe this should return a different type based on Gen. Q and UQ only exist on Gen8+, but, based on the old comment, I believe previous Gens can generate 64-bit moves. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	c95380c404	i965: Really don't emit Q or UQ moves on Gen < 8 It's much easier to do this in the generator rather than while coming out of NIR. brw_type_for_nir_type doesn't know the Gen, so we'd have to add a bunch of plumbing. The alternate fix is to not emit int64 moves for doubles in the first place... but that seems even more difficult. This change won't catch non-MOV instructions that try to use 64-bit integer types on Gen < 8. This may convert certain kinds of bugs in to different kinds of bugs that are more difficult to detect (since the assertions in the function won't catch them). NOTE: I don't think anything can emit mixed-type 64-bit moves until the same platform supports both ARB_gpu_shader_fp64 and ARB_gpu_shader_int64. When we enable int64 on Gen < 8, we can solve this problem other ways. This prevents regressions on HSW in the next patch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	30164d501d	nir: Add support for 64-bit integer types to split_var_copies_block Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	3c9b35372b	nir: Enable 64-bit integer support for almost all unary and binary operations v2: Don't up-convert the shift count parameter if shift instructions. Suggested by Connor. Add type_is_singed() function. This will make adding 8- and 16-bit types easier. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Cc: Jason Ekstrand <jason@jlekstrand.net>	2017-01-20 15:41:23 -08:00
Ian Romanick	fda33e09d8	nir: Shift count for shift opcodes is always 32-bits Previously both sources were unsized. This caused problems when the thing being shifted was 64-bit but the shift count was 32-bit. The expectation in NIR is that all unsized sources (and destination) will ultimately have the same size. The changes in nir_opt_algebraic.py are to prevent errors like: Failed to parse transformation: 03:12:25 (('extract_i8', 'a', 'b'), ('ishr', ('ishl', 'a', ('imul', ('isub', 3, 'b'), 8)), 24), 'options->lower_extract_byte') 03:12:25 Traceback (most recent call last): 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 610, in __init__ 03:12:25 xform = SearchAndReplace(xform) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 495, in __init__ 03:12:25 BitSizeValidator(varset).validate(self.search, self.replace) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 311, in validate 03:12:25 validate_dst_class = self._validate_bit_class_up(replace) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 414, in _validate_bit_class_up 03:12:25 src_class = self._validate_bit_class_up(val.sources[i]) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 420, in _validate_bit_class_up 03:12:25 assert src_class == src_type_bits 03:12:25 AssertionError Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Cc: Jason Ekstrand <jason@jlekstrand.net>	2017-01-20 15:41:23 -08:00
Ian Romanick	8ad74a2745	nir: Lower packing and unpacking of 64-bit integer types This change makes me wonder whether double packing should be reimplemented as int64BitsToDouble(packInt2x32(v)). I'm a little on the fence since not all platforms that support fp64 natively support int64. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	3460d05a71	nir: Add 64-bit integer support for conversions and bitcasts v2 (idr): "cut them down later" => Remove ir_unop_b2u64 and ir_unop_u642b. Handle these with extra i2u or u2i casts just like uint(bool) and bool(uint) conversion is done. v3 (idr): Make the "from" type in a cast unsized. This reduces the number of required cast operations at the expensive slightly more complex code. However, this will be a dramatic improvement when other sized integer types are added. Suggested by Connor. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	3ca0029a0d	nir: Add 64-bit integer constant support v2: Rebase on `19a541f` (nir: Get rid of nir_constant_data) Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v1]	2017-01-20 15:41:23 -08:00
Ian Romanick	48e122244b	nir: Add GLSL_TYPE_INT64 and GLSL_TYPE_UINT64 to glsl_get_bit_size Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	81952814a3	glsl: Optimize redundant pack(unpack()) and unpack(pack()) combinations The lowering passes 64-bit integer operations will generate a lot of these. v2: Modify the HANDLE_PACK_UNPACK_INVERSE so that the breaks apply to the switch instead of the 'do { } while(true)' loop. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	7122d851aa	glsl: Add a lowering pass for 64-bit integer modulus Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	695b04f7eb	glsl: Add "built-in" functions to do 64%64 => 64 modulus These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! v2: Use function inlining. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	82c31f3eb9	glsl: Add a lowering pass for 64-bit integer division Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	012f2995c3	glsl: Add "built-in" functions to do 64/64 => 64 division These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! v2: Use function inlining. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	50d52df278	glsl: Add a lowering pass for 64-bit integer sign() Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	6b03b345eb	glsl: Add "built-in" function for 64-bit integer sign() These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	6c3af04363	glsl: Add a lowering pass for 64-bit integer multiplication v2: Rename lower_64bit.cpp and lower_64bit_test.cpp to lower_int64. Suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	330fc2413c	glsl: Add "built-in" functions to do 64x64 => 64 multiplication These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	aa38bf1e59	glsl: Move builtin_function related prototypes to a separate file Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	8358e58f25	glsl/standalone: Enable ARB_gpu_shader_int64 v2: Add missing break in GLSL_TYPE_INT64 case. Notice by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	8dfea5348c	i965: Avoid int64 warnings. Just add operations to the switch statement here. v2 (idr): "cut them down later" => Remove ir_unop_b2u64 and ir_unop_u642b. Handle these with extra i2u or u2i casts just like uint(bool) and bool(uint) conversion is done. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	c101cee2ba	i965: Avoid int64 induced warnings Just add types into unsupported or double equivalent spots. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	a53f315ad8	mesa/program: Add unused ir operations. v2 (idr): "cut them down later" => Remove ir_unop_b2u64 and ir_unop_u642b. Handle these with extra i2u or u2i casts just like uint(bool) and bool(uint) conversion is done. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	f82ced5af3	glsl: Allow GLSL_TYPE_INT64 for ir_unop_abs and ir_unop_sign Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	8e7e1ae036	glsl: Print GLSL_TYPE_UINT64 and GLSL_TYPE_INT64 values Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Ian Romanick	0d14fec345	glsl: Add interaction between ARB_gpu_shader_int64 and ARB_shader_clock If ARB_gpu_shader_int64 is supported, ARB_shader_clock also adds clockARB() that returns a uint64_t. Rather than add new opcodes and intrinsics for this, just wrap the existing intrinsic with a packUint2x32. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	bfc4080d38	glsl: Add 64-bit integer functions These are all the allowed 64-bit functions from ARB_gpu_shader_int64 spec. v2: restrict int64/double functions better. v3 (idr): Delete spurious blank lines. Suggested by Matt. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	050f38ef0b	glsl/varying_packing: Add 64-bit integer support As for the double code, but using the 64-bit integer conversions. v2 (idr): Remove some spurious u2i() and i2u() operations when packing and unpacking, respectively, int64_t varyings. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	923aebdd46	glsl/ast: Add 64-bit integer support in some places. Just add support in two more places in ast parsing. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	9ba9a7f854	glsl: Add 64-bit integer support to some operations. This adds 64-bit integer support to some AST and IR operations where it is needed. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	25c7a61b28	glsl/ir_builder: Add support for some 64-bit bitcasts. We need builder support to implement some of the builtins. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	78cc44280e	glsl/ast: Add 64-bit integer support to conversion functions This adds support to call the new operations on conversions. v2 (idr): Delete an unnecessary break-statement. Noticed by Matt. Add a missing blank line. Noticed by Ian. v3 (idr): "cut them down later" => Remove ir_unop_b2u64 and ir_unop_u642b. Handle these with extra i2u or u2i casts just like uint(bool) and bool(uint) conversion is done. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com> [v2] Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	85faf5082f	glsl: Add 64-bit integer support for constant expressions This just adds the new operations and add 64-bit integer support to all the existing cases where it is needed. v2: fix some issues found in testing. v2.1: add unreachable (Ian), add missing int/uint pack/unpack (Dave). v3 (idr): Rebase on top of idr's series to generate ir_expression_operation_constant.h. In addition, this version: Adds missing support for ir_unop_bit_not, ir_binop_all_equal, ir_binop_any_nequal, ir_binop_vector_extract, ir_triop_vector_insert, and ir_quadop_vector. Removes support for uint64_t from ir_unop_abs and ir_unop_sign. v4 (idr): "cut them down later" => Remove ir_unop_b2u64 and ir_unop_u642b. Handle these with extra i2u or u2i casts just like uint(bool) and bool(uint) conversion is done. Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v2] Reviewed-by: Matt Turner <mattst88@gmail.com> [v3] Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	a68b6ee063	glsl/ir: Add support for 64-bit integer conversions. This adds all the conversions in the world, I'm not 100% sure of all of these are needed, but add all of them and we can cut them down later. v2: fix issue with packing output types. v3 (idr): Rebase on top of idr's series to generate ir_expression_operation_constant.h. Fix transposed ir_validate assertions for ir_unop_u642i64 and ir_unop_i642u64. Add missing automatic type setup for ir_unop_u642i64 and ir_unop_i642u64. v4 (idr): "cut them down later" => Remove ir_unop_b2u64 and ir_unop_u642b. Handle these with extra i2u or u2i casts just like uint(bool) and bool(uint) conversion is done. Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v2] Reviewed-by: Matt Turner <mattst88@gmail.com> [v3] Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	7dd63c10c3	glsl: Add 64-bit integer support to uniform initialiser code Just add support to the double case, same code should work. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	8df5287c23	glsl/varyings: Add 64-bit integer support. This adds 64-bit ints to the link_varyings 64-bit support. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	bbce1c538d	glsl/ast/ir: Add 64-bit integer constant support This adds support for 64-bit integer constants to the parser, ast and ir. v2: fix a few issues found in testing. v3: Add missing ir_constant copy contructor support. v4: Use PRIu64 and PRId64 in printfs in glsl_parser_extras.cpp. Suggested by Nicolai. Rebase on Marek's linalloc changes. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v2] Reviewed-by: Matt Turner <mattst88@gmail.com> [v3] Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	249007d13c	mesa: Add support for 64-bit integer uniforms This hooks up the API to the internals for 64-bit integer uniforms. v2: update to use non-strict aliased alternatives Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	8ce53d4a2f	glsl: Add basic ARB_gpu_shader_int64 types This adds the builtins and the lexer support. To avoid too many warnings, it adds basic support to the type in a few other places in mesa, mostly in the trivial places. It also adds a query to be used later for if a type is an integer 32 or 64. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	e90830bb8e	glsl: Add ARB_gpu_shader_int64 boilerplate. This just adds the basic boilerplate support. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	839ce21143	mesa: Add ARB_gpu_shader_int64 extension bits This just adds the usual boilerplate in mesa core. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Dave Airlie	150f2fa789	mapi: Add support for ARB_gpu_shader_int64. Just add the boilerplate xml code. v2 (idr): Update dispatch_sanity. Only add extension functions in core profile. v3 (idr): Remove comment line from gl_API.xml. Suggested by Matt. Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-20 15:41:23 -08:00
Lionel Landwerlin	74c23bde5b	anv: don't require render target isl bit for depth/stencil surfaces Blorp can deal with depth/stencil surfaces blits/copies without the render target requirement. Also having both render target and depth/stencil requirement is incompatible from isl's point of view. This fixes an image creation issue in the high level quality settings of the Unity3D player, which requires a depth texture with src/dst transfer & 4x multisampling. v2: Simply aspect checking condition (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>	2017-01-20 21:39:51 +00:00
Lionel Landwerlin	8a28e764d0	spirv: don't assert with location decorations on non i/o variables Some applications might add location decoration to samplers. Rather than raising an error it seems it would make more sense to just discard these decorations. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: 17.0 <mesa-stable@lists.freedesktop.org>	2017-01-20 21:39:46 +00:00
Matt Turner	f57bdd4849	i965: Validate "Special Cases for Byte Operations" Do this in general_restrictions_based_on_operand_types() because the two rules that "Special Cases for Byte Operations" relax are checked there.	2017-01-20 11:40:52 -08:00
Matt Turner	75b7f5a269	i965: Validate "Region Alignment Rules"	2017-01-20 11:40:52 -08:00
Matt Turner	f817d132c1	i965: Validate "General Restrictions Based on Operand Types"	2017-01-20 11:40:52 -08:00
Matt Turner	83696b2234	i965: Validate "General Restrictions on Regioning Parameters"	2017-01-20 11:40:52 -08:00
Matt Turner	df0b7bcdfd	i965: Replace reg_type_size[] with a function. A function is necessary to handle immediate types. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	ada891d472	i965: Validate math instruction sources. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	fce0612fc2	i965: Claim that SEND/math has two sources. src1 must be a descriptor (including the information to determine that the SEND is doing an extended math operation), but src0 can actually be null since it serves as the source of the implicit GRF -> MRF move. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	c9724682b5	i965: Simplify num_sources_from_inst(). desc will always be non-NULL, because brw_validate_instructions() does not attempt to validate any instructions that fail the is_unsupported_inst() check. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	9fd12666d0	i965: Factor out send_restrictions() function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	7abc65dd7c	i965: Factor out sources_not_null() validation function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	a693305b61	i965: Structure code so unsupported inst will not generate more errors. We want to rely on brw_opcode_desc() always returning non-NULL in other validation functions. Other validation functions will be in the else case of the block added in this patch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	f0429359cc	i965: Add a test for the EU assembly validator. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	ae9c69e1cf	i965: Add a CHECK macro to call more complicated validation funcs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	25448e4b7e	i965: Make ERROR_IF usable from other functions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	f9a4fc9b15	i965: Mark error annotation on correct SIMD16 inst. inst, whose assignment can be seen in the last line of context pointed to the correct instruction in the SIMD16 program, but src_offset was the offset from the beginning of the SIMD16 program. So if an instruction at offset 0x100 in the SIMD16 program was illegal, we would mark an error on the instruction at offset 0x100 (which is likely in the SIMD8 program). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	59003f3447	i965/vec4: Use UW-typed operands when dest is UW. Using a UD-typed operand makes the execution size D, and if the size of the execution type is greater than the size of the destination type, the destination must be appropriately strided. We actually just want UW-types all around. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	68bcbfa9e4	i965: Use W-typed immediate in brw_F32TO16(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	3eada948a0	gtest: Update to 1.8.0. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-20 11:40:52 -08:00
Matt Turner	cbc39e541f	i965: Don't change F->VF if dest type is DF. We change the immediate source type to VF to allow instruction compaction, but there are no entires in the compaction table for DF, so there's no point in doing this. Additionally, I mixing floating-point types is now allowed except for F and VF.	2017-01-20 11:40:52 -08:00
Lionel Landwerlin	a72dea9483	anv: fix comment typo Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-20 16:46:32 +00:00
Lionel Landwerlin	0c3d058723	spirv: fix warn string typo Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-20 16:46:29 +00:00
Lionel Landwerlin	bac6fe5c77	blorp: remove unnecessary struct declaration Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-20 16:46:21 +00:00
Marek Olšák	74f40d1570	Revert "radeonsi: reject invalid vertex element formats" This reverts commit `9e4d1d8a7c`. It broke arb_vertex_type_10f_11f_11f_rev-draw-vertices, which has first_non_void == -1.	2017-01-20 16:02:45 +01:00
Philipp Zabel	a37cf630b4	gallium: add pipe_screen::resource_changed callback wrappers Add resource_changed to the ddebug, rbug, and trace wrappers. Since it is optional, there is no need to add it to noop. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Suggested-by: Nicolai Hähnle <nhaehnle@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-01-20 15:30:30 +01:00
Philipp Zabel	97de7e6586	st/mesa: ask pipe driver to recreate derived internal resources when (re-)binding external textures Use the resource_changed callback to invalidate internal resources derived from external textures when they are (re-)bound. This is needed to comply with the requirement from the GL_OES_EGL_image_external extension that a call to glBindTexture guarantees that all further sampling will return values that correspond to the values in the external texture at or after the time that glBindTexture was called. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-01-20 15:30:30 +01:00
Philipp Zabel	9bab714c61	mesa: update external textures when (re-)binding To comply with the requirement from the GL_OES_EGL_image_external extension that a call to glBindTexture guarantees that all further sampling will return values that correspond to the values in the external texture at or after the time that glBindTexture was called, do not bail out early from mesa_BindTextures if the target is external. This will later allow the state tracker to instruct the pipe driver to invalidate internal resources derived from the external texture. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-01-20 15:30:30 +01:00
Philipp Zabel	c70ed79e79	etnaviv: implement resource_changed to invalidate internal resources derived from imported buffers Implement the resource_changed pipe callback to invalidate internal resources derived from imported buffers. This is needed to update the texture for re-imported renderables. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-01-20 15:30:30 +01:00
Philipp Zabel	362edc868c	etnaviv: initialize seqno of imported resources Imported resources already have contents that we want to be copied to texture resources derived from them. Set initial seqno of imported resources to 1, just as if it had already been rendered to. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-01-20 15:30:29 +01:00
Philipp Zabel	2c95d6dac3	st/dri: ask the driver to update its internal copies on reimport For imported buffers that can't be used directly as a source to the texture samplers, the pipe driver might need to create an internal copy, for example in a different tiling layout. When buffers are reimported they may contain new image data, so the driver internal copies need to be recreated. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-01-20 15:30:29 +01:00
Philipp Zabel	30853f55a3	gallium: add pipe_screen::resource_changed Add a hook to tell drivers that an imported resource may have changed and they need to update their internal derived resources. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-01-20 15:30:29 +01:00
Emil Velikov	5872850b88	configure.ac: move require_dri_shared_libs_and_glapi() before its users Otherwise we'll get a lovely message as below: "require_dri_shared_libs_and_glapi: command not found" Cc: Steven Newbury <steve@snewbury.org.uk> Reported-by: Steven Newbury <steve@snewbury.org.uk> Fixes: `da410e6afa` "configure: explicitly require shared glapi for enable-dri" Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Steven Newbury <steve@snewbury.org.uk> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-20 14:27:08 +00:00
Samuel Pitoiset	383fc8e9f3	gallium/hud: add missing break in hud_cpufreq_graph_install() Fixes: `e99b9395be` "gallium/hud: Add support for CPU frequency monitoring" Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-01-20 10:33:47 +01:00
Tapani Pälli	4148881513	android: correct typo in build Fixes: `63c58dfc65` Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-20 07:49:10 +02:00
Elie TOURNIER	9fdaeb7776	nir: add min/max optimisation Add the following optimisations: min(x, -x) = -abs(x) min(x, -abs(x)) = -abs(x) min(x, abs(x)) = x max(x, -abs(x)) = x max(x, abs(x)) = abs(x) max(x, -x) = abs(x) shader-db: total instructions in shared programs: 13067779 -> 13067775 (-0.00%) instructions in affected programs: 249 -> 245 (-1.61%) helped: 4 HURT: 0 total cycles in shared programs: 252054838 -> 252054806 (-0.00%) cycles in affected programs: 504 -> 472 (-6.35%) helped: 2 HURT: 0 Signed-off-by: Elie Tournier <tournier.elie@gmail.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-19 21:44:28 -08:00
Jason Ekstrand	f22ee14644	nir/algebraic: Only include nir_search_helpers once We were including it once per value, so probably around 10k times. Let's not cause the compiler any more work than we have to. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-01-19 21:40:30 -08:00
Anuj Phogat	6de293284b	i965: Remove unnecessary mt->compressed checks It's harmless to use ALIGN_NPOT() for uncompressed formats because they have block width/height = 1. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-01-19 14:28:18 -08:00
Anuj Phogat	c7e37a0cb8	i965: Fix indentation in brw_miptree_layout_2d() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-01-19 14:28:18 -08:00
Anuj Phogat	47d9b3a9dd	i965: Fix comment to include 3d textures Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-01-19 14:28:18 -08:00
Chad Versace	de0b0a3a9c	i965: Delete pending CCS and HiZ ops in intel_miptree_make_shareable() Fixes crash in piglit `egl_khr_gl_renderbuffer_image-clear-shared-image GL_DEPTH_COMPONENT24` on Skylake. The crash happened because blorp attempted to execute a pending hiz clear after the hiz buffer was deleted. Deleting the pending hiz ops when the hiz buffer gets deleted fixes the crash. For good measure, this patch also deletes all pending CCS/MCS ops when the CCS/MCS buffer gets deleted. I'm now aware of any bugs caused by the dangling ops, but deleting them is clearly the right thing to do. Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99265	2017-01-19 13:47:57 -08:00
Andres Rodriguez	e0674e740b	vulkan/wsi: clarify the severity of lack of DRI3 v2 The current message sounds like a small warning, clarify that it can result in lack of presentation support and application crashes. v2: add "if they do" (Bas) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98263 Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Acked-by: Jason ekstrand <jason@jlekstrand.net> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-19 15:41:42 +00:00
Andres Rodriguez	a3ad6a34c6	radv: fix include order for installed headers v2 In situations where libdrm_amdgpu and mesa are installed to the same location, the mesa installed headers will take precedence over the git source headers. This is due to the AMDGPU_CFLAGS containing the install directory. This situation can cause build errors if the git version of a header is newer than the currently installed version of a header (e.g. git pull updates vulkan.h) Note: using the same install prefix for mesa and libdrm is probably a common occurrence since it is described in the radeonBuildHowTo wiki: https://www.x.org/wiki/radeonBuildHowTo/ v2: added sign-off Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-19 15:41:38 +00:00
Emil Velikov	0f8afde7ba	docs/releasing: document post branch version bump Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-19 15:38:30 +00:00
Emil Velikov	49e4204b12	mesa: Bump version to 17.1.0-devel Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-19 15:38:30 +00:00
Marek Olšák	9e4d1d8a7c	radeonsi: reject invalid vertex element formats This should fix a coverity defect. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-01-19 16:38:37 +01:00
Marek Olšák	e490b7812c	radeonsi: don't forget to add HTILE to the buffer list for texturing This fixes VM faults. Discovered by Samuel Pitoiset. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98975 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99450 Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-01-19 16:38:37 +01:00
Nayan Deshmukh	31908d6a4a	st/vdpau: only send buffers with B8G8R8A8 format to X PresentPixmap only works if the pixmap depth matches with the window depth, otherwise it returns a BadMatch protocol error. Even if the depths match, the result won't look correctly if the VDPAU RGB component order doesn't match the X11 one so we only allow the X11 format. For other buffers we copy them to a buffer which is send to X. v2: only send buffers with format VDP_RGBA_FORMAT_B8G8R8A8 v3: reword commit message v4: add comment explaining the code Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-01-19 15:34:02 +01:00
Nicolai Hähnle	3cd092c415	radeonsi: fix texture gather on stencil textures At least on VI, texture gather doesn't work with a 24_8 data format, so use 8_8_8_8 and a modified swizzle instead. A bit of background: When creating a GL_STENCIL_INDEX8 texture, we select the X24S8 pipe format because we don't support stencil-only render targets properly. With mip-mapping this can lead to a setup where the tiling is incompatible with stencil texturing, and a flushed stencil texture is used. For the flushed stencil, a literal X24S8 is used because there were issues with an 8bpp DB->CB copy. Longer term, it would be good if we could get away from these workarounds, i.e. properly support an S8 format for stencil-only rendering and flushed stencil. Since stencil texturing is somewhat rare, it's not a high priority. Fixes GL45-CTS.texture_cube_map_array.sampling. Cc: 17.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-01-19 15:02:57 +01:00
Alejandro Piñeiro	905961452a	mesa/main: Fix FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE for NONE attachment type When the attachment type is NONE (att->Type), FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE should be NONE always. Note that technically, the current behaviour follows the spec. From OpenGL 4.5 spec, Section 9.2.3 "Framebuffer Object Queries": "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, then either no framebuffer is bound to target; or the default framebuffer is bound, attachment is DEPTH or STENCIL, and the number of depth or stencil bits, respectively, is zero." Reading literally this paragraph, for the default framebuffer, NONE should be only returned if attachment is DEPTH and STENCIL without being allocated. But it doesn't makes too much sense to return DEFAULT_FRAMEBUFFER if the attachment type is NONE. For example, this can happens if the attachment is FRONT_RIGHT run on monoscopic mode, as that attachment is only available on stereo mode. With the current behaviour, defensive querying of the object type would not work properly. So you could query the object type checking for NONE, get DEFAULT_FRAMEBUFFER, and then get and INVALID_OPERATION when requesting other pnames (like RED_SIZE), as the real attachment type is NONE. This fixes: GL45-CTS.direct_state_access.framebuffers_get_attachment_parameters v2: don't change the behaviour for att->Type != GL_NONE, as caused some ES CTS regressions v3: simplify condition (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-01-19 11:55:41 -02:00
Zachary Michaels	d7d32b3bfe	radeonsi: Always leave poly_offset in a valid state This commit makes si_update_poly_offset set poly_offset to NULL if uses_poly_offset is false. This way poly_offset either points into the currently queued rasterizer, or it is NULL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99451 Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-19 10:50:16 +01:00
Nicolai Hähnle	a7c635ec65	mesa/main: fix meta caller of _mesa_ClampColor Since _mesa_ClampColor properly checks for support of the API function now, it's meta callers need to check support as well. Fixes: `963311b71f` ("mesa/main: fix version/extension checks in _mesa_ClampColor") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99401 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org>	2017-01-19 09:13:25 +01:00
Timothy Arceri	4d65f68a9b	mesa/glsl: move TransformFeedbackBufferStride to gl_shader Here we remove the single use of this field in gl_linked_shader which allows us to move the field out of gl_shader_info While we are at it we rewrite link_xfb_stride_layout_qualifiers() to be more clear. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 17:05:26 +11:00
Timothy Arceri	e603cf1841	glsl: exit loop early if we find xfb layout qualifers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 17:05:26 +11:00
Timothy Arceri	7983ed5f65	glsl: set InnerCoverage directly in gl_program Also move out of the shared gl_shader_info. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 17:05:26 +11:00
Timothy Arceri	1f141eaef6	glsl: tidy up PostDepthCoverage shader field There is no reason for this to be in the shared gl_shader_info or to copy it to gl_program at the end of linking (its already there). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 17:05:26 +11:00
Timothy Arceri	3d41f4b990	mesa/glsl: move pixel_center_integer to gl_shader This is only used by gl_linked_shader as a temp during linking so use a temp there instead. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 17:05:26 +11:00
Timothy Arceri	0a9d102ddc	mesa/glsl: move origin_upper_left to gl_shader This is only used by gl_linked_shader as a temp during linking so use a temp there instead. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 17:05:26 +11:00
Timothy Arceri	ceeedb9bb0	mesa/glsl: move uses_gl_fragcoord to gl_shader This is only used by gl_linked_shader as a temp during linking so use a temp there instead. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 17:05:26 +11:00
Timothy Arceri	66a6050ad8	mesa/glsl: move redeclares_gl_fragcoord to gl_shader This is never used in gl_linked_shader other than as a temp during linking so just use a temp instead. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 17:05:26 +11:00
Timothy Arceri	cc7ecce253	mesa/glsl: move ARB_fragment_coord_conventions_enable field This is only used by gl_shader not gl_linked_shader so move it there. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 17:05:26 +11:00
Timothy Arceri	ae28c5a60c	st/mesa/glsl: set early_fragment_tests directly in shader_info We also move EarlyFragmentTests out of the gl_shader_info struct as it is now only used by gl_shader. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 17:05:26 +11:00
Timothy Arceri	5c93d27423	mesa/glsl/i965: set and use tcs vertices_out directly Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 17:05:26 +11:00
Timothy Arceri	4cd709e2bc	i965: get outputs_written from gl_program There is no need to go via the pointer in nir_shader. This change is required for the shader cache as we don't create a nir_shader. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 17:05:26 +11:00
Dave Airlie	ef71b867ee	gallivm: use #ifdef not #if for PIPE_ARCH_BIG_ENDIAN This fixes the build on ppc/s390. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-19 16:00:53 +10:00
Timothy Arceri	3fe8d04a6d	mesa: don't always set _NEW_PROGRAM when linking We only need to set it when linking was successful and the program being linked is currently active. The programs_in_use mask is just used as a flag for now but in a future change we will use it to update the CurrentProgram array. V2: make sure to flush vertices before linking (suggested by Marek) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-19 15:55:02 +11:00
Timothy Arceri	aad93402c0	mesa: change init subroutine defaults helper to work per gl_program A later patch will result in SSO programs calling this helper per gl_program rather than per gl_shader_program. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 15:55:02 +11:00
Timothy Arceri	90d950038f	mesa/glsl: move ProgramResourceList to gl_shader_program_data We also move NumProgramResourceList at the same time. GLES does interface validation on SSO at runtime so we need to move this to be able to switch to storing gl_program pointers in CurrentProgram. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-19 15:55:02 +11:00
Timothy Arceri	62f718bfcb	glsl: store number of explicit uniform loactions in gl_shader_program This allows us to cleanup the functions that pass this count around, but more importantly we will be able to call the uniform linking functions from that backends linker without having to pass this information to the backend directly via Driver.LinkShader(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-19 15:55:02 +11:00
Timothy Arceri	c054bbf0d4	glsl: create a new link_and_validate_uniforms() helper Currently this just breaks up the linking code a bit but in the future i965 will call this from the backend via Driver.LinkShader() so that we can do NIR optimisations before assigning uniform locations. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-19 15:55:02 +11:00
Timothy Arceri	ce4fb3c8a1	glsl: make a bunch of varying linking functions static Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-19 15:55:02 +11:00
Timothy Arceri	90fffd1770	glsl: move more varying linking code to link_varyings.cpp Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-19 15:55:02 +11:00
Topi Pohjolainen	180653c357	i965/blorp: Make post draw flush more explicit Blits do not need any special treatment as the target buffer object is added to render cache just as one does for normal draw. Color clears and resolves in turn require explicit "end of pipe synchronization". It is not clear what this means exactly but the assumption is that render cache flush with command stream stall should be sufficient. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-18 22:42:47 +02:00
Topi Pohjolainen	46b346899d	i965/gen6: Issue direct depth stall and flush after depth clear instead of calling unconditionally brw_emit_mi_flush() which does: brw_emit_pipe_control_flush(brw, PIPE_CONTROL_DEPTH_CACHE_FLUSH \| PIPE_CONTROL_RENDER_TARGET_FLUSH \| PIPE_CONTROL_CS_STALL); brw_emit_pipe_control_flush(brw, PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE \| PIPE_CONTROL_CONST_CACHE_INVALIDATE); Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-18 22:42:47 +02:00
Topi Pohjolainen	e6da6943fe	i965: Make depth clear flushing more explicit Current blorp logic issues unconditional "flush everything" (see brw_emit_mi_flush()) after each render. For example, all blits issue this unconditionally which shouldn't be needed if they set render cache properly so that subsequent renders do necessary flushing before drawing. In case of piglit: ext_framebuffer_multisample-accuracy all_samples depth_draw small intel_hiz_exec() is always preceded by blorb blit and the unconditional flush looks to hide the lack of stall and flushes in depth clears. By removing the brw_emit_mi_flush() I get gpu hangs. This patch adds the stalls and flushes mandated by the spec and gets rid of those hangs. v2 (Jason, Ken): Document the rational for separating depth cache flush and stall on Gen7. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-18 22:42:47 +02:00
Topi Pohjolainen	4840a53e90	i965/blorp: Use the render cache mechanism instead of explicit flushing by replacing brw_emit_mi_flush() with brw_render_cache_set_check_flush(). The latter splits the flush in two: brw_emit_pipe_control_flush(brw, PIPE_CONTROL_DEPTH_CACHE_FLUSH \| PIPE_CONTROL_RENDER_TARGET_FLUSH \| PIPE_CONTROL_CS_STALL); brw_emit_pipe_control_flush(brw, PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE \| PIPE_CONTROL_CONST_CACHE_INVALIDATE); instead of int flags = PIPE_CONTROL_NO_WRITE \| PIPE_CONTROL_RENDER_TARGET_FLUSH; if (brw->gen >= 6) { flags \|= PIPE_CONTROL_INSTRUCTION_INVALIDATE \| PIPE_CONTROL_CONST_CACHE_INVALIDATE \| PIPE_CONTROL_DEPTH_CACHE_FLUSH \| PIPE_CONTROL_VF_CACHE_INVALIDATE \| PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE \| PIPE_CONTROL_CS_STALL; } brw_emit_pipe_control_flush(brw, flags); v2 (Jason): Check that destination exists before trying to add to render cache. Depth clears and resolves don't have it. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-18 22:42:47 +02:00
Emil Velikov	ea8b2624c8	utils: really remove the __END_DECLS macro Fixes: `d1efa09d34` "util: import sha1 implementation from OpenBSD" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 20:09:57 +00:00
Emil Velikov	9f8dc3bf03	utils: build sha1/disk cache only with Android/Autoconf Earlier commit imported a SHA1 implementation and relaxed the SHA1 and disk cache handling, broking the Windows builds. Restrict things for now until we get to a proper fix. Fixes: `d1efa09d34` "util: import sha1 implementation from OpenBSD" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 20:09:01 +00:00
Emil Velikov	d1efa09d34	util: import sha1 implementation from OpenBSD At the moment we support 5+ different implementations each with varying amount of bugs - from thread safely problems [1], to outright broken implementation(s) [2] In order to accommodate these we have 150+ lines of configure script and extra two configure toggles. Whist an actual implementation being ~200loc and our current compat wrapping ~250. Let's not forget that different people use different code paths, thus effectively makes it harder to test and debug since the default implementation is automatically detected. To minimise all these lovely experiences, import the "100% Public Domain" OpenBSD sha1 implementation. Clearly document any changes needed to get building correctly, since many/most of those can be upstreamed making future syncs easier. As an added bonus this will avoid all the 'fun' experiences trying to integrate it with the Android and SCons builds. v2: Manually expand __BEGIN_DECLS/__END_DECLS and document (Tapani). Furthermore it seems that some games (or surrounding runtime) static link against OpenSSL resulting in conflicts. For more information see the discussion thread [3] Bugzilla [1]: https://bugs.freedesktop.org/show_bug.cgi?id=94904 Bugzilla [2]: https://bugs.freedesktop.org/show_bug.cgi?id=97967 [3] https://lists.freedesktop.org/archives/mesa-dev/2017-January/140748.html Cc: Mark Janes <mark.a.janes@intel.com> Cc: Vinson Lee <vlee@freedesktop.org> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Jonathan Gray <jsg@jsg.id.au> Tested-by: Jonathan Gray <jsg@jsg.id.au> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Tapani Pälli <tapani.palli@intel.com> (v1) Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2017-01-18 19:07:23 +00:00
Kenneth Graunke	5b4a531207	i965: Make brw_cache_item structure private to brw_program_cache.c. struct brw_cache_item is an implementation detail of the program cache. We don't need to make those internals available to the entire driver. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-01-18 10:53:14 -08:00
Marek Olšák	c67a2793b3	radeonsi: determine in advance which VBOs should be added to the buffer list v2: now it should be correct Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	1db2bf8d2b	radeonsi: use fewer pointer dereferences in upload_vertex_buffer_descriptors Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	b9b9540a60	radeonsi: reject invalid vertex buffer indices at state creation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	cf248929bf	radeonsi: use a global dirty mask for shader pointers Only vertex buffers use a separate bool flag. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	861d7af1cb	radeonsi: use a bitmask-based loop in si_decompress_textures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	4bde7d3d3c	radeonsi: skip an unnecessary mutex lock for L2 prefetches the mutex lock is inside util_range_add. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	d93b0eacb7	radeonsi: si_cp_dma_prepare is a no-op for L2 prefetches Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	395c49849d	radeonsi: add SI_CPDMA_SKIP_BO_LIST_UPDATE the next commit will use it in a clever way, because the CP DMA prefetch doesn't need this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	35cd7551a4	radeonsi: use the correct target machine when building shader variants If the shader selector is created with a different context than the shader variant, we should use the calling context's target machine for the shader variant. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99419 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	3ae3be6dd4	radeonsi: move shader pipe context state into a separate structure Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Ben Widawsky	b0cc55f298	i965: Fix SURFACE_STATE to handle non-zero aux offsets Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Stone <daniels@collabora.com>	2017-01-18 09:38:18 -08:00
Christian Gmeiner	65a44a76fc	Revert "etnaviv: Fake occlusion query capability" This reverts commit `b7ac0f5671`. This is a half baked solution needs some rework to fixes issues with reported counter bits (GL_QUERY_COUNTER_BITS_ARB). Also it enables PIPE_CAP_QUERY_TIME_ELAPSED accidently. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-18 17:39:28 +01:00
Mauro Rossi	730574c58e	android: ac/debug: move sid_tables.h generation and IB decode to amd/common This patch is the porting to android of the following commits: `b838f64` "ac/debug: Move sid_tables.h generation to common code." `0ef1b4d` "ac/debug: Move IB decode to common code." Fixes android building errors due to sid_tables.h and ac_debug.c, ac_debug.h moved to amd/common Tested by building nougat-x86 Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:28:59 +00:00
Mauro Rossi	02185a1c9b	android: gallium/auxiliary: fix building error in Android 7.0 Conditional libLLVMCore static library dependency is added, for the case when MESA_ENABLE_LLVM is true Fixes the following building error with Android 7.0: In file included from external/mesa/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp:62: ... external/llvm/include/llvm/IR/Attributes.h:68:14: fatal error: 'llvm/IR/Attributes.inc' file not found #include "llvm/IR/Attributes.inc" ^ 1 error generated. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:24:19 +00:00
Mauro Rossi	f93f7cae14	android: amd/common: fix LLVMInitializeAMDGPU* functions declaration LLVMInitializeAMDGPU* functions need to be explicitly declared and mesa expects them via <llvm-c/Target.h> header, but LLVM needs to be instructed to invoke its own LLVM_TARGET(AMDGPU) macro, or the functions will not be available. A new llvm cflag (-DFORCE_BUILD_AMDGPU) serves this purpose, the same mechanism is used also by other llvm targets e.g. FORCE_BUILD_ARM A necessary prerequisite is to have AMDGPU target handled accordingly in llvm config files i.e. {Target,AsmParser,AsmPrinter}.def for llvm device build includes. This avoids the following building errors: external/mesa/src/amd/common/ac_llvm_util.c:43:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetInfo(); ^ external/mesa/src/amd/common/ac_llvm_util.c:44:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTarget(); ^ external/mesa/src/amd/common/ac_llvm_util.c:45:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetMC(); ^ external/mesa/src/amd/common/ac_llvm_util.c:46:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUAsmPrinter(); ^ Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:21:40 +00:00
Mauro Rossi	db3aaa3137	android: radeonsi: fix LLVMInitializeAMDGPU* functions declaration LLVMInitializeAMDGPU* functions need to be explicitly declared and mesa expects them via <llvm-c/Target.h> header, but LLVM needs to be instructed to invoke its own LLVM_TARGET(AMDGPU) macro, or the functions will not be available. A new llvm cflag (-DFORCE_BUILD_AMDGPU) serves this purpose, the same mechanism is used also by other llvm targets e.g. FORCE_BUILD_ARM A necessary prerequisite is to have AMDGPU target handled accordingly in llvm config files i.e. {Target,AsmParser,AsmPrinter}.def for llvm device build includes. This avoids the following building errors: external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:129:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetInfo(); ^ external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:130:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTarget(); ^ external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:131:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetMC(); ^ external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:132:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUAsmPrinter(); ^ Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:21:35 +00:00
Mauro Rossi	a2a63ad262	android: radeon: fix LLVMInitializeAMDGPU* functions declaration LLVMInitializeAMDGPU* functions need to be explicitly declared and mesa expects them via <llvm-c/Target.h> header, but LLVM needs to be instructed to invoke its own LLVM_TARGET(AMDGPU) macro, or the functions will not be available. A new llvm cflag (-DFORCE_BUILD_AMDGPU) serves this purpose, the same mechanism is used also by other llvm targets e.g. FORCE_BUILD_ARM A necessary prerequisite is to have AMDGPU target handled accordingly in llvm config files i.e. {Target,AsmParser,AsmPrinter}.def for llvm device build includes. This avoids the following building errors: external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:121:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' [-Werror=implicit-function-declaration] LLVMInitializeAMDGPUTargetInfo(); ^ external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:122:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' [-Werror=implicit-function-declaration] LLVMInitializeAMDGPUTarget(); ^ external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:123:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' [-Werror=implicit-function-declaration] LLVMInitializeAMDGPUTargetMC(); ^ external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:124:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' [-Werror=implicit-function-declaration] LLVMInitializeAMDGPUAsmPrinter(); ^ Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:21:28 +00:00
Emil Velikov	9c5003996c	nouveau: remove always false argument in nouveau_fence_new() No point in having the extra argument considering that it's effectively unused since the function was introduced. Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-18 16:01:15 +00:00
Emil Velikov	af4a298719	egl/wayland: resolve quirky try_damage_buffer() implementation The implementation was added with commit `d085a5dff5` and effectively provided a hidden dependency. Namely: the codepath used was determined solely during build time. Thus if we built again new wayland and then run against older (yet still within the requirements, as per the configure) one will get undefined symbols. As of earlier commit `36b9976e1f` "egl/wayland: Avoid race conditions when on non-main thread" the required version was bumped to one which provides the API, thus we can drop the quirky solution. Cc: Derek Foreman <derekf@osg.samsung.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Derek Foreman <derekf@osg.samsung.com>	2017-01-18 16:01:15 +00:00
Emil Velikov	687cf37bbe	configure: error out when building static XOR shared Current code warns out in such cases and falls-back to either static or shared. That can be easily missed amongst the volume produced by our configure script. Replace the warning with an error such that one gets direct feedback when they're doing something wrong. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:15 +00:00
Emil Velikov	da410e6afa	configure: explicitly require shared glapi for enable-dri We've been using and depending on it for at least a couple of years. Make it obvious and error out, should one opt for it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:15 +00:00
Emil Velikov	b628fdd6e7	configure: factor out commom egl/gbm checks Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:15 +00:00
Emil Velikov	e8044dd434	configure: remove HAVE_EGL_DRIVER_DRI[23] We have them for local purposes in configure, where we can use their direct dependency. With the only remaining instance in the makefile(s) being always true, as it can be seen in the configure snippet. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:15 +00:00
Emil Velikov	3b887f122f	configure: forbid static EGL/GBM Both libraries implicitly require shared GLAPI which in itself mandates shared libraries. Stop pretending that one can use it and error out at configure stage. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:15 +00:00
Emil Velikov	d4066216c6	configure: remove unused AC_SUBST variables v2: Rebase. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2017-01-18 16:01:15 +00:00
Emil Velikov	4380a2098b	gallium: correctly manage libsensors link flags We should be using LIBS rather than the LDFLAGS variable. Furthermore try to keep the linking to the final stage, rather than intermetent static library. Cc: Steven Toth <stoth@kernellabs.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	cb5e799448	egl/wayland: unify dri2_wl_create_surface implementations Rather than having two almost identical codepaths (one for HW/wl_drm and another for SW/wl_shm), just factorise and reuse in both places. v2: Rebase v3: Rebase Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> (v2)	2017-01-18 16:01:14 +00:00
Emil Velikov	bfd6314350	egl/wayland: use the destroy_window_callback for swrast As described in commit `690ead4a13` ("egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.") if we attempt to destroy a EGL surface attached to already destroyed Wayland window we'll get a segfault. v2: set the correct callback alongside the window->private. (Dan) Cc: Daniel Stone <daniels@collabora.com> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	3ecd6c6abd	glx: unify GLX_SGIX_pbuffer aliased declarations No point in having an identical code in two places. Not to mention that the Apple one incorrectly uses GLXDrawable as pbuf type. This change is both API and ABI safe since the header uses the correct GLXPbufferSGIX and both types are a typedef of the same primitive XID. Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jeremy Sequoia <jeremyhu@apple.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	9898bcf3f4	glx: use GLX_ALIAS for glXGetProcAddress Use the macro, rather than open-coding it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	dfc84c2296	mesa: make use of HAVE_FUNC_ATTRIBUTE_ALIAS macro We must make sure that xserver has an equivalent one-line change to its configure.ac as the glx/glapi headers get copied over. Then again, xserver does _not_ seem to set HAVE_ALIAS to begin with so one might want to look into that first. Cc: Adam Jackson <ajax@redhat.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	63c58dfc65	android: set HAVE_FUNC_ATTRIBUTE_ALIAS Analogous to previous two commits. Strictly speaking it's not be applicable for Android since we don't build GLX and related code. Regardless keep things consistent with the other build systems. Cc: Rob Herring <robh@kernel.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	52bf10cc4f	scons: set HAVE_FUNC_ATTRIBUTE_ALIAS Analogoust to the previous commit were we did so for autotools Cc: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	95d9eae427	configure: use standard check for attribure alias Currently we have two macros - HAVE_ALIAS and GLX_ALIAS_UNSUPPORTED. To make it even better former of which is explicitly cleared in some cases while not in others. Clear all that up by using a single macro properly set during configure. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	f121ac68b0	glx: remove always false ifdef GLX_NO_STATIC_EXTENSION_FUNCTIONS Quick search through git history (of both mesa and xserver) hows no instances where this was ever set. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:14 +00:00
Wladimir J. van der Laan	b7ac0f5671	etnaviv: Fake occlusion query capability This enables the PIPE_CAP_OCCLUSION_QUERY capability without adding an occlusion query type. This is necessary to get Mesa to report desktop GL 2.0 support (to run exciting things such as ioq3's OpenGL 2 renderer), and should be valid because exposing the capability does not guarantee that any counters are actually implemented. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-18 16:58:18 +01:00
Christian Gmeiner	103c363e0a	etnaviv: add flags parameter to texture barrier Fixes compile warning introduced by commit a1c848. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-18 16:58:11 +01:00
Christian Gmeiner	3ef916c128	etnaviv: handle PIPE_CAP_TGSI_FS_FBFETCH Fixes compile warning introduced by commit ee3ebe. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-18 16:58:05 +01:00
Roland Scheidegger	56441708cf	gallivm: (trivial) fix copy/paste bug with big endian code `8bd67a35c5` introduced using undefined variable on big endian archs due to copy/paste bug. (compile hack tested only)	2017-01-18 16:30:50 +01:00
Jose Fonseca	34041968f8	configure.ac: Revert recent HAVE_LLVM changes. This reverts changes 903eb09b5fb78d47d0f8a4bdf826a113ca2aff40..1a0aa468f354f0ee94dd383cd40ae915584624aa: Tobias Droste (5): configure.ac: Rename MESA_LLVM to FOUND_LLVM configure.ac: Only set LLVM_LIBS if LLVM is used configure.ac: Only define HAVE_LLVM if LLVM is used configure.ac: Set and use HAVE_GALLIUM_LLVM define configure.ac: Don't check LLVM version in gallium_require_llvm They break scons build, and I'm not convinced this is the right fix. In particular changing HAVE_LLVM in the C code is something I'd rather avoid no matter what. So it's better to discuss without the pressure of broken builds.	2017-01-18 14:46:54 +00:00
Elie TOURNIER	5034cf4e35	docs: Fix GLSL compiler link The doc wasn't update since we moved the glsl compiler to src/compiler/glsl. I also updated the description of the standalone compiler. v2: - Mention that just-log argument removes headers/separators. - Mention that version argument is mandatory. Since version argument is mandatory, add --version to the command line example. Signed-off-by: Elie Tournier <tournier.elie@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-01-18 14:15:31 +00:00
Emil Velikov	8d1712a065	vulkan: automake: do not use EXTRA_DIST in a conditional Otherwise the file might not end up in the tarball. Fixes: `dbd677efb4` "vulkan: add API registry" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 13:41:32 +00:00
Tomasz Figa	2d14ae6bea	configure.ac: Respect LLVM_CFLAGS in LLVM version detection When compiling LLVM headers, including llvm-config.h, we need to respect LLVM_CFLAGS. This is especially crucial if LLVM is located in a non-standard location and it happens that llvm-config.h includes another header. In such case the detection would fail due to missing header, because the path is provided in LLVM_CFLAGS. Let's add LLVM_CFLAGS to global CFLAGS for the time of detection and then restore the original flags, as done in other places of the script. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 13:25:17 +00:00
Tobias Droste	1a0aa468f3	configure.ac: Don't check LLVM version in gallium_require_llvm This is actually not needed because the version is checked later. Line 2609: if test "x$enable_gallium_llvm" == "xyes"; then llvm_require_version $LLVM_REQUIRED_GALLIUM "gallium" llvm_add_default_components "gallium" HAVE_GALLIUM_LLVM=xyes DEFINES="${DEFINES} -DHAVE_GALLIUM_LLVM" fi Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 13:23:01 +00:00
Tobias Droste	4d0efb9683	configure.ac: Set and use HAVE_GALLIUM_LLVM define Gallium code used HAVE_LLVM to check if it needs to compile code for LLVM in header and source files. With the new logic HAVE_LLVM is always set. Use extra define to figure out if LLVM is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99010 Signed-off-by: Tobias Droste <tdroste@gmx.de>	2017-01-18 13:23:01 +00:00
Tobias Droste	b045d23c0b	configure.ac: Only define HAVE_LLVM if LLVM is used Make sure that HAVE_LLVM compiler define is only set if LLVM is actually used. Signed-off-by: Tobias Droste <tdroste@gmx.de> v2 [Emil] fold within the existing conditional Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 13:23:01 +00:00
Tobias Droste	38e81293b0	configure.ac: Only set LLVM_LIBS if LLVM is used This renames llvm_check_version_for to llvm_require_version and let it set a variable to mark that LLVM will be used. Use this to make a usefull configure output and to only check if the libs are found in LLVM if it is actually used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99010 Signed-off-by: Tobias Droste <tdroste@gmx.de>	2017-01-18 13:23:01 +00:00
Tobias Droste	add9066eb0	configure.ac: Rename MESA_LLVM to FOUND_LLVM This renames MESA_LLVM to FOUND_LLVM and updates the config.log report to say if LLVM is found or not, to make clear that this does not mean that it is used. There are no MESA_LLVM users so drop the AC_SUBST. v2 [Emil] - Polish test: -a over && test, = over ==, unquiote xyes - other ? Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 13:23:00 +00:00
Jose Fonseca	903eb09b5f	gallivm: Cleanup USE_MCJIT. Split USE_MCJIT macro dual nature into a separate constant time define and a run-time variable. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 12:35:01 +00:00
Kenneth Graunke	aa291c3ba9	i965: Don't map/unmap in brw_print_program_cache on LLC platforms. We have a persistent mapping. Don't map it a second time or try to unmap it. Just use the pointer. This most likely would wreak havoc except that this code is unused (it's only called from an if (0) debug block). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-01-17 21:47:38 -08:00
Kenneth Graunke	ce89239294	i965: Move program cache printing to brw_program_cache.c. It makes sense to put a function which prints out the entire contents of the program cache in the file that implements the program cache. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-01-17 21:47:36 -08:00
Kenneth Graunke	f9edc550b2	i965: Make a helper for finding an existing shader variant. We had five copies of the same "walk the cache and look for an existing shader variant for this program" code. Now we have one helper function that returns the key. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-01-17 21:47:10 -08:00
Kenneth Graunke	e7d4008ebf	glsl: Make copy propagation not panic when it sees an intrinsic. A number of games have large arrays of constants, which we promote to uniforms. This introduces copies from the uniform array to the original temporary array. Normally, copy propagation eliminates those copies, making everything refer to the uniform array directly. A number of shaders in "Deus Ex: Mankind Divided" recently exposed a limitation of copy propagation - if we had any intrinsics (i.e. image access in a compute shader), we weren't able to get rid of these copies. That meant that any variable indexing remained on the temporary array rather being moved to the uniform array. i965's scalar backend currently doesn't support indirect addressing of temporary arrays, which meant lowering it to if-ladders. This was horrible. According to Marek, on radeonsi/GCN, "F1 2015" uses 64% less spilled-temp-array memory. On i965/Skylake: total instructions in shared programs: 13362954 -> 13329878 (-0.25%) instructions in affected programs: 43745 -> 10669 (-75.61%) helped: 12 HURT: 0 total cycles in shared programs: 248081010 -> 245949178 (-0.86%) cycles in affected programs: 4597930 -> 2466098 (-46.37%) helped: 12 HURT: 0 total spills in shared programs: 9493 -> 9507 (0.15%) spills in affected programs: 25 -> 39 (56.00%) helped: 0 HURT: 1 total fills in shared programs: 12127 -> 12197 (0.58%) fills in affected programs: 110 -> 180 (63.64%) helped: 0 HURT: 1 Helps Deus Ex: Mankind Divided. The one shader with hurt spills/fills is from Tomb Raider at Ultra settings, but that same shader has a -39.55% reduction in instructions and -14.09% reduction in cycle counts, so it seems like a win there as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-17 21:45:22 -08:00
Kenneth Graunke	9919542f1c	i965: Make DCE set null destinations on messages with side effects. (Co-authored by Matt Turner.) Image atomics, for example, return a value - but the shader may not want to use it. We assigned a useless VGRF destination. This seemed harmless, but it can actually be quite harmful. The register allocator has to assign that VGRF to a real register. It may assign the same actual GRF to the destination of an instruction that follows soon after. This results in a write-after-write (WAW) dependency, and stall. A number of "Deus Ex: Mankind Divided" shaders use image atomics, but don't use the return value. Several of these were hitting WAW stalls for nearly 14,000 (poorly estimated) cycles a pop. Making dead code elimination null out the destination avoids this issue. This patch cuts one shader's estimated cycles by -98.39%! Removing the message response should also help with data cluster bandwidth. On Skylake: (instruction counts remain identical) total cycles in shared programs: 255413890 -> 248081010 (-2.87%) cycles in affected programs: 12019948 -> 4687068 (-61.01%) helped: 24 HURT: 10 v2: Make can_omit_write independent of can_eliminate (Curro). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-17 21:45:04 -08:00
Kenneth Graunke	90bf39cd2b	i965: Combine some dead code elimination NOP'ing code. In theory we might have incorrectly NOP'd instructions that write the flag, but where that flag value isn't used, and yet the instruction either writes the accumulator or has side effects. I don't believe any such instructions exist, so this is mostly a code cleanup. Curro pointed out that FS_OPCODE_FB_WRITE has a null destination and actually writes the flag on Gen4-5 to dynamically decide whether to write some payload data. The hunk removed in this patch might have NOP'd it, except that we don't actually mark flags_written() in the IR, so it doesn't think the flag is touched at all. That's sketchy, but it means it wouldn't hit this today (though there are likely other problems!). v2: Properly replace the inst->regs_written() check in the second hunk with the flag being live (mistake caught by Curro). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-17 21:45:00 -08:00
Kenneth Graunke	be5f53e769	i965: Make DCE explicitly not eliminate any control flow instructions. According to Matt, the dead code pass explicitly avoided IF and WHILE because on Sandybridge, these could have conditional modifiers and null destination registers. Normally, those instructions use BAD_FILE for the destination register. Nowadays, we don't do that anymore, so we could technically drop these checks. However, it's clearer to explicitly leave control flow instructions alone, so change it to the more generic !inst->is_control_flow(). This should have no actual change. [This patch implements review feedback from Curro and Matt.] Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-17 21:44:29 -08:00
Dave Airlie	aac562f112	radv: disable vertex reuse when writing viewport index This fixes some issues we'd hit later if using viewport indexes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-18 08:04:11 +10:00
Dave Airlie	7e0382fb35	radv: add support for layered clears (v2) Just always use the layer clear pipelines, the overhead of emitting the layer shouldn't be too large. v2: Bas suggested we always use it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-18 06:21:22 +10:00
Dave Airlie	7886100811	radv/ac: split part of llvm compile into a separate function This is needed to have common code for gs copy shader emission. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-18 06:21:05 +10:00
Dave Airlie	5dadd7ca27	radv/ac: switch an if to switch makes it easier to add other shader stages. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-18 06:20:48 +10:00
Dave Airlie	6b635bbe16	radv: add support for writing layer/viewport index (v2) This just adds the infrastructure to allow writing layer and viewport index. It's just a first patch out of the geom shader tree, and doesn't do much on its own. v2: add missing if statement change (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-18 06:20:44 +10:00
Bas Nieuwenhuizen	3b4bf8aa63	ac/debug: Decrease num_dw for type 2 NOP's. Otherwise we read past the end of the buffer. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-17 20:54:57 +01:00
Marek Olšák	57f18623fb	radeonsi: for the tess barrier, only use emit_waitcnt on SI and LLVM 3.9+ Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-17 16:13:25 +01:00
Nayan Deshmukh	3a8f316e7b	st/vdpau: remove the delayed rendering hack(v1.1) the hack was introduced to avoid an extra copying but now with dri3 we don't need it anymore v1.1: rebasing Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-01-17 11:52:03 +01:00
Nayan Deshmukh	15bfdea99c	st/vdpau: use dri3 to directly send the buffer to X(v2) this avoids an extra copy which occurs in case of dri2 v1.1: fallback to dri2 if dri3 fails to initialize v2: add PIPE_BIND_SCANOUT to output buffers as they will be send to X server directly (Michel) Suggested-by: Christian König <christian.koenig@amd.com> Tested-by: Andy Furniss <adf.lists@gmail.com> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>	2017-01-17 11:51:56 +01:00
Nayan Deshmukh	0ef17d76bb	vl/dri3: use external texture as back buffers(v4) dri3 allows us to send handle of a texture directly to X so this patch allows a state tracker to directly send its texture to X to be used as back buffer and avoids extra copying v2: use clip width/height to display a portion of the surface v3: remove redundant variables, fix wrapping, rename variables handle vaapi path v3.1: we need clip_width/height for every frame so we don't need to maintain it for each buffer instead use a global variable v4: In case of single gpu we can cache the buffers as applications use constant number of buffer and we can avoid calls to present extension for every frame Reviewed and Suggested-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Andy Furniss <adf.lists@gmail.com> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>	2017-01-17 11:51:50 +01:00
Iago Toral Quiroga	9fe9db8031	anv: set UAV coherence required bit when needed The same we do in the OpenGL driver (comment copied from there). This is required to ensure that we execute the fragment shader stage when side-effects (such as image or ssbo stores) are present but there are no color writes. I found this while writing a test to check rendering to a framebuffer without attachments where the fragment shader does not produce any color outputs but writes to an image via imageStore(). Without this patch the fragment shader does not execute and the image is not written, which is not correct. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-17 07:57:04 +01:00
Samuel Iglesias Gonsálvez	ff0dd67d2f	anv: increase ANV_MAX_STATE_SIZE_LOG2 limit to 1 MB Fixes crash in dEQP-VK.ubo.random.all_shared_buffer.48 due to a fragment shader code bigger than 128 kB. This patch increases the allocation size limit to 1 MB. v2: - Increase it to 1 MB (Jason) - Increase device->instruction_block_pool allocation size in anv_device.c (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-17 06:42:42 +01:00
Ilia Mirkin	19963231a3	nv50/ir: optimize shl + and Address loading can often end up as shl + shr + shl combinations. The latter two are equal shifts, which get converted into an and mask. However if the previous shl is more than the mask is trying to remove (in terms of low bits), we can just remove the and entirely. This reduces some large shaders by as many as 3% of instructions (out of 2K). total instructions in shared programs : 6495509 -> 6491076 (-0.07%) total gprs used in shared programs : 954621 -> 954623 (0.00%) local gpr inst bytes helped 0 0 1014 1014 hurt 0 2 0 0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-16 21:13:09 -05:00
Ilia Mirkin	5ba380c226	nvc0: enable FBFETCH with a special slot for color buffer 0 We don't need to support all the color buffers for advanced blend, just cb0. For Fermi, we use the special binding slots so that we don't overlap with user textures, while Kepler+ gets a dedicated position for the fb handle in the driver constbuf. This logic is only triggered when a FBFETCH is actually present so it should be a no-op most of the time. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-16 21:13:09 -05:00
Ilia Mirkin	6b7511c2f1	st/mesa: add support for advanced blend when fb can be fetched from This implements support for emitting FBFETCH ops, using the existing lowering pass for advanced blend logic, and disabling hw blend when advanced blending is enabled. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 21:13:09 -05:00
Ilia Mirkin	a1c8484271	gallium: add flags parameter to texture barrier This is so that we can differentiate between flushing any framebuffer reading caches from regular sampler caches. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 21:13:09 -05:00
Ilia Mirkin	ee3ebe68f9	gallium: add PIPE_CAP_TGSI_FS_FBFETCH Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 21:13:09 -05:00
Ilia Mirkin	1393999541	gallium: add FBFETCH opcode to retrieve the current sample value Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 21:13:08 -05:00
Ilia Mirkin	376316e963	mesa: allow BlendBarrier to be used without support for full fb fetch The extension spec is not currently published, so it's a bit premature to require it for BlendBarrier usage. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 21:13:08 -05:00
Ilia Mirkin	2dd4cdeb4e	glsl: avoid treating fb fetches as output reads to be lowered Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 21:13:08 -05:00
Dave Airlie	75f858cc33	radv/meta: split color renderpass creation out. This is just prep work for layered clears, it doesn't change anything. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-17 08:22:48 +10:00
Bas Nieuwenhuizen	5ae4de18d9	radv: Support multiple devices. Pretty straightforward. Also deleted the big comment block as it is a pretty standard pattern for filling in arrays. Also removed the error message on non-existent devices, as getting 7 errors printed to the console each time you enumerate the devices is pretty confusing. v2: Add constant for number of DRM devices. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-16 22:15:22 +01:00
Bas Nieuwenhuizen	8406f79d6a	radv: Get physical device from radv_device instead of the instance. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-16 22:15:22 +01:00
Ilia Mirkin	0baa639f76	nvc0: true up exposing of the HW_METRIC_QUERY_GROUP for maxwell This had been updated in one place but not the other. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-01-16 16:04:55 -05:00
Dave Airlie	d4392a877c	radv/ac: use ctx->voidt in more places. (v2) Just noticed this while in the area. v2: one replacement was incorrect. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-17 06:55:51 +10:00
Dave Airlie	3634dfd9e7	radv/meta: consolidate the depth stencil clear renderpasses We only need one per samples (maybe not even that), reduce all the unneeded ones. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-17 06:51:25 +10:00
Ilia Mirkin	5eeebca12f	nv50/ir: handle new DDIV op which will be used for double divisions The existing lowering is in place to lower that to RCP + MUL, or fancier things down the line if necessary. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-16 14:45:46 -05:00
Nicolai Hähnle	6be4a40430	tgsi: add DDIV instruction Double-precision division, to allow more precision than a DRCP + DMUL sequence. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-16 20:17:22 +01:00
Nicolai Hähnle	5e94e5bb9b	radeonsi: fix R600_DEBUG=nooptvariant Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2017-01-16 20:16:18 +01:00
Kenneth Graunke	7a2b65a1d7	i965: Make BLORP disable the NP Z PMA stall fix. This may fix GPU hangs on Gen8. I don't know if it does though. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-16 10:15:06 -08:00
Kenneth Graunke	d2590eb65f	i965: Enable OpenGL 4.5 on Haswell. Everything is in place and the test results look solid. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-01-16 10:13:23 -08:00
Marek Olšák	d523415609	radeonsi: implement GL_FIXED vertex format Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 18:07:08 +01:00
Marek Olšák	018fb2ecb3	radeonsi: implement 32-bit SNORM/UNORM/SSCALED/USCALED vertex formats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 18:07:08 +01:00
Marek Olšák	44e9b67229	radeonsi: make fix_fetch 64-bit v2: add u_bit_consecutive64 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 18:07:08 +01:00
Thomas Hindoe Paaboel Andersen	8daf6de3de	gallium/hud: avoid buffer overrun Renaming data sources was added in `e8bb97ce30` It was possible to use a new name longer than the name array in hud_graph of 128. This patch truncates the name to fit the array. CC: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-16 18:07:08 +01:00
Marek Olšák	0d9a4efce9	gallium/radeon: add GPU-shaders-busy HUD query It should be close to the GPU load, but it can be much lower if something is stalling shader execution (e.g. CP DMA). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 15:35:30 +01:00
Marek Olšák	aa0de724c7	gallium/radeon: make the GPU load / GRBM_STATUS monitoring extensible The next patch will add SPI_BUSY monitoring. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 15:35:30 +01:00
Marek Olšák	935d58ac73	radeonsi: show average results per frame for perf counters in HUD so that the graphs are independent from FPS. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 15:35:30 +01:00
Marek Olšák	1fe7c8d3c9	gallium/hud: disable queries during HUD draw calls Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 15:35:30 +01:00
Marek Olšák	5b2eddc40f	gallium/hud: increase the vertex buffer size for background quads Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 15:35:30 +01:00
Nayan Deshmukh	4b0e9babc6	st/va: delay calling begin_frame until we have all parameters If begin_frame is called before setting intra_matrix and non_intra_matrix it leads to segmentation faults when vl_mpeg12_decoder.c is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92634 Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-01-16 15:09:01 +01:00
Kenneth Graunke	5597b2b243	i965: Use align1 mode for barrier messages. In commit `7428e6f86a` we switched the barrier SEND message's destination type to UW to avoid problems in SIMD16 compute shaders. Tessellation control shaders also use barriers, and in vec4 mode, we were emitting them in align16 mode. The simulator warns that only UD, D, F, and DF are valid destination types - UW is technically illegal. So, switch to align1 mode. Either mode should work fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-15 16:49:58 -08:00
Ilia Mirkin	dd39e48726	nvc0/ir: emit FMZ flag when requested on FFMA Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-15 13:13:58 -05:00
Connor Abbott	c9b74f3f03	nir/gcm: fix a bug with metadata handling We were using impl->num_blocks, but that isn't guaranteed to be up-to-date until after the block_index metadata is required. If we were unlucky, this could lead to overwriting memory. Noticed by inspection. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-14 18:18:17 -05:00
Lionel Landwerlin	bf8e1f9e7b	radv: generate entrypoints from vk.xml v2: rework entry point iteration (Jason) cleanup unused imports v3: don't drop header installation (Emil) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-14 19:29:44 +00:00
Lionel Landwerlin	c7fc310cd1	anv: generate entry points from vk.xml v2: rework entry point iteration (Jason) cleanup unused imports v3: don't drop header installation (Emil) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-14 19:29:44 +00:00
Lionel Landwerlin	dbd677efb4	vulkan: add API registry Signed-off: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-14 19:29:44 +00:00
Lionel Landwerlin	60bc90cea8	include: update Vulkan headers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-14 19:29:44 +00:00
Andres Rodriguez	1e1bddf15a	radv: make device extension setup dynamic Each physical device may have different extensions than one another. Furthermore, depending on the software stack, some extensions may not be accessible. If an extension is conditional, it can be registered only when necessary. v2: removed unused function and fixed indentation Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-14 14:20:17 +01:00
Andres Rodriguez	5323efb685	radv: rename global extension properties structs All extension arrays are global, but only one of them refers to instance extensions. The device extension array refers to extensions that are common across all physical devices. This disctinction will be more imporant once we have dynamic extension support for devices. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-14 14:20:12 +01:00
Andres Rodriguez	0eb8b6a3e1	radv: use a winsys context per-queue, instead of per device v2 Queues are independent execution streams. The vulkan spec provides no ordering guarantees for different queues. By using a single context for all queues, we are forcing all commands into an unecessary FIFO ordering. This change is a preparation step to allow our-of-ordering scheduling of certain work tasks. v2: Fix a rebase error with radv_QueueSubmit() and trace_bo Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-14 14:19:41 +01:00
Timothy Arceri	772cd31048	nir: optimise min/max fadd combos shader-db results BDW: total instructions in shared programs: 13060410 -> 13060313 (-0.00%) instructions in affected programs: 24533 -> 24436 (-0.40%) helped: 88 HURT: 0 total cycles in shared programs: 256585692 -> 256586698 (0.00%) cycles in affected programs: 647290 -> 648296 (0.16%) helped: 35 HURT: 30 Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-14 23:26:22 +11:00
Kenneth Graunke	0d5071db5e	i965: Move Gen4-5 interpolation stuff to brw_wm_prog_data. This fixes glxgears rendering, which had surprisingly been broken since late October! Specifically, commit `91d61fbf7c`. glxgears uses glShadeModel(GL_FLAT) when drawing the main portion of the gears, then uses glShadeModel(GL_SMOOTH) for drawing the Gouraud-shaded inner portion of the gears. This results in the same fragment program having two different state-dependent interpolation maps: one where gl_Color is flat, and another where it's smooth. The problem is that there's only one gen4_fragment_program, so it can't store both. Each FS compile would trash the last one. But, the FS compiles are cached, so the first one would store FLAT, and the second would see a matching program in the cache and never bother to compile one with SMOOTH. (Clearing the program cache on every draw made it render correctly.) Instead, move it to brw_wm_prog_data, where we can keep a copy for every specialization of the program. The only downside is bloating the structure a bit, but we can tighten that up a bit if we need to. This also lets us kill gen4_fragment_program entirely! Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-13 17:25:48 -08:00
Grazvydas Ignotas	40a8f9e6f2	anv: remove some unused macros and functions VK_ICD_WSI_PLATFORM_MAX is used, but a duplicate from wsi_common.h . Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-13 16:52:27 -08:00
Jason Ekstrand	3b80481965	anv: Default PointSize to 1.0 if not written by the shader The Vulkan rules for point size are a bit whacky. If you only have a vertex shader and you use points, then you must write PointSize in your vertex shader. If you have a geometry or tessellation shader, then it's dependent on the shaderTessellationAndGeometryPointSize device feature. From the Vulkan 1.0.38 specification: "shaderTessellationAndGeometryPointSize indicates whether the PointSize built-in decoration is available in the tessellation control, tessellation evaluation, and geometry shader stages. If this feature is not enabled, members decorated with the PointSize built-in decoration must not be read from or written to and all points written from a tessellation or geometry shader will have a size of 1.0. This also indicates whether shader modules can declare the TessellationPointSize capability for tessellation control and evaluation shaders, or if the shader modules can declare the GeometryPointSize capability for geometry shaders. An implementation supporting this feature must also support one or both of the tessellationShader or geometryShader features." In other words, if the feature is disbled (the client can disable features!) then they don't write PointSize and we provide a 1.0 default but if the feature is enabled, they do write PointSize and we use the one they wrote in the shader. There are at least two valid ways we can implement this: 1) Track whether or not shaderTessellationAndGeometryPointSize is enabled and set the 3DSTATE_SF bits based on that and what stages are enabled, ignoring the shader source. 2) Just look at the last geometry stage VUE map and see if they wrote PointSize and set the 3DSTATE_SF accordingly. The second solution is the easiest and the most robust against invalid usage of the Vulkan API, so we choose to go with that one. This fixes all of the dEQP-VK.tessellation.primitive_discard.*point_mode tests. The tests are also broken because they unconditionally enable shaderTessellationAndGeometryPointSize if it's supported by the implementation and then don't write PointSize in the evaluation shader. However, since this is the "robust against invalid API usage" solution, the tests happily pass. :-) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-13 16:31:17 -08:00
Jason Ekstrand	99d497c5b6	anv/pipeline: Replace get_fs_input_map with get_last_vue_prog_data This lets us delete a helper from genX_pipeline.c Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-13 16:31:17 -08:00
Juan A. Suarez Romero	56ee2df4bf	i965/vec4: Fix mapping attributes This patch reverts `57bab6708f`, which was causing issues with ILK and earlier VS programs. 1. brw_nir.c: Revert "i965/vec4/nir: vec4 also needs to remap vs attributes" Do not perform a remap in vec4 backend. Rather, do it later when setup attributes 2. brw_vec4.cpp: This fixes mapping ATTRx to proper GRFn. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99391 [jordan.l.justen@intel.com: merge Juan's two patches from bugzilla] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-13 16:23:32 -08:00
Kenneth Graunke	fed4afc5bb	anv: Move nir_lower_wpos_center after dead variable elimination. When multiple shader stages exist in the same SPIR-V module, we compile all entry points and their inputs/outputs, then dead code eliminate the ones not related to the specific entry point later. nir_lower_wpos_center was being run prior to eliminating those random other variables, which made it trip up, thinking it found gl_FragCoord when it actually found something else like gl_PerVertex[3]. Fixes dEQP-VK.spirv_assembly.instruction.graphics.module.same_module. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-13 15:00:38 -08:00
Kenneth Graunke	99c019e1d4	i965: Fix textureGather with RG32I/UI on Gen7. According to the "Gather4 R32G32_FLOAT Bug" internal documentation page, the R32G32_UINT and R32G32_SINT formats are affected by the same bug as R32G32_FLOAT. Applying the same workarounds should be viable - apparently the R32G32_FLOAT_LD format shouldn't corrupt integer data which is NaN or other sketchy floating point values. One irritating caveat is that, because it's a FLOAT format, the alpha channel or any set to SCS_ONE return 0x3f8 (1.0) rather than integer 1. So we need shader code to whack those channels to 1. Fixes GL45-CTS.texture_gather.plain-gather-int-cube-rg on Haswell. v2: Fix swizzle component zeroing (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-13 11:57:06 -08:00
Bas Nieuwenhuizen	6d2fb04f09	radv: Support loader interface version 3. Port of `1e41d7f7b0`: "anv: Support loader interface version 3 (patch v2)" Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-13 19:11:17 +01:00
Boyan Ding	ddd27ef462	mesa/get: Remove unused extra_ARB_viewport_array Unused since `0a7691ee` (mesa: Enable enums for OES_viewport_array). Silence a warning of unused variable. Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-13 17:09:22 +00:00
Boyan Ding	dc18ec8b24	xlib: Unify the style of function pointer calls in structs Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> [Emil Velikov: handle the final case in glXCreateContextAttribsARB] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-13 16:24:40 +00:00
Boyan Ding	2d05425d3e	radeon: Unify the style of function pointer calls in structs Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> [Emil Velikov: handle the all cases] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-13 16:24:32 +00:00
Boyan Ding	056cfa558c	nouveau: Unify the style of function pointer calls in structs Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>	2017-01-13 16:24:32 +00:00
Boyan Ding	0ee4c4a732	glX_proto_send.py: Unify the style of function pointer calls in structs Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>	2017-01-13 16:24:32 +00:00
Boyan Ding	1411fbd50d	loader/dri3: Unify the style of function pointer calls in structs Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>	2017-01-13 16:24:32 +00:00
Boyan Ding	868ae3e31b	egl/dri2: Unify the style of function pointer calls in structs Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> [Emil Velikov: address platform_surfaceless] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-13 16:24:22 +00:00
Derek Foreman	3698d71124	i915: Add XRGB8888 format to intel_screen_make_configs This is a copy of commit `536003c11e` except for i915. Original log for the i965 commit follows: Some application, such as drm backend of weston, uses XRGB8888 config as default. i965 doesn't provide this format, but before commit `65c8965d`, the drm platform of EGL takes ARGB8888 as XRGB8888. Now that commit `65c8965d` makes EGL recognize format correctly so weston won't start because it can't find XRGB8888. Add XRGB8888 format to i965 just as other drivers do. Signed-off-by: Derek Foreman <derekf@osg.samsung.com> Acked-by: Boyan Ding <boyan.j.ding@gmail.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2017-01-13 15:53:42 +00:00
Derek Foreman	4f1d27a406	gbm/drm: Pick the oldest available buffer in get_back_bo Applications may query the back buffer age to efficiently perform partial updates. Generally the application will keep a fixed length damage history, and use this to calculate what needs to be redrawn based on the age of the back buffer it's about to render to. If presented with a buffer that has an age greater than the length of the damage history, the application will likely have to completely repaint the buffer. Our current buffer selection strategy is to pick the first available buffer without considering its age. If an application frequently manages to fit within two buffers but occasionally requires a third, this extra buffer will almost always be old enough to fall outside of a reasonably long damage history, and require a full repaint. This patch changes the buffer selection behaviour to prefer the oldest available buffer. By selecting the oldest available buffer, the application will likely always be able to use its damage history, at a cost of having to perform slightly more work every frame. This is an improvement if the cost of a full repaint is heavy, and the surface damage between frames is relatively small. It should be noted that since we don't currently trim our queue in any way, an application that briefly needs a large number of buffers will continue to receive older buffers than it would if it only ever needed two buffers. Reviewed-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Derek Foreman <derekf@osg.samsung.com> Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>	2017-01-13 15:52:11 +00:00
Jonas Ådahl	36b9976e1f	egl/wayland: Avoid race conditions when on non-main thread When EGL is used on some other thread than the thread that drives the main wl_display queue, the Wayland EGL dri2 implementation is vulnerable to a race condition related to display round trips and global object advertisements. The race that may happen is that after after a proxy is created, but before the queue is set, events meant to be emitted via the yet to be set queue may already have been queued on the wrong queue. In order to make it possible to avoid this race, wayland 1.11 introduced new API that allows creating a proxy wrapper that may be used as the factory proxy when creating new proxies via Wayland requests. The queue of a proxy wrapper can be changed without effecting what queue events emitted by the actual proxy will be queued on, while still effecting what default queue proxies created from it will have. By introducing a wl_display proxy wrapper and using this when performing round trips (via wl_display_sync()) and retrieving the global objects (via wl_display_get_registry()), the mentioned race condition is avoided. Signed-off-by: Jonas Ådahl <jadahl@gmail.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-01-13 15:50:37 +00:00
Jonas Ådahl	361796651c	egl/wayland: Cleanup private display connection when init fails When failing to initializing the Wayland EGL driver, don't leak the display server connection if it was us who created it. Signed-off-by: Jonas Ådahl <jadahl@gmail.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-01-13 15:50:04 +00:00
Rhys Kidd	cba8086951	travis: Add the new drivers etnaviv and imx Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-13 15:46:37 +00:00
sguttula	9b14a828db	st/va: flush pipeline after post processing This will flush the pipeline,which will allow to share dma-buf based buffers. Signed-off-by: Suresh Guttula <Suresh.Guttula@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-01-13 14:21:29 +01:00
Alejandro Piñeiro	84e3e12b25	main/fbobject: throw invalid operation when get_attachment fails if needed In most cases, if a call to get_attachment fails is because attachment is a INVALID_ENUM. But for some specific cases, if COLOR_ATTACHMENTm (where m >= MAX_COLOR_ATTACHMENTS) is used, it should raise an INVALID_OPERATION exception instead. Fixes: GL45-CTS.direct_state_access.framebuffers_get_attachment_parameter_errors GL45-CTS.direct_state_access.framebuffers_renderbuffer_attachment_errors v2: extra new line before quote block. Include "color attachment" on both new message errors (Nicolai). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-13 08:52:14 -02:00
Alejandro Piñeiro	c6eb3aeba5	main/fboject: return if it is color_attachment on get_attachment Some callers would need that info to know if they should raise INVALID_ENUM or INVALID_OPERATION. An alternative would be the caller to check if the attachment is a GL_COLOR_ATTACHMENTm, but that seems redundant as get_attachment is already doing that. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-13 08:52:07 -02:00
Nicolai Hähnle	963311b71f	mesa/main: fix version/extension checks in _mesa_ClampColor Add a proper check for feature support, and raise an invalid enum for GL_CLAMP_VERTEX/FRAGMENT_COLOR unconditionally in core profiles, since those enums were explicitly removed after the extension was promoted to core functionality (not in the profile sense) with OpenGL 3.0. This matches the behavior of the AMD closed source driver and fixes GL45-CTS.gtf30.GL3Tests.half_float.half_float_textures. Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-13 11:03:11 +01:00
Samuel Pitoiset	e1ea70d9f3	radeonsi: replace si_shader_context::soa by bld_base We no longer need to use lp_build_tgsi_soa_context. No regressions founds with full piglit run. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 10:41:08 +01:00
Samuel Pitoiset	ecf04b84e5	radeonsi: replace ctx->soa.outputs by ctx->outputs The plan is to replace si_shader_context::soa with its parent structure (ie. bld_base). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 10:41:06 +01:00
Samuel Pitoiset	f04088a7ba	radeonsi: move si_shader_context::soa::addr to si_shader_context The plan is to replace si_shader_context::soa with its parent structure (ie. bld_base). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 10:41:02 +01:00
Samuel Pitoiset	6f0d955b6d	radeonsi: allocate the array of immediates dynamically Currently, we can store up to 256 immediates in a static array, but this is not always enough. Instead, allocate a dynamic array like what we currently do for temps. This fixes a segfault with dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 No regressions found with full piglit run. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 10:40:57 +01:00
Grazvydas Ignotas	cb89d19dbb	radv: remove some unused macros and functions These seem unlikely to be used. Also remove irrelevant comment about SKL. v2: forgot to rebase on master Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>	2017-01-13 08:42:33 +01:00
Nanley Chery	64272d4f1b	anv: Avoid some resolves for samplable HiZ buffers v2: Simplify nested ifs (Jason Ekstrand) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:21 -08:00
Nanley Chery	71334f494a	anv: Enable sampling from HiZ v2: Restrict ISL_AUX_USAGE_HIZ to depth aspects Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:21 -08:00
Nanley Chery	5e0902cd2a	anv/blorp: Don't fast depth clear samplable HiZ buffers on BDW Avoid the resolves that would be required if fast depth clears were allowed for such buffers. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:21 -08:00
Nanley Chery	3ac01ad2ac	anv: Add a helper to determine sampling with HiZ Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	bcf880a9c8	isl/surface_state: Handle ISL_AUX_USAGE_HIZ v2: Remove redundant x/y offset asserts (Jason Ekstrand) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	58af615636	anv: Perform HiZ resolves only on layout transitions This is a better mapping to the Vulkan API and improves performance in all tested workloads. v2: Remove unnecessary image view aspect checks (Jason Ekstrand) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	2852efcda4	anv: Disable HiZ for input attachments v2 (Jason Ekstrand): - Add spec citation - Drop conditional Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	b62d8ad2ae	anv: Avoid resolves incurred by fast depth clears Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	968ffd6c86	anv: Prepare for transitioning to the requested final layout Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	104ce1dbab	anv: Store depth stencil layouts Store the current and requested depth stencil layouts so that we can perform the appropriate HiZ resolves for a given transition while recording a render pass. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	2e2cf78a51	anv: Add helpers to handle depth buffer layout transitions Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	0ce8b37a8e	anv: Delete anv's HiZ op emit function This is no longer used. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	462a4c9648	anv: Use the gen8 BLORP HiZ resolving function Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	d16871d958	anv/blorp: Add a gen8 HiZ op resolve function Add an entry point for resolving using BLORP's gen8 HiZ op function. v2: Manually add the aux info Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	3b7106c181	anv: Use gen8 BLORP HiZ clearing functions Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	f357af0c90	intel/blorp_clear: Add gen8 HiZ clearing functions Add an entry point for the optimized gen8 BLORP HiZ sequence. commit `c9eaf12de2` fixed a bug that was unknowingly worked around by forcing additional clear rectangle alignment restrictions not specified in the PRMs. Now that the bug is no longer present, omit the additional alignment restrictions. v2: Adjust code comment about padding Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:19 -08:00
Nanley Chery	64fb5b0d51	anv: Enable HiZ support for multiple subpasses We'll be using layout transitions later on in the series which can occur within and between subpasses. Turn this on now to simplify the change later. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:19 -08:00
Nanley Chery	168985fca1	anv: Use ::anv_attachment_state for toggling HiZ per subpass We're about to enable HiZ support for multiple subpasses. Use this field to keep track of whether or not subpass operations should treat the depth buffer as having an auxiliary HiZ buffer. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:19 -08:00
Nanley Chery	055ff2ec52	anv: Replace anv_image_has_hiz() with ISL_AUX_USAGE_HIZ The helper doesn't provide additional functionality over the current infrastructure. v2: Add comment to anv_image::aux_usage (Jason Ekstrand) v3: Clarify comment for aux_usage (Jason Ekstrand) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:19 -08:00
Nanley Chery	160a54810e	anv/blorp: Handle ISL_AUX_USAGE_HIZ Prevent assert failures that would occur in the next patch. v2: Don't remove asserts from blorp/blit (Jason Ekstrand) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:19 -08:00
Nanley Chery	09948151ab	intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP We'll be switching to layout-transition based resolves which can occur outside of a render pass. Add this sequence to BLORP, as using BLORP will enable emitting depth stencil state outside of a render pass (among other benefits). The depth buffer extent is ignored to enable eventual usage in VkCmdClearAttachments(). Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:19 -08:00
Emil Velikov	f0bdd13fdb	get-typod-pick-list.sh: add new script Typos do happen as people nominate patches for stable. This script aims to catch most of those. Due to the subtle nature of things, one has to pay special attention to the output, similar to get-extra-pick-list.sh. At the moment only the following is handled: grep -i "CC:.*mesa-dev" Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-13 03:07:48 +00:00
Emil Velikov	5abd0a7583	ac: automake: ensure that ./common is generated Depending on the autoconf (or friends) version one may or may not have the ./common folder created. Thus in the latter case we'll fail to generate the file. Reviewed-by: Thierry Reding <treding@nvidia.com> Tested-by: Darren Salt <devspam@moreofthesa.me.uk> Reported-by: Darren Salt <devspam@moreofthesa.me.uk> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-13 03:07:48 +00:00
Ilia Mirkin	f897036978	nvc0/ir: only try to check for zero LOD if we aren't already forcing it There's a levelZero flag which forces texturing to pick level zero (and not consume an explicit LOD argument). This is set for MS targets, but could also be set for any other incoming instruction. As that is what determines whether a LOD argument is present, check that rather than the more indirect isMS logic. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-12 21:08:42 -05:00
Ilia Mirkin	eb60a89bc3	nouveau: take extra push space into account for pushbuf_space calls Ever since a long time ago when I messed around with fences, I ensure that after a PUSH_SPACE call there is enough space to write a fence out into the pushbuf. However the PUSH_SPACE macro is not all-knowing, and so sometimes we have to invoke nouveau_pushbuf_space manually with the relocs/pushes args set. If we don't take the extra allocation from PUSH_SPACE into account, then we will end up accidentally flushing when the code was not expecting a flush. This can lead to various runtime and rendering failures. The amount of extra allocation isn't that important - it has to be at least 8 based on the current nouveau_winsys.h setting, but even more won't hurt. I just rounded up to powers of 2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99354 Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Ben Skeggs <bskeggs@redhat.com>	2017-01-12 20:39:19 -05:00
Grazvydas Ignotas	8945836658	mapi: update the asm code to support x32 Fixes crashes when both glx-tls and asm are enabled on x32. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94512 Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=575458 Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-01-13 00:59:32 +01:00
Nicolai Hähnle	1007047ca1	ac/nir: use ac_emit_fdiv throughout ... and eliminate emit_fdiv and nir_to_llvm_context::fpmath_md_*, which are now unused. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:22 +01:00
Nicolai Hähnle	38c67f77ed	ac/nir: use ac_build_gather_values[_extended] throughout ... and eliminate the non-ac copies. Mostly straight-forward search & replace. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:20 +01:00
Nicolai Hähnle	2c9d26a356	ac/nir: use ac_emit_llvm_intrinsic throughout ... by straight-forward search & replace, and eliminate emit_llvm_intrinsic. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:17 +01:00
Nicolai Hähnle	fccf29373d	radeonsi: remove unused si_prepare_cube_coords Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:13 +01:00
Nicolai Hähnle	a0ce09b4b2	amd/common: unify cube map coordinate handling between radeonsi and radv Code is taken from a combination of radv (for the more basic functions, to avoid gallivm dependencies) and radeonsi (for the new and improved derivative calculations). v2: add 0.5 offset to tex coords only after derivative calculation v3: - really only touch the first three coordinates - rebase on the removal of the 1.5 --> 0.5 offset change Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:10 +01:00
Nicolai Hähnle	0ee1ee5fbb	radeonsi: only touch first three coordinates in si_prepare_cube_coords Sourcing coords_arg[4] is actually never correct, since bias is handled differently in tex_fetch_args anyway. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:07 +01:00
Nicolai Hähnle	9f590ee9d9	radeonsi: remove unused si_llvm_cube_to_2d_coords Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:03 +01:00
Nicolai Hähnle	205ad5234a	radeonsi: restrict cube map derivative computations to the correct plane As remarked by the comment in the original code, the old algorithm fails when (tc + deriv) points at a different cube face. Instead, simply project the derivative directly to the plane of the selected cube face. The new code is based on exactly differentiating (using the chain rule) the projection onto a plane corresponding to a fixed cube map face (which is still selected in the usual way based on the texture coordinate itself). The computations end up fairly involved, but we do save two reciprocal computations. Fixes GL45-CTS.texture_cube_map_array.sampling. v2: add 0.5 offset to tex coords only after derivative calculation v3: go back to 1.5 offset Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:38:59 +01:00
Nicolai Hähnle	e01deee42f	radeonsi: communicate cube map coordinates more explicitly v2: fix compile error that snuck in during rebase Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:38:34 +01:00
Grazvydas Ignotas	c728051131	ac/debug: move .gitignore for sid_tables.h too `b838f642` "ac/debug: Move sid_tables.h generation to common code." moved sid_tables.h but forgot the corresponding .gitignore. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-13 00:37:52 +01:00
Jason Ekstrand	08eced3cfd	nir/gcm: Fix a typo in a comment Reported-by: Matt Turner <mattst88@gmail.com>	2017-01-12 14:56:55 -08:00
Jason Ekstrand	087e172179	nir/gcm: Rework the schedule late loop This fixes a bug in code motion that occurred when the best block is the same as the schedule early block. In this case, because we're checking (lca != def->parent_instr->block) at the top of the loop, we never get to the check for loop depth so we wouldn't move it out of the loop. This commit reworks the loop to be a simple for loop up the dominator chain and we place the (lca != def->parent_instr->block) check at the end of the loop. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-12 14:56:55 -08:00
Chuck Atkins	e9a4ec4bd8	glx: Add missing glproto dependency for gallium-xlib glx Cc: mesa-stable@lists.freedesktop.org Cc: Bruce Cherniak <bruce.cherniak@intel.com> Signed-of-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 22:01:55 +00:00
Emil Velikov	c90f921273	ac, radeonsi: automake: add missing builddir include The generated file is correctly stored in the builddir as of earlier commit. Yet the commit forgot to add the respective include flag thus the compiler would error out failing to find sid_tables.h Bugzila: https://bugs.freedesktop.org/show_bug.cgi?id=99389 Fixes: `d1dc22eb46` "ac: automake: rework sid_tables.h generation" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 22:01:55 +00:00
Bas Nieuwenhuizen	8aaca3820c	radv: Call NIR passes using NIR_PASS_V. Port of `faa1edeeb7` "anv/pipeline: Call NIR passes using NIR_PASS_V" Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-12 21:39:52 +01:00
Bas Nieuwenhuizen	65cbb993d3	radv: Call nir_lower_constant_initializers. Port of `c5d664f9dc` "anv/pipeline: Call nir_lower_constant_initializers" Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-12 21:39:46 +01:00
Bas Nieuwenhuizen	18e70edd8c	radv: Only call remove_dead_variables once. Port of `43e0b0d4b2` "anv/pipeline: Only call remove_dead_variables once" Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-12 21:39:41 +01:00
Axel Davy	970556292b	st/nine: Protect dtors with mutex When the flag D3DCREATE_MULTITHREAD is set, a global mutex is used to protect nine calls. However for performance reasons, AddRef and Release didn't hold the mutex, and instead used atomics. Unfortunately at item release, the item can be destroyed, and that destruction path should be protected by a mutex (at least for some objects). Without this patch, it is possible an app thread is in a dtor while another thread is making gallium nine calls. It is possible that two threads are using the same gallium pipe, which is forbiden. The problem has been made worse with csmt, because it can cause hang, since nine_csmt_process is not threadsafe. Fixes Hitman hang, and possibly others. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Axel Davy	5f4359ea0e	st/nine: Flush the queue at device dtor Flush the queue to get refcounts right, and properly release the items, instead of throwing away all pending commands. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Axel Davy	4e922c81f6	st/nine: Process pending commands on Reset Some nine_state_* and nine_context_* functions used for Reset() require all pending commands are flushed. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Axel Davy	6b87a2a77a	st/nine: Flush pending commands if needed for surface9 changes nine_context uses NineSurface9 fields, thus we need to flush pending commands using the surface before changing the fields. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Axel Davy	f895ab8e22	st/nine: Rework CreatePipeSurface Create both surfaces in one call. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Axel Davy	d43bc05e8b	st/nine: Remove duplicated checks There is no need to check on csmt_active before calling nine_csmt_process, because the function checks already. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Masanori Kakura	9b5f5de9e9	st/nine: Don't call u_box_union_* when dirty region is empty When dirty region is empty, u_box_union_* incorrectly expands the new region. This fixes broken font rendering issue in WOLF RPG Editor v2.10 games. Signed-off-by: Masanori Kakura <kakurasan@gmail.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Emil Velikov	a5f0cdb36f	winsys/etnaviv: automake: introduce Makefile.sources ... and list the public header within it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 19:30:15 +00:00
Emil Velikov	0467700536	etnaviv: automake: include all files in the sources lists Note: the currently mentioned etnaviv_utils.h is typo. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 19:30:09 +00:00
Emil Velikov	d1dc22eb46	ac: automake: rework sid_tables.h generation Drop $(srcdir)/ prefix analogous to before the file (and rule) movement and move it outside of the NEED_RADEON_LLVM conditional. Otherwise the build may fail as below. make[3]: *** No rule to make target 'common/sid_tables.h', needed by 'distdir'. Stop. Fixes: `b838f64237` "ac/debug: Move sid_tables.h generation to common code." Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 19:29:28 +00:00
Emil Velikov	23dcce0c03	automake: use shared llvm libs for make distcheck Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 19:29:22 +00:00
Emil Velikov	024b4c35bc	automake: add the new drivers etnaviv and imx to make distcheck Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 19:29:20 +00:00
Christian Gmeiner	e8626e3b31	imx: gallium driver for imx-drm scanout driver Changes from V1 -> V2: - updated Copyright - added $(top_srcdir)/src/gallium/winsys to include path (suggested by Emil) - adapted driver to new renderonly API Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 19:27:11 +00:00
The etnaviv authors	c9e8b49b88	etnaviv: gallium driver for Vivante GPUs This driver supports a wide range of Vivante IP cores like GC880, GC1000, GC2000 and GC3000. Changes from V1 -> V2: - added missing files to actually integrate the driver into build system. - adapted driver to new renderonly API Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 19:27:11 +00:00
Christian Gmeiner	848b49b288	gallium: add renderonly library This a very lightweight library to add basic support for renderonly GPUs. A kms gallium driver must specify how a renderonly_scanout objects gets created. Also it must provide file handles to the used kms device and the used gpu device. This could look like: struct renderonly ro = { .create_for_resource = renderonly_create_gpu_import_for_resource, .kms_fd = fd, .gpu_fd = open("/dev/dri/renderD128", O_RDWR \| O_CLOEXEC) }; The renderonly_scanout object exits for two reasons: - Do any special treatment for a scanout resource like importing the GPU resource into the scanout hw. - Make it easier for a gallium driver to detect if anything special needs to be done in flush_resource(..) like a resolve to linear. A GPU gallium driver which gets used as renderonly GPU needs to be aware of the renderonly library. This library will likely break android support and hopefully will get replaced with a better solution based on gbm2. Changes from V1 -> V2: - reworked the lifecycle of renderonly object (suggested by Nicolai Hähnle) - killed the midlayer (suggested by Thierry Reding) - made the API more explicit regarding gpu and kms fd's - added some docs Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Alexandre Courbot <acourbot@nvidia.com>	2017-01-12 19:27:11 +00:00
Jason Ekstrand	27a1c7ffbd	spirv: Handle patch decorations up-front Once again, SPIR-V is insane... It allows you to place "patch" decorations on structure members. Presumably, this is so that you can do something such as out struct S { layout(location = 0) patch vec4 thing1; layout(location = 0) vec4 thing2; } str; And have your I/O "nicely" organized. While this is a bit silly, it's allowed and well-defined so whatever. Where it really gets interesting is when you have an array of struct. SPIR-V says nothing about not allowing you to have those qualifiers on the members of a struct that's inside an array and GLSLang does this. Specifically, if you have layout(location = 0) out patch struct S { vec4 thing1; vec4 thing2; } str[2]; then GLSLang will place the "patch" decorations on the struct members. This is ridiculous there is no way that having some of them be patch and some not would be well-defined given that patch and non-patch outputs are in effectively different storage classes. This commit moves around the way we handle the "patch" decoration so that we can detect even the crazy cases and handle them. Fixes: dEQP-VK.tessellation.user_defined_io.per_patch_block_array.* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-12 10:41:34 -08:00
Chad Versace	1e41d7f7b0	anv: Support loader interface version 3 (patch v2) This patch implements vk_icdNegotiateLoaderICDInterfaceVersion(), which brings us to loader interface v3. v2: - Drop the pragmas. [emil] - Advertise v3 instead of v2. Anvil supported more than I thought. [jason] - s/Surface/SurfaceKHR/ in comments. [emil] Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: mesa-stable@lists.freedesktop.org Cc: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 09:42:32 -08:00
Chad Versace	98cf089849	vulkan: Update vk_icd.h to interface version 3 Import from commit f2aeefec on branch 'master' of https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2017-01-12 09:42:32 -08:00
Chad Versace	c085bfcec9	vulkan: Add new cast macros for VkIcd types We can't import the latest vk_icd.h because the new header breaks the Mesa build. This patch defines new casting macros, ICD_DEFINE_NONDISP_HANDLE_CASTS() and ICD_FROM_HANDLE(), which can handle both the old and new vk_icd.h, and will prevent the build from breaking when we update the header. In the old vk_icd.h, types were defined as: typedef struct _VkIcdFoo { ... } VkIcdFoo; Commit 6ebba1f6 in the Vulkan loader changed the above to typedef { ... } VkIcdFoo; because the old definitions violated the C and C++ specs. According to the specs, identifiers that begins with an underscore followed by an uppercase letter are reserved. (It's pedantic, I know), See the Github issue referenced below. References: https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/7 References: `6ebba1f630` Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2017-01-12 09:42:32 -08:00
George Kyriazis	a61528fa33	Always defer memory free in swr_resource_destroy Defer delete on regular resources. This ensures that any work being done on the resource is completed before freeing up the resource's memory. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-01-12 09:10:15 -06:00
Juan A. Suarez Romero	ce44501ea8	nir/i965: assert first is always less than 64 This fixes a defect detected by Coverity Scan. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-01-12 15:08:05 +00:00
Samuel Pitoiset	f0997e2aa8	nvc0: enable GL 4.3 on gm107+ Although, arb_shader_image_load_store-atomicity will most likely hang your box, I think it's now quite reasonable to enable GL 4.3 on Maxwell/Pascal GPUs. I suspect that test to be wrong because it doesn't even work on the NVIDIA blob. I have tested a bunch of benchmarks (UE4 demos) and real games like Shadow of Mordor and they all work fine. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-01-12 15:22:21 +01:00
Samuel Pitoiset	38ff9980d7	nvc0: use sched control codes for gm107 MP counters code Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2017-01-12 15:22:15 +01:00
Samuel Pitoiset	75e6992379	nvc0: use sched control codes for gm107 blitter shader Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-12 15:22:07 +01:00
Samuel Pitoiset	90537d6a89	nv50/ir: use sched control codes for gm107 builtins Yes, IMUL/IMAD require dependency barriers and we should definitely replace these instructions by XMAD but the different flags need to be figured out. Note that XMAD only supports 16-bits integers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2017-01-12 15:22:01 +01:00
Samuel Pitoiset	f519c47f7d	nv50/ir: improve instruction pipelining on gm107 This makes use of scheduling control codes which are very useful for improving the instruction pipelining. This patch will increase performance on Maxwell GPUs by, at least, x1.5 up to x3.5 for some benchmarks. Although this has been fairly well tested, I would not be suprised if someone hit a corner case somewhere. That way, the scheduler is enabled by default but it can be deactivated by using NV50_PROG_SCHED=0. Thanks to Scott Gray for the reverse engineering work available from https://github.com/NervanaSystems/maxas/wiki/Control-Codes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Alexandre Courbot <acourbot@nvidia.com> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2017-01-12 15:21:54 +01:00
Samuel Pitoiset	1b3b4196f0	nv50/ir: do not insert texture barriers on gm107 It's actually useless to insert those texture barriers post RA because the current control code (ie. st 0x0) will wait for all dependencies before issuing a new instruction. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2017-01-12 15:21:47 +01:00
Juan A. Suarez Romero	75968a668e	i965/gen7: expose OpenGL 4.2 on Haswell when supported GL_ARB_vertex_attrib_64bit was the last piece missing. v2: update docs (Jordan) Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-12 12:56:57 +01:00
Samuel Iglesias Gonsálvez	77077986eb	i965: enable ARB_shader_precision to HSW+ v2: update docs (Jordan) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-12 12:56:57 +01:00
Samuel Iglesias Gonsálvez	1d1ddbaa56	i965: unify the code to enable of ARB_gpu_shader_fp64 and ARB_vertex_attrib_64bit for HSW+ Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-12 12:56:56 +01:00
Alejandro Piñeiro	485955be9c	i965: Enable ARB_vertex_attrib_64bit for Haswell v2: update docs (Jordan) Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-12 12:56:56 +01:00
Juan A. Suarez Romero	6bb4255f8e	i965: check for dual slot attributes on any gen Those not supporting 64 bit input vertex attributes will have the dual_slot value as false. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-12 12:56:56 +01:00
Juan A. Suarez Romero	f51a5b51ab	i965/vec4: emit correctly load_inputs for 64bit data For dvec3 and dvec4 types, a single GRF do not have enough space to allocate two inputs from two different vertices (SIMD4x2). So the GRF only contains first two components for the two vertices, and the next GRF has the remaining components. We want to put all the components for the same vertex in the same register. Thus, we do a shuffle to reorder the data. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-12 12:56:56 +01:00
Alejandro Piñeiro	58fdb85f0f	i965/vec4: take into account doubles when creating attribute mapping Doubles needs more that one slot per attribute. So when filling the attribute_map we check if it is a double in order to allocate one extra register. Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-12 12:56:56 +01:00
Alejandro Piñeiro	57bab6708f	i965/vec4/nir: vec4 also needs to remap vs attributes Doubles need extra space, so we would need to do a remapping for vec4 too in order to take that into account. We reuse the already existing remap_vs_attrs, but passing is_scalar, so they could remap accordingly. v2: code-format remap_vs_attrs_params initialization (Matt) Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-12 12:56:56 +01:00
Alejandro Piñeiro	f8310189f4	i965/vec4: use attribute slots for first non payload GRF As part of the payload setup, setup_attributes is called with the first GRF that can be used for the attributes (first ones are used for uniforms for example) and returns the first GRF that is not part of the payload. Before this patch, it adds directly the number of attributes. But as with 64-bit attributes can consume more than one slot, that is not valid anymore. This patch change the addition to use the number of slots consumed. gen >= 8 would not be affected, as they use the scalar mode. For that case, the vs configuration is done at fs_visitor::assign_vs_urb_setup. v2: add explanation in commit log (Jordan) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-12 12:56:35 +01:00
Alejandro Piñeiro	329cbe363d	i965: downsize 64PASSTHRU formats to equivalent 32FLOAT formats on gen < 8 gen < 8 doesn't support 64PASSTHRU formats when emitting vertices. So in order to provide the equivalent functionality, we need to downsize the format to equivalent 32FLOAT, and in some cases (R64G64B64 and R64G64B64A64) submit two 3DSTATE_VERTEX_ELEMENTS for each vertex element. Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-12 12:56:12 +01:00
Alejandro Piñeiro	717f99b34a	i965: return PASSTHRU surface types also on gen7 Although gen7 doesn't include surface types as a valid conversion format, we return it, as it reflects what we want to achieve, even if we need to workaround it on gen < 8. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-12 12:56:12 +01:00
Alejandro Piñeiro	f354cd5c69	main/buffers: take into account FRONT_AND_BACK on ReadBuffer From OpenGL 3.1 spec, section 4.3.1 "Reading Pixels", page 190 (203 PDF) "When READ FRAMEBUFFER BINDING is zero, i.e. the default framebuffer, src must be one of the values listed in table 4.4, including NONE . FRONT_AND_BACK , FRONT , and LEFT refer to the front left buffer." There is an equivalent text on OpenGL 4.5 spec, section 18.2.1 "Selecting Buffers for Reading", page 502 (524 PDF), so the behaviour is still the same. Part of the fix for: GL45-CTS.direct_state_access.framebuffers_draw_read_buffers_errors Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-01-12 08:21:03 -02:00
Alejandro Piñeiro	d54bc7e01f	main/buffers: update error handling on DrawBuffers for 4.5 Before 4.5, GL_BACK was not allowed as a value of bufs. Since 4.5 it is allowed under some circumstances: From the OpenGL 4.5 specification, Section 17.4.1 "Selecting Buffers for Writing", page 493 (page 515 of the PDF): "An INVALID_ENUM error is generated if any value in bufs is FRONT, LEFT, RIGHT, or FRONT_AND_BACK . This restriction applies to both the de- fault framebuffer and framebuffer objects, and exists because these constants may themselves refer to multiple buffers, as shown in table 17.4." And on page 492 (page 514 of the PDF): "If the default framebuffer is affected, then each of the constants must be one of the values listed in table 17.6 or the special value BACK . When BACK is used, n must be 1 and color values are written into the left buffer for single-buffered contexts, or into the back left buffer for double-buffered contexts." This patch keeps the same behaviour if OpenGL version is < 4. We assume that for 4.x this is the intended behaviour, so a fix, but for 3.x the intended behaviour is the already in place. Part of the fix for: GL45-CTS.direct_state_access.framebuffers_draw_read_buffers_errors v2: remove forgot printf v3: remove spaces before commas on spec quote, split line too long (Anuj) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-01-12 08:21:03 -02:00
Nicolai Hähnle	e33910b0d9	radeonsi: num_records is in units of stride for swizzled buffers even on VI The old setting didn't hurt, but this is cleaner. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-12 11:02:56 +01:00
Juan A. Suarez Romero	883ca597df	docs: document INTEL_PRECISE_TRIG envvar v2: use more generic description (Jordan) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-12 09:58:23 +00:00
Iago Toral Quiroga	5bcafc933c	spirv: fix typo in warning message Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-01-12 08:18:00 +01:00
Rafael Antognolli	ea7e4b1e51	i965: Enable predicate support on gen >= 8. Predication needs cmd parser only on gen7. For newer platforms, it should be available without it. v2 (Ken): rebase on recent changes. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-11 21:44:59 -08:00
Timothy Arceri	0252ba26c5	util: fix list_is_singular() Currently its dependant on the user calling and checking the result of list_empty() before using the result of list_is_singular(). Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 13:58:32 +11:00
Nanley Chery	5857858aa6	anv/image: Disable HiZ for depth buffer arrays We currently don't perform clears or resolves on multiple array layers with HiZ. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-11 17:35:59 -08:00
Nanley Chery	9f1d3a0c97	anv/cmd_buffer: Fix programmed HiZ qpitch Match the comment above the field by using units of pixels and not HiZ blocks. Cc: mesa-stable@lists.freedesktop.org Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-11 17:35:59 -08:00
Nanley Chery	61992e0afe	anv/cmd_buffer: Fix arrayed depth/stencil attachments Enable multiple layers of the depth/stencil buffers to be accessible. Fixes the crucible test, func.depthstencil.arrayed_clear. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-11 17:35:59 -08:00
Pierre Moreau	4e0d171d7e	clover: Check for executables before enqueueing a kernel Without this check, the kernel::bind() method would fail with a std::out_of_range exception, letting an exception escape from the library into the client, rather than returning the corresponding error code CL_INVALID_PROGRAM_EXECUTABLE. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-01-11 16:50:56 -08:00
Kenneth Graunke	c17b2f5724	spirv: Shut up unhandled enumeration value warnings. We don't want to do anything for the other cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-11 15:16:27 -08:00
Timothy Arceri	de8b03f5fb	nir: don't turn ieq/ine into inot if used by an if Otherwise we will end up with an extra instruction to compare the result of the inot. On BDW: total instructions in shared programs: 13060620 -> 13060481 (-0.00%) instructions in affected programs: 103379 -> 103240 (-0.13%) helped: 127 HURT: 0 total cycles in shared programs: 256590950 -> 256587408 (-0.00%) cycles in affected programs: 11324730 -> 11321188 (-0.03%) helped: 114 HURT: 21 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 09:47:29 +11:00
Timothy Arceri	7acc865226	nir: add late opt to turn inot/b2f combos back to bcsel We turn these from bcsel into inot/b2f combos in order for other optimisation passes to get further. Once we have finished turn the ones that remain and are used in more than a single expression back into a bcsel. On BDW: total instructions in shared programs: 13060965 -> 13060297 (-0.01%) instructions in affected programs: 835701 -> 835033 (-0.08%) helped: 670 HURT: 2 total cycles in shared programs: 256599536 -> 256598006 (-0.00%) cycles in affected programs: 114655488 -> 114653958 (-0.00%) helped: 419 HURT: 240 LOST: 0 GAINED: 1 The 2 HURT is because inserting bcsel creates the only use of const 1.0 in two shaders from tri-of-friendship-and-madness. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 09:47:29 +11:00
Timothy Arceri	8f37fc7066	nir: add imprecise flrp optimisation On BDW: total instructions in shared programs: 13061890 -> 13061877 (-0.00%) instructions in affected programs: 2441 -> 2428 (-0.53%) helped: 13 HURT: 0 total cycles in shared programs: 256612254 -> 256611784 (-0.00%) cycles in affected programs: 16418 -> 15948 (-2.86%) helped: 10 HURT: 2 V2: don't use ffma directly Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 09:47:29 +11:00
Kenneth Graunke	b4c44ff08c	i965: Use the nir_move_comparisons pass. While the below stats are encouraging this pass will also become very usefull for avoiding regression once brw_do_channel_expressions() and brw_do_vector_splitting() are disabled. On Broadwell: total instructions in shared programs: 13078787 -> 13060898 (-0.14%) instructions in affected programs: 1809827 -> 1791938 (-0.99%) helped: 4527 HURT: 157 total cycles in shared programs: 256562762 -> 256590424 (0.01%) cycles in affected programs: 159749392 -> 159777054 (0.02%) helped: 5583 HURT: 2289 total spills in shared programs: 14929 -> 14923 (-0.04%) spills in affected programs: 62 -> 56 (-9.68%) helped: 1 HURT: 0 total fills in shared programs: 20144 -> 20141 (-0.01%) fills in affected programs: 253 -> 250 (-1.19%) helped: 1 HURT: 3 LOST: 0 GAINED: 2 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 09:47:29 +11:00
Kenneth Graunke	b5e682a1ef	i965: Move nir_lower_locals_to_regs a bit later. I'm going to add a boolean scheduling pass that I want run late, but after copy propagation and dead code elimination. Yet, I don't want to have to think about registers. So, move the register conversion a little later. No impact on shader-db. Suggested by Jason Ekstrand. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-12 09:47:29 +11:00
Kenneth Graunke	fd957b1751	nir: Introduce a nir_opt_move_comparisons() pass. This tries to move comparisons (a common source of boolean values) closer to their first use. For GPUs which use condition codes, this can eliminate a lot of temporary booleans and comparisons which reload the condition code register based on a boolean. V2: (Timothy Arceri) - fix move comparision for phis so we dont end up with: vec1 32 ssa_227 = phi block_34: ssa_1, block_38: ssa_240 vec1 32 ssa_235 = feq ssa_227, ssa_1 vec1 32 ssa_230 = phi block_34: ssa_221, block_38: ssa_235 - add nir_op_i2b/nir_op_f2b to the list of comparisons. V3: (Timothy Arceri) - tidy up suggested by Jason. - add inot/fnot to move comparison list V4: (Jason Ekstrand) - clean up move_comparison_source - get rid of the tuple - rework phi handling Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 09:47:29 +11:00
Timothy Arceri	e8328e55e7	nir/algebraic: add support for conditional helper functions to expressions Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 09:47:29 +11:00
Jason Ekstrand	a7e399de59	anv/TODO: Check off a bunch of stuff	2017-01-11 10:28:18 -08:00
Jason Ekstrand	c472568b4e	nir/search: Only allow matching SSA values This is more correct and should also be a tiny bit faster since we're just comparing pointers instead of calling nir_src_equal. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2017-01-11 10:28:18 -08:00
Derek Foreman	534ea2b5ba	egl/dri2: add image_loader_extension back into loader extensions for wayland before commit `f871946594` image_loader_extension was always present in dri2_dpy->extensions, after that commit it is only present for render nodes. Its removal broke partial render based on buffer age on (at least) raspberry pi. Fixes: `f871946594` "egl/dri2: rework dri2_egl_display::extensions storage" Signed-off-by: Derek Foreman <derekf@osg.samsung.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-11 15:58:14 +00:00
Li Qiang	6205c53303	gallium/tgsi: fix overflow in parse property In parse_identifier, it doesn't stop copying 'pcur' untill encounter the NULL. As the 'ret' has a fixed-size buffer, if the 'pcur' has a long string, there will be a buffer overflow. This patch avoid this. Signed-off-by: Li Qiang <liq3ea@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2017-01-11 12:40:38 +01:00
Mauro Rossi	2c0d849e2d	st/dri: remove trailing whitespace Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-11 10:16:19 +02:00
Mauro Rossi	eca79e84b9	android: st/mesa: fix building error in libmesa_st_mesa Fixes building error due to dependency on nir generated headers Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-11 10:16:19 +02:00
Dave Airlie	e9d3cbca31	radv: fix multi-viewport emission This set context req seq was in the wrong place. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-11 09:08:51 +01:00
Tapani Pälli	f97f938650	nir: change asserts to unreachable in nir_type_conversion_op this is to avoid following compilation error on Android: error: control may reach end of non-void function [-Werror,-Wreturn-type] Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-01-11 10:08:13 +02:00
Iago Toral Quiroga	a9f497c678	spirv: gl_PrimitiveID in the fragment shader is handled as an input Geometry and Tessellation stages do handle this as a system value instead. Fixes: dEQP-VK.geometry.basic.primitive_id Reviewed-by: Dave Airlie <ailried@redhat.com>	2017-01-11 08:59:28 +01:00
Rob Clark	99e9dca149	freedreno: add "nogrow" debug param Sometimes it is useful to disable the "growable" cmdstream buffers for debugging. (See 419a154d in libdrm) Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-01-10 19:40:00 -05:00
Rob Clark	a43f3b895c	freedreno/a5xx: remove hack for glamor Now that issues glamor was hitting w/ glsl>=130 (aka missing INSTANCED bit in vertex attribute state) is fixed, remove hack. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-01-10 19:40:00 -05:00
Rob Clark	3c71853c9a	freedreno/a5xx: fixed instanced Add missing bit, now that we know where it is. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-01-10 19:40:00 -05:00
Rob Clark	b48fde1576	freedreno/a5xx: use the non-_ZERO_BASE for vertexid Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-01-10 19:40:00 -05:00
Rob Clark	730c3047f0	freedreno/a5xx: add texture MIPLVLS Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-01-10 19:40:00 -05:00
Rob Clark	1a5d0818df	freedreno/a5xx: fix fragcoord related hangs Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-01-10 19:40:00 -05:00
Rob Clark	ff81c3c9fd	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-01-10 19:40:00 -05:00
Kenneth Graunke	23a36c2811	anv: Enable tessellation shaders. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:27:31 -08:00
Kenneth Graunke	ebd88b5aa3	anv: Initialize physical device limits for tessellation Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:27:31 -08:00
Kenneth Graunke	dcca706b4e	anv: Clamp depth buffer dimensions to be at least 1. When there are no framebuffer attachments, fb->width and fb->height will be 0. Subtracting 1 results in 4294967295 which is too large for the field, causing genxml assertions when trying to create the packet. In this case, we can just program it to 1. Caught by dEQP-VK.tessellation.tesscoord.triangles_equal_spacing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:27:31 -08:00
Kenneth Graunke	e50d4807a3	anv: Compile TCS/TES shaders. v2: Merge more TCS/TES info. v3: Fix caching keys. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:27:31 -08:00
Kenneth Graunke	de05ecba9f	anv: Emit 3DSTATE_HS/TE/DS packets. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:27:31 -08:00
Kenneth Graunke	08b5713068	anv: Handle patch primitives. v2: Use anv_pipeline_has_stage rather than tess_info != NULL. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:27:10 -08:00
Kenneth Graunke	5297267a1c	nir: Add a pass to lower TES patch_vertices intrinsics to a constant. In Vulkan, we always have both the TCS and TES available in the same pipeline, so we can simply use the TCS OutputVertices execution mode value as the TES PatchVertices built-in. For GLSL, we handle this in the linker. But we could use this pass in the case when both TCS and TES are linked together, if we wanted. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:21:53 -08:00
Kenneth Graunke	944e8b08cd	spirv: Silence unsupported tessellation capability warnings. ...when the capability bit is set. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:21:38 -08:00
Kenneth Graunke	1e5b09f42f	spirv: Tidy some repeated if checks by using a switch statement. Iago suggested tidying this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:21:31 -08:00
Kenneth Graunke	bb04b84114	spirv: Add tessellation varying and built-in support. We need to: - handle the extra array level for per-vertex varyings - handle the patch qualifier correctly - assign varying locations Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:21:28 -08:00
Kenneth Graunke	23710e17f8	spirv: Handle tessellation execution modes. v2: Use info->tess. v3: Handle more things in either TCS/TES. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> [v1] Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:21:24 -08:00
Kenneth Graunke	5edc338162	compiler: Merge shader_info's tcs and tes structs. Annoyingly, SPIR-V lets you specify all of these fields in either the TCS or TES, which means that we need to be able to store all of them for either shader stage. Putting them in a union won't work. Combining both is an easy solution, and given that the TCS struct only had a single field, it's pretty inexpensive. This patch renames the combined struct to "tess" to indicate that it's for tessellation in general, not one of the two stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:21:21 -08:00
Kenneth Graunke	195bf8f027	genxml: Rename 3DSTATE_HS::Enable to "Function Enable". "Function Enable" is what the other stages use. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:20:33 -08:00
Lionel Landwerlin	860d91ec5b	anv: set input_slots_valid on brw_wm_prog_key With shaders using a lot of inputs/outputs, like this (from Gtk+) : layout(location = 0) in vec2 inPos; layout(location = 1) in float inGradientPos; layout(location = 2) in flat int inRepeating; layout(location = 3) in flat int inStopCount; layout(location = 4) in flat vec4 inClipBounds; layout(location = 5) in flat vec4 inClipWidths; layout(location = 6) in flat ColorStop inStops[8]; layout(location = 0) out vec4 outColor; we're missing the programming of the input_slots_valid field leading to an assert further down the backend code. v2: Use valid slots of the geometry or vertex stage (Jason) v3: Use helper to find correct vue map (Jason) v4: Set the valid slots off the previous stages (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 18:16:45 +00:00
Lionel Landwerlin	4b44ca7225	anv: add helper to get vue map for fragment shader Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 18:14:36 +00:00
Lionel Landwerlin	59fe3796a8	anv: add get_.*_prog_data for tesselation stages Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 18:14:33 +00:00
Lionel Landwerlin	6122b4ee96	anv: make get_.*_prog_data take a const pipeline Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 18:14:09 +00:00
Vinson Lee	01d80bed1f	nir: Fix anonymous union initialization with older GCC. Fix this build error with GCC 4.4.7. CC nir/nir_opt_copy_prop_vars.lo nir/nir_opt_copy_prop_vars.c: In function ‘copy_prop_vars_block’: nir/nir_opt_copy_prop_vars.c:765: error: unknown field ‘deref’ specified in initializer nir/nir_opt_copy_prop_vars.c:765: warning: missing braces around initializer nir/nir_opt_copy_prop_vars.c:765: warning: (near initialization for ‘(anonymous).<anonymous>’) nir/nir_opt_copy_prop_vars.c:765: warning: initialization from incompatible pointer type Fixes: `62332d139c` ("nir: Add a local variable-based copy propagation pass") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 23:25:32 -08:00
Samuel Iglesias Gonsálvez	17eac30e90	docs: add Vulkan Float64 capability support for anv driver Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 06:42:44 +01:00
Dave Airlie	ada66480b2	radv/ac: add support for multi sample image coords This just adds the nir->llvm support, enabling the extension causes some failures on llvm 3.9 at least, but this code seems fine. NIR passes the sampler in src[1].x, and we LLVM/SI requires it as the last parameters in the coords (coord[2] for 2D, coord[3] for 2DArray). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-10 12:59:31 +10:00
Boyan Ding	41b1d9a558	glsl: Do not allow scalar types in vector relational functions According to OpenGL Shading Language 4.50 spec, Section 8.7 "Vector Relational Functions", functions of this type do not operate on scalar types, so remove scalar types from signature definitions to make the behavior consistent with glslangValidator and other drivers. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>	2017-01-09 17:58:33 -08:00
Thomas Hindoe Paaboel Andersen	5b4fa21d53	nir: remove duplicated foreach loop The foreach loop was called both in the else case and right after. The indentation seems to indicate that the extra call was from a previous version with an else section with out curly brackets. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 17:04:47 -08:00
Kenneth Graunke	2bae2fa094	i965: Fix number of slots in SSO mode when there are no user varyings. We want vue_map->num_slots to be one more than the final slot. When assigning fixed slots, built-in slots, and non-SSO user varyings, we do slot++. This leaves "slot" as one past the most recently assigned slot. But for SSO user varyings, we computed slot based on the varying location value...and left it at that slot value. To work around this inconsistency, I made num_slots be "slot + 1" if separate and "slot" otherwise. The problem is...if there are no user varyings in SSO mode...then we would have done slot++ when assigning built-ins, so it would be off by one. This resulted in loops from 0 to vue_map->num_slots hitting a bonus BRW_VARYING_SLOT_PAD at the end. This used to break the SIMD8 VS/TES backends, but I fixed that in commit `480d6c1653`. It's probably safe at this point, but we should fix it anyway. To fix this, do slot++ in all cases. For SSO mode, we overwrite slot for every varying, so this increment only matters on the last varying. Because we process varyings in order, this will set slot to 1 more than the highest assigned slot. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-01-09 16:52:16 -08:00
Kenneth Graunke	203c128781	spirv: Move cursor before calling vtn_ssa_value() in phi 2nd pass. vtn_ssa_value() can produce variable loads, and the cursor might be after a return statement, causing nir_builder assert failures about not inserting instructions after a jump. This fixes: dEQP-VK.spirv_assembly.instruction.graphics.barrier.in_if dEQP-VK.spirv_assembly.instruction.graphics.barrier.in_switch Cc: "13.0 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 16:52:02 -08:00
Marek Olšák	230b756f86	mesa: set GLSL 1.20 for the fixed-function fragment shader This fixes broken depth texturing after: commit `22639a6e19` Author: Timothy Arceri <timothy.arceri@collabora.com> Date: Mon Nov 21 00:29:29 2016 +1100 st/mesa: get Version from gl_program rather than gl_shader_program Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-01-10 01:03:32 +01:00
Bas Nieuwenhuizen	8bc39e251b	radv: Create single RADV_DEBUG env var. Also changed RADV_SHOW_QUEUES to a no compute queue option. That would make more sense later when the compute queue is established, but the transfer queue still experimental. v2: Don't include the trace flag. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-09 21:44:14 +01:00
Bas Nieuwenhuizen	8cb60c7dd3	ac/debug: Dump indirect buffers. This is for handling chained command buffers and secondary command buffers. It doesn't handle the trace id for secondary command buffers yet, but I don't think that is possible in general with just writes, as we could call a secondary command buffer multiple times. I think this is good enough for now, as the most useful case is the chaining when we grow an IB. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-09 21:44:08 +01:00
Bas Nieuwenhuizen	97dfff5410	radv: Dump command buffer on hang. v2: - Now use the filename specified by RADV_TRACE_FILE env var. - Use the same var to enable tracing. I thought we could as well always set the filename explicitly instead of having some arbitrary defaults, and at that point we don't need a separate feature enable. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-09 21:44:03 +01:00
Bas Nieuwenhuizen	0ef1b4d5b1	ac/debug: Move IB decode to common code. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-09 21:43:59 +01:00
Bas Nieuwenhuizen	b838f64237	ac/debug: Move sid_tables.h generation to common code. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-09 21:43:54 +01:00
Jason Ekstrand	1c5dcecd51	relnotes: Claim OpenGL 4.5 rather than 4.4 Acked-by: Matt Turner <mattst88@gmail.com>	2017-01-09 10:55:57 -08:00
Jason Ekstrand	5b4aeb331a	mesa: Bump the version to 17.0 Acked-by: Matt Turner <mattst88@gmail.com>	2017-01-09 10:55:39 -08:00
Marek Olšák	cac74a9bcc	radeonsi: fix the Witcher 2 black transitions v2: do it properly Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98238 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-09 12:01:30 +01:00
Marek Olšák	5b85a6b3f7	radeonsi: set si_shader_context::input_decls for ranged decls correctly This has no effect because no code uses those members with ranged decls. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-09 12:01:30 +01:00
Marek Olšák	6f356d15be	radeonsi: cleanly communicate whether si_shader_dump should check R600_DEBUG Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-09 12:01:30 +01:00
Iago Toral Quiroga	030e5f07a5	isl: render target cube maps should be handled as 2D images, not cubes This fixes layered rendering Vulkan CTS tests with cube (arrays). We also do this in the GL driver, see this code from gen8_depth_state.c for example: case GL_TEXTURE_CUBE_MAP_ARRAY: case GL_TEXTURE_CUBE_MAP: /* The PRM claims that we should use BRW_SURFACE_CUBE for this * situation, but experiments show that gl_Layer doesn't work when we do * this. So we use BRW_SURFACE_2D, since for rendering purposes this is * equivalent. / surftype = BRW_SURFACE_2D; depth = 6; break; So I guess we simply forgot to port this workaround to Vulkan. v2: tweak the conditions so the special case is cube texture sampling rather than anything else (Jason) Fixes: dEQP-VK.geometry.layered.cube* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 11:43:07 +01:00
Iago Toral Quiroga	566a0c43f0	anv: don't skip the VUE header if we are reading gl_Layer in a fragment shader This is the same we do in the GL driver: the hardware provides gl_Layer in the VUE header, so when the fragment shader reads it we can't skip it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 11:43:07 +01:00
Samuel Iglesias Gonsálvez	0449c93638	anv: enable shaderFloat64 feature Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 10:44:07 +01:00
Samuel Iglesias Gonsálvez	465204695f	anv: enable float64 feature on supported platforms v2: - Remove image_ms_array initialization (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 10:44:07 +01:00
Samuel Iglesias Gonsálvez	88c8121ec9	spirv: enable SpvCapabilityFloat64 only to supported platforms v2 (Jason): - Use nir_spirv_supported_extensions to check if the feature is enabled. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 10:44:07 +01:00
Juan A. Suarez Romero	c2acf97fcc	nir/i965: use two slots from inputs_read for dvec3/dvec4 vertex input attributes So far, input_reads was a bitmap tracking which vertex input locations were being used. In OpenGL, an attribute bigger than a vec4 (like a dvec3 or dvec4) consumes just one location, any other small attribute. So we mark the proper bit in inputs_read, and also the same bit in double_inputs_read if the attribute is a dvec3/dvec4. But in Vulkan, this is slightly different: a dvec3/dvec4 attribute consumes two locations, not just one. And hence two bits would be marked in inputs_read for the same vertex input attribute. To avoid handling two different situations in NIR, we just choose the latest one: in OpenGL, when creating NIR from GLSL/IR, any dvec3/dvec4 vertex input attribute is marked with two bits in the inputs_read bitmap (and also in the double_inputs_read), and following attributes are adjusted accordingly. As example, if in our GLSL/IR shader we have three attributes: layout(location = 0) vec3 attr0; layout(location = 1) dvec4 attr1; layout(location = 2) dvec3 attr2; then in our NIR shader we put attr0 in location 0, attr1 in locations 1 and 2, and attr2 in location 3 and 4. Checking carefully, basically we are using slots rather than locations in NIR. When emitting the vertices, we do a inverse map to know the corresponding location for each slot. v2 (Jason): - use two slots from inputs_read for dvec3/dvec4 NIR from GLSL/IR. v3 (Jason): - Fix commit log error. - Use ladder ifs and fix braces. - elements_double is divisible by 2, don't need DIV_ROUND_UP(). - Use if ladder instead of a switch. - Add comment about hardware restriction in 64bit vertex attributes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 10:42:22 +01:00
Samuel Iglesias Gonsálvez	3551a2d3ad	isl: fix VA64 support for double and dvecN vertex attributes We use 64_PASSTHRU formats to upload vertex attributes of 64 bits to avoid conversions. From the BDW PRM, Volume 2d, page 586 (VERTEX_ELEMENT_STATE): "When SourceElementFormat is set to one of the 64_PASSTHRU formats, 64-bit components are stored in the URB without any conversion. In this case, vertex elements must be written as 128 or 256 bits, with VFCOMP_STORE_0 being used to pad the output as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red component into the URB, Component 1 must be specified as VFCOMP_STORE_0 (with Components 2,3 set to VFCOMP_NOSTORE) in order to output a 128-bit vertex element, or Components 1-3 must be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex element. Likewise, use of R64G64B64_PASSTHRU requires Component 3 to be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex element." v2,v3 (Jason): - Don't delete unused formats. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Juan A. Suarez Romero	1c9483f48e	anv/pipeline: get map for double input attributes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	cc4ff6c2a0	spirv: add support for doubles to OpSpecConstant v2 (Jason): - Fix indent in radv change - Add vtn_u64_literal() helper to take 64 bits (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	fc1708948b	spirv/nir: add (un)packDouble2x32() translation Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	c332432bae	spirv/nir: implement DF conversions SPIR-V does not have special opcodes for DF conversions. We need to identify them by checking the bit size of the operand and the result. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	27cf6a369f	nir: add nir_type_conversion_op() This function returns the nir_op corresponding to the conversion between the given nir_alu_type arguments. This function lacks support for integer-based types with bit_size != 32 and for float16 conversion ops. v2: - Improve readiness of the code and delete cases that don't happen now (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	3a571fcc43	nir: add nir_get_nir_type_for_glsl_type() v2 (Jason): - Refactor nir_get_nir_type_for_glsl_type() to avoid using unneeded helpers (Jason) v3: - Use return directly (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	59944a77ae	spirv: add support for doubles on OpComposite{Insert,Extract} Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	e6bebb9982	spirv: Enable double floating points when copying variables in _vtn_variable_copy() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	9d71cfeff8	spirv: add double support to _vtn_block_load_store() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	0cd0c32c06	spirv: add double support to _vtn_variable_load_store Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	8076c8b59f	spirv: add double support to SpvOpCompositeExtract v2 (Jason): - Add asserts. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	a966387883	spirv: fix SpvOpSpecConstantOp with SpvOpVectorShuffle working with double-based vecs We need to pick two 32-bit values per component to perform the right shuffle operation. v2 (Jason): - Add assert to check matching bit sizes (Jason) - Simplify the code to pick components (Jason) v3: - Switch on bit_size once (Jason) - Add comment to explain the constant value for unused components (Erik) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	ec686ff62c	spirv: add DF support to SpvOp*ConstantComposite v2 (Jason): - Add assert. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	2bf4d0ba7a	spirv: add DF support to vtn_const_ssa_value() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	d77ffc3d87	spirv: add support for loading DF constants v2 (Jason): - Add assert. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	9602c7c02f	spirv: add definition of double based data types Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Samuel Iglesias Gonsálvez	d1bbe2c94e	spirv: fix typo in spec_constant_decoration_cb() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 09:10:13 +01:00
Dave Airlie	41969f0d06	radv: drop unused fields in physical device. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-09 16:48:14 +10:00
Tapani Pälli	8b43f42011	i965: call intel_prepare_render always when reading pixels Currently we do this only in the fallback code (when tiled memcpy version failed) but it needs to be done always so that we have correct read and write buffer in place. No regressions seen in CI. Fixes: dEQP-EGL.functional.buffer_age.* Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98330 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-01-09 07:44:53 +02:00
Timothy Arceri	953e4e4417	st/mesa: pass gl_program to st_bind_ubos() We no longer need anything from gl_linked_shader. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-09 15:27:35 +11:00
Timothy Arceri	270e584a86	st/mesa: pass gl_program to st_bind_images() We no longer need anything from gl_linked_shader. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-09 15:27:35 +11:00
Timothy Arceri	59ac77b410	st/mesa: stop passing gl_linked_shader to set_affected_state_flags() We now get everything we need from the gl_program param. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-09 15:27:35 +11:00
Timothy Arceri	ae632afe4f	st/mesa/glsl: set num_images directly in shader_info This change also removes the now duplicate NumImages field. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-09 15:27:35 +11:00
Timothy Arceri	4b30011d34	st/mesa: pass gl_program to st_bind_ssbos() We no longer need to pass gl_shader_program. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-09 15:27:34 +11:00
Timothy Arceri	1130f82a88	nir: add another comparison simplification On BDW: total instructions in shared programs: 13061877 -> 13060965 (-0.01%) instructions in affected programs: 133569 -> 132657 (-0.68%) helped: 566 HURT: 0 total cycles in shared programs: 256611784 -> 256599536 (-0.00%) cycles in affected programs: 861016 -> 848768 (-1.42%) helped: 379 HURT: 73 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 12:32:16 +11:00
Kenneth Graunke	3371de38f2	nir: Turn bcsel of +/- 1.0 and 0.0 into b2f sequences. On BDW: total instructions in shared programs: 13074882 -> 13068703 (-0.05%) instructions in affected programs: 1823116 -> 1816937 (-0.34%) helped: 4187 HURT: 537 total cycles in shared programs: 256622718 -> 256425382 (-0.08%) cycles in affected programs: 123790120 -> 123592784 (-0.16%) helped: 3823 HURT: 2037 total spills in shared programs: 15276 -> 14929 (-2.27%) spills in affected programs: 9446 -> 9099 (-3.67%) helped: 352 HURT: 1 total fills in shared programs: 20496 -> 20144 (-1.72%) fills in affected programs: 13040 -> 12688 (-2.70%) helped: 352 HURT: 1 LOST: 2 GAINED: 21 v2: Rely on 'a' being a well-formed boolean (Connor, Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-09 12:32:16 +11:00
Kenneth Graunke	1c50d31c26	nir: Convert ineg(b2i(a)) to a if it's a boolean. On BDW: total instructions in shared programs: 13071119 -> 13070371 (-0.01%) instructions in affected programs: 83424 -> 82676 (-0.90%) helped: 505 HURT: 45 (all TCS, all hurt by a single instruction) total cycles in shared programs: 256601322 -> 256588932 (-0.00%) cycles in affected programs: 819410 -> 807020 (-1.51%) helped: 450 HURT: 57 total loops in shared programs: 2950 -> 2942 (-0.27%) loops in affected programs: 8 -> 0 helped: 7 HURT: 0 v2: Drop unnecessary 'a@bool' annotation (Connor, Eric). Add a comment explaining the rule (Ian). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-09 12:32:16 +11:00
Kenneth Graunke	86b9be777f	i965: Move TES input VUE map calculation out a layer. In Vulkan, we'll compile the TCS and TES at the same time, so I can just pass the TCS output VUE map to brw_compile_tes as the TES input VUE map. So, we only need to do this in GL. Move it to the GL-specific layer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-07 22:24:10 -08:00
Kenneth Graunke	6e8ac0641f	i965: Pass NULL for gl_program when compiling TES. This isn't needed, and Vulkan doesn't have one. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-07 22:24:10 -08:00
Kenneth Graunke	08f8f1bcd5	i965: Move TES spacing/domain/topology setup to brw_compile_tes(). Moving this down a layer lets us share code between Vulkan and GL. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-07 22:24:10 -08:00
Kenneth Graunke	cc2df4bb81	i965: Access TES shader info via NIR. NIR exists in both GL and Vulkan, but gl_program is GL specific. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-07 22:24:10 -08:00
Kenneth Graunke	a4fd84ef5f	mesa: Introduce a compiler enum for tessellation spacing. It feels weird using GL_* enums in a Vulkan driver. v2: Fix the TESS_SPACING -> PIPE_TESS_SPACING conversion. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-07 22:22:28 -08:00
Kenneth Graunke	9bb89175e6	compiler: Change shader_info->tes.vertex_order into a ccw boolean. The vertex order is either clockwise or counterclockwise. We can just store a "ccw" boolean rather than GLenum values. I don't want to use GLenums in a Vulkan driver, and even in GL a simple boolean works fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-07 20:42:32 -08:00
Jason Ekstrand	faa1edeeb7	anv/pipeline: Call NIR passes using NIR_PASS_V This lets us get validation without having to do it manually. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-07 15:45:09 -08:00
Jason Ekstrand	43e0b0d4b2	anv/pipeline: Only call remove_dead_variables once It can handle multiple modes at a time now so there's no reason to call it repeatedly. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-07 15:45:09 -08:00
Kenneth Graunke	957ec00243	Revert recent GLSL slot counting fiasco. I apparently broke mark_whole_variable in ir_set_program_inouts. It was passing a type that wasn't var->type, so the wrapper didn't work out. It's all broken, revert it and start over. Fixes all kinds of things on other drivers. Revert "glsl: Make is_fixed_function_array actually check for varyings." This reverts commit `42699e1271`. Revert "glsl: Mark whole variable used for ClipDistance and TessLevel." This reverts commit `5c580e64cc`. Revert "glsl: Override the # of varying slots for ClipDistance and TessLevel." This reverts commit `8b5749f65a`. Revert "glsl: Create and use a new ir_variable::count_attribute_slots() wrapper." This reverts commit `6aa5cb34d0`.	2017-01-07 15:15:08 -08:00
Kenneth Graunke	42699e1271	glsl: Make is_fixed_function_array actually check for varyings. We can't check VARYING_SLOT_* locations until we've determined that the variable is actually a varying. Fixes assert failures in drivers which actually use this path, such as radeonsi and i915. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99314 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-07 13:05:37 -08:00
Kai Wasserbäch	5a165b4086	drirc: Allow extension midshader for Divinity: Original Sin (EE) See also <https://bugs.freedesktop.org/show_bug.cgi?id=93551#c27> where this was first observed as a requirement. Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-07 15:47:35 +01:00
Timothy Arceri	1edc53a66b	glsl: fix opt_minmax redundancy checks against baserange Marking operations as redundant if they are equal to the base range is fine when the tree structure is something like this: max / \ max b / \ 3 max / \ 3 a But the opt falls apart with a tree like this: max / \ max max / \ / \ 3 a b 3 The problem is that both branches are treated the same: descending in the left branch will prune the constant, and then descending the right branch will prune the constant there as well, because limits[0] wasn't updated to take the change on the left branch into account, and so we still get [3,\infty) as baserange. In order to fix the bug we just disable the marking of redundant expressions when they match the baserange. NIR algebraic opt will clean up the first tree for anyway, hopefully other backends are smart enough to do this also. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-07 21:46:36 +11:00
Jason Ekstrand	45912fb908	i965/compiler: Use the new nir_opt_copy_prop_vars pass We run this after nir_lower_vars_to_ssa so that as many load/store_var intrinsics as possible before copy_prop_vars executes. This is because the pass isn't particularly efficient (it does a lot of linear walks of a linked list) so we'd like as much of the work as possible to be done before copy_prop_vars runs. Shader DB results on Sky Lake: total instructions in shared programs: 12020290 -> 12013627 (-0.06%) instructions in affected programs: 26033 -> 19370 (-25.59%) helped: 16 HURT: 13 total cycles in shared programs: 137772848 -> 137549012 (-0.16%) cycles in affected programs: 6955660 -> 6731824 (-3.22%) helped: 217 HURT: 237 total loops in shared programs: 3208 -> 3208 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 4112 -> 4057 (-1.34%) spills in affected programs: 483 -> 428 (-11.39%) helped: 2 HURT: 0 total fills in shared programs: 5519 -> 5102 (-7.56%) fills in affected programs: 993 -> 576 (-41.99%) helped: 2 HURT: 0 LOST: 0 GAINED: 0 Broadwell had similar results. On older hardware, the impact isn't as large because they don't advertise GL 4.5. Of the hurt programs, all but one are hurt by a single instruction and the one is hurt by 3 instructions. All of the helped programs, on the other hand, are helped by at least 3 instructions and one kerbal space program shader is helped by 44.59%. The real star of the show, however, is the Gl43CSDof synmark2 benchmark which has two shaders which are cut by 28% and 40% and the over-all runtime performance of the benchmark on my Sky Lake laptop is improved by around 25-30% (it's a bit hard to be exact due to thermal throttling). Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-06 16:44:29 -08:00
Jason Ekstrand	62332d139c	nir: Add a local variable-based copy propagation pass Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-06 16:44:28 -08:00
Jason Ekstrand	830dca74fe	nir/builder: Add a helper for getting the most recently added instruction Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-06 16:44:28 -08:00
Jason Ekstrand	75a6707984	nir/builder: Add a load_deref_var helper Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-06 16:44:28 -08:00
Jason Ekstrand	13a2f20740	nir/dead_variables: Remove shader-local variables that are only written Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-06 16:44:28 -08:00
Jason Ekstrand	58fe5c57cd	nir/dead_variables: Removed shared variables when requested Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-06 16:44:28 -08:00
Jason Ekstrand	2d7bed6158	anv/formats: Use the real format for B4G4R4A4_UNORM_PACK16 on gen8 Because border color is handled pre-swizzle, when we move the alpha channel around in the format, the OPAQUE_BLACK border colors don't work correctly on B4G4R4A4_UNORM_PACK16 with the hack. This fixes the following Vulkan CTS tests on Broadwell: dEQP-VK.pipeline.sampler.view_type.2d_array.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black dEQP-VK.pipeline.sampler.view_type.1d_array.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black dEQP-VK.pipeline.sampler.view_type.2d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black dEQP-VK.pipeline.sampler.view_type.1d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black dEQP-VK.pipeline.sampler.view_type.3d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2017-01-06 16:44:15 -08:00
Jason Ekstrand	4e7958fb13	isl: Mark A4B4G4R4_UNORM as supported on gen8 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0" <mesa-dev@lists.freedesktop.org>	2017-01-06 16:44:15 -08:00
Pierre-Loup A. Griffais	f6d3af2af6	radv: fix depth transitions with layerCount = VK_REMAINING_ARRAY_LAYERS Interpreting layerCount literally would try to create billions of image views in radv_process_depth_image_inplace(). Signed-off-by: Pierre-Loup A. Griffais <pgriffais@valvesoftware.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-01-07 01:26:08 +01:00
Kenneth Graunke	e6ae19944d	i965: Rework gl_TessLevel[] handling to use NIR compact arrays. Treating everything as scalar arrays allows us to drop a bunch of special case input/output munging all throughout the backend. Instead, we just need to remap the TessLevel components to the appropriate patch URB header locations in remap_patch_urb_offsets(). We also switch to treating the TES input versions of these as ordinary shader inputs rather than system values, as remap_patch_urb_offsets() just makes everything work out without special handling. This regresses one Piglit test: arb_tessellation_shader-large-uniforms/GL_TESS_CONTROL_SHADER-array-at-limit The compiler starts promoting the constant arrays assigned to gl_TessLevel to uniform arrays. Since the shader also has a uniform array that uses the maximum number of uniform components, this puts it over the uniform component limit enforced by the linker. This is arguably a bug in the constant array promotion code (it should avoid pushing us over limits), but is unlikely to penalize any real application. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-06 15:55:48 -08:00
Kenneth Graunke	31d9de58ab	i965: Inline store_output helper in quads workaround code. It's only used in one place, it ignores the offset parameter currently, and I want to add more parameters...at which point, passing in a bunch of integers seems less obvious than writing it out. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-06 15:55:47 -08:00
Kenneth Graunke	311b1f0a98	nir: Make glsl_to_nir compact scalar TessLevel arrays. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-06 15:55:46 -08:00
Kenneth Graunke	496693d466	i965: Make unify_interfaces not spread VARYING_BIT_TESS_LEVEL_*. This is harmless today because gl_TessLevelInner/Outer in the TES is currently treated as system values. However, when we move to treating them as inputs, this would cause a bug: with no TCS present, it would propagate TES reads of VARYING_SLOT_TESS_LEVEL into the VS output VUE map slots. This is totally bogus - those don't even exist in the VS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-06 15:55:42 -08:00
Kenneth Graunke	a46bd79ee1	glsl: Support gl_TessLevelInner/Outer[] as TES input variables. Upcoming reworks in i965 are going to make it easy to handle this like any other input. Having it as a system value will just require additional code for no benefit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-06 15:55:41 -08:00
Kenneth Graunke	5c580e64cc	glsl: Mark whole variable used for ClipDistance and TessLevel*. There's no point in trying to mark partial array access for gl_ClipDistance, gl_TessLevelOuter, or gl_TessLevelInner - they're special built-in variables that control fixed function hardware, and will likely be used in an all-or-nothing fashion. Since these arrays only occupy 1-2 varying slots, we have to avoid our normal processing which increments the slot value by the array index. (I wrote this code before i965 switched from ir_set_program_inouts to nir_shader_gather_info. It's not used by anyone today, and I'm not sure how valuable it is...the alternative to GLSL IR lowering is NIR compact arrays, at which point you should use nir_gather_info.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-06 15:55:39 -08:00
Kenneth Graunke	8b5749f65a	glsl: Override the # of varying slots for ClipDistance and TessLevel*. Right now, this shouldn't have any effect, as all drivers use LowerClipDist and LowerTessFactors to turn the float[] arrays into vectors. However, it should help make it possible for drivers to avoid that lowering. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-06 15:55:37 -08:00
Kenneth Graunke	6aa5cb34d0	glsl: Create and use a new ir_variable::count_attribute_slots() wrapper. This wraps glsl_type::count_attribute_slots(), but will soon contain a couple of overrides for a couple of GLSL built-ins variables. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-06 15:55:25 -08:00
Marek Olšák	aead6a1e94	gallium/radeon: use the internal clear_buffer callback to fix r600g r600g doesn't set pipe_context::clear_buffer. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99303 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2017-01-06 23:32:25 +01:00
Roland Scheidegger	f4821daed1	llvmpipe: do transpose/untwiddle after conversion for 8bit formats Generally we should do tranpose after conversion, if the format has less than 32 bits per channel (if it has 32 bits, conversion is going to be a no-op anyway...). This is obviously because there's less vectors to deal with. Though the advantage for 16 bit formats isn't that big, and in fact with AVX there isn't really any (as the 32bit unpacks can be done with 256bit, but the smaller ones cannot, although that would change again with proper AVX2 support). Only makes sense for 2d and not 1d cases. And to keep things easy, only handle 1,2 and 4 channels (rgbx is just fine). For rgba unorm8 format the backend conversion sums up to these instruction totals (not counting the movs for SSE2 due to 2-op syntax - generally every 2 unpacks need an additional mov). SSE2 AVX transpose: 32 unpack 16 unpack untwiddle: 0 8 (128bit low/high permutes) convert: 16 mul + 16 cvt 8 mul + 8 cvt 32->8bit: 12 pack 8 (128bit extract) + 12 pack When doing transpose/untwiddle afterwards we get: convert: 16 mul + 16 cvt 8 mul + 8 cvt 32->8bit: 12 pack 8 (128bit extract) + 12 pack transpose/untwiddle 12 unpack 12 unpack So for SSE2, this drops 20 unpacks (total instruction count 76->56) whereas for AVX it replaces the 16 256bit unpacks with 8 128bit ones and drops the 8 lo/hi permutes (in total 60->48). (Albeit to be fair, the permutes could be dropped even when doing the transpose first, they are extremely pointless but we'd need to be able to tell lp_build_conv to reorder the vectors, for AVX2 we're going to need to be able to tell lp_build_conv about ordering in any case.) (With different ordering going into conversion, it would be possible to do 4 unpacks + 4 pshufbs instead of 12 unpacks, but that might not be better, and not all cpus can do it. Proper AVX2 support should eliminate the 8 128bit extracts, reduce these 12 packs to 6 and the 12 unpacks to 2 pshufb + 2 permq ideally (+ 2 final 128bit extracts).) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-06 23:13:34 +01:00
Roland Scheidegger	6e7ce1ef55	gallivm: generalize 4x4f->1x16ub special case conversion This special packing path can be easily extended to handle not just float->unorm8 but also float->snorm8 and uint32->uint8 and int32->int8 (i.e. all interesting cases for llvmpipe fs backend code). The packing parts all stay the same (only the last step packing will be signed->signed instead of signed->unsigned but luckily even sse2 can do both). While here also note some bugs with that (we keep the bugs identical to what we did before on x86, albeit other archs may differ). In particular float->unorm8 too large values will still get clamped to 0, not 255, and for float->snorm8 NaNs will end up as -1, not 0 (but we do the clamp against 1.0 there to prevent too large values ending up as -1.0 - this is inconsistent to unorm8 handling but is what we ended up before, I'm not sure we can get away without it). This is quite fishy in any case as we depend on arch-dependent behavior of the iround (my understanding is in fact with altivec the conversion would actually saturate although I've no idea about NaNs, so probably wouldn't need to do anything for snorm). (There are only minimal piglit tests for unorm clamping behavior AFAICT, in particular nothing seems to test values which are too large to be handled by the float->int conversion.) For uint32->uint8 we also do a min against MAX_INT, since the source for the packs is always signed (again, on x86 - should probably be able to express these arch-dependent bits better some day). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-06 23:13:34 +01:00
Roland Scheidegger	04480a04b1	llvmpipe: use alpha from already converted color if possible For rgbx formats, there is no point in doing alpha conversion again (and with different tranpose even, so llvm can't eliminate it). Albeit it looks like there's some minimal changes needed in the blend code (found by code inspection, no test seemed to complain) if we do this - the blend factors are already sanitized if we have no destination alpha, however for src_alpha_saturate it looks like it still might make a difference (note that we forced has_alpha to true before for some formats and nothing complained, but this seems safer). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-06 23:13:34 +01:00
Roland Scheidegger	53c2d24a24	llvmpipe: use scalar load instead of vectors for small vectors in fs backend llvm has _huge_ problems trying to load things like <4 x i8> vectors and stitching such loads together to form 128bit vectors. My understanding of the problem is that the type legalizer tries to extend that to really a <4 x i32> vector and not a <16 x i8> vector with the 4 elements first then followed by padding, so the shuffles for then combining things together are more or less impossible - you can in fact see the pmovzxd llvm generates. Pre-4.0 llvm just gives up on it completely and does a 30+ pextrb/pinsrb sequence instead. It looks like current llvm has fixed this behavior (my guess would be due to better shuffle combination and load/shuffle folds), but we can avoid this by just loading as <1 x i32> values, combine that and only cast at the end. (I suspect it might also work if we'd pad the loaded vectors immediately before shuffling them together, instead of directly stitching 2 such vectors together pairwise before combining the pair. But this _might_ lose the ability to load the values directly into their right place in the vector with pinsrd.). But using 32bit values is probably easier for llvm as it will never give it funny ideas how the vector should look like. (This is possibly only a problem for 1x8bit formats, since 2x8bit will end up fetching 64bit hence only two vectors are stitched together, not 4, but we use the same strategy anyway.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-06 23:13:34 +01:00
Ian Romanick	1472ff3591	i965: Enable several GLES 3.1 extensions on HSW+ The only reason we didn't previously enable this was the dependency on OpenGL ES 3.1. These should have been enabled as soon as HSW got stencil texturing. We also needed to fixup setting MaxViewports. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-06 12:42:43 -08:00
Ian Romanick	90c51ccf82	i965: Always set MaxViewports and related limits Since `9d6ca7c3`, there should be no performance hit for having MaxViewports > 1. Always set this context state. This eliminates the need to update this conditional as we add support for OES_viewport_array on older GPUs. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-06 12:42:43 -08:00
Marek Olšák	b7699ce07c	winsys/amdgpu: fix a race condition between fence updates and IB submissions The CS thread is needed to ensure proper ordering of operations and can't be disabled (without complicating the code). Discovered by Nine CSMT, which ended up in a deadlock. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	ece6e1f658	radeonsi: add TC L2 prefetch for shaders and VBO descriptors Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	a131dacb14	radeonsi: add CP DMA flags for greater control over synchronization for L2 prefetch Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	8ac1715d02	radeonsi: cleanly communicate which CP DMA packet is first Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	2b621c47aa	gallium/radeon: add new HUD query num-SDMA-IBs Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	6b8a371e00	gallium/radeon: rename the num-ctx-flushes query to num-GFX-IBs Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	5871ebd7f1	radeonsi: add HUD queries for cache flush stats Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	aac07bb79c	radeonsi: don't count fast clears and prefetches into CP DMA stats Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	3b98a5dc47	radeonsi: don't wait for compute shaders in texture_barrier it doesn't interact with compute shaders in any way Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	4b93ba542c	radeonsi: assume that a TES without POSITION precedes GS Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	53648050a5	radeonsi: unduplicate VS color export code it's exactly the same as the other ones Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	42920c0fb9	radeonsi: clean up more HAVE_LLVM #ifdefs Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	a8374c3d22	gallium/radeon: clean up HAVE_LLVM #ifdefs in r600_get_llvm_processor_name Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Kenneth Graunke	2138347a45	i965: Properly flush in hsw_pause_transform_feedback(). Fixes a number of transform feedback tests when run with Linux 4.8, which allows us to use the MI_LOAD_REGISTER_REG command, at which point we started using this new broken path. ES3-CTS.functional.transform_feedback.array_element.interleaved.lines.* and Piglit's arb_transform_feedback2/draw-auto are both fixed by this patch, for example. Thanks to Chris Wilson for catching this mistake! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99030 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-01-06 12:01:53 -08:00
Kenneth Graunke	4295af646f	i965: Fix texturing in the vec4 TCS and GS backends. We were failing to zero m0.2 of the sampler message header for TCS and GS messages in the simple case. fs_generator has done this for about a year now, but we missed it in vec4_generator. Fixes ES31-CTS.core.texture_cube_map_array.sampling, GL45-CTS.texture_cube_map_array.sampling, and many dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler subtests: - dynamically_uniform.tessellation_control.isampler3d - dynamically_uniform.tessellation_control.isamplercube - dynamically_uniform.tessellation_control.sampler2d - dynamically_uniform.tessellation_control.usamplercube - dynamically_uniform.tessellation_control.sampler2darray - dynamically_uniform.tessellation_control.isampler2darray - dynamically_uniform.tessellation_control.usampler3d - dynamically_uniform.tessellation_control.usampler2darray - dynamically_uniform.tessellation_control.usampler2d - dynamically_uniform.tessellation_control.sampler3d - dynamically_uniform.tessellation_control.samplercube - dynamically_uniform.tessellation_control.isampler2d - uniform.tessellation_control.isampler3d - uniform.tessellation_control.isamplercube - uniform.tessellation_control.usampler2d - uniform.tessellation_control.usampler3d - uniform.tessellation_control.sampler2darray - uniform.tessellation_control.isampler2darray - uniform.tessellation_control.usampler2darray - uniform.tessellation_control.sampler2d - uniform.tessellation_control.usamplercube - uniform.tessellation_control.sampler3d - uniform.tessellation_control.samplercube - uniform.tessellation_control.isampler2d Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-06 11:49:53 -08:00
Tim Rowley	c93efb0a4f	swr: [rasterizer core] rename OutputMerger functions Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-01-06 10:05:08 -06:00
Tim Rowley	fa7c5e242f	swr: [rasterizer core] fix SIMD16 Transpose_16_16 Fix incorrect swizzling in SIMD16 Transpose_16_16 breaking the two-channel 16-bpc formats like R16G16_FLOAT. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-01-06 10:05:02 -06:00
Tim Rowley	e62b6d2f0f	swr: [rasterizer core] fix SIMD16 output merger Honor the colorHottileEnable mask when accessing colorBuffer pointers. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-01-06 10:04:56 -06:00
Tim Rowley	1a77e0c48d	swr: [rasterizer core] fix SIMD16 PackTraits pack() and unpack() Fix routines for 8-bit and 16-bit formats used by optimized tile store. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-01-06 10:04:50 -06:00
Tim Rowley	bd22c3d411	swr: [rasterizer core] fix SIMD16 transpose functions Fixed Transpose_16 methods of following formats: Transpose8_8_8_8 Transpose8_8 Transpose32_32 Transpose16_16_16_16 Transpose16_16_16 Transpose16_16 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-01-06 10:04:41 -06:00
Tim Rowley	e6eede81af	swr: [rasterizer core] whitespace adjustments Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-01-06 10:04:28 -06:00
Kenneth Graunke	a4d6f4d954	i965: Don't set EmitNoMainReturn. A while ago, we stopped using Luca's GLSL IR lower_jumps pass in favor of nir_lower_returns(). Marek's commit `d3cb79e043` put it in do_common_optimization, which resulted in us calling it again. Dropping the EmitNoMainReturn setting makes us skip that pass again. Apparently that pass doesn't work properly, because this fixes Piglit's tests/spec/glsl-1.10/execution/vs-nested-return-sibling-loop. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99287 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-05 23:15:39 -08:00
Eric Anholt	69da8c32c7	vc4: Rewrite T image handling based on calling the LT handler. The T images are composed of effectively swizzled-around blocks of LT (4x4 utile) images, so we can reduce the t_utile_address() calls by 16x by calling into the simpler LT loop. This also adds support for calling down with non-utile-aligned coordinates, which will be part of lifting the utile alignment requirement on our callers and avoiding the RMW on non-utile-aligned stores. Improves 1024x1024 TexSubImage by 2.55014% +/- 1.18584% (n=46) Improves 1024x1024 GetTexImage by 2.242% +/- 0.880954% (n=32)	2017-01-05 17:19:54 -08:00
Eric Anholt	3a3a0d2d6c	vc4: Move the utile_width/height functions to header inlines. I want these inlined in the callers, particularly with the tiling changes coming up, but we're not building with lto so some caller would suffer.	2017-01-05 17:19:54 -08:00
Eric Anholt	6cf9ff8a6c	vc4: Make the load/store utile functions static. They don't have any other callers outside of this file, and I'm hoping they get inlined soon.	2017-01-05 17:19:54 -08:00
Eric Anholt	e64b1169d3	vc4: Simplify the load/store utile functions. They now have less of a dependency on the cpp, and don't have to do a divide. Hacking up mesa-demos teximage to do only one subtest and not draw points, I saw 1024x1024 glTexSubImage2D() improve by 4.86939% +/- 1.40408% (n=30) and glGetTexImage() by 2.18978% +/- 0.140268% (n=5).	2017-01-05 17:19:48 -08:00
Eric Anholt	7b8c67b3cc	vc4: Reuse a list function to simplify bufmgr code.	2017-01-05 16:23:32 -08:00
Eric Anholt	ebf33e577a	vc4: Flush the job early if we're referencing too many BOs. If we get up toward 256MB (or whatever the CMA area size is), VC4_GEM_CREATE will start throwing errors. Even if we don't trigger that, when we flush the kernel's BO allocation for the CLs or bin memory may end up throwing an error, at which point our job won't get rendered at all. Just flush early (half of maximum CMA size) so that hopefully we never get to that point.	2017-01-05 16:23:32 -08:00
Timothy Arceri	076ab157ff	st/mesa/glsl: move SamplerTargets to gl_program This will help allow us to simplify the handling of samplers by storing them in a single location rather than duplicating them in both gl_linked_shader and gl_program. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-06 11:21:42 +11:00
Timothy Arceri	937523971f	st/mesa/glsl: set SamplersUsed directly in gl_program Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-06 11:21:42 +11:00
Timothy Arceri	53a509723f	mesa/glsl: set sampler units directly in gl_program Now that we create gl_program earlier there is no need to mess about copying things to gl_linked_shader then to gl_program. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-06 11:21:42 +11:00
Timothy Arceri	7cc61cf706	mesa: simplify sampler setting code There is no need to loop over active samplers the code above this would have already exited if the sampler was inactive, or errored if the count was larger than the uniforms array size. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-06 11:21:42 +11:00
Timothy Arceri	4807a83da0	mesa/glsl: set num_textures per stage directly in shader_info Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-06 11:21:42 +11:00
Timothy Arceri	c46a630000	mesa: make _CurrentFragmentProgram a gl_program struct pointer Making this point to a gl_program struct rather than a gl_shader_program struct will allow use to later also make the CurrentProgram array hold gl_program structs which in turn will allow for code simpilifcation. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-06 11:21:42 +11:00
Timothy Arceri	6e3f6097c9	i965: stop passing gl_shader_program to the precompile and codegen functions We no longer need it. While we are at it we mark the vs, gs, and wm codegen functions as static. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-06 11:21:42 +11:00
Timothy Arceri	5ceedefd6c	mesa/glsl: remove hack to reset sampler units to zero Now that we have the is_arb_asm flag we can just skip the initialisation. V2: remove hack from standalone compiler where it was never needed since it only compiles glsl shaders. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-06 11:21:42 +11:00
Timothy Arceri	238486884e	i965: make use of new is_arb_asm flag Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-06 11:21:42 +11:00
Timothy Arceri	f584f38214	st/mesa/glsl: add new is_arb_asm flag in gl_program Set the flag via the _mesa_init_gl_program() and NewProgram() helpers. In i965 we currently check for the existance of gl_shader_program to decide if this is an ARB assembly style program or not. Adding a flag makes the code clearer and will help removes a dependency on gl_shader_program in the i965 codegen functions. Also this will allow use to skip initialising sampler units for linked shaders, we currently memset it to zero again during linking. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-06 11:21:42 +11:00
Timothy Arceri	2784128398	i965: pass gl_program directly to brw_compile_tes() This is the only thing we use from gl_shader_program so pass it directly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:41 +11:00
Timothy Arceri	2a4d169735	i965: stop passing gl_shader_program to brw_nir_setup_glsl_uniforms() We can now just get the data needed from the gl_shader_program_data pointer in gl_program. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:41 +11:00
Timothy Arceri	d3b2ee6b49	i965: pass gl_program to brw_upload_ubo_surfaces() There is no need to pass gl_linked_shader anymore. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:41 +11:00
Timothy Arceri	9ca14f583c	i965: stop passing gl_shader_program to brw_assign_common_binding_table_offsets() We now get everything we need directly from gl_program so there is no need for this. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:41 +11:00
Timothy Arceri	f5bc127b2f	st/mesa/glsl/i965: move ShaderStorageBlocks to gl_program Having it here rather than in gl_linked_shader allows us to simplify the code. Also it is error prone to depend on the gl_linked_shader for programs in current use because a failed linking attempt will free infomation about the current program. In i965 we could be trying to recompile a shader variant but may have lost some required fields. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:41 +11:00
Timothy Arceri	f62eb6c7eb	st/mesa/glsl/i965: set num_ssbos directly in shader_info Here we also remove the duplicate field in gl_linked_shader and always get the value from shader_info instead. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:41 +11:00
Timothy Arceri	0e7eec1ab5	st/mesa/glsl/i965: move per stage UniformBlocks to gl_program This will help allow us to store pointers to gl_program structs in the CurrentProgram array resulting in a bunch of code simplifications. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:41 +11:00
Timothy Arceri	b792c38979	st/mesa/glsl/i965: set num_ubos directly in shader_info This also removes the duplicate field in gl_linked_shader, and gets num_ubos from shader_info instead. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:41 +11:00
Timothy Arceri	a1da57c19c	st/mesa/glsl/i965: move ImageUnits and ImageAccess fields to gl_program Having it here rather than in gl_linked_shader allows us to simplify the code. Also it is error prone to depend on the gl_linked_shader for programs in current use because a failed linking attempt will free infomation about the current program. In i965 we could be trying to recompile a shader variant but may have lost some required fields. We drop the memset on ImageUnits because gl_program is already created using rzalloc(). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:40 +11:00
Timothy Arceri	3d2485f011	i965: get InfoLog and LinkStatus via the pointer in gl_program Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:40 +11:00
Timothy Arceri	be9a6a7eb7	i965: get shared_size from shader_info rather than gl_shader_program Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:40 +11:00
Timothy Arceri	234211ec8d	i965: stop depending on gl_shader_program for brw_compute_vue_map() params This removes another dependency on gl_shader_program from the codegen functions, this will help allow us to use gl_program for the CurrentProgram array rather than gl_shader_program. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:40 +11:00
Timothy Arceri	6f76ca300b	i965: pass gl_program to the brw_*_debug_recompile() functions Rather then passing gl_shader_program. The only field use was Name which is the same as the Id field in gl_program. For wm and vs we also make the functions static and move them before the codegen functions. This change reduces the codegen functions dependency on gl_shader_program. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-06 11:21:40 +11:00
Roland Scheidegger	caf18a8434	gallivm: (trivial) fix typo bug with small AoS format unpacking Fix typo using wrong (uninitialized) build context introduced by `4634cb5921`. (This only affects very rare small packed formats which have a PIPE_SWIZZLE_0 channel, such as r4a4, which is never used by mesa/st. Nevertheless it broke lp_test_format.)	2017-01-06 00:46:15 +01:00
Roland Scheidegger	4634cb5921	gallivm: implement aos unpack (to unorm8) for small unorm formats Using bit replication. This path now resembles something which might make sense. (The logic was mostly copied from llvmpipe fs backend.) I am not convinced though it is actually faster than SoA sampling (actually I'm quite certain it's always a loss with AVX). With SoA it's just shift/mask/cvt/mul for getting the colors, whereas there's still roughly 3 shifts, 3 or/and per channel for AoS (i.e. for SoA it's exactly the same as it would be for a rgba8 format, whereas the extra effort for AoS is significant). The filtering might still be faster (albeit with FMA the instruction count gets down quite a bit there on the SoA float filtering path on new cpus). And those small unorm formats often don't have an alpha channel (which makes things worse relatively for AoS path). (This also fixes a trivial bug in the llvmpipe fs code this was derived from, albeit it was only relevant for 4-bit channels.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-05 23:59:38 +01:00
Roland Scheidegger	bc86e829a5	gallivm: optimize lp_build_unpack_arith_rgba_aos slightly This code uses a vector shift which has to be emulated on x86 unless there's AVX2. Luckily in some cases we can actually avoid the shift altogether, so do that. Also make sure we hit the fast lp_build_conv() path when applicable, albeit that's quite the hack... That said, this path is taken for AoS sampling for small unorm (smaller than rgba8) formats, and it is completely hopeless even with those changes, with or without AVX. (Probably should have some code similar to the one in the llvmpipe fs backend code, using bit replication to extend to rgba8888 - rounding is not quite 100% accurate but if it's good enough there it should be here as well.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-05 23:59:38 +01:00
Roland Scheidegger	a03a2ac6fd	gallivm: use 2 srcs for 32->16bit conversions in lp_bld_conv_auto If we only feed one source vector at a time, we cannot use pack intrinsics (as we only have a 64bit destination dst vector). lp_bld_conv_auto is specifically designed to alter the length and number of destination vectors, so this works just fine (if we use single source vectors at a time, afterwards we immediately reassemble the vectors). For AVX though this isn't really possible, since we expect 128bit output already for a single 256bit input. (One day we should handle AVX2 which again would need multiple inputs, however there's the problem that we get different ordered output there and we don't want to reorder, so would need to be able to tell build_conv to handle upper and lower halfs independently.) A similar strategy would probably work for 32->8bit too (if it doesn't hit the special case) but I'm going to try something different for that... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-05 23:59:38 +01:00
Roland Scheidegger	db7e786a25	llvmpipe: (trivial) minimally simplify mask construction simd instruction sets usually have comparisons for equal, not unequal. So use a different comparison against the mask itself - which also means we don't need a all-zero as well as a all-one (for the pxor) reg. Also add code to avoid scalar expansion of i1 values which we definitely shouldn't do. There's problems with this though with llvm select interaction, so it's disabled (basically using llvm select instead of intrinsics may still produce atrocious code, even in cases where we figured it should not, albeit I think this could probably be fixed with some better selection of optimization passes, but I have zero idea there really). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-01-05 23:59:38 +01:00
Lionel Landwerlin	a8eeb089c0	anv: fix multiple creation with internal failure The specification section 9.4 says : When an application attempts to create many pipelines in a single command, it is possible that some subset may fail creation. In that case, the corresponding entries in the pPipelines output array will be filled with VK_NULL_HANDLE values. If any pipeline fails creation (for example, due to out of memory errors), the vkCreate*Pipelines commands will return an error code. The implementation will attempt to create all pipelines, and only return VK_NULL_HANDLE values for those that actually failed. Fixes : dEQP-VK.api.object_management.alloc_callback_fail_multiple.graphics_pipeline dEQP-VK.api.object_management.alloc_callback_fail_multiple.compute_pipeline v2: C is hard let's go shopping (Lionel) v3: Remove unnecessary condition in for loops (Lionel) v4: Document why we return on first failure (Eduardo) Move i declaration inside for() (Eduardo) v5: Move array cleanup out of loop (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-05 21:09:09 +00:00
Tim Rowley	33fa4c99f7	swr: [rasterizer core/common/jitter] gl_double support Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99214 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-01-05 14:10:36 -06:00
Fredrik Höglund	b6670157d7	dri3: Fix MakeCurrent without a default framebuffer In OpenGL 3.0 and later it is legal to make a context current without a default framebuffer. This has been broken since DRI3 support was introduced. Cc: "13.0 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-05 20:52:01 +01:00
Marek Olšák	e16245b339	radeonsi: turn SDMA IBs into de-facto preambles of GFX IBs Draw calls no longer flush SDMA IBs. r600_need_dma_space is responsible for synchronizing execution between both IBs. Initial buffer clears and fast clears will stay unflushed in the SDMA IB (up to 64 MB) as long as the GFX IB isn't flushed either. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:24 +01:00
Marek Olšák	cba9d59362	radeonsi: implement SDMA-based buffer clearing for SI Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:24 +01:00
Marek Olšák	29d6a367a6	radeonsi: do all math in bytes in SI DMA code Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:24 +01:00
Marek Olšák	9e1aa81dfe	gallium/radeon: prevent SDMA stalls by detecting RAW hazards in need_dma_space Call r600_dma_emit_wait_idle only when there is a possibility of a read-after-write hazard. Buffers not yet used by the SDMA IB don't have to wait. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:24 +01:00
Marek Olšák	3be8336440	gallium/radeon: move unrelated code from dma_emit_wait_idle to need_dma_space r600_dma_emit_wait_idle is going away in its current form. The only difference is that the moved code is executed before DMA calls instead of after them. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:24 +01:00
Marek Olšák	973d7cd90a	radeonsi: inline cik_sdma_do_copy_buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:23 +01:00
Marek Olšák	067a3237b9	radeonsi: also wait for SDMA in the clear_buffer CPU fallback Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:23 +01:00
Marek Olšák	f6a1c2d883	radeonsi: simplify r600_resource typecasts in si_clear_buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:23 +01:00
Marek Olšák	a31a92e7ef	radeonsi: always use SDMA for big buffer clears and first buffer uses Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:23 +01:00
Marek Olšák	69f489dfa1	radeonsi: use SDMA in rvid_buffer_clear on CIK-VI Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:23 +01:00
Marek Olšák	9a3296bf1c	radeonsi: use SDMA for initial clearing of DCC/CMASK/HTILE on CIK-VI Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:23 +01:00
Marek Olšák	d4c0ad4de8	radeonsi: implement SDMA-based buffer clearing for CIK-VI Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:43:23 +01:00
Marek Olšák	431742dbba	gallium/hud: increase the vertex buffer size for text Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:30:00 +01:00
Marek Olšák	6d54cd75a8	gallium/hud: add an option to sort items below graphs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:30:00 +01:00
Marek Olšák	80b8b9c8a4	gallium/hud: add an option to reset the color counter Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:30:00 +01:00
Marek Olšák	a57e071e9e	gallium/hud: allow more data sources per pane Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:30:00 +01:00
Marek Olšák	e8bb97ce30	gallium/hud: add an option to rename each data source useful for radeonsi performance counters v2: allow specifying both : and = Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:30:00 +01:00
Marek Olšák	d995115b17	gallium: remove TGSI_OPCODE_SUB It's redundant with the source modifier. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:30:00 +01:00
Marek Olšák	a4ace98a97	gallium: remove TGSI_OPCODE_ABS It's redundant with the source modifier. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-05 18:30:00 +01:00
Axel Davy	09d09b219e	st/nine: Remove all usage of ureg_SUB in nine_shader This is required to drop gallium SUB. Signed-off-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-05 18:30:00 +01:00
Axel Davy	67cda68bba	st/nine: Remove all usage of ureg_SUB in nine_ff This is required to remove gallium SUB. Signed-off-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-05 18:30:00 +01:00
Axel Davy	caf93f5311	st/nine: Do not map SUB and ABS to their gallium equivalent. This is required for gallium SUB and ABS to be removed. Signed-off-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-05 18:30:00 +01:00
Eric Anholt	dbe0dd11b9	configure: Fix another bashism. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-05 09:24:28 -08:00
Marek Olšák	3477f67057	st/mesa: fix a segfault when prog->sh.data is NULL Broken by: st/mesa: get Version from gl_program rather than gl_shader_program Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-01-05 17:11:03 +01:00
Emil Velikov	37f9262064	docs: add news item and link release notes for 13.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-05 16:07:53 +00:00
Emil Velikov	934792b846	docs: add sha256 checksums for 13.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `c8ece92ded`)	2017-01-05 16:07:53 +00:00
Emil Velikov	5cd9660302	docs: add release notes for 13.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `bec04114d2`)	2017-01-05 16:07:53 +00:00
Nayan Deshmukh	ee4b4791ab	st/va: fix incorrect argument in vl_compositor_cleanup This fixes the mistake introduced in commit `b6737a8bcd` Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-01-05 16:40:06 +01:00
Tim Rowley	68ddcc6c28	swr: remove unneeded llvm version check Old test caused breakage with llvm-svn (4.0.0svn), and not needed as the minimum required llvm version for swr is 3.6. Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2017-01-05 07:31:19 -06:00
George Kyriazis	36ad826548	swr: fix windows build break wrap lp_bld_type.h around extern "C". Windows decorates global variables, so when used from .cpp files, need to use an undecorated version. Also, removed related and unneeded code from swr_screen.cpp Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-05 07:30:18 -06:00
Marek Olšák	3753dc896d	radeonsi: update clip_regs if clip_disable changes to fix a hang This seems to fix the GPU hangs caused by: commit `ed3190b3f3` Author: Marek Olšák <marek.olsak@amd.com> Date: Sun Nov 13 18:41:43 2016 +0100 radeonsi: don't export ClipVertex and ClipDistance[] if clipping is disabled Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99219 Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-01-05 14:01:18 +01:00
Marek Olšák	c7affbf687	st/mesa: enable GLSLOptimizeConservatively for drivers that want it GLSL compilation now takes 24% less time with the Gallium noop driver. I used my shader-db for the measurement. The difference for the whole radeonsi driver can be ~10%. The generated TGSI is mostly the same. For example, the compilation success rate with a TGSI->GCN bytecode converter without any optimizations is the same. Note that glsl_to_tgsi does its own copy propagation and simple register allocation. shader-db GCN report: - Talos spills fewer SGPRs. - DOTA 2 spills more SGPRs. - The average shader-db score is better, but it's just due to randomness. 29045 shaders in 17564 tests Totals: SGPRS: 1325929 -> 1325017 (-0.07 %) VGPRS: 1010808 -> 1010172 (-0.06 %) Spilled SGPRs: 1432 -> 1399 (-2.30 %) Spilled VGPRs: 93 -> 92 (-1.08 %) Private memory VGPRs: 688 -> 688 (0.00 %) Scratch size: 2540 -> 2484 (-2.20 %) dwords per thread Code Size: 39336732 -> 39342936 (0.02 %) bytes Max Waves: 217937 -> 217969 (0.01 %) Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-05 13:07:12 +01:00
Marek Olšák	96fe8834f5	glsl_to_tgsi: do fewer optimizations with GLSLOptimizeConservatively Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-05 13:07:12 +01:00
Marek Olšák	0a5018c1a4	mesa: add gl_constants::GLSLOptimizeConservatively to reduce the amount of GLSL optimizations for drivers that can do better. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-05 13:07:12 +01:00
Marek Olšák	e51baeb6c1	gallium: add PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY Drivers with good compilers don't need aggressive optimizations before TGSI. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-05 13:07:12 +01:00
Marek Olšák	d3cb79e043	glsl: run do_lower_jumps properly in do_common_optimizations so that backends don't have to run it manually Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-05 13:07:12 +01:00
Kenneth Graunke	7c6b714cd0	i965: Print VS output VUE map in Vulkan too. We need to move this to the shared layer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-05 01:55:27 -08:00
Kenneth Graunke	480d6c1653	i965: Fix last slot calculations If the VUE map has slots at the end which the shader does not write, then we'd "flush" (constructing an URB write) on the last output it actually wrote. Then, we'd construct another SEND with EOT, but with no actual payload data. That's not legal. For example, SSO programs have clip distance slots allocated no matter what, but the shader may not write them. If it doesn't write any user defined varyings, then the clip distance slots will be the last ones. Found while debugging dEQP-VK.tessellation.shader_input_output.gl_position_vs_to_tcs_to_tes Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-05 01:54:52 -08:00
Iago Toral Quiroga	8dc92a5613	docs: Mark GL_ARB_gpu_shader_fp64 and OpenGL 4.0 as done for i965/hsw+ Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-05 09:34:36 +01:00
Iago Toral Quiroga	580c503ca2	docs: add GL_ARB_gpu_shader_fp64 and OpenGL 4.0 support for Intel Haswell. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-01-05 09:34:14 +01:00
Iago Toral Quiroga	a98f2e53e1	i965: add a kernel_features bitfield to intel screen We can use this to track various features that may or may not be supported by the hw / kernel. Currently, we usually do this by checking the generation and supported command parser versions in various places thoughtout the driver code. With this patch, we centralize all these checks in just once place at screen creation time, then we just query the bitfield wherever we need to check if a particular feature is supported. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-05 08:43:46 +01:00
Iago Toral Quiroga	e3123c8ca2	i965/gen7: Enable OpenGL 4.0 in Haswell when supported Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-05 08:43:46 +01:00
Iago Toral Quiroga	1f1b8def48	i965: get rid of brw->can_do_pipelined_register_writes Instead, check the screen field directly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-05 08:43:46 +01:00
Chris Wilson	02a44484f0	i965: Move the pipelined test for SO register access to the screen Moving the test to the screen places it alongside the other global HW feature tests that want to be shared between contexts. Also, we need to know if we support pipelined register writes at screen creation time so that we can tell if we can expose OpenGL 4.0 in gen7. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-05 08:43:46 +01:00
Samuel Iglesias Gonsálvez	ab1ec7de93	i965/disasm: remove printing hstride and width in align16 DF source regions Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-05 07:29:23 +01:00
Samuel Iglesias Gonsálvez	301fdfd838	vec4: use DIM instruction when loading DF immediates in HSW Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-05 07:29:13 +01:00
Carl Worth	3fbdac28d5	glcpp: Remove illegal characters from tests Some of the existing tests were using '@' and '"' incidentally within the test body. Neither of these characters are actually legal for GLSL. And since we are planning to start generating errors for illegal characters, we need to first make the test suite clean. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-04 14:40:48 -08:00
Carl Worth	5363518705	glcpp: Exhaustively test all legal characters in GLSL Here, each legal character (as defined by GLSL Language Specification version 4.30.6, section 3.1) appears at least once in the input file. Obviously, characters with special meaning (like '#' and '\') aren't treated exhaustively with respect to all their possible uses. We have many other tests for that. Here, we're simply ensuring that the test suite sees every legal character at least once. v2 (by Ken): Fix expectations, move to src/compiler, renumber tests. Carl's .expected: Updated .expected: .. .. . . . . . . . . . . . . . . . . . .. . . . . . (For some reason, the original test expected ".." to produce two lines. glcpp, cpp, and mcpp all follow my updated behavior, so I believe it to be correct.) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-04 14:40:48 -08:00
Carl Worth	16b480547f	glcpp: Allow vertical tab and form feed characters in GLSL Of course, these aren't really useful for anything, but the GLSL language specification does allow them: The source character set used for the OpenGL shading languages, outside of comments, is a subset of UTF-8. It includes the following characters: ... White space: the space character, horizontal tab, vertical tab, form feed, carriage-return, and line- feed. [GLSL Language Specification 4.30.6, section 3.1] So treat vertical tab ('\v' or ^K) and form-feed ('\f' or ^L) as horizontal space characters. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-04 14:40:48 -08:00
Carl Worth	6c8762400d	glcpp: Add testing for no space between macro name and replacement list GCC's preprocessor accepts a macro definition where there is no space between the macro's identifier name and the replacementlist. (GCC does emit a "missing space" warning that we don't, but that's fine.) This is an exhaustive test that verifies that all legal GLSL characters that could possibly be interpreted as separating the macro name from the replacement list are interpreted as such. So the testing here includes all valid GLSL symbols except for: * Characters that can be part of an identifier (a-z, A-Z, 0-9, _) * Backslash, (allowed only as line continuation) * Hash, (allowed only to introduce pre-processor directive, or as part of a paste operator in a replacement list---but not as first token of replacement list) * Space characters (since the point of the testing is to have missing space) * Left parenthesis (which would indicate a function-like macro) v2 (Ken): Move to src/compiler, renumber tests. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-04 14:40:48 -08:00
Lionel Landwerlin	36b5f1d200	spirv: compute push constant access offset & range v2: Move relative push constant relative offset computation down to _vtn_load_store_tail() (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-04 21:14:17 +00:00
Lionel Landwerlin	0089085038	spirv: move block_size() definition Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-04 21:08:44 +00:00
Marek Olšák	89975e29d3	va: call texture_get_handle while the mutex is being held The context may be used by texture_get_handle. Reviewed-by: Christian König <christian.koenig@amd.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org>	2017-01-04 17:27:41 +01:00
Marek Olšák	dbba4e03b1	vdpau: call texture_get_handle while the mutex is being held The context may be used by texture_get_handle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99158 Reviewed-by: Christian König <christian.koenig@amd.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org>	2017-01-04 17:27:41 +01:00
Samuel Pitoiset	7d48a84b16	radeonsi: capitalize VM hex addr when dumping buffer list Useful when debugging with R600_DEBUG=vm,check_vm to match addr in both outputs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-04 10:14:22 +01:00
Tapani Pälli	0f991e8434	i965: remove unused brwInitVtbl declaration function was removed by `b3360d23ac` Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2017-01-04 09:46:34 +02:00
Iago Toral Quiroga	1a8f2629e6	i965: remove brw_context dependency from intel_batchbuffer_init() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-04 08:14:30 +01:00
Iago Toral Quiroga	ba30e0ca20	i965: make intel_batchbuffer_free() take a batchbuffer as argument Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-04 08:14:26 +01:00
Iago Toral Quiroga	1daa31d8a8	i965: make intel_batchbuffer_emit_dword() take a batchbuffer as argument Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-04 08:14:21 +01:00
Iago Toral Quiroga	f03bac1fc7	i965: Make intel_bachbuffer_reloc() take a batchbuffer argument Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-04 08:14:04 +01:00
Timothy Arceri	4b7dfd8812	nir: fix loop iteration count calculation for floats Fixes performance regression in SynMark PSPom caused by loops with float counters not always unrolling. For example: for (float i = 0.02; i < 0.9; i += 0.11) ... Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-04 14:48:36 +11:00
Edmondo Tommasina	abcaba497d	gallium/hud: add a path separator between dump directory and filename It's more user friendly and it avoids to write files in unexpected places. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-03 22:08:41 +01:00
Heiko Przybyl	e933246013	r600/sb: Fix loop optimization related hangs on eg Make sure unused ops and their references are removed, prior to entering the GCM (global code motion) pass, to stop GCM from breaking the loop logic and thus hanging the GPU. Turns out, that sb has problems with loops and node optimizations regarding associative folding: - the global code motion (gcm) pass moves ops up a loop level/basic block until they've fulfilled their total usage count - if there are ops folded into others, the usage count won't be fulfilled and thus the op moved way up to the top - within GCM the op would be visited and their deps would be moved alongside it, to fulfill the src constaints - in a loop, an unused op is moved out of the loop and GCM would move the src value ops up as well - now here arises the problem: if the loop counter is one of the src values it would get moved up as well, the loop break condition would never get hit and the shader turn into an endless loop, resulting in the GPU hanging and being reset A reduced (albeit nonsense) piglit example would be: [require] GLSL >= 1.20 [fragment shader] uniform int SIZE; uniform vec4 lights[512]; void main() { float x = 0; for(int i = 0; i < SIZE; i++) x += lights[2i+1].x; } [test] uniform int SIZE 1 draw rect -1 -1 2 2 Which gets optimized to: ===== SHADER #12 OPT ================================== PS/BARTS/EVERGREEN ===== ===== 42 dw ===== 1 gprs ===== 2 stack ========================================= ALU 3 @24 1 y: MOV R0.y, 0 t: MULLO_UINT R0.w, [0x00000002 2.8026e-45].x, R0.z LOOP_START_DX10 @22 PUSH @6 ALU 1 @30 KC0[CB0:0-15] 2 M x: PRED_SETGE_INT __.x, R0.z, KC0[0].x JUMP @14 POP:1 LOOP_BREAK @20 POP @14 POP:1 ALU 2 @32 3 x: ADD_INT R0.x, R0.w, [0x00000002 2.8026e-45].x TEX 1 @36 VFETCH R0.x___, R0.x, RID:0 MFC:16 UCF:0 FMT[..] ALU 1 @40 4 y: ADD R0.y, R0.y, R0.x LOOP_END @4 EXPORT_DONE PIXEL 0 R0.____ EOP ===== SHADER_END =============================================================== Notice R0.z being the loop counter/break condition relevant register and being never incremented at all. Also some of the loop content has been moved out of it, to fulfill the requirements for the one unused op. With a debug build of mesa this would produce an error like error at : PRED_SETGE_INT __, __, EM.2, R1.x.2\|\|FP@R0.z, C0.x : operand value R1.x.2\|\|FP@R0.z was not previously written to its gpr and the compilation would fail due to this. On a release build it gets passed to the GPU. When using this patch, the loop remains intact: ===== SHADER #12 OPT ================================== PS/BARTS/EVERGREEN ===== ===== 48 dw ===== 1 gprs ===== 2 stack ========================================= ALU 2 @24 1 y: MOV R0.y, 0 z: MOV R0.z, 0 LOOP_START_DX10 @22 PUSH @6 ALU 1 @28 KC0[CB0:0-15] 2 M x: PRED_SETGE_INT __.x, R0.z, KC0[0].x JUMP @14 POP:1 LOOP_BREAK @20 POP @14 POP:1 ALU 4 @30 3 t: MULLO_UINT T0.x, [0x00000002 2.8026e-45].x, R0.z 4 x: ADD_INT R0.x, T0.x, [0x00000002 2.8026e-45].x TEX 1 @40 VFETCH R0.x___, R0.x, RID:0 MFC:16 UCF:0 FMT[..] ALU 2 @44 5 y: ADD R0.y, R0.y, R0.x z: ADD_INT R0.z, R0.z, 1 LOOP_END @4 EXPORT_DONE PIXEL 0 R0.____ EOP ===== SHADER_END =============================================================== Piglit: ./piglit summary console -d results/_gpu_noglx name: unpatched_gpu_noglx patched_gpu_noglx ---- ------------------- ----------------- pass: 18016 18021 fail: 748 743 crash: 7 7 skip: 1124 1124 timeout: 0 0 warn: 13 13 incomplete: 0 0 dmesg-warn: 0 0 dmesg-fail: 0 0 changes: 0 5 fixes: 0 5 regressions: 0 0 total: 19908 19908 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94900 Tested-by: Heiko Przybyl <lil_tux@web.de> Tested-on: Barts PRO HD6850 Signed-off-by: Heiko Przybyl <lil_tux@web.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-03 21:58:52 +01:00
Eric Anholt	dd12119706	editorconfig: Fix up the tab rendering width. Our editorconfig file looked sensible, saying that we wanted to indent with spaces and use 3/4/whatever space indentation. However, the spec has this little surprise: "tab_width: a whole number defining the number of columns used to represent a tab character. This defaults to the value of indent_size and doesn't usually need to be specified." so once my editor started respecting editorconfig, the files that have tabs left in them started getting rendered wrong, showing up like this in brw_program.c: case GL_COMPUTE_PROGRAM_NV: { struct brw_program *prog = rzalloc(NULL, struct brw_program); if (prog) { prog->id = get_new_program_id(brw->screen); return _mesa_init_gl_program(&prog->program, target, id); } else return NULL; } Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-03 10:38:53 -08:00
Chad Versace	c4b87f129e	meta: Disable dithering during glGenerateMipmap Fixes tests 'dEQP-GLES3.functional.texture.mipmap..generate.rgba5551' on Intel Broadwell 0x1616. The GL 4.5 spec describes the algorithm of glGenerateMipmap as: The contents of the derived images are computed by repeated, filtered reduction of the level base image. [...] No particular filter algorithm is required, though a box filter is recommended as the default filter. Consider a texture for which all pixels are identical at level 0. From the spec's description above, one may reasonably assume that the "filtered reduction" of level 0 produces a new miplevel for which again all pixels are identical. For any 2x2 subspan of identical pixels, it is difficult to see how the "filtered reduction" of that subspan can produce a pixel that differs from the source pixels. Dithering during _mesa_meta_GenerateMipmap() violated that reasonable assumption. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99210 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2017-01-03 08:22:23 -08:00
Romain Failliot	8d8ed437a5	doc/features.txt: update for freedreno I lost track of who created initial patch (Ilia?).. Romain rebased it. I pushed it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95460 Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-01-03 10:46:13 -05:00
Robert Bragg	96c9ec9c27	i965: Remove perf monitor/query backend In its current state the unified i965 backend for AMD_performance_monitor and INTEL_performance_query isn't able to report meaningful Observation Architecture metrics since we haven't so far had the necessary kernel support to fully configure the OA unit, nor the corresponding support for normalizing the counters into a form that can be usefully interpreted by application developers (as opposed to raw values that may, for example, scale by the number of EUs there are). So that we can focus on implementing just one of these extensions fully and since we anticipate some significant backend changes as we look to use a new kernel interface to configure the OA unit, this patch removes the current backend. This will simplify our ability to update the frontend infrastructure and backend interface before updating our support for performance counters. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-01-03 07:27:07 -08:00
Christian König	ac57bcda1e	vl/zscan: fix "Fix trivial sign compare warnings" The variable actually needs to be signed, otherwise converting it to a float doesn't work as expected. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98914 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Fixes: `1fb4179f92` ("vl: Fix trivial sign compare warnings")	2017-01-03 12:18:14 +01:00
Nayan Deshmukh	b6737a8bcd	st/va: error handling handle the cases when vl_compositor_set_csc_matrix(), vl_compositor_init_state() and vl_compositor_init() fail Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-01-03 12:02:15 +01:00
Nayan Deshmukh	29aad4e8bd	st/vdpau: error handling handle the cases when vl_compositor_set_csc_matrix(), vl_compositor_init_state() and vl_compositor_init() fail Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-01-03 12:02:15 +01:00
Nayan Deshmukh	cee5af93ee	vl/compositor: implement error handling pipe_buffer_map and pipe_buffer_create may return NULL Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-01-03 12:02:15 +01:00
Iago Toral Quiroga	1a83e9892d	i965/vec4: enable ARB_gpu_shader_fp64 for Haswell Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	6c350e34ee	i965/vec4: adjust spilling costs for 64-bit registers. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	3cd38b6898	i965/vec4: prevent spilling of DOUBLE_TO_SINGLE destination FROM_DOUBLE opcodes are setup so that they use a dst register with a size of 2 even if they only produce a single-precison result (this is so that the opcode can use the larger register to produce a 64-bit aligned intermediary result as required by the hardware during the conversion process). This creates a problem for spilling though, because when we attempt to emit a spill for the dst we see a 32-bit destination and emit a scratch write that allocates a single spill register, making the intermediary writes go beyond the size of the allocation. Prevent this by avoiding to spill the destination register of these opcodes. Alternatively, we can avoid this by splitting the opcode in two: one that produces a 64-bit aligned result and one that takes the 64-bit aligned result as input and produces a 32-bit result from it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	8843c43f7e	i965/vec4: avoid spilling of registers that mix 32-bit and 64-bit access When 64-bit registers are (un)spilled, we need to execute data shuffling code before writing to or after reading from memory. If we have instructions that operate on 64-bit data via 32-bit instructions, (un)spills for the register produced by 32-bit instructions will not do data shuffling at all (because we only see a normal 32-bit istruction seemingly operating on 32-bit data). This means that subsequent reads with that register using DF access will unshuffle data read from memory that was never adequately shuffled when it was written. Fixing this would require to identify which 32-bit instructions write 64-bit data and emit spill instructions only when the full 64-bit data has been written (by multiple 32-bit instructions writing to different offsets of the same register) and always emit 64-bit unspills whenever 64-bit data is read, even when the instruction uses a 32-bit type to read from them. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	82c69426a5	i965/vec4: support basic spilling of 64-bit registers The current spilling code can't spill vgrf allocations larger than 1 but SIMD4x2 doubles require 2 vgrfs, so we need to permit this case (which is handled properly for DF data types by emitting 2 scratch messages and doing data shuffling). We accomplish this by not auto-disabling spilling for vgrf allocations with a size of 2, and then disable spilling on any register with an offset != 0B (which indicates array access). Disable spilling of partial DF reads/writes because these don't read/write data for both logical threads and our scratch messages for 64-bit data need data for both threads to be present. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	c762809e49	i965/vec4: run scalarize_df() after spilling Spilling of 64-bit data requires data shuffling for the corresponding scratch read/write messages. This produces unsupported swizzle regions and writemasks that we need to scalarize. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	73610384a8	i965/vec4: prevent src/dst hazards during 64-bit register allocation 8-wide compressed DF operations are executed as two separate 4-wide DF operations. In that scenario, we have to be careful when we allocate register space for their operands to prevent the case where the first half of the instruction overwrites the source of the second half. To do this we mark compressed instructions as having hazards to make sure that ther register allocators assigns a register regions for the destination that does not overlap with the region assigned for any of its source operands. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	2b57adad00	i965/vec4/scalarize_df: support more swizzles via vstride=0 By exploiting gen7's hardware decompression bug with vstride=0 we gain the capacity to support additional swizzle combinations. This also fixes ZW writes from X/Y channels like in: mov r2.z:df r0.xxxx:df Because DF regions use 2-wide rows with a vstride of 2, the region generated for the source would be r0<2,2,1>.xyxy:DF, which is equivalent to r0.xxzz, so we end up writing r0.z in r2.z instead of r0.x. Using a vertical stride of 0 in these cases we get to replicate the XX swizzle and write what we want. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	c3edacaa28	i965/vec4/scalarize_df: do not scalarize swizzles that we can support natively Certain swizzles like XYZW can be supported by translating only the first two 64-bit swizzle channels to 32-bit channels. This happens with swizzles such that the first two logical components, when translated to 32-bit channels and replicated across the second dvec2 row, select the same channels specified by the 3rd and 4th logical swizzle components. Notice that this opens up the possibility that some instructions are not scalarized and can end up with XY or ZW 32-bit writemasks. Make sure we always scalarize in such cases. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	2f0bc54e2b	i965/vec4: split instructions that read 64-bit interleaved attributes Stages that use interleaved attributes generate regions with a vstride=0 that can hit the gen7 hardware decompression bug. v2: - Make static the function and fix indent (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	0579c85e5c	i965/vec4: dump subnr for FIXED_GRF This came in handy when debugging the payload setup for Tess Eval, since it prints correct subnr for attributes that can be loaded in the second half of a register. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	8e92b40203	i965/vec4/tes: consider register offsets during attribute setup Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	49d4d0268b	i965/vec4/tes: fix setup_payload() for 64bit data types Use a width of 2 with 64-bit attributes. Also, if we have a dvec3/4 attribute that gets split across two registers such that components XY are stored in the second half of a register and components ZW are stored in the first half of the next, we need to fix regioning for any instruction that reads components Z/W of the attribute. Notice this also means that we can't support sources that read cross-dvec2 swizzles (like XZ for example). v2: don't assert that we have a single channel swizzle in the case that we have to fix up Z/W access on the first half of the next register. We can handle any swizzle that does not cross dvec2 boundaries, which the double scalarization pass should have prevented anyway. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	183cd8ab94	i965/vec4/tes: fix input loading for 64bit data types v2: use byte_offset() instead of offset() Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	3e294ab893	i965/vec4/tcs: fix outputs for 64-bit data v2: use byte_offset() instead of offset() Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	639e92ea3c	i965/vec4/tcs: fix input loading for 64-bit data v2: use byte_offset() instead of offset() Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Samuel Iglesias Gonsálvez	74fd0c590b	i965/vec4/gs: fix input loading for 64bit data v2 (Iago): - Adapt 64-bit path to component packing changes. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	b76f2206f5	i965/vec4: fix store output for 64-bit types We need to shuffle the data before it is written to the URB. Also, dvec3/4 need two vec4 slots. v2: use byte_offset() instead of offset(). Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	5fe8d567d8	i965/vec4: fix attribute setup for doubles Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	6a01259d8a	i965/vec4: fix indentation in lower_attributes_to_hw_regs() Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	ae400e38d9	i965/vec4: make emit_pull_constant_load support 64-bit loads This way callers don't need to know about 64-bit particularities and we reuse some code. v2: - use byte_offset() instead of offset() - only mark the surface as used once Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	df6e3aa6ae	i965/vec4: fix move_push_constants_to_pull_constants() for 64-bit data v2: adapt to changes in offset() Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	eee2c0d785	i965/vec4: fix indentation in move_push_constants_to_pull_constants() Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	10694be522	i965/vec4: fix move_uniform_array_access_to_pull_constant() for 64-bit data v2: adapt to changes in offset() Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	52fb22b646	i965/vec4: fix scratch writes for 64bit data Mostly the same stuff as usual: we ned to shuffle the data before we write and we need to emit two 32-bit write messages (with appropriate 32-bit writemask channels set) for a full dvec4 scratch write. v2: use byte_offset() instead of offset(). Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	dcc36f8b29	i965/vec4: fix scratch reads for 64bit data v2: Setup for a 64-bit scratch read by checking the type size of the correct register v3: Use byte_offset() instead of offset() Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	e4d9ab609f	i965/vec4: fix scratch offset for 64bit data A vec4 is 16 bytes and a dvec4 is 32 bytes so for doubles we have to multiply the reladdr by 2. The reg_offset part is in units of 16 bytes and is used to select the low/high 16-byte chunk of a full dvec4, so we don't want to multiply that part of the address. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	07bc6a35d3	i965/vec4: do not split scratch read/write opcodes 64-bit scratch read/writes require to shuffle data around so we need to have access to the full 64-bit data. We will do the right thing for these when we emit the messages. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	2a857104e4	i965/vec4: Do not use DepCtrl with 64-bit instructions The BDW PRM says that it is not supported, but it seems that gen7 is also affected, since doing DepCtrl on double-float instructions leads to GPU hangs in some cases, which is probably not surprising knowing that this is not supported in new hardware iterations. The SKL PRMs do not mention this restriction, so it is probably fine. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	506154f704	i965/vec4: extend the DWORD multiply DepCtrl restriction to all gen8 platforms v2: - Add Broxton as Intel's internal PRMs says that it is needed (Matt). Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Samuel Iglesias Gonsálvez	b9cd3f5b49	i965/vec4: don't copy propagate misaligned registers This means we would copy propagate partial reads or writes and that can affect the result. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	93eae0d2a4	i965/vec4: don't propagate single-precision uniforms into 4-wide instructions Otherwise we end up producing code that violates the register region restriction that says that when execsize == width and hstride != 0 the vstride can't be 0. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	6637312847	i965/vec4: Prevent copy propagation from violating pre-gen8 restrictions In gen < 8 instructions that write more than one register need to read more than one register too. Make sure we don't break that restriction by copy propagating from a uniform. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	70cc6b0a02	i965/vec4: prevent copy-propagation from values with a different type size Because the meaning of the swizzles and writemasks involved is different, so replacing the source would lead to different semantics. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Connor Abbott	0fec5e9867	i965/vec4: don't constant propagate 64-bit immediates v2: Also check if the instruction source target is 64-bit. (Samuel) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	8eea41e75d	i965/vec4: Fix SSBO stores for 64-bit data In this case we need to shuffle the 64-bit data before we write it to memory, source from reg_offset + 1 to write components Z and W and consider that each DF channel is twice as big. v2: use byte_offset() instead of offset(). Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	9998d55afd	i965/vec4: Fix SSBO loads for 64-bit data Same requirements as for UBO loads. v2: - use byte_offset() instead of offset() (Iago) - keep the const. offset as an immediate like the original code did (Juan) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	4486c90aae	i965/vec4: Fix UBO loads for 64-bit data We need to emit 2 32-bit load messages to load a full dvec4. If only 1 or 2 double components are needed dead-code-elimination will remove the second one. We also need to shuffle the result of the 32-bit messages to form valid 64-bit SIMD4x2 data. v2: - use byte_offset() instead of offset() (Iago) - keep the const. offset as an immediate like the original code did (Juan) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	d8e123cc5d	i965/vec4: Add a shuffle_64bit_data helper SIMD4x2 64bit data is stored in register space like this: r0.0:DF x0 y0 z0 w0 r1.0:DF x1 y1 z1 w1 When we need to write data such as this to memory using 32-bit write messages we need to shuffle it in this fashion: r0.0:DF x0 y0 x1 y1 r0.1:DF z0 w0 z1 w1 and emit two 32-bit write messages, one for r0.0 at base_offset and another one for r0.1 at base_offset+16. We also need to do the inverse operation when we read using 32-bit messages to produce valid SIMD4x2 64bit data from the data read. We can achieve this by aplying the exact same shuffling to the data read, although we need to apply different channel enables since the layout of the data is reversed. This helper implements the data shuffling logic and we will use it in various places where we read and write 64bit data from/to memory. v2 (Curro): - Use the writemask helper and don't assert on the original writemask being XYZW. - Use the Vec4 IR builder to simplify the implementation. v3 (Iago): - Use byte_offset() instead of offset(). v3: - Fix typo (Matt) - Clarify the example and fix indention (Matt). Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	017c8df35b	i965/vec4: support multiple dispatch widths and groups in the IR builder. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	b3a7d0ee9d	i965/vec4: Lower 64-bit MAD The previous patch made sure that we do not generate MAD instructions for any NIR's 64-bit ffma, but there is nothing preventing i965 from producing MAD instructions as a result of lowerings or optimization passes. This patch makes sure that any 64-bit MAD produced inside the driver after translating from NIR is also converted to MUL+ADD before we generate code. v2: - Use a copy constructor to copy all relevant instruction fields from the original mad into the add and mul instructions v3: - Rename the lowering and fix commit log (Matt) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	82e9dda8bf	i965/vec4/nir: do not emit 64-bit MAD RepCtrl=1 does not work with 64-bit operands so we need to use RepCtrl=0. In that situation, the regioning generated for the sources seems to be equivalent to <4,4,1>:DF, so it will only work for components XY, which means that we have to move any other swizzle to a temporary so that we can source from channel X (or Y) in MAD and we also need to split the instruction (we are already scalarizing DF instructions but there is room for improvement and with MAD would be more restricted in that area) Also, it seems that MAD operations like this only write proper output for channels X and Y, so writes to Z and W also need to be done to a temporary using channels X/Y and then move that to channels Z or W of the actual dst. As a result the code we produce for native 64-bit MAD instructions is rather bad, and much worse than just emitting MUL+ADD. For reference, a simple case of a fully scalarized dvec4 MAD operation requires 15 instructions if we use native MAD and 8 instructions if we emit ADD+MUL instead. There are some improvements that we can do to the emission of MAD that might bring the instruction count down in some cases, but it comes at the expense of a more complex implementation so it does not seem worth it, at least initially. This patch makes translation of NIR's 64-bit FMMA instructions produce MUL+ADD instead of MAD. Currently, there is nothing else in the vec4 backend that emits MAD instructions, so this is sufficient and it helps optimization passes see MUL+ADD from the get go. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	83dcd14602	i965/vec4: Skip swizzle to subnr in 3src instructions with DF operands We make scalar sources in 3src instructions use subnr instead of swizzles because they don't really use swizzles. With doubles it is more complicated because we use vstride=0 in more scenarios in which they don't produce scalar regions. Also RepCtrl=1 is not allowed with 64-bit operands, so we should avoid this. v2: Fix typo (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	49be3abbe7	i965/vec4: fix indentation in pack_uniform_registers Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	bdf5498c6b	i965/vec4: fix pack_uniform_registers for doubles We need to consider the fact that dvec3/4 require two vec4 slots. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	23278a75ce	i965/vec4: teach register coalescing about 64-bit Specifically, at least for now, we don't want to deal with the fact that channel sizes for fp64 instructions are twice the size, so prevent coalescing from instructions with a different type size. Also, we should check that if we are coalescing a register from another MOV we should be writing the same amount of data in both operations, otherwise we end up wiring more or less than the original instruction. This can happen, for example, when we have split fp64 MOVs with an exec size of 4 that only write one register each and then a MOV with exec size of 8 that reads both. We want to avoid the pass to think that it can coalesce from the first split MOV alone. Ideally we would like the pass to see that it can coalesce from both split MOVs instead, but for now we keep it simple. Finally, the pass doesn't support coalescing of multiple registers but in the case of normal SIMD4x2 double-precision instructions they naturally write two registers (one per vertex) and there is no reason why we should not allow coalescing in this case. Change the restriction to bail if we see instructions that write more than 8 channels, where the channels can be 32-bit or 64-bit. v2: - Make sure that scan_inst and inst write the same amount of data. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	7c5bf597ef	i965/disasm: fix subreg for dst in Align16 mode There is a single bit for this, so it is a binary 0 or 1 meaning offset 0B or 16B respectively. v2: - Since brw_inst_dst_da16_subreg_nr() is known to be 1, remove it from the expression (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	ac5a06ff83	i965/vec4: implement access to DF source components Z/W The general idea is that with 32-bit swizzles we cannot address DF components Z/W directly, so instead we select the region that starts at the the 16B offset into the register and use X/Y swizzles. The above, however, has the caveat that we can't do that without violating register region restrictions unless we probably do some sort of SIMD splitting. Alternatively, we can accomplish what we need without SIMD splitting by exploiting the gen7 hardware decompression bug for instructions with a vstride=0. For example, an instruction like this: mov(8) r2.x:DF r0.2<0>xyzw:DF Activates the hardware bug and produces this region: Component: x0 y0 z0 w0 x1 y1 z1 w1 Register: r0.2 r0.3 r0.2 r0.3 r1.2 r1.3 r1.2 r1.3 Where r0.2 and r0.3 are r0.z:DF for the first vertex of the SIMD4x2 execution and r1.2 and r1.3 are the same for the second vertex. Using this to our advantage we can select r0.z:DF by doing r0.2<0,2,1>.xyxy and r0.w by doing r0.2<0,2,1>.zwzw without needing to split the instruction. Of course, this only works for gen7, but that is the only hardware platform were we implement align16/fp64 at the moment. v2: Adapted to the fact that we now do this after converting to hardware registers (Iago) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	e238601a2d	i965/vec4: translate 64-bit swizzles to 32-bit The hardware can only operate with 32-bit swizzles, which is a rather limiting restriction. However, the idea is not to expose this to the optimization passes, which would be a mess to deal with. Instead, we let the bulk of the vec4 backend ignore this fact and we fix the swizzles right at codegen time. At the moment the pass only needs to handle single value swizzles thanks to the scalarization pass that runs before it. Notice that this only works for X/Y swizzles. We will add support for Z/W swizzles in the next patch, since they need a bit more work. v2 (Sam): - Do not expand swizzle of 64-bit immediate values. v3: - Do this after translation to hardware registers instead of doing it right before so we don't need the force_vstride0 flag (Curro). - Squashed patch that included FIXED_GRF in the list of register files that need this translation (Iago). - Remove swizzle assignments for VGRF and UNIFORM files in convert_to_hw_regs(), they will be set by apply_logical_swizzle() (Iago). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	fb7cb853c9	i965/vec4: add a scalarization pass for double-precision instructions The hardware only supports 32-bit swizzles, which means that we can only access directly channels XY of a DF making access to channels ZW more difficult, specially considering the various regioning restrictions imposed by the hardware. The combination of both things makes handling ramdom swizzles on DF operands rather difficult, as there are many combinations that can't be represented at all, at least not without some work and some level of instruction splitting depending on the case. Writemasks are 64-bit in general, however XY and ZW writemasks also work in 32-bit, which means these writemasks can't be represented natively, adding to the complexity. For now, we decided to try and simplify things as much as possible to avoid dealing with all this from the get go by adding a scalarization pass that runs after the main optimization loop. By fully scalarizing DF instructions in align16 we avoid most of the complexity introduced by the aforementioned hardware restrictions and we have an easier path to an initial fully functional version for the vector backend in Haswell and IvyBridge. Later, we can improve the implementation so we don't necessarily scalarize everything, iteratively adding more complexity and building on top of a framework that is already working. Curro drafted some ideas for how this could be done here: https://bugs.freedesktop.org/show_bug.cgi?id=92760#c82 v2: - Use a copy constructor for the scalar instructions so we copy all relevant instructions fields from the original instruction. v3: Fix indention in one switch (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	f4b8649233	i965/vec4: split double-precision SEL There is a hardware bug affecting compressed double-precision SEL instructions in align16 mode by which they won't read predication mask properly. The bug does not affect other predicated instructions and it does not affect SEL in Align1 mode either. This was found empirically and verified by Curro in the simulator. Fix this by splitting double-precision SEL in Align16 mode to use an execution size of 4. v2: Check that the dst type is 64-bit, since we can have 16-wide single precision bcsel instructions that also write 2 registers. v3: Replace bcsel by SEL in all the comments as bcsel is the nir opcode but SEL is the actual assembly instruction (Matt). Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	5356d52f31	i965/vec4: teach cmod propagation about different execution sizes We can't propagate the conditional modifier from one instruction to another of a different execution size / group, since that would change the channels affected by the conditional. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	8f39b3668a	i965/vec4: teach CSE about exec_size, group and doubles v2: adapt to changes in offset() Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	ca63a3ce51	i965/disasm: print NibCtrl for instructions with execsize < 8 v2 (Curro): - Print it also for execsize < 4. - QtrCtrl is still in effect, so print 2 * qtr_ctl + nib_ctl + 1 - Do not read the nib ctl from the instruction in gen < 7, the field only exists in gen7+. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	a83608f504	i965/vec4: dump NibCtrl for instructions with execsize != 8 v2: do it in the same fashion as the FS backend for consistency (Curro) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	e481dcc35e	i965/vec4: make the generator set correct NibCtrl for SIMD4 DF instructions From the HSW PRM, Command Reference, QtrCtrl: "NibCtrl is only allowed for SIMD4 instructions with a DF (Double Float) source or destination type." v2: Assert that the type is DF (Samuel) v3: Don't set the default group to 0 and then set it only for 4-wide instructions. Instead, assert that exec size and group are always a correct match and then always set the default group from the instruction. (Curro) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	58767f0fec	i965/vec4: add a SIMD lowering pass Generally, instructions in Align16 mode only ever write to a single register and don't need any form of SIMD splitting, that's why we have never had a SIMD splitting pass in the vec4 backend. However, double-precision instructions typically write 2 registers and in some cases they run into certain hardware bugs and limitations that we need to work around by splitting the instructions so we only write to 1 register at a time. This patch implements a SIMD splitting pass similar to the one in the scalar backend. Because we only use double-precision instructions in Align16 mode in gen7 (gen8+ is fully scalar and gens < 7 do not implement fp64) the pass should be a no-op on any other generation. For now the pass only handles the gen7 restriction where any instruction that writes 2 registers also needs to read 2 registers. This affects double-precision instructions reading uniforms, for example. Later patches will extend the lowering pass adding a few more cases. v2: - Move the simd lowering pass after the main optimization loop and run copy-propagation and dce if it reports progress (Curro) - Compute number of registers written instead of fixing it to 1 (Iago) - Use group from backend_instruction (Iago) - Drop assertion that checked that we only split 8-wide instructions into 4-wide. (Curro) - Don't assume that instructions can only be 8-wide, we might want to use 16-wide instructions in the future too (Curro) - Wrap gen7 workarounds in a conditional to ease adding workarounds for other gens in the future (Curro) - Handle dst/src overlap hazard (Curro) - Use the horiz_offset() helper to simplify the implementation (Curro) - Drop the assertion that checks that each split instruction writes exactly one register (Curro) - Use the copy constructor to generate split instructions with all the relevant fields initialized to the values in the original instruction instead of copying only a handful of them manually (Curro) v3 (Iago): - When copying to a temporary, allocate the number of registers required for the copy based on the size written of the lowered instruction instead of assuming that all lowered instructions produce single-register writes - Adapt to changes in offset() Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	945269ab72	i965: move the group field from fs_inst to backend_instruction. Just like the exec_size, we are going to need this in the vec4 backend when we implement a simd splitting pass. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	07cadc306e	i965/vec4: add a horiz_offset() helper This will come in handy when we implement a simd lowering pass in a follow-up patch. v2: use byte_offset() Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Juan A. Suarez Romero	4ea3bf8ebb	i965/vec4: handle 32 and 64 bit channels in liveness analysis Our current data flow analysis does not take into account that channels on 64-bit operands are 64-bit. This is a problem when the same register is accessed using both 64-bit and 32-bit channels. This is very common in operations where we need to access 64-bit data in 32-bit chunks, such as the double packing and packing operations. This patch changes the analysis by checking the bits that each source or destination datatype needs. Actually, rather than bits, we use blocks of 32bits, which is the minimum channel size. Because a vgrf can contain a dvec4 (256 bits), we reserve 8 32-bit blocks to map the channels. v2 (Curro): - Simplify code by making the var_from_reg helpers take an extra argument with the register component we want. - Fix a couple of cases where we had to update the code to the new way of representing live variables. v3: - Fix indent in multiline expressions (Matt) - Fix comment's closing tag (Matt) - Use DIV_ROUND_UP(inst->size_written, 16) instead of 2 * regs_written(inst) to avoid rounding issues. The same for regs_read(i). (Curro). - Add asserts in var_from_reg() to avoid exceeding the allocated registers (Curro). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	29dd5cf9d6	i965/vec4: dump the instruction execution size Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	486fd5422c	i965/vec4: use the IR's execution size In the vec4 backend the generator sets to 8 the execution size for all instructions by default, however, to implement 64-bit floating-point we will need to split certain instruction into smaller sizes so we need the IR to convey this information like we do in the scalar backend. This patch uses the execution size from the vec4 IR. We will use this feature in a later patch when we implement a SIMD splitting pass. v2: - Drop the assertion on the execution size being 8 or 4 (Curro) - Use exec_size from backend_instruction (Curro) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	f79547840a	i965/vec4: fix regs_read() for doubles Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	7c6fba5e7c	i965/vec4: fix size_written for doubles Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:51 +01:00
Iago Toral Quiroga	9527a50da0	i965: move exec_size from fs_instruction to backend_instruction We are going to need this in the vec4 backend too. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Samuel Iglesias Gonsálvez	b58026b31e	i965/vec4: use the new helper function to create double immediates Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	98da3623d5	i965/vec4: add a helper function to create double immediates Gen7 hardware does not support double immediates so these need to be moved in 32-bit chunks to a regular vgrf instead. Instead of doing this every time we need to create a DF immediate, create a helper function that does the right thing depending on the hardware generation. v2 (Curro): - Use swizzle() and writemask() helpers and make tmp const. v3 (Iago): - Adapt to changes in offset() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	8f9ce5fa22	i965/vec4: fix optimize predicate for doubles Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	1816ae8f68	i965/vec4: implement fsign() for doubles v2: use a MOV with a conditional_mod instead of a CMP, like we do in d2b, to skip loading a double immediate. v3: Fix comment (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	6e570619e0	i965/vec4: implement d2b v2 (Curro): - Generate the flag register with a MOV with conditional_mod instead of a CMP instruction, which has the benefit that we can skip loading a DF 0.0 constant. - Avoid the PICK_LOW_32BIT + MOV by using the flag result and a SEL to set the boolean result. v3: - Fix comment (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	c1fb525016	i965/vec4: implement d2i, d2u, i2d and u2d Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	4b22576234	i965/vec4: implement HW workaround for align16 double to float conversion From the BDW PRM, Workarounds chapter: "DF->f format conversion for Align16 has wrong emask calculation when source is immediate." Notice that Broadwell and later are strictly scalar at the moment though, so this is not really necessary. v2: Instead of moving the immediate to a vgrf and converting from there, just convert the double immediate to float in the compiler and move the result to the destination (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	bfc1f0f017	i965/vec4: add helpers for conversions to/from doubles Use these helpers to implement d2f and f2d. We will reuse these helpers when we implement things like d2i or i2d as well. v2: - Rename the helpers (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	c722a8e61e	i965/vec4: Rename DF to/from F generator opcodes The opcodes are not specific for conversions to/from float since we need the same for conversions to/from other 32-bit types. Rename the opcodes accordingly and change the asserts to check the size of the types involved instead. v2: - Rename to VEC4_OPCODE_TO_DOUBLE and VEC4_OPCODE_FROM_DOUBLE (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	619271ec87	i965/vec4: fix register allocation for 64-bit undef sources Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	21cf6f14d5	i965/vec4: make opt_vector_float ignore doubles The pass does not support doubles in its current form. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	a8318b120e	i965/vec4: fix get_nir_dest() to use DF type for 64-bit destinations v2: Make dst_reg_for_nir_reg() handle this for nir_register since we want to have the correct type set before we call offset(). Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	bb0e67d55d	i965/vec4: fix indentation in get_nir_src() Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	8cdbbbd2cf	i965/vec4/nir: implement double comparisons v2: - Added newline before if() (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	8a3ba03339	i965/vec4: implement double packing Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	94cfdf586a	i965/vec4: implement double unpacking Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	7ec57e91d6	i965/vec4: don't copy propagate vector opcodes that operate in align1 mode Basically, ALIGN1 mode will ignore swizzles on the input vectors so we don't want the copy propagation pass to mess with them. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	553700cf55	i965/vec4: Fix DCE for VEC4_OPCODE_SET_{LOW,HIGH}_32BIT These align1 opcodes do partial writes of 64-bit data. The problem is that we want to use them to write on the same register to implement packDouble2x32 and from the point of view of DCE, since both opcodes write to the same register, only the last one stands and decides to eliminate the first, which is not correct, so prevent this from happening. v2: Make a helper in vec4_instruction to know if the instruction is an align1 partial write. This will come in handy when we implement a simd splitting pass in a later patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	54b998e0e4	i965/vec4: add VEC4_OPCODE_SET_{LOW,HIGH}_32BIT opcodes These opcodes will set the low/high 32-bit in each 64-bit data element using Align1 mode. We will use this to implement packDouble2x32. We use Align1 mode because in order to implement this in Align16 mode we would need to use 32-bit logical swizzles (XZ for low, YW for high), but the IR works in terms of 64-bit logical swizzles for DF operands all the way up to codegen. v2: - use suboffset() instead of get_element_ud() - no need to set the width on the dst Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	6979e5a412	i965/vec4: add VEC4_OPCODE_PICK_{LOW,HIGH}_32BIT opcodes These opcodes will pick the low/high 32-bit in each 64-bit data element using Align1 mode. We will use this, for example, to do things like unpackDouble2x32. We use Align1 mode because in order to implement this in Align16 mode we would need to use 32-bit logical swizzles (XZ for low, YW for high), but the IR works in terms of 64-bit logical swizzles for DF operands all the way up to codegen. v2: - use suboffset() instead of get_element_ud() - no need to set the width on the dst Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	9b6174dffa	i965/vec4: add dst_null_df() Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	4c040332f5	i965/vec4: We only support 32-bit integer ALU operations for now Add asserts so we remember to address this when we enable 64-bit integer support, as suggested by Connor and Jason. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	611fe6b32f	i965/disasm: align16 DF source regions have a width of 2 Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	c35fa7ac55	i965/vec4: set correct register regions for 32-bit and 64-bit For 32-bit instructions we want to use <4,4,1> regions for VGRF sources so we should really set a width of 4 (we were setting 8). For 64-bit instructions we want to use a width of 2 because the hardware uses 32-bit swizzles, meaning that we can only address 2 consecutive 64-bit components in a row. Also, Curro suggested that the hardware is probably fixing the width to 2 for 64-bit instructions anyway, so just go with that and use <2,2,1>. v2: - No need to explicitly set the vertical stride of 64-bit regions to 2, brw_vecn_grf with a width of 2 will do that for us. - No need to adjust the width of dst registers. v3 (Ian): - Make type_size and width const. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Connor Abbott	ed74b19ab4	i965: add brw_vecn_grf() Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	e09a6be3b6	i965/vec4: translate d2f/f2d Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	558f279531	i965/vec4: add double/float conversion pseudo-opcodes These need to be emitted as align1 MOV's, since they need to have a stride of 2 on the float register (whether src or dest) so that data from another thread doesn't cross the middle of a SIMD8 register. v2 (Iago): - The float-to-double needs to align 32-bit data to 64-bit before doing the conversion. This was doable in align16 when we tried to use an execsize of 4, but with an execsize of 8 we would need another align1 opcode to do that (since we need data to cross the middle of a SIMD register). Just making the opcode handle this internally seems more practical that adding another opcode just for this purpose and having the caller know about this before converting. - The double-to-float conversion produces 32-bit elements aligned to 64-bit so we make the opcode re-pack the result to 32-bit and fit in one register, as expected by SIMD4x2 operation. This still requires that callers reserve two registers for the float data destination because we need to produce 64-bit aligned data first, and repack it later on the same destination register, but it saves the need for a re-pack opcode only to achieve this making the operation complete in a single opcode. Hopefully that is worth the weirdness of the double register allocation... Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Connor Abbott	2d6eee3144	i965/vec4: add support for printing DF immediates Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	9ce4b20bde	i965/vec4/nir: fix emitting 64-bit immediates Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Connor Abbott	3457252b74	i965/vec4/nir: set the right type for 64-bit registers Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	fef06f6356	i965/vec4/nir: support doubles in ALU operations Basically, this involves considering the bit-size information to set the appropriate type on both operands and destination. v2 (Curro) - Don't use two temporaries (and write one of them twice ) to obtain the nir_alu_type. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Iago Toral Quiroga	0f096b1e5a	i965/vec4/nir: Add bit-size information to types Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Connor Abbott	2d81a29203	i965/vec4/nir: allocate two registers for dvec3/dvec4 v2 (Curro): - Do not special-case for a bit-size of 64, divide the bit_size by 32 instead. - Use DIV_ROUND_UP so we can handle sub-32-bit types. v3 (Ian): - Make num_regs const. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Connor Abbott	54913850aa	i965/vec4/nir: simplify glsl_type_for_nir_alu_type() Less duplication, one one less case to handle for doubles and support for sized NIR types. v2: Fix call to get_instance by swapping rows and columns params (Iago) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Samuel Iglesias Gonsálvez	9fa24632f3	i965/nir: double/dvec2 uniforms only need to be padded to a single vec4 slot max_vector_size is used in the vec4 backend to pad out the uniform components to match a size that is a multiple of a vec4. Double and dvec2 uniforms only require a single vec4 slot, not two. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 11:26:50 +01:00
Samuel Iglesias Gonsálvez	c5ae6e78fc	i965/fs: fix exec_size when emitting DIM instruction Otherwise, DIM instructions will be emitted with the default exec size which could be 16 in some cases, that is not legal. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-03 06:48:39 +01:00
Timothy Arceri	22639a6e19	st/mesa: get Version from gl_program rather than gl_shader_program Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-03 12:57:24 +11:00
Timothy Arceri	2c0d267717	i965: stop passing gl_shader_program to brw_compile_gs() and gen6_gs_visitor() Instead we caan just use gl_program. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-03 12:20:10 +11:00
Timothy Arceri	b880281f0b	i965: get InfoLog and LinkStatus via the shader program data pointer in gl_program This removes another dependency on gl_shader_program in the codegen functions. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-03 12:20:10 +11:00
Timothy Arceri	340b22c217	i965: eliminate gen6_xfb_enabled field in brw_gs_prog_data We can just get this information from shader_info instead. Note that passing gen6_gs_visitor() gl_program via _LinkedShaders will go away in a later patch. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-03 12:20:10 +11:00
Timothy Arceri	6643da6d7f	i965: update brw_get_shader_time_index() not to take gl_shader_program This removes another dependency on gl_shader_program in the codegen functions which will help allow us to use gl_program in the CurrentProgram array rather than gl_shader_program. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-03 12:20:10 +11:00
Marek Olšák	cb6f49a902	gallium/hud: fix the windows build by disabling file dumping	2017-01-02 23:18:28 +01:00
Kenneth Graunke	bc7f1eddbd	glsl: Update ES 3.2 shader output restrictions. This disallows fancy varyings in tessellation and geometry shaders, as required by ES 3.2. Fixes: dEQP-GLES31.functional.tessellation.user_defined_io.negative.per_patch_array_of_structs dEQP-GLES31.functional.tessellation.user_defined_io.negative.per_patch_structs_containing_arrays (Not a candidate for stable branches as it only disallows things which should be working as desktop GL allows them.) v2: Update error messages to not say "vertex shader" (caught by Iago). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-01-02 14:10:50 -08:00
Ben Widawsky	fc78ee5da0	i965/miptree: Create a disable CCS flag Cc: Chad Versace <chadversary@chromium.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-02 10:35:17 -08:00
Ben Widawsky	d0b6a949f8	i965: Replace bool aux disable with enum As CCS buffers are passed to KMS, it becomes useful to be able to determine exactly what type of aux buffers are disabled. This was previously not entirely needed (though the code was a little more confusing), however it becomes very desirable after a recent patch from Chad: commit `1c8be049be` Author: Chad Versace <chadversary@chromium.org> Date: Fri Dec 9 16:18:11 2016 -0800 i965/mt: Disable aux surfaces after making miptree shareable The next patch will handle CCS and get rid of no_ccs. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-01-02 10:35:13 -08:00
Edmondo Tommasina	3f5fba8a7b	docs: document GALLIUM_HUD_DUMP_DIR envvar Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-01 00:03:39 +01:00
Edmondo Tommasina	5b9d76296f	gallium/hud: set filedescriptor for fps graph Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-01 00:03:38 +01:00
Edmondo Tommasina	94c9916710	gallium/hud: set filedescriptor for cpu graph Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-01 00:03:38 +01:00
Edmondo Tommasina	57f86fb3a8	gallium/hud: move file initialization to a function The function will be used later to create the filedescriptor for other metrics. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-01 00:03:38 +01:00
Edmondo Tommasina	22cd9040da	gallium/hud: dump hud_driver_query values to files Dump values for every selected data source in GALLIUM_HUD. Every data source has its own file and the filename is equal to the data source identifier. Set GALLIUM_HUD_DUMP_DIR to dump values to files in this directory. No values are dumped if the environment variable is not set, the directory doesn't exist or the user doesn't have write access. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-01 00:03:06 +01:00
Ilia Mirkin	1f13cb8b15	anv,radv: disable StorageImageWriteWithoutFormat for now The SPIR-V capability isn't even marked as enabled, and there are no tests in Vulkan-CTS. Per Jason Ekstrand, this won't work in anv as such write-only surfaces require additional setup which is currently not performed. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-12-31 16:38:00 -05:00
Kenneth Graunke	62a8191841	i965: Avoid NULL pointer dereference when transform feedback is off. upload_3dstate_streamout can be called when there's no currently bound transform feedback object. In this case, we get the default object, which has a NULL shader (previously gl_shader_program, now gl_program). The old code did something sketchy, but which worked: const struct gl_transform_feedback_info *linked_xfb_info = &xfb_obj->shader_program->LinkedTransformFeedback; Here, if shader_program is NULL, this would be a bogus pointer of 0x60. But we never actually dereferenced it, so it worked out. With Timothy's recent reworks, we actually end up dereferencing xfb_obj->program along the way, which crashes since it's NULL. The solution is to move this pointer initialization into the "active" block, where we know it actually exists and won't be bogus. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99231 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-12-30 15:46:22 -08:00
Timothy Arceri	68245aa6f5	glsl/mesa: add reference to gl_shader_program_data from gl_program We also add the stubs for the standalone compiler in this change. By adding a reference here we can now refactor some code to use gl_program where we were previously awkwardly using gl_shader_program. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-31 09:48:51 +11:00
Timothy Arceri	9d99dc4bc1	mesa: make union in gl_program a struct and add FIXME i915 is mixing the use of these fields, for now change this to a struct and add a FIXME. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99229	2016-12-31 09:00:05 +11:00
Jason Ekstrand	c2799a80c5	i965/peephole_ffma: Use nir_builder Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-12-30 12:38:04 -08:00
Jason Ekstrand	8495ece52e	nir/split_var_copies: Use a nir_shader rather than a void *mem_ctx Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-12-30 12:38:04 -08:00
Jason Ekstrand	ffa4ba71d9	nir/opt_peephole_select: Pass around the actual nir_shader Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-12-30 12:38:04 -08:00
Jason Ekstrand	cd6f736c07	nir/conditional_if: Properly use the builder We were passing around a void *mem_ctx and using that to initialize the builder which was wrong since that pointed to ralloc_parent(impl) which is the shader but the builder is supposed to be initialized with the nir_function_impl. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-12-30 12:38:04 -08:00
Jason Ekstrand	47b54a6f74	nir/lower_var_copies: Use a shader rather than a void *mem_ctx Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-12-30 12:38:04 -08:00
Jason Ekstrand	c4ccdfa513	nir/lower_io: Use the builder instead of carrying a mem_ctx Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-12-30 12:38:04 -08:00
Jason Ekstrand	c8e0612165	nir/from_ssa: Use nir_builder for emit_copy This lets us get rid of the void *mem_ctx parameter and make things a bit more type safe. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-12-30 12:38:04 -08:00
Jason Ekstrand	134a5ad31c	nir: Make nir_copy_deref follow the "clone" pattern We rename it to nir_deref_clone, re-order the sources to match the other clone functions, and expose nir_deref_var_clone. This past part, in particular, lets us get rid of quite a few lines since we no longer have to call nir_copy_deref and wrap it in deref_as_var. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-12-30 12:38:04 -08:00
Rob Clark	832dddcf91	freedreno/ir3: rework varying slots (maybe??) See: dEQP-GLES2.functional.shaders.swizzles.vector_swizzles.mediump_vec2_yyyy_fragment if we only access (in FS) varying.y then it ends up in slot zero.. I'm not sure the hw likes that.. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-30 13:49:57 -05:00
Ilia Mirkin	36c648b894	spirv: always expose SpvCapabilityStorageImageExtendedFormats I forgot to do this in commit `76b97d544e` ("anv: enable storage image extended formats"). Since both drivers support this now, no need for the conditional enable. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-29 22:09:58 -05:00
Ilia Mirkin	c633f228b4	anv: add support for extended texture gather Now that the SPIR-V -> NIR translation is in place, no additional logic is required. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-12-29 20:43:33 -05:00
Dave Airlie	80bafc0c11	radv: only allow cmask/dcc in color optimal. I had this on transfers due to the clear color cmd, but it seems like that path shouldn't get fast clears. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-30 00:04:16 +00:00
Dave Airlie	1814df7ea7	radv: only allow cmask/dcc on exclusive or concurrent with graphics queue. Otherwise we don't get the barriers to flush dcc etc. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-30 00:04:01 +00:00
Jason Ekstrand	a4d1eb443e	nir: Rewrite lower_regs_to_ssa to use the phi builder This keeps some of Connor's original code. However, while I was at it, I updated this very old pass to a bit more modern NIR.	2016-12-29 16:02:44 -08:00
Jason Ekstrand	67a70889f6	nir/phi-builder: Set the value in the block when creating a phi After we figure out the value that we are going to return, we have a loop that walks up the dominance tree and sets the value in each of the blocks that doesn't have one yet. In the case of the phi, the def is set to NEEDS_PHI not NULL, so the last one where the phi node actually goes never gets filled out. This can lead to duplicating the phi node unnecessarily.	2016-12-29 16:02:44 -08:00
Jason Ekstrand	baf1aa1334	nir: Add foreach_register helper macros	2016-12-29 16:02:44 -08:00
Jason Ekstrand	fb181196de	nir: Rename convert_to_ssa lower_regs_to_ssa This matches the naming of nir_lower_vars_to_ssa, the other to-SSA pass.	2016-12-29 16:02:44 -08:00
Timothy Arceri	194537ebe4	mesa/glsl/i965: remove Driver.NewShader() After removing brw_shader in the previous commit this is no longer needed. V2: remove use in src/compiler/glsl/test_optpass.cpp Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-30 10:57:17 +11:00
Timothy Arceri	718a0cf49f	i965: move compiled_once flag to brw_program This allows us to delete brw_shader and removes the last use of gl_linked_shader in the codegen paths. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-30 10:57:16 +11:00
Timothy Arceri	8417bf528e	mesa/glsl: move BlendSupport bitfield to gl_program This will let us to make _CurrentFragmentProgram a gl_program pointer allowing for simpilifications to be made. We also need to add a field to gl_shader to hold it during parsing. In gl_program we put it inside a union in anticipation of moving more fields here that can be only fs or vertex stage fields. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-30 10:57:16 +11:00
Timothy Arceri	3177eef392	mesa: store gl_program in gl_transform_feedback_object rather than gl_shader_program This will allow us to make the CurrentProgram array store gl_program which allows us to do a bunch of simplifications. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-30 10:57:16 +11:00
Timothy Arceri	700bc94dce	mesa/glsl: move LinkedTransformFeedback from gl_shader_program to gl_program This will help allow us to store gl_program in the CurrentProgram array rather than gl_shader_program which will allow a bunch of simplifications. Note that we make LinkedTransformFeedback a pointer so we don't waste memory creating a struct for each stage. We also store a pointer to the gl_program that will contain the pointer in gl_shader_program so we can get easy access to the correct stage. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-30 10:57:16 +11:00
Timothy Arceri	31c04e4e22	i965: get LinkedTransformFeedback from gl_transform_feedback_object We have already set the gl_shader_program pointer to the correct shader program in _mesa_BeginTransformFeedback() so use it. This is more consistent with how we do it for gen7. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-30 10:57:16 +11:00
Timothy Arceri	29d70f5de9	mesa: move _Used to gl_program We no longer need to initialise it because gl_program is never reused. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-30 10:57:16 +11:00
Timothy Arceri	8a69ae5345	mesa/compiler: add local_size_variable to shader_info This will be used in api_validate.c in a following patch when we switch to using gl_program pointers for the pipelines CurrentProgram array. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-30 10:57:16 +11:00
Timothy Arceri	9ea513e226	mesa: pass gl_program to _mesa_append_uniforms_to_file() This now contains everything we need. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-30 10:57:16 +11:00
Timothy Arceri	b51bfbdd85	glsl/mesa: set separate_shader directly in shader_info Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-30 10:57:16 +11:00
Timothy Arceri	41dd6c3539	mesa/glsl: move subroutine metadata to gl_program This will allow us to store gl_program rather than gl_shader_program as the current program perstage which allows us to simplify code that makes use of the CurrentProgram list. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-30 10:57:16 +11:00
Timothy Arceri	0de6f6223a	mesa/compiler: add stage to shader_info This will allow us to simplify the current program logic for SSO. Also since we aim to detach shader_info from nir_shader this will come in handy avoiding passing nir_shader around just to keep track of the stage we are dealing with. V2: set stage for arb asm programs also. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-30 10:57:15 +11:00
Eric Anholt	88b41239f9	vc4: Rework scheduling of thread switch to cut one more NOP. Jonas's patch got us most of the benefit of scheduling instructions into the delay slots of thread switch, but if there had been nothing to pair the thrsw with, it would move the thrsw up and leave a NOP where the thrsw was. Instead, don't pair anything with thrsw through the normal scheduling path, and have a separate helper function that inserts the thrsw earlier if possible and inserts any necessary NOPs. total instructions in shared programs: 93027 -> 92643 (-0.41%) instructions in affected programs: 14952 -> 14568 (-2.57%)	2016-12-29 15:22:54 -08:00
Jonas Pfeil	d82dbc4cde	vc4: Fill thread switching delay slots Scan for instructions without a signal set in front of the switching instruction and move the signal up there. shader-db results: total instructions in shared programs: 94494 -> 93027 (-1.55%) instructions in affected programs: 23545 -> 22078 (-6.23%) v2: Fix re-emitting of the instruction in the loop trying to emit NOPs, drop a scheduling change from branch delay slots. (by anholt) Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de>	2016-12-29 14:41:09 -08:00
Eric Anholt	63e7671c7e	vc4: Enable NIR-based loop unrolling. This successfully unrolls a new shader in GLB2.7, which also gets that shader to successfully compile in multithreaded mode.	2016-12-29 14:41:09 -08:00
Timothy Arceri	5f323198ea	nir: stop gcc warning about uninitialised variables Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-29 13:47:11 +11:00
Dave Airlie	44f833ab18	radv: denote support for extended storage image formats. I'm sure anv has support for these as well, but this is just a first use of the interface to allow different supported spir-v features. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-28 22:44:40 +00:00
Dave Airlie	de7dd4d621	spirv: add interface for drivers to define support extensions. I expect over time the struct contents will change as all drivers support stuff etc, but for now this should be a good starting point. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-28 22:43:17 +00:00
Chad Versace	464b23b1f2	mesa/shaderobj: Fix races on refcounts Use atomic ops when updating gl_shader::RefCount. Fixes intermittent failures and crashes in 'dEQP-EGL.functional.sharing.gles2.multithread.*'. All tests in that group now pass except 'dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_server_sync.textures.copyteximage2d_texsubimage2d_render'. Tested with: mesa: branch 'master' at `d6545f2` deqp: branch 'nougat-cts-dev' at 4acf725 with additional local fixes DEQP_TARGET: x11_egl hw: Intel Broadwell 0x1616 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99085 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Cc: Mark Janes <mark.a.janes@intel.com> Cc: Haixia Shi <hshi@chromium.org>	2016-12-28 11:10:43 -08:00
Rob Clark	ec01ef2db1	freedreno/ir3: fix linkage::var size It should actually be 32 for a4xx/a5xx.. we still only advertise 16 but for a5xx the linkage map includes position/psize. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-27 16:54:01 -05:00
Rob Clark	c416ea31cf	freedreno/ir3: treat clipvertex like a normal varying We need this in case it is streamed out. Not sure why we were treating it specially before. Having it as a VS out is harmless if FS doesn't have a matching input. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-27 16:54:01 -05:00
Rob Clark	d10c5a2481	freedreno/a5xx: transform-feedback support We'll need to revisit when adding hw binning pass support, whether we can still do this in main draw step, as we do w/ a3xx/a4xx, or if we needed to move it to the binning stage. Still some failing piglits but most tests pass and the common cases seem to work. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-27 16:54:01 -05:00
Rob Clark	928e9bd602	freedreno: update generated headers Pull in a5xx streamout related regs. Also fixes a couple incorrect register definitions. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-27 16:54:01 -05:00
Rob Clark	6d77ceb701	freedreno/ir3: UBO support for 64b GPUs (a5xx) Update address calculation to support 64b addresses. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-27 16:54:01 -05:00
Rob Clark	fc10dc9fde	freedreno/ir3: rework location of driver constants Rework how we lay out driver constants (driver-params, UBO/TFBO buffer addresses, immediates) for more flexibility. For a5xx+ we need to deal with the fact that gpu ptrs are 64b instead of 32b, which makes the fixed offset scheme not work so well. While we are dealing with that we might also make the layout more dynamic to account for varying # of UBOs, etc. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-27 16:54:01 -05:00
Rob Clark	09202cde7e	freedreno/a5xx: fix emit for bo addresses Reloc for the buffer address is two dwords on 64b devices (a5xx+) Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-27 16:54:01 -05:00
Rob Clark	f043904080	freedreno/a5xx: texture layout Seems to be imilar to a4xx, and sampler state "array-pitch" needs to be aligned to page size. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-27 16:54:01 -05:00
Rob Clark	859cb24d94	ttn: set ->info->num_ubos For dealing w/ 32b vs 64b gpu addresses, I need to rework how we pass UBO buffer addresses to shader, and knowing up front the # of UBOs is useful. But I noticed ttn wasn't setting this. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-27 16:54:01 -05:00
Chad Versace	d6545f2345	anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0 The spec implicitly allows the incoming count to be 0. From the Vulkan 1.0.38 spec, Section 4.1 Physical Devices: If the value referenced by pQueueFamilyPropertyCount is not 0 [then do stuff]. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-27 12:31:34 -08:00
Chad Versace	b85c0b569f	egl: Emit correct error when robust context creation fails Fixes dEQP-EGL.functional.create_context_ext.robust_* on Intel with GBM. If the user sets the EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR in EGL_CONTEXT_FLAGS_KHR when creating an OpenGL ES context, then EGL_KHR_create_context spec requires that we unconditionally emit EGL_BAD_ATTRIBUTE because that flag does not exist for OpenGL ES. When creating an OpenGL context, the spec requires that we emit EGL_BAD_MATCH if we can't support the request; that error is generated in the egl_dri2 layer where the driver capability is actually checked. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99188 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-12-27 10:21:29 -08:00
Damien Grassart	75252826e8	anv: return count of queue families written The Vulkan spec indicates that vkGetPhysicalDeviceQueueFamilyProperties() should overwrite pQueueFamilyPropertyCount with the number of structures actually written to pQueueFamilyProperties. Signed-off-by: Damien Grassart <damien@grassart.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Cc: mesa-stable@lists.freedesktop.org	2016-12-27 10:15:47 -08:00
Chad Versace	e2d69d5e2d	i965: Allow import/export of ARGB1555 images To my knowledge, this fixes no tests. I simply wrote the patch for completeness as a follow-up to the previous two patches. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-12-27 09:14:04 -08:00
Chad Versace	f3739810e3	mesa/texformat: Handle GL_RGBA + GL_UNSIGNED_SHORT_5_5_5_1 _mesa_choose_tex_format() already handles GL_RGBA + GL_UNSIGNED_SHORT_1_5_5_5_REV by converting it to MESA_FORMAT_B5G5R5A1_UNORM. Teach it do the same for the non-reversed type. Otherwise, the switch's fallthrough converts it to an 8888 format, which has incompatible precision in the alpha channel. Patch 2/2 to fix dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8 on Intel. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99185 Cc: Haixia Shi <hshi@chromium.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-12-27 09:14:00 -08:00
Chad Versace	9aa6ab0748	dri: Add __DRI_IMAGE_FORMAT_ARGB1555 This allows eglCreateImage() to accept textures of said format. Patch 1/2 to fix dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8 on Intel. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99185 Cc: Haixia Shi <hshi@chromium.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-12-27 09:13:43 -08:00
Tapani Pälli	4d6d4f939e	egl/dri2: implement query surface hook This makes better guarantee that the values we return are in sync what the underlying drawable currently has. Together with dEQP change in bug #98327 this fixes following test: dEQP-EGL.functional.resize.surface_size.grow v2: avoid unnecessary x11 roundtrips (Chad Versace) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98327	2016-12-27 08:01:08 +02:00
Dave Airlie	d8423772ca	radv: add some asserts for operations on general queue These might be useful in the future, or not. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-27 03:27:14 +00:00
Bas Nieuwenhuizen	059af2515a	radv: Also skip DCC clear flushes for compute. (airlied: fixes DOOM hang with compute queue enabled) Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Bas Nieuwenhuizen <basni@google.com>	2016-12-27 03:27:13 +00:00
Dave Airlie	3fd306b423	radv: handle queue present directly to winsys Don't call the QueueSubmit interface, just call direct to the winsys, so we can pass the wait semaphores. Noticed while debugging doom, doesn't fix anything. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-26 22:20:35 +00:00
Jordan Justen	097c9dc2d4	intel/blorp_blit: Fix max blit size for gen6 Fixes ES3-CTS.gtf.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_stencil_blit Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-26 08:50:21 -08:00
Dave Airlie	b5bb8b54cf	radv: fix rendering to b10g11r11_ufloat_pack32 doom was causing a printf about an illegal color, it was due the non-void returning -1, and the other function checking for 4, align these. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-26 10:31:20 +10:00
Dave Airlie	4813c9ade7	radv: handle multi-component shared load/stores. This was seen in doom shaders, so handle it properly. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave AIrlie <airlied@redhat.com>	2016-12-26 10:31:20 +10:00
Vedran Miletić	d9fef848a6	clover: Use Clang's diagnostics Presently errors from frontend are handled only if they occur in clang::CompilerInvocation::CreateFromArgs(). This patch uses clang::DiagnosticsEngine to detect errors such as invalid values for Clang frontend arguments. Fixes Piglit's cl/program/build/fail/invalid-version-declaration.cl test. v2: fix inconsistent code formatting Signed-off-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Aaron Watry <awatry@gmail.com>	2016-12-24 18:35:09 -08:00
Damien Grassart	3a30b1a556	radv: return count of queue families written The Vulkan spec indicates that vkGetPhysicalDeviceQueueFamilyProperties() should overwrite pQueueFamilyPropertyCount with the number of structures actually written to pQueueFamilyProperties. Signed-off-by: Damien Grassart <damien@grassart.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-25 02:25:02 +01:00
Jason Ekstrand	88b5acfa09	i965/generator/tex: Handle an immediate sampler with an indirect texture In this case we were dying when we tried to do SHL addr sampler imm(8) because that puts an immediate in src0 of a two source instruction. This fixes 2704 of the new separate sampler Vulkan CTS tests on Sky Lake. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-12-23 07:27:13 -08:00
Bruce Cherniak	9e35426731	swr: fix icc compile error ICC doesn't like the use of nullptr (std::nullptr_t) argument in p_atomic_set. GCC and clang don't complain. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99119 Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-12-23 08:36:21 -06:00
Dave Airlie	e7279f16a0	radv: set some proper values for interp offset limits. These are taken from the amdgpu-pro driver, and cause no CTS change. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-23 14:36:54 +10:00
Dave Airlie	14737bcdd5	radv: bump texel offsets to align with radeonsi it appears from the amdgpu-pro results the hw can do more, but let's just align with radeonsi for now. No CTS regressions. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-23 14:36:50 +10:00
Jason Ekstrand	d55835b8bd	nir/algebraic: Add optimizations for "a == a && a CMP b" This sequence shows up The Talos Principal, at least under Vulkan, and prevents loop analysis from properly computing trip counts in a few loops. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-12-22 16:27:19 -08:00
Jason Ekstrand	8962cc96ec	i965: Use nir_opt_trivial_continues and nir_opt_if Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-12-22 16:27:19 -08:00
Jason Ekstrand	6d9f576b56	nir: Add a pass for moving SPIR-V continue blocks to the ends of loops When shaders come in from SPIR-V, we handle continue blocks by placing the contents of the continue inside of a "if (!first_iteration)". We do this so that we can properly handle the fact that continues in SPIR-V jump to the continue block at the end of the loop rather than jumping directly to the top of the loop like they do in NIR. In particular, the increment step of a simple for loop ends up in the continue block. This pass looks for this case in loops that don't actually have any continues and moves the continue contents to the end of the loop instead. We need this because loop unrolling doesn't work if the increment is inside of a condition. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-12-22 16:27:19 -08:00
Jason Ekstrand	1111a05f90	nir: Add an optimization pass to remove trivial continues Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-12-22 16:27:19 -08:00
Jason Ekstrand	993e9195d4	nir: Correctly handle blocks in cf_node_cf_tree_next Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-12-22 16:27:19 -08:00
Timothy Arceri	3321eb4c36	i965: make use of nir_lower_returns() for GL Fixes two new piglit tests: spec/glsl-1.10/execution/vs-nested-return-sibling-loop.shader_test spec/glsl-1.10/execution/vs-nested-return-sibling-loop2.shader_test shader-db results for BDW: total instructions in shared programs: 12903158 -> 12903134 (-0.00%) instructions in affected programs: 27100 -> 27076 (-0.09%) helped: 32 HURT: 6 total cycles in shared programs: 294922518 -> 294922804 (0.00%) cycles in affected programs: 4372828 -> 4373114 (0.01%) helped: 31 HURT: 8 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-23 10:59:32 +11:00
Timothy Arceri	f20ba7ad44	nir: update nir_lower_returns to only predicate instructions when needed Unless an if statement contains nested returns we can simply add any following instructions to the branch without the return. V2: fix handling if_nested_return value when there is a sibling if/loop that doesn't contain a return. (Spotted by Ken) V3: - add a better comment to the new variable - remove instructions after if when both branches return Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-23 10:59:32 +11:00
Timothy Arceri	40e9f2f138	i965: disable loop unrolling in GLSL IR There is a single regression in loop unrolling which is: loops HURT: shaders/orbital_explorer.shader_test GS SIMD8: 0 -> 1 However the loop is huge so it seems reasonable not to unroll it. It's surprising that GLSL IR does unroll it. shader-db results BDW: total instructions in shared programs: 13037455 -> 13036947 (-0.00%) instructions in affected programs: 17982 -> 17474 (-2.83%) helped: 63 HURT: 25 total cycles in shared programs: 262217870 -> 262227990 (0.00%) cycles in affected programs: 2287046 -> 2297166 (0.44%) helped: 969 HURT: 844 total loops in shared programs: 2951 -> 2952 (0.03%) loops in affected programs: 0 -> 1 helped: 0 HURT: 1 LOST: 0 GAINED: 1 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-23 10:15:36 +11:00
Timothy Arceri	715f0d06d1	i965: use nir loop unrolling pass shader-db results for BDW: total instructions in shared programs: 12589614 -> 12590119 (0.00%) instructions in affected programs: 50525 -> 51030 (1.00%) helped: 7 HURT: 145 total cycles in shared programs: 241524604 -> 241490502 (-0.01%) cycles in affected programs: 1941404 -> 1907302 (-1.76%) helped: 302 HURT: 449 total loops in shared programs: 4245 -> 2947 (-30.58%) loops in affected programs: 1535 -> 237 (-84.56%) helped: 1142 HURT: 0 total spills in shared programs: 14453 -> 14453 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 18984 -> 18984 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 26 GAINED: 15 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-23 10:15:36 +11:00
Timothy Arceri	e729504fb1	nir: pass compiler rather than devinfo to functions that call nir_optimize Later we will pass compiler to nir_optimise to be used by the loop unroll pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-23 10:15:36 +11:00
Timothy Arceri	51daccb289	nir: add a loop unrolling pass V2: - tidy ups suggested by Connor. - tidy up cloning logic and handle copy propagation based of suggestion by Connor. - use nir_ssa_def_rewrite_uses to fix up lcssa phis suggested by Connor. - add support for complex loop unrolling (two terminators) - handle case were the ssa defs use outside the loop is already a phi - support unrolling loops with multiple terminators when trip count is know for each terminator V3: - set correct num_components when creating phi in complex unroll - rewrite update remap table based on Jasons suggestions. - remove unrequired extract_loop_body() helper as suggested by Jason. - simplify the lcssa phi fix up code for simple loops as per Jasons suggestions. - use mem context to keep track of hash table memory as suggested by Jason. - move is_{complex,simple}_loop helpers to the unroll code - require nir_metadata_block_index - partially rewrote complex unroll to be simpler and easier to follow. V4: - use rzalloc() when creating nir_phi_src but not setting pred right away fixes regression cause by ralloc() no longer zeroing memory. V5: - simplify calling of complex_unroll() - use new loop terminator fields to get the break/continue from blocks and simplify loop unrolling code - handle slightly less trivial loop terminators. if branches can now have instructions but can only contain a single block. - use nir print type IR snippets in unroll function descriptions - add better explanation and variable for why we need to clone additional times when the second terminator it the limiting terminator. - partially convert out of ssa before unrolling loops (suggested by Jason) v6: - remove unused nir_builder - use Jasons new from ssa helper - tidy/fixup cursor use - unroll terminators that contain control flow correctly - unroll complex loops with control flow before the terminators correctly Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-23 10:15:36 +11:00
Timothy Arceri	f8407a5398	nir: add helper for cloning nir_cf_list V2: - updated to create a generic list clone helper nir_cf_list_clone() - continue to assert on clone when fallback flag not set as suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-23 10:15:36 +11:00
Timothy Arceri	b84dfa0f62	nir: update fixup_phi_srcs() to handle registers We need to do this because we partially get out of SSA when unrolling and cloning loops. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-23 10:15:36 +11:00
Timothy Arceri	d781320974	nir: create helper for fixing phi srcs when cloning This will be useful for fixing phi srcs when cloning a loop body during loop unrolling. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-23 10:15:36 +11:00
Thomas Helland	ec8423a4b1	nir: Add a LCSAA-pass V2: Do a "depth first search" to convert to LCSSA V3: Small comment fixup V4: Rebase, adapt to removal of function overloads V5: Rebase, adapt to relocation of nir to compiler/nir Still need to adapt to potential if-uses Work around nir_validate issue V6 (Timothy): - tidy lcssa and stop leaking memory - dont rewrite the src for the lcssa phi node - validate lcssa phi srcs to avoid postvalidate assert - don't add new phi if one already exists - more lcssa phi validation fixes - Rather than marking ssa defs inside a loop just mark blocks inside a loop. This is simpler and fixes lcssa for intrinsics which do not have a destination. - don't create LCSSA phis for loops we won't unroll - require loop metadata for lcssa pass - handle case were the ssa defs use outside the loop is already a phi V7: (Timothy) - pass indirect mask to metadata call v8: (Timothy) - make convert to lcssa a helper function rather than a nir pass - replace inside loop bitset with on the fly block index logic. - remove lcssa phi validation special cases - inline code from useless helpers, suggested by Jason. - always do lcssa on loops, suggested by Jason. - stop making lcssa phis special. Add as many source as the block has predecessors, suggested by Jason. V9: (Timothy) - fix regression with the is_lcssa_phi field not being initialised to false now that ralloc() doesn't zero out memory. V10: (Timothy) - remove extra braces in SSA example, pointed out by Topi V11: (Timothy) - add missing support for LCSSA phis in if conditions. V12: (Timothy) - small tidy up suggested by Jason. - always create lcssa phi even if it just points to an lcssa phi from an inner loop Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-23 10:15:36 +11:00
Thomas Helland	6772a17acc	nir: Add a loop analysis pass This pass detects induction variables and calculates the trip count of loops to be used for loop unrolling. V2: Rebase, adapt to removal of function overloads V3: (Timothy Arceri) - don't try to find trip count if loop terminator conditional is a phi - fix trip count for do-while loops - replace conditional type != alu assert with return - disable unrolling of loops with continues - multiple fixes to memory allocation, stop leaking and don't destroy structs we want to use for unrolling. - fix iteration count bugs when induction var not on RHS of condition - add FIXME for && conditions - calculate trip count for unsigned induction/limit vars V4: (Timothy Arceri) - count instructions in a loop - set the limiting_terminator even if we can't find the trip count for all terminators. This is needed for complex unrolling where we handle 2 terminators and the trip count is unknown for one of them. - restruct structs so we don't keep information not required after analysis and remove dead fields. - force unrolling in some cases as per the rules in the GLSL IR pass V5: (Timothy Arceri) - fix metadata mask value 0x10 vs 0x16 V6: (Timothy Arceri) - merge loop_variable and nir_loop_variable structs and lists suggested by Jason - remove induction var hash table and store pointer to induction information in the loop_variable suggested by Jason. - use lowercase list_addtail() suggested by Jason. - tidy up init_loop_block() as per Jasons suggestions. - replace switch with nir_op_infos[alu->op].num_inputs == 2 in is_var_basic_induction_var() as suggested by Jason. - use nir_block_last_instr() in and rename foreach_cf_node_ex_loop() as suggested by Jason. - fix else check for is_trivial_loop_terminator() as per Connors suggetions. - simplify offset for induction valiables incremented before the exit conditions is checked. - replace nir_op_isub check with assert() as it should have been lowered away. V7: (Timothy Arceri) - use rzalloc() on nir_loop struct creation. Worked previously because ralloc() was broken and always zeroed the struct. - fix cf_node_find_loop_jumps() to find jumps when loops contain nested if statements. Code is tidier as a result. V8: (Timothy Arceri) - move is_trivial_loop_terminator() to nir.h so we can use it to assert is the loop unroll pass - fix analysis to not bail when looking for terminator when the break is in the else rather then the if - added new loop terminator fields: break_block, continue_from_block and continue_from_then so we don't have to gather these when doing unrolling. - get correct array length when forcing unrolling of variables indexed arrays that are the same size as the iteration count - add support for induction variables of type float - update trival loop terminator check to allow an if containing instructions as long as both branches contain only a single block. V9: (Timothy) - bunch of tidy ups and simplifications suggested by Jason. - rewrote trivial terminator detection, now the only restriction is there must be no nested jumps, anything else goes. - rewrote the iteration test to use nir_eval_const_opcode(). - count instruction properly even when forcing an unroll. - bunch of other tidy ups and simplifications. V10: (Timothy) - some trivial tidy ups suggested by Jason. - conditional fix for break inside continue branch by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-23 10:15:36 +11:00
Timothy Arceri	eda3ec7957	i965: use nir_lower_indirect_derefs() for GLSL This moves the nir_lower_indirect_derefs() call into brw_preprocess_nir() so thats is called by both OpenGL and Vulkan and removes that call to the old GLSL IR pass lower_variable_index_to_cond_assign() We want to do this pass in nir to be able to move loop unrolling to nir. There is a increase of 1-3 instructions in a small number of shaders, and 2 Kerbal Space program shaders that increase by 32 instructions. The changes seem to be caused be the difference in the GLSL IR vs NIR variable index lowering passes. The GLSL IR pass creates a simple if ladder for arrays of size 4 or less, while the NIR pass implements a binary search for all arrays regardless of size. Shader-db results BDW: total instructions in shared programs: 13021176 -> 13021819 (0.00%) instructions in affected programs: 57693 -> 58336 (1.11%) helped: 20 HURT: 190 total cycles in shared programs: 299805580 -> 299750826 (-0.02%) cycles in affected programs: 2290024 -> 2235270 (-2.39%) helped: 337 HURT: 442 total fills in shared programs: 19984 -> 19984 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 4 GAINED: 0 V2: remove the do_copy_propagation() call from the i965 GLSL IR linking code. This call was added in `f7741c5211` but since we are moving the variable index lowering to NIR we no longer need it and can just rely on the nir copy propagation pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-23 10:15:36 +11:00
Timothy Arceri	976859ce57	i965: allow sampler indirects on all gens Without this we will regress the max-samplers piglit test on Gen6 and lower when loop unrolling is done in NIR. There is a check in the GLSL IR linker that errors when it finds indirects and EmitNoIndirectSampler is set. As far as I can tell there is no reason for not enabling this for all gens regardless of whether they fully support ARB_gpu_shader5 or not. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-23 10:15:35 +11:00
Jason Ekstrand	a620f66872	nir: Add a couple quick-and-dirty out-of-SSA helpers These are designed for use within an optimization pass when SSA becomes more pain than it's worth. They're very naive and don't generate anything close to optimal register-based NIR. Also, they may result in shaders which do not validate because of, for instance, registers in phi sources. However, the register-based into-SSA pass should be pretty efficient at cleaning up the mess. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-12-23 10:15:35 +11:00
Arda Coskunses	99de7b7525	vulkan/wsi/x11: don't crash on null wsi x11 connection Without this check driver crash when application window closed unexpectedly. Acked-by: Edward O'Callaghan <funfunctor@folklore194.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-12-22 14:09:46 -08:00
Arda Coskunses	01dd363e67	vulkan/wsi/x11: don't crash on null visual When application window closed unexpectedly due to lost window visualtypes getting invlaid parameters which is causing a crash. Necessary check is added to prevent the crash. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-12-22 14:09:34 -08:00
Christian Inci	7a4ea95f1c	radeonsi: Bugfix needed for hashcat Hashcat needs MAX_GLOBAL_BUFFERS to be 21 or even 22 for some modes. It'll crash otherwise. I'm adding an assert to see if programs need it to be even higher. Signed-off-by: Christian Inci <chris.bugsfd@broke-the-inter.net> [Handle first properly; should be NFC, since clover always uses first == 0.] Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-22 17:11:43 +01:00
Nicolai Hähnle	eca57f85ee	radeonsi: fix gl_ClipDistance and gl_ClipVertex for points The clipper hardware doesn't consider points as primitives that can be clipped. Simply setting the corresponding cull bits works, and should not have an adverse effect on other primitive types according to the hardware team. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-12-22 16:59:58 +01:00
Nicolai Hähnle	3778a10d37	radeonsi: only set VS_OUT_MISC_SIDE_BUS_ENA when the misc vector is used Should have no effect (other than perhaps on power consumption), but Vulkan does this. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-12-22 16:58:53 +01:00
Vinson Lee	ede8c02ab0	llvmpipe: Link tests with CLOCK_LIB. Fix linking error with 'make check'. CXXLD lp_test_format ../../../../src/gallium/auxiliary/.libs/libgallium.a(os_time.o): In function `os_time_get_nano': src/gallium/auxiliary/os/os_time.c:59: undefined reference to `clock_gettime' Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2016-12-21 17:23:05 -08:00
Fredrik Höglund	27a8aab882	radv: fix dual source blending Add the index to the location when assigning driver locations for output variables. Otherwise two fragment shader outputs declared as: layout (location = 0, index = 0) out vec4 output1; layout (location = 0, index = 1) out vec4 output2; will end up aliasing one another. Note that this patch will make the second output variable in the above example alias a possible third output variable with location = 1 and index = 0. But this shouldn't be a problem in practice since only one color attachment is supported when dual-source blending is used. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-22 02:07:17 +01:00
Dave Airlie	877202b6dc	radv: enable shaderStorageImageExtendedFormats This passes all the CTS tests that get enabled for this. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-22 10:29:15 +10:00
Dave Airlie	a3ca2a9b7b	radv: enable shaderGatherImageExtended Thanks to Ilia's patch this works fine on radv. No regressions in CTS, all enabled tests pass. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-22 09:48:18 +10:00
Dave Airlie	56020c7a7c	radv/image: only touch queue family info for concurrent images. The spec says to ignore these fields for exclusive images. Fixes crashes in: dEQP-VK.clipping.* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-21 23:33:04 +00:00
Dave Airlie	9d23b8a18e	radv: flush smem for uniform buffer bit. (cc'ing stable as I'd like to backport the ubo speedup as well) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-21 22:31:14 +00:00
Junwei Zhang	13ae47234a	radeonsi: add Polaris12 PCI ID Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-21 15:10:54 -05:00
Junwei Zhang	018ead4266	radeonsi: add Polaris12 support (v3) v2: use gfxip names for llvm 4.0+ v3: use tonga for llvm <= 3.8, drop gfxip name, we can just change that we change the other asics. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-12-21 15:10:03 -05:00
Ian Romanick	15c8f322ca	glsl: Eliminate the open-coded version of process_block_array_leaf Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-12-21 10:24:45 -08:00
Juan A. Suarez Romero	415f5f09e3	ttn: handle GLSL_SAMPLER_DIM_SUBPASS_MS case Fixes a warning. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-12-21 12:44:25 +01:00
Juan A. Suarez Romero	c32a9ec5f5	i965: allow unsourced enabled VAO The GL 4.5 spec says: "If any enabled array’s buffer binding is zero when DrawArrays or one of the other drawing commands defined in section 10.4 is called, the result is undefined." This commits avoids crashing the code, which is not a very good "undefined result". This fixes spec/!opengl 3.1/vao-broken-attrib piglit test.	2016-12-21 12:37:22 +01:00
Edward O'Callaghan	8801734da7	svga: Fix a strict-aliasing violation in shader dumper As per the C spec, it is illegal to alias pointers to different types. This results in undefined behaviour after optimization passes, resulting in very subtle bugs that happen only on a full moon.. Use a memcpy() as a well defined coercion between the isomorphic bit-field interpretations of memory. V.2: Use C99 compat STATIC_ASSERT() over C11 static_assert(). Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-12-21 15:00:21 +11:00
Roland Scheidegger	e827d91756	draw: use SoA fetch, not AoS one Now that there's some SoA fetch which never falls back, we should always get results which are better or at least not worse (something like rgba32f will stay the same). For cases which get way better, think something like R16_UNORM with 8-wide vectors: this was 8 sign-extend fetches, 8 cvt, 8 muls, followed by a couple of shuffles to stitch things together (if it is smart enough, 6 unpacks) and then a (8-wide) transpose (not sure if llvm could even optimize the shuffles + transpose, since the 16bit values were actually sign-extended to 128bit before being cast to a float vec, so that would be another 8 unpacks). Now that is just 8 fetches (directly inserted into vector, albeit there's one 128bit insert needed), 1 cvt, 1 mul. v2: ditch the old AoS code instead of just disabling it. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-12-21 04:48:24 +01:00
Roland Scheidegger	cb81460dcc	gallivm: generalize the compressed format soa fetch a bit This can now handle rgtc (unorm) too - this path no longer handles plain formats, but that's unnecessary they now all have their proper SoA unpack (this will still be dog-slow though due to the actual fetch being per-pixel util fallbacks). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-12-21 04:48:24 +01:00
Roland Scheidegger	3c98e3cd63	gallivm: provide soa fetch path handling formats with more than 32bit This previously always fell back to AoS conversion. Even for 4-float formats (which is the optimal case by far for that fallback case) this was suboptimal, since it meant the conversion couldn't be done with 256bit vectors. While this may still only be partly possible for some formats, (unless there's AVX2 support) at least the transpose can be done with half the unpacks (and before using the transpose for AoS fallbacks, it was worse still). With less than 4 channels, things got way worse with the AoS fallback quickly even with 128bit vectors. The strategy is pretty much the same as the existing one for formats which fit into 32 bits, except there's now multiple vectors to be fetched (2 or 4 to be exact), which need to be shuffled first (if it's 4 vectors, this amounts to a transpose, for 2 it's a bit different), then the unpack is done the same (with the exception that the shift of the channels is now modulo 32, and we need to select the right vector). In fact the most complex part about it is to get the shuffles right for separating into lo/hi parts for AVX/AVX2... This also makes use of the new ability of gather to use provided type information, which we abuse to outsmart llvm so we get decent shuffles, and to fetch 3x32bit vectors without having to ZExt the scalar. And just because we can, we handle double formats too, albeit they are a bit different (draw sometimes needs to handle that). v2: fix typo float/int bug (generating inefficient code). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-12-21 04:48:24 +01:00
Roland Scheidegger	8bd67a35c5	gallivm: optimize gather a bit, by using supplied destination type By using a dst_type in the the gather interface, gather has some more knowledge about how values should be fetched. E.g. if this is a 3x32bit fetch and dst_type is 4x32bit vector gather will no longer do a ZExt with a 96bit scalar value to 128bit, but just fetch the 96bit as 3x32bit vector (this is still going to be 2 loads of course, but the loads can be done directly to simd vector that way). Also, we can now do some try to use the right int/float type. This should make no difference really since there's typically no domain transition penalties for such simd loads, however it actually makes a difference since llvm will use different shuffle lowering afterwards so the caller can use this to trick llvm into using sane shuffle afterwards (and yes llvm is really stupid there - nothing against using the shuffle instruction from the correct domain, but not at the cost of doing 3 times more shuffles, the case which actually matters is refusal to use shufps for integer values). Also do some attempt to avoid things which look great on paper but llvm doesn't really handle (e.g. fetching 3-element 8 bit and 16 bit vectors which is simply disastrous - I suspect type legalizer is to blame trying to extend these vectors to 128bit types somehow, so fetching these with scalars like before which is suboptimal due to the ZExt). Remove the ability for truncation (no point, this is gather, not conversion) as it is complex enough already. While here also implement not just the float, but also the 64bit avx2 gathers (disabled though since based on the theoretical numbers the benefit just isn't there at all until Skylake at least). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-12-21 04:48:24 +01:00
Roland Scheidegger	5b950319ce	gallivm: optimize SoA AoS fallback fetch path a little We should do transpose, not extract/insert, at least with "sufficient" amount of channels (for 4 channels, extract/insert shuffles generated otherwise look truly terrifying). Albeit we shouldn't fallback to that so often in any case. v2: ditch the extract/insert path, not worth keeping (we're going to avoid hitting the fallback that often with future patches). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-12-21 04:48:24 +01:00
Roland Scheidegger	d7d23aee4b	gallivm: (trivial) handle non-aligned fetch for lp_build_fetch_rgba_soa soa fetch so far always assumed that data was aligned. However, we want to use this for vertex fetch, and data might not be aligned there, so handle it in this path too (basically just pass through alignment through to other functions). (It looks like it wouldn't work for for cached s3tc but this is no different than with AoS fetch.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-12-21 04:48:24 +01:00
Axel Davy	123e947228	st/nine: Upload on secondary context for DrawUp Avoid synchronization by using the secondary context for uploading the vertex data for DrawUp. v2: Rely on u_upload_mgr to use persistent coherent buffers. Do not flush. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	0ec4e5f630	st/nine: Dirty MANAGED buffers at Lock time Tests suggest MANAGED buffers are made dirty at Lock time, not at Unlock time. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	bad7f7cc63	st/nine: Implement new buffer upload path This new buffer upload path enables to lock faster than the normal path when using DISCARD/NOOVERWRITE. v2: Diverse cleanups and fixes. v3: Fix allocation size for 'lone' buffers and add more debug info. v4: Rewrite of the path to handle when DISCARD/NOOVERWRITE is not used anymore. The resource content is copied to the new resource used. v5: flush for safety after unmap (not sure it is really required here, but safer to flush). v6: Do not use the path if persistent coherent mapping is unavailable. Fix buffer creation flags. v7: Do not flush since it is not needed. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	8960be0e93	st/nine: Allow non-zero resource offset for vertex buffers Next patches will introduce an offset. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	1e64be6f91	st/nine: Do not wait for DEFAULT lock for volumes when we can If the volumes (and the texture container) are not referenced, then they are no pending operations on them. We can lock directly. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	b4f16615ef	st/nine: Do not wait for DEFAULT lock for surfaces when we can If the surfaces (and the texture container) are not referenced, then they are no pending operations on them. We can lock directly. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	525a1b292a	st/nine: Add arguments to context's blit and copy_region The new arguments enable to reference the objects while the function hasn't run. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	325324c749	st/nine: Idem for nine_context_gen_mipmap Will enable to use the bind count as an information for whether the surface/volume is used in the worker thread. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	7089d88199	st/nine: Bind destination for surface/volume uploads Will enable to use the bind count as an information for whether the surface/volume is used in the worker thread. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	d4a9b21feb	st/nine: Use nine_context_box_upload for volumes Use nine_context_box_upload for uploads: . systemmem volume to default volume . managed volume internal content to its resource. Check the uploads are executed before any action that can alter the data, that is LockBox and volume destruction. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	f042639231	st/nine: Fix leak with volume dtor The last level was not released. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	76e392d852	st/nine: Fix leak with cubetexture dtor The last level was not released. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	fec0b7f067	st/nine: Use nine_context_box_upload for surfaces Use nine_context_box_upload for uploads: . systemmem surface to default surface . managed surface internal content to its resource. Check the uploads are executed before any action that can alter the data, that is LockRect, NineSurface9_CopyDefaultToMem and surface destruction. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	c873a2bd0c	st/nine: Implement nine_context_box_upload This function will be used for surface and volume uploads Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	cadc7a5d94	st/nine: Use nine_context_gen_mipmap in BaseTexture9 Generate mipmaps in the worker thread. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	8d3e0f2187	st/nine: Implement nine_context_gen_mipmap To offload mipmap generation as well. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	16b6fb65ae	st/nine: Optimize managed buffer upload Do the upload in the other thread. Usually managed buffers are used once per frame. It is then very likely pending_upload is 0 at Lock time. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	a78b5f4378	st/nine: Implement nine_context_range_upload Will be used to upload buffers. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	1843e36b03	st/nine: Do not bind the container if forward is false This doesn't make sense to bind the container in that specific case. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	2fc8ef1401	st/nine: Comment and simplify iunknown The behaviour is a bit less obscure now. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	098ba64c4c	st/nine: Detach buffers in swapchain dtor. BackBuffers can survive swapchain dtor if the user has a reference on them. The swapchain itself has no reference on the buffer. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	14875ebd83	st/nine: Fix NineUnknown_Detach We don't bind the container in AddRef. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	930f479acf	st/nine: Simplify ARG_BIND_REF Remove some noop operations. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	9c4b4e8809	st/nine: Avoid flushing the queue for queries GetData Use the newly introduced counter to know when we don't need synchronization. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Patrick Rudolph	8a69343f1e	st/nine: Add CSMT_NO_WAIT_WITH_COUNTER Similar to the other macros, but introduces a counter, which enables to know when the instructions has been executed. Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2016-12-20 23:47:08 +01:00
Axel Davy	884166a251	st/nine: Use nine_context_clear_render_target Enables to not wait for the worker thread for ColorFill. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:47:08 +01:00
Axel Davy	7b154ac04d	st/nine: Optimize ColorFill When we lock the whole surface to overwrite it, we can use DISCARD. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Axel Davy	9bf1da05d9	st/nine: Simplify ColorFill For render targets, NineSurface9_GetSurface is not expected to fail. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Axel Davy	31262bbce0	st/nine: use get_pipe_acquire/release when possible Use the acquire/release semantic when we don't need to wait for any pending command. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Axel Davy	22f6d6fbd2	st/nine: Implement Fast path for dynamic buffers and csmt Use the secondary pipe for DISCARD/NOOVERWRITE, which avoids stalling to get the pipe from the worker thread. v2: flush at unmap. This is required for example if the driver does hidden draw calls or copies. In the case of unsynchronized it is probably not required, but it is more safe. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Axel Davy	3e8234fff4	st/nine: Add secondary pipe for device The secondary pipe will be used for operations that don't need synchronization. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Axel Davy	7a7eeefd7d	st/nine: Add nine_context_get_pipe_acquire/release See commit for description. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Axel Davy	ddb6f1d2d1	st/nine: SYSTEMMEM ignores DISCARD. Tests show SYSTEMMEM should ignore DISCARD. Prevents game bugs with following patches reimplementing DISCARD. Halo is affected. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Axel Davy	4f344db8b0	st/nine: Upload Managed buffers just before draw call using them Previously we were uploading Managed buffers at the next draw call after they were set dirty. This is not the expected behaviour. Instead upload just before draw call needing the content. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Axel Davy	e52aded87f	st/nine: Track bindings for buffers Similar code than for textures. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Axel Davy	62068c9d90	st/nine: Fix BASETEX_REGISTER_UPDATE BASETEX_REGISTER_UPDATE was adding the texture to the list of textures to upload in too many cases. tex->base.base.bind will be set to true if the texture is in a stateblock, whereas we want to upload only if bound to the device, which is what bind_count is for. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Axel Davy	804b28cdc4	st/nine: Simplify the logic to bind textures This makes the code more readable. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Patrick Rudolph	fef23f6712	st/nine: Use nine_context for resource_copy_region Use nine_context wrapper for resource_copy_region. Enables to offload it with CSMT. Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2016-12-20 23:44:23 +01:00
Patrick Rudolph	c8913a06b4	st/nine: Use nine_context for blit Enables to offload it with CSMT. Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2016-12-20 23:44:23 +01:00
Patrick Rudolph	0fd5730613	st/nine: Add NINE_DEBUG=tid to turn threadid on or off To ease debugging. Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2016-12-20 23:44:23 +01:00
Patrick Rudolph	3098bf03a6	st/nine: Print threadid in debug log To ease debugging. Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2016-12-20 23:44:23 +01:00
Patrick Rudolph	ac2927335b	st/nine: Implement gallium nine CSMT Use an offloading thread for all nine_context functions. Macros are used to ease the reading of the code. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Axel Davy	2c371a25a8	st/nine: Call GetPipe for implicit pipe usages With csmt, every usage of the pipe in the main thread has to be protected by calling GetPipe. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:23 +01:00
Patrick Rudolph	1277ceefd1	st/nine: Add struct nine_clipplane Required to know the size exact size of the plane. Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2016-12-20 23:44:22 +01:00
Patrick Rudolph	3af17a671d	st/nine: Add nine_queue This queue mechanism will be used for CSMT. Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2016-12-20 23:44:22 +01:00
Axel Davy	e068d3afe1	st/nine: Create pipe_surfaces on resource creation. Create the pipe_surfaces on renderable resources creation. This enables to avoid creating them on the fly. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	bb666b0297	st/nine: Back swvp in nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	f5f881fd3e	st/nine: Change the way nine_shader gets the pipe The change is required with csmt, where depending on the thread you don't access the pipe the same way. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	97e4b65e7f	st/nine: Reimplement nine_context_apply_stateblock The new version uses nine_context functions instead of applying the changes directly to nine_context. This will enable it to work with CSMT. v2: Fix nine_context_light_enable_stateblock The memcpy arguments were wrong, and the state wasn't set dirty. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	8d967abb98	st/nine: Decompose nine_context_set_texture Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	69f447752d	st/nine: Decompose nine_context_set_indices Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	08b717dfd3	st/nine: Decompose nine_context_set_stream_source Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	7ebdbb573b	st/nine: Do not use NineBaseTexture9 in nine_context Some fields are subject to modification outside of nine_context (SetLod, etc). Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	152d007769	st/nine: Move Managed Pool handling out of nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	eb884a4ac2	st/nine: Integrate nine_pipe_context_clear to nine_context_clear Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	b95205b1f2	st/nine: Move pipe and cso to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	66ad5b1592	st/nine: Rename pipe to pipe_data in nine_context This patch it to avoid name conflict when device->pipe will be moved to nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	fc49f7df89	st/nine: Rename cso in nine_context to cso_shader This patch it to avoid name conflict when device->cso is moved to nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	c7237e2c5c	st/nine: Access pipe_context via NineDevice9_GetPipe Except for nine_ff and nine_state. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	4a4eba8c05	st/nine: Remove NineDevice9_GetCSO Was useless. Remove useless usage in swapchain9. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	6a7541a5aa	st/nine: Move query9 pipe calls to nine_context This will enable to use threading for them. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	0a5252d25b	st/nine: Use atomics for nine_bind nine_bind didn't need atomics up to now, because it's use what always within a protected mutex. We need to use atomics because with the next patches several threads may use nine_bind. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	b748b8fd86	st/nine: Track dirty state groups in nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	a0a18920c7	st/nine: Back User Clip Planes to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	c6ca7c747e	st/nine: Back ps to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	d671190df9	st/nine: Back ds to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	1a735a99d0	st/nine: Back all ff states in nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	bb62ea925a	st/nine: Refactor LightEnable Call a helper function. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	cbe370020e	st/nine: Refactor SetLight Call a helper function to set the light. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	c5af96aebd	st/nine: Put ff data in a separate structure And make nine_state_access_transform take this new structure as input. Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	4a6d83ebc2	st/nine: Back viewport to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	9498613607	st/nine: Back scissor to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	7f6e01052b	st/nine: Back RT to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	aafbd62955	st/nine: Back current index buffer to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	b13b217243	st/nine: Back all shader constants to nine_context For device vs shader float constants and may_swvp, the same tips than for the other constant types is used. Also memset the constants properly. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	93ac6dfdcc	st/nine: Back sampler states to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:22 +01:00
Axel Davy	2a698c3df2	st/nine: Back vs to nine_context And move programmable_vs storage and computation. Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	43288cf376	st/nine: Back vdecl to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	63633e2a08	st/nine: Move stream freq data to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	848ffc81e4	st/nine: Move vtxbuf to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	aea7a019ef	st/nine: Move stream_usage_mask to nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	eed47b748f	st/nine: Back textures into nine_context Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	6bbb7b9fc5	st/nine: Move texture setting to nine_context_* And move samplers_shadow to nine_context. Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	c1871e829a	st/nine: Track changed.texture only for stateblocks Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	64e232bd60	st/nine: Move draw calls to nine_state Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. v2: Release buffers for Draw*Up functions in device9.c, instead of nine_context. This prevents a leak with csmt where the wrong pointers were released. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	f72d8719eb	st/nine: Move core of device clear to nine_state Part of the refactor to move all gallium calls to nine_state.c, and have all internal states required for those calls in nine_context. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	1b24d5e1f5	st/nine: Introduce nine_context nine_context is a new structure which goal will be to contain all internal states. It will be the states of the second thread in the to-be-introduced CSMT mode. This patch moves several internal states to nine_context, while the next patches add the other fields. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	e3c59fbd25	st/nine: Implement WFOG properly We were advertising support for WFOG (like all win drivers), but we weren't implementing it. This patch implements the behaviour. See comments. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	b40f12ebf0	st/nine: Fix ff texture coordinate selection The code was wrongly detecting which texture coordinates to generate when the coordinate index was different to the stage index. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	c75415224f	st/nine: Convert redundant check to assert in ff ps We disable the alpha stage if the color stage is disabled. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	32f6f91617	st/nine: Fix two special cases in ff ps if first alpha stage is disabled and writes to temp, diffuse alpha is written to temp. Last stage always writes to current. Behaviour was deduced by tests with a test app. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	1efdc8f594	st/nine: Remove useless code in ff ps Current is already initialized to Diffuse. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	7afea63d4f	st/nine: Fix ff cases when stages should be disabled When a texture is read by a stage for colorop, it should be disabled, and disable following stages. When a texture is read for alphaop, 1.0f is read for the input, which is the behaviour for a dummy texture. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	191b90a35c	st/nine: Always initialize current in ff ps The check was not catching all possible cases. NVE4 should be fine. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	1ee978fa50	st/nine: Fix check for ff specular Fix the check for computing ff specular. This seems to match the opengl behavior, and give the correct output on windows. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	89716b0b38	st/nine: Do not saturate illumination coefficients in ff Fixes bad rendering of a test app. Wine has the same behaviour. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	877dc0308f	st/nine: Fix ff COLOR0 w component computation The computation was wrong. COLOR0's last component should be equal to the material diffuse w component. The behaviour was checked with a test app on Windows. Wine has the same behaviour. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	e94ac230a2	st/nine: Fix specular enable for alpha Apparently specular enable doesn't affect the alpha channel. Fixes https://github.com/iXit/Mesa-3D/issues/253 Behaviour comfirmed looking in wine sources. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	85811d0e87	st/nine: Ignore MULTISAMPLEMASK when RT is not multisampled We were ignoring MULTISAMPLEMASK for non-maskable multisample modes, but we were missing the non-multisampled case. Fixes a crash in Halo. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	bce9fe8db2	driconf: Fix missing gettext DRI_CONF_NINE_OVERRIDEVENDOR was missing gettext for the description. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	59048e7548	st/nine: Add new driconf options to control DISCARD behaviour See the patch for the new controls added. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	06657fa203	st/nine: Rework buffer presentation path Use the new API for DISCARD. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	35ea402a24	st/nine: Fix a leak in Swapchain dtor Count properly the number of backbuffers, and use the new info to release the correct number of buffers Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	f78cbbdfaa	d3dadapter/present: Add precision for WaitBufferReleased Add precision on the behaviour of WaitBufferReleased. All implementers and users of the API were expecting that behaviour. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	0eef5491d3	d3dadapter/present: Add new API to ID3DPresent The API will enable better support for the commonly used DISCARD swapchain parameter. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	b2f17e5f62	st/nine: Silent warnings with guid_str In non-debug build, the variables are unused, and thus trigger a compilation warning. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	2cd8622fb3	st/nine: Do not generate gallium NOP on d3d NOP Some drivers crash if NOP is generated. Besides there is no point to generate NOP. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	461e03167e	st/nine: Fix leak in user constant upload path The new code properly releases the previous buffers allocated. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	c3e5140142	st/nine: Correctly release sw cursor image cursor.image is used for software cursor emulation. It wasn't released. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	9c0f65e08a	st/nine: Handle when cursor stride is not what is expected SetCursor assumes for now a 32x32 argb cursor with pitch 128. 32x32 argb doesn't have pitch 128 on all hw, thus use a temporary surface with the correct pitch when needed. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:21 +01:00
Axel Davy	e7a0f580a6	st/nine: Avoid crash on empty Draw*Up Ignore empty draw calls. Avoid assertion fault when such draw calls happen in u_upload_mgr. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	ada0c2ceaa	st/nine: Capture texturestage states in pixel stateblocks pixels stateblocks need to capture these. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	8b021be769	st/nine: Add missing changed states to pixel stateblocks Some states were not properly recorded in pixel stateblocks. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	b3b593b83b	st/nine: Add some debug info in stateblocks This is useful to check what is exactly recorded. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	fad0f147fb	st/nine: Remove useless check in surface9 ctor Textures already have the check in BaseTexture9. Non-Textures cannot be in the MANAGED Pool. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	503d729029	st/nine: Fix bad light initialization in stateblocks src was initialized instead of dst. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	9be94d5c1a	st/nine: Remove unused ff.changed.group It was unused. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	638b70985d	st/nine: Fix ps multisample check We want to use centroid for nonmaskable multisampling as well. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	d38215fb17	st/nine: Fix useless swapchain init checks In NineDevice9_SetDefaultState we can assume the implicit swapchain is properly initialized. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	409ad78777	st/nine: Don't update stream_usage_mask in sw path The variable is used only in the hw path. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	0630d3600b	st/nine: Remove useless call to nine_update_state The call was not needed. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	494ace4b85	st/nine: Add validation to SetSamplerState Check value validity and mimick Win behaviour. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	f4d5bc2555	st/nine: Improve doc of D3DPMISCCAPS_POSTBLENDSRGBCONVERT The cap should be advertised for d3d10 able cards, but only for Ex contexts. Unfortunately at this point Mesa has no way to know if Ex is used or not (the info is got later). Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-12-20 23:44:20 +01:00
Axel Davy	c4268fd175	gallium-docs: Add documentation for when using several contexts Add documentation to explicit what can be expected and what is allowed when using several contexts. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-20 23:44:20 +01:00
Axel Davy	1736ef6570	gallium-docs: Add documentation for threading requirements Add documentation for the requirements related to threading for screens and contexts. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-20 23:44:20 +01:00
Chad Versace	fbb4af96c6	egl: Check config's surface types in eglCreate*Surface() If the provided EGLConfig does not support the requested surface type, then emit EGL_BAD_MATCH. Fixes dEQP-EGL.functional.negative_api.create_pbuffer_surface on GBM. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-12-20 11:53:31 -08:00
Kenneth Graunke	62b8bcda1c	glsl: Use ir_var_temporary when generating inline functions. We were using ir_var_auto for the inlined function parameter variables, which is wrong, as it suggests that those are real variables declared by the program. Normally this doesn't matter. However, if you called built-ins at global scope, it would pollute the global variable namespace with these new parameter temporaries. If the shader already had variables with those names, the linker might see contradictory global variable declarations and raise an error. Making them temporaries indicates that these are just things generated by the compiler internally. This avoids confusing the linker. Fixes a new Piglit test: glsl-fs-multiple-builtins. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99154 Reported-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-20 11:41:29 -08:00
Kenneth Graunke	8fc5443a2b	i965: Don't bail on vertex element processing if we need draw params. BaseVertex, BaseInstance, DrawID, and some edge flag conditions need vertex buffer and elements structs. We can't bail early in this case. Gen4-7 already do this properly. Gen8+ did not. Thanks to Ilia Mirkin for helping track this down. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99144 Reported-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-20 11:41:28 -08:00
Jonathan Gray	d74c3e55b3	mesa: don't attempt to unlock an unlocked debug state mutex Commit `929fcee47e` introduced code that attempts to unlock an unlocked mutex which is undefined behaviour. On OpenBSD this leads to an abort: 0 0x0000124dadfa96ba in thrkill () at <stdin>:2 1 0x0000124dadf3da39 in _libc_abort () at /usr/src/lib/libc/stdlib/abort.c:52 2 0x0000124d2c1165b5 in _libpthread_pthread_mutex_unlock (mutexp=<optimized out>) at /usr/src/lib/librthread/rthread_sync.c:221 3 0x0000124d279c02e4 in init_attrib_groups (ctx=0x124df0fda000) at main/context.c:825 4 _mesa_initialize_context (ctx=ctx@entry=0x124df0fda000, api=api@entry=API_OPENGL_CORE, visual=visual@entry=0x7f7ffffbdfd0, share_list=share_list@entry=0x0, driverFunctions=driverFunctions@entry=0x7f7ffffbda60) at main/context.c:1204 5 0x0000124d27b507ec in st_create_context (api=api@entry=API_OPENGL_CORE, pipe=pipe@entry=0x124dc4910000, visual=visual@entry=0x7f7ffffbdfd0, share=share@entry=0x0, options=options@entry=0x7f7ffffbe128) at state_tracker/st_context.c:545 6 0x0000124d27b8639f in st_api_create_context (stapi=<optimized out>, smapi=0x124d1b608800, attribs=0x7f7ffffbe100, error=0x7f7ffffbe0fc, shared_stctxi=0x0) at state_tracker/st_manager.c:669 7 0x0000124d27cc5b9c in dri_create_context (api=<optimized out>, visual=0x124d8a0f8a00, cPriv=0x124de473f240, major_version=<optimized out>, minor_version=<optimized out>, flags=<optimized out>, notify_reset=false, error=0x7f7ffffbe2b4, sharedContextPrivate=0x0) at dri_context.c:123 8 0x0000124d27cc5029 in driCreateContextAttribs (screen=0x124d8a0f8400, api=<optimized out>, config=0x124d8a0f8a00, shared=<optimized out>, num_attribs=<optimized out>, attribs=<optimized out>, error=0x7f7ffffbe2b4, data=0x124d77814a00) at dri_util.c:448 9 0x0000124d8e109b00 in drisw_create_context_attribs (base=0x124df3e08700, config_base=0x124d7a0e7300, shareList=<optimized out>, num_attribs=<optimized out>, attribs=<optimized out>, error=0x7f7ffffbe2b4) at drisw_glx.c:476 10 0x0000124d8e104b4a in glXCreateContextAttribsARB (dpy=0x124d533f0000, config=0x124d7a0e7300, share_context=0x0, direct=1, attrib_list=0x7f7ffffbe300) at create_context.c:78 Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-12-20 11:41:04 -08:00
Dave Airlie	ab8ea1b3d4	glsl: allow invariant on fragment shader outputs. From page 27 (page 33 of the PDF) of the GLSL 1.20 spec: " Only variables output from a vertex shader can be candidates for invariance." But this later changes to: From page 37 (page 43 of the PDF) of the GLSL 1.30 spec: " Only variables output from a shader can be candidates for invariance." We can also find: From page 37 (page 43 of the PDF) of the GLSL 1.30 spec: " Initially, by default, all output variables are allowed to be variant. To force all output variables to be invariant, use the pragma #pragma STDGL invariant(all) before all declarations in a shader. If this pragma is used after the declaration of any variables or functions, then the set of outputs that behave as invariant is undefined. It is an error to use this pragma in a fragment shader." But this needs to be corrected and it is being addressed at: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16140 Fixes GL45-CTS.shading_language_420pack.qualifier_order. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2016-12-20 11:44:34 +02:00
Timothy Arceri	f562b13bc7	i965: keep gl_program shader info in sync after gather info It's possible that nir_shader was cloned and it no longer contains a pointer to the shader_info in gl_program. So we need to copy shader_info back to gl_program if that is the case. Fixes a regression with NIR_TEST_CLONE=true Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98840	2016-12-20 15:43:53 +11:00
Ian Romanick	ee1f35eb69	nir: Trivial clean ups in the generated nir_constant_expressions.c Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:44 -08:00
Ian Romanick	3c7066c1ed	nir: Silence unused parameter warnings in nir_constant_expression.c nir/nir_constant_expressions.c:290:25: warning: unused parameter 'num_components' [-Wunused-parameter] evaluate_ball3(unsigned num_components, nir_const_value _src) ^ nir/nir_constant_expressions.c: In function 'evaluate_fddx': nir/nir_constant_expressions.c:1282:57: warning: unused parameter '_src' [-Wunused-parameter] evaluate_fddx(unsigned num_components, nir_const_value _src) ^ v2: Unconditionally mark the parameters as MAYBE_UNUSED instead of conditionally adding (void) casts to keep the generator simple. Suggested by Jason. Number of total warnings in my build reduced from 1575 to 1485 (reduction of 89). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:44 -08:00
Ian Romanick	4300693a07	nir: Silence missing field initializer warnings for vectors in nir_constant_expressions nir/nir_constant_expressions.c: In function 'evaluate_ball2': nir/nir_constant_expressions.c:279:7: warning: missing initializer for field 'z' of 'struct bool_vec' [-Wmissing-field-initializers] }; ^ nir/nir_constant_expressions.c:234:10: note: 'z' declared here bool z; ^ Number of total warnings in my build reduced from 2532 to 2304 (reduction of 228). v2: Initialize bool vectors with 0 instead of false to keep the generator simpler. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:44 -08:00
Ian Romanick	8bfe397974	glsl: Silence "unused parameter" warnings in ast_type.cpp glsl/ast_type.cpp: In function ‘bool validate_point_mode(YYLTYPE, _mesa_glsl_parse_state, const ast_type_qualifier&, const ast_type_qualifier&)’: glsl/ast_type.cpp:173:30: warning: unused parameter ‘loc’ [-Wunused-parameter] validate_point_mode(YYLTYPE loc, ^~~ glsl/ast_type.cpp:174:45: warning: unused parameter ‘state’ [-Wunused-parameter] _mesa_glsl_parse_state state, ^~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:44 -08:00
Ian Romanick	d7aee96cc6	glsl: Trivial whitespace fixes in link_uniforms.cpp Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:44 -08:00
Ian Romanick	dd4fada6cd	glsl: Silence unused parameter warning in propagate_invariance.cpp glsl/propagate_invariance.cpp: In member function ‘virtual ir_visitor_status {anonymous}::ir_invariance_propagation_visitor::visit_leave(ir_assignment)’: glsl/propagate_invariance.cpp:86:63: warning: unused parameter ‘ir’ [-Wunused-parameter] ir_invariance_propagation_visitor::visit_leave(ir_assignment ir) ^~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:44 -08:00
Ian Romanick	88cc9484f8	glsl: Minor formatting fixes in link_uniform_blocks.cpp Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:43 -08:00
Ian Romanick	296407990b	glsl: Fix all the whitespace errors in link_uniform_block_active_visitor.cpp Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:43 -08:00
Ian Romanick	1f2659ad4d	mesa: Silence numerous "unused parameter" warnings in dlist.c main/dlist.c: In function ‘save_DrawArraysInstancedARB’: main/dlist.c:1748:36: warning: unused parameter ‘mode’ [-Wunused-parameter] save_DrawArraysInstancedARB(GLenum mode, ^~~~ main/dlist.c:1749:35: warning: unused parameter ‘first’ [-Wunused-parameter] GLint first, ^~~~~ main/dlist.c:1750:37: warning: unused parameter ‘count’ [-Wunused-parameter] GLsizei count, ^~~~~ main/dlist.c:1751:37: warning: unused parameter ‘primcount’ [-Wunused-parameter] GLsizei primcount) ^~~~~~~~~ main/dlist.c: In function ‘save_DrawElementsInstancedARB’: main/dlist.c:1759:38: warning: unused parameter ‘mode’ [-Wunused-parameter] save_DrawElementsInstancedARB(GLenum mode, ^~~~ main/dlist.c:1760:39: warning: unused parameter ‘count’ [-Wunused-parameter] GLsizei count, ^~~~~ main/dlist.c:1761:38: warning: unused parameter ‘type’ [-Wunused-parameter] GLenum type, ^~~~ main/dlist.c:1762:45: warning: unused parameter ‘indices’ [-Wunused-parameter] const GLvoid indices, ^~~~~~~ main/dlist.c:1763:39: warning: unused parameter ‘primcount’ [-Wunused-parameter] GLsizei primcount) ^~~~~~~~~ main/dlist.c: In function ‘save_DrawElementsInstancedBaseVertexARB’: main/dlist.c:1771:48: warning: unused parameter ‘mode’ [-Wunused-parameter] save_DrawElementsInstancedBaseVertexARB(GLenum mode, ^~~~ main/dlist.c:1772:49: warning: unused parameter ‘count’ [-Wunused-parameter] GLsizei count, ^~~~~ main/dlist.c:1773:48: warning: unused parameter ‘type’ [-Wunused-parameter] GLenum type, ^~~~ main/dlist.c:1774:55: warning: unused parameter ‘indices’ [-Wunused-parameter] const GLvoid indices, ^~~~~~~ main/dlist.c:1775:49: warning: unused parameter ‘primcount’ [-Wunused-parameter] GLsizei primcount, ^~~~~~~~~ main/dlist.c:1776:47: warning: unused parameter ‘basevertex’ [-Wunused-parameter] GLint basevertex) ^~~~~~~~~~ main/dlist.c: In function ‘save_DrawArraysInstancedBaseInstance’: main/dlist.c:1785:45: warning: unused parameter ‘mode’ [-Wunused-parameter] save_DrawArraysInstancedBaseInstance(GLenum mode, ^~~~ main/dlist.c:1786:44: warning: unused parameter ‘first’ [-Wunused-parameter] GLint first, ^~~~~ main/dlist.c:1787:46: warning: unused parameter ‘count’ [-Wunused-parameter] GLsizei count, ^~~~~ main/dlist.c:1788:46: warning: unused parameter ‘primcount’ [-Wunused-parameter] GLsizei primcount, ^~~~~~~~~ main/dlist.c:1789:45: warning: unused parameter ‘baseinstance’ [-Wunused-parameter] GLuint baseinstance) ^~~~~~~~~~~~ main/dlist.c: In function ‘save_DrawElementsInstancedBaseInstance’: main/dlist.c:1797:47: warning: unused parameter ‘mode’ [-Wunused-parameter] save_DrawElementsInstancedBaseInstance(GLenum mode, ^~~~ main/dlist.c:1798:48: warning: unused parameter ‘count’ [-Wunused-parameter] GLsizei count, ^~~~~ main/dlist.c:1799:47: warning: unused parameter ‘type’ [-Wunused-parameter] GLenum type, ^~~~ main/dlist.c:1800:52: warning: unused parameter ‘indices’ [-Wunused-parameter] const void indices, ^~~~~~~ main/dlist.c:1801:48: warning: unused parameter ‘primcount’ [-Wunused-parameter] GLsizei primcount, ^~~~~~~~~ main/dlist.c:1802:47: warning: unused parameter ‘baseinstance’ [-Wunused-parameter] GLuint baseinstance) ^~~~~~~~~~~~ main/dlist.c: In function ‘save_DrawElementsInstancedBaseVertexBaseInstance’: main/dlist.c:1810:57: warning: unused parameter ‘mode’ [-Wunused-parameter] save_DrawElementsInstancedBaseVertexBaseInstance(GLenum mode, ^~~~ main/dlist.c:1811:58: warning: unused parameter ‘count’ [-Wunused-parameter] GLsizei count, ^~~~~ main/dlist.c:1812:57: warning: unused parameter ‘type’ [-Wunused-parameter] GLenum type, ^~~~ main/dlist.c:1813:62: warning: unused parameter ‘indices’ [-Wunused-parameter] const void indices, ^~~~~~~ main/dlist.c:1814:58: warning: unused parameter ‘primcount’ [-Wunused-parameter] GLsizei primcount, ^~~~~~~~~ main/dlist.c:1815:56: warning: unused parameter ‘basevertex’ [-Wunused-parameter] GLint basevertex, ^~~~~~~~~~ main/dlist.c:1816:57: warning: unused parameter ‘baseinstance’ [-Wunused-parameter] GLuint baseinstance) ^~~~~~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:43 -08:00
Ian Romanick	1b9f285265	mesa: Fix all the whitespace errors in dlist.c Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:43 -08:00
Ian Romanick	ceea514d91	linker: Accurately mark a uniform block instance array element as used in a stage Now that information about which array-of-arrays elements are accessed is tracked, use that information to only mark an instance array element as used-by-stage if, in fact, it is. Fixes GL45-CTS.program_interface_query.uniform-block-types. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:43 -08:00
Ian Romanick	d32956935e	glsl: Walk a list of ir_dereference_array to mark array elements as accessed Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:43 -08:00
Ian Romanick	e92935089b	glsl: Mark a set of array elements as accessed using a list of array_deref_range Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:43 -08:00
Ian Romanick	8d499f60c8	glsl: Add structures to track accessed elements of a single array Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:43 -08:00
Ian Romanick	b7053b80f2	glsl: Add tracking for elements of an array-of-arrays that have been accessed If there's a better way to provide access to ir_array_refcount_entry private members to the test functions, I am very interested to know about it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: Francisco Jerez <currojerez@riseup.net> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:43 -08:00
Ian Romanick	5085b64031	glsl: Use simpler visitor to determine which UBO and SSBO blocks are used Very soon this visitor will get more complicated. The users of the existing ir_variable_refcount visitor won't need the coming functionality, and this use doesn't need much of the functionality of ir_variable_refcount. v2: ir_array_refcount_visitor::get_variable_entry cannot return NULL, so don't check it. Suggested by Timothy. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:43 -08:00
Ian Romanick	d56bd07bb3	glsl: Track the linearized array index for each UBO instance array element v2: Set linearizer_array_index in process_block_array_leaf. Suggested by Timothy. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:55:37 -08:00
Ian Romanick	300de78ab1	glsl: Fix wonkey indentation left from previous commit Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:54:38 -08:00
Ian Romanick	8862fefba0	glsl: Split process_block_array into two functions One for the array parts and one for the leaf members. This will simplify later changes. The indentation is wonkey after this patch. This was done to make it more obvious that the function is just getting split. The next patch will fix the indentation. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-19 15:54:38 -08:00
Kenneth Graunke	4c4d9e4f03	glsl: Fix program interface queries relating to interface blocks. This fixes 555 dEQP tests (using the nougat-cts-dev branch), Piglit's arb_program_interface_query/arb_program_interface_query-resource-query, and GL45-CTS.program_interface_query.separate-programs-{tess-control, tess-eval,geometry}. Only one dEQP program interface failure remains. I would have liked to split this up into several distinct changes, but I wasn't sure how to do that given thet tangled nature of these issues. So, the issues: * We need to treat interface blocks declared as an array of instances as a single block - removing the outer array. The resource list entry's name should not include the array length. Properties such as GL_ARRAY_SIZE should refer to the variable inside the block, not the interface block's array properties. * We need to do this prefixing even for structure variables. * We need to do this for built-ins (such as gl_PerVertex.gl_Position). * After interface array unwrapping, any variable which is an array should have [0] appended. It doesn't matter if it's a TCS/TES/GS input or TCS output - that looked like an attempt to unwrap for per-vertex variables, but that didn't consider per-patch variables, and as far as I can tell there's nothing to justify this. Several Mesa developers have suggested that Issue 16 contradicts the main specification, but I believe that it doesn't - the main spec just isn't terribly clear. The main ARB_program_interface query spec says: "* For an active interface block not declared as an array of block instances, a single entry will be generated, using the block name from the shader source. * For an active interface block declared as an array of instances, separate entries will be generated for each active instance. The name of the instance is formed by concatenating the block name, the "[" character, an integer identifying the instance number, and the "]" character." Issue 16 says that built-ins should be named "gl_PerVertex.gl_Position", but several people suggested the second bullet above means that it should be named "gl_PerVertex[array length].gl_Position". There are two important things to note. Those bullet points say "an active interface block", while the others say "variable" or "active shader storage block member". They also don't mention applying the rules recursively (unlike the other bullets). Both suggest that these rules apply to blocks themselves, not members of blocks. In fact, for GL_UNIFORM_BLOCK queries, we do have "block[0]", "block[1]", ... resource list entries - so those rules are real, and actually used. So if they don't apply to block members, then how should members be named? Unfortunately, I don't see any rules outside of issue 16 - where the rationale is very unclear. I hope to clarify the spec in the future. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-12-19 15:43:09 -08:00
Kenneth Graunke	ad6d1d70ad	glsl: Drop bogus is_vertex_input from add_shader_variable(). stage_mask is a bitmask of shader stages, so the proper comparison would be (1 << MESA_SHADER_VERTEX), not MESA_SHADER_VERTEX itself. But we only care for structure types, and VS inputs cannot be structs. So we can just drop this entirely. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-12-19 15:40:47 -08:00
Kenneth Graunke	37d63b50b1	mesa/get: Convert stencil values to TYPE_UINT. These are listed as Z+ in the GL spec, and often have values of 0xFFFFFFFF. For glGetFloat, we should return 4294967295.0 rather than -1.0. Similarly, for glGetInteger64v, we should return 0xFFFFFFFF, not the sign extended 0xFFFFFFFFFFFFFFFF. Fixes 6 dEQP tests matching the pattern dEQP-GLES3.functional.state_query.integers.stencilvaluemask*getfloat when run in a single process (with state reset code happening between tests, which makes dEQP set the stencil value mask to 0xFFFFFFFF). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-12-19 11:33:40 -08:00
Kenneth Graunke	9f93afb9a5	mesa/get: Add TYPE_UINT for casting through a GLuint. The "State Tables" section of the OpenGL specification lists many values as belonging to Z+ (non-negative integers), not Z (all integers). For ordinary glGetInteger queries, this doesn't matter. However, when accessing Z+ values via glGetFloat or glGetInteger64, we need to treat the source value as an unsigned value. Otherwise, we'll produce a negative number when bit 31 is set. This commit merely adds the plumbing. It doesn't convert any values. v2: Gotta catch 'em all (add missing cases caught by Ilia) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-12-19 11:33:40 -08:00
Kenneth Graunke	78a391ed83	mesa/get: Make GetFloat/GetDouble of TYPE_INT_N not normalize things. GetFloat of integer valued things is supposed to perform a simple int -> float conversion. INT_TO_FLOAT is not that. Instead, it converts [-2147483648, 2147483647] to a normalized [-1.0, 1.0] float. This is only used for COMPRESSED_TEXTURE_FORMATS, which nobody in their right mind would try and access via glGetFloat(), but we may as well fix it. Found by inspection. v2: Gotta catch 'em all (fix another case of this caught by Ilia) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-12-19 11:33:40 -08:00
Michel Dänzer	52098fada7	Revert "cso: don't release sampler states that are bound" This reverts commit `6dc96de303`. No longer necessary with the previous change. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-19 17:51:38 +09:00
Michel Dänzer	95eb5e4eed	cso: Make sanitize_hash safe for samplers Remove currently bound sampler states from the hash table before pruning entries from the hash table, so they cannot accidentally be deleted by the pruning. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-19 17:51:34 +09:00
Michel Dänzer	745e2eaaec	cso: Store hash key in struct cso_sampler Preparation for following changes, no functional change intended. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-19 17:51:31 +09:00
Michel Dänzer	9e14238647	cso: Optimize cso_save/restore_fragment_samplers Only copy/memset the pointers that actually need to be. v2: * Cast info->nr_samplers to int for calculating delta (Nicolai) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-19 17:50:21 +09:00
Michel Dänzer	5e70f80c99	cso: Store pointers to struct cso_sampler in struct sampler_info Preparation for following changes, no functional change intended. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-19 17:50:17 +09:00
Michel Dänzer	3d661a12be	cso: Don't restore nr_samplers in cso_restore_fragment_samplers If info->nr_samplers > ctx->nr_fragment_samplers_saved, the assignment would prevent cso_single_sampler_done from unbinding the no longer used samplers from the driver, which could result in use-after-free. This is probably unlikely to happen in practice though. Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-19 17:50:08 +09:00
Liu Zhiquan	e2610bf165	EGL/android: Enhance pbuffer implementation Some dri drivers will pass multiple bits in buffer_mask parameter to droid_image_get_buffer(), more than the actual supported buffer type combination. For such case, will go through all the bits, and will not return error when unsupported buffer is requested, only return error when the allocation for supported buffer failed. v2: coding style and log changes v3: coding style changes and update patch format Signed-off-by: Liu Zhiquan <zhiquan.liu@intel.com> Signed-off-by: Long, Zhifang <zhifang.long@intel.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2016-12-19 08:26:32 +02:00
Bas Nieuwenhuizen	1d529cba02	radv: Use correct workgroup size limits. Not sure where the 16k comes from, but pretty sure 2k is the max. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-18 22:18:14 +01:00
Dave Airlie	6229994ab7	radv: expose the compute queue v2: Don't expose the SDMA queue and use the CIK check also in the second if. (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:55 +01:00
Bas Nieuwenhuizen	442735d35d	radv: Only emit PFP ME syncs for DMA on the GFX queue. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-18 20:52:51 +01:00
Bas Nieuwenhuizen	f2523ebf52	radv: Create an empty CS per ring type. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-18 20:52:47 +01:00
Bas Nieuwenhuizen	accc5fc026	radv: Don't enable CMASK on compute queues. We can't fast clear on compute queues. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-18 20:52:41 +01:00
Bas Nieuwenhuizen	bfee9866ea	radv: Use RELEASE_MEM packet for MEC timestamp query. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-18 20:52:37 +01:00
Bas Nieuwenhuizen	9b0efc98ba	radv: Implement indirect dispatch for the MEC. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-18 20:52:33 +01:00
Bas Nieuwenhuizen	3a559029e2	radv: update vkCmdUpdateBuffer for the MEC. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-18 20:52:29 +01:00
Bas Nieuwenhuizen	b3499557a2	radv: Implement cache flushing for the MEC. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-18 20:52:26 +01:00
Dave Airlie	72aaa83f4b	radv: add semaphore support Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:26 +01:00
Dave Airlie	d270b5fac3	radv: pass queue index into winsys submission This is so we can submit on separate queues if needed Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:26 +01:00
Dave Airlie	d0e6fb0574	radv: init compute queue and avoid initing transfer queues Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:26 +01:00
Bas Nieuwenhuizen	71dabe1c16	radv/winsys: Make WaitIdle queue aware. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-18 20:52:20 +01:00
Dave Airlie	d028bd7b55	radv/meta: update header info Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:20 +01:00
Dave Airlie	4bd666a319	radv: hook compute clears into clear image api. These aren't used yet but we will want to use them when we implement a separate compute queue. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:20 +01:00
Dave Airlie	f11ea8779d	radv: clear image implementation for compute queue Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:20 +01:00
Dave Airlie	9839ce282b	radv/meta: split clear image out into a separate layer clear function This will make it easier to add support for clears on compute queues. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:20 +01:00
Dave Airlie	ef5f59c9a9	radv: implement image->image copies using compute shader This is required for having a separate compute queue, we probably can't use this on GFX queue due to DCC. v2: Set coord_components = 2 for itoi texture fetch. (Bas) Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:20 +01:00
Dave Airlie	983af3a6d1	radv: add a compute shader implementation for buffer to image This implements the reverse of the current buffer->image path and can be used when we need to do image transfer on compute queues This just adds the code turned off as we don't support separate computes queues yet, and we don't want to use this path on the GFX queues for DCC reasons. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:20 +01:00
Bas Nieuwenhuizen	35cf08ef64	radv: Use correct pitch for views with different block size. Needed when accessing a comrpessed texture as R32G32B32A32 from a shader. This was not encountered previously, as we used the CB for the reinterpretation, which does not use this pitch. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-18 20:52:15 +01:00
Dave Airlie	94a7434bbc	radv: Store queue family in command buffers. v2: Added helper (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:15 +01:00
Dave Airlie	c20701f4be	radv: start fixing up queue allocate for multiple queues v2: Fix error handling and zero init the device (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:15 +01:00
Dave Airlie	59c9a131f4	radv/winsys: start adding support for DMA/compute queue Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-18 20:52:15 +01:00
Bas Nieuwenhuizen	86cb418bd4	radv/winsys: Expose number of compute/dma rings. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-18 20:52:08 +01:00
Rob Clark	2c0dfd48f0	freedreno/a5xx: border color support Not 100% sure it works if you have border color in VS.. but it might be right. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-18 13:49:45 -05:00
Rob Clark	939486d3d3	freedreno/a5xx: use MRT0 to import linear zs A bit of a hack, but we need to do this until we can do tiled zs in sysmem (and associated tile/until blits for transfer_map). Fixes xonotic and glmark2 "refract", when reorder wasn't enabled. (reorder would paper over the issue by avoiding the extra round- trip to system memory and back to gmem. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-18 13:48:10 -05:00
Rob Clark	bea8602e5b	freedreno: fdN_gmem_restore_format() is not gen specific Refactor out into a common helper, since this is the same across generations when we need equiv z/s gmem restore format. Next patch needs this in a5xx, rather than creating yet another helper push this into core. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-18 13:48:03 -05:00
Rob Clark	6f93c75a47	freedreno/a5xx: cargo-cult end-batch sequence more faithfully Fixes some issues at least with GMEM bypass mode, where we'd sometimes end up with some FS quads not hitting memory. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-18 13:47:54 -05:00
Rob Clark	d35022f24d	freedreno/a5xx: misc fix Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-18 13:47:47 -05:00
Rob Clark	651f2655a8	freedreno/a5xx: fix (at least some) vtx formats Swap/component-order doesn't seem to be quite what that is. At least blob was always setting it to XYZW ('11') but we weren't. Causing problems w/ formats like sint16.. Hard-coding this instead at least seems to get glamor working. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-18 13:47:38 -05:00
Rob Clark	2540226f66	freedreno/a5xx: more formats Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-18 13:47:31 -05:00
Rob Clark	c768461c1f	freedreno/a5xx: fixup caps Might not be 100% accurate, mostly just copy from a4xx to get started. We are defn lying about occlusion query at this point (not implemented yet) but need it to expose anything higher than gl1.4 (glamor needs gl2.1) Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-18 13:47:18 -05:00
Rob Clark	abcf8f5b58	freedreno/a5xx: fix random faults on first sysmem draw Not sure what this event is, but blob writes it.. and it seems to solve random write faults at mystery address that would sometimes happen on first BYPASS draw. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-18 13:47:08 -05:00
Rob Clark	54537fa1dc	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-18 13:47:00 -05:00
Rob Clark	5e632b3a83	freedreno/a5xx: fix stride/size for mem->gmem blits <brownpaperbag>these should be the in-GMEM dimensions</brownpaperbag> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-18 13:46:48 -05:00
Dave Airlie	0f2e9a8986	radv/winsys: consolidate request->fence code Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-17 16:30:16 +01:00
Dave Airlie	7ad1c24e2a	radv: handle fence allocation failing Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-17 16:29:57 +01:00
Bas Nieuwenhuizen	b2b4f7248b	radv: Don't bail out on pipeline create failure. The spec says we have to try to create all, and only set failed pipelines to VK_NULL_HANDLE. If one of them fails, we have to return an error, but as far as I can see, the spec does not care which of the suberrors. Fixes dEQP-VK.api.object_management.alloc_callback_fail_multiple.compute_pipeline dEQP-VK.api.object_management.alloc_callback_fail_multiple.graphics_pipeline Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-17 11:41:53 +01:00
Ilia Mirkin	6493b4f4dd	spirv/nir: add support for ImageGatherExtended The strategy is to do the same thing that the GLSL lower_offset_arrays pass does - create 4 separate texture gather ops, one per offset, and read in the results from each gather's w component to recreate the desired result. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-16 20:27:37 -05:00
Francisco Jerez	79d08ed3d2	anv: Fix uniform and storage buffer offset alignment limits. This fixes a regression in a bunch of image store vulkan CTS tests from commit `ad38ba1134`, which started using OWORD block read messages to implement UBO loads. The reason for the failure is that we were giving bogus buffer alignment limits to the application (1B), so the CTS would happily come back with descriptor sets pointing at not even word-aligned uniform buffer addresses. Surprisingly the sampler messages used to fetch pull constants before that commit were able to cope with the non-texel aligned addresses, but the dataport messages used to fetch pull constants after that commit and the ones used to access storage buffers (before and after the same commit) aren't as permissive with unaligned addresses. Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99097 Reported-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-16 14:12:54 -08:00
Thomas Helland	a66818830a	nir: Remove nir_array from lower_locals_to_regs We do nothing but initialize it, add to it, and delete it. This is a fallout from removing constant initializer support. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-16 12:02:28 -08:00
Bruce Cherniak	79b66ec05e	swr: Implement fence attached work queues for deferred deletion. Work can now be added to fences and triggered by fence completion. This allows for deferred resource deletion, and other asynchronous tasks. Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-12-16 11:29:02 -06:00
Timothy Arceri	3421b3f5a3	nir: Turn imov/fmov of undef into undef Reverting the previous attempt at this `a5502a721f` resulted in the following Vulkan test failing. dEQP-VK.glsl.return.return_in_dynamic_loop_dynamic_vertex This time we use the num_components from the alu dest rather than num_inputs to the op to determine the size of the undef. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99100	2016-12-16 20:32:59 +11:00
Eric Engestrom	08fc74663b	egl/x11: cleanup init code No functional change, just rewriting it in an easier-to-understand way. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-15 11:48:31 +00:00
Iago Toral Quiroga	47351b843a	nir/lower_tex: fix number of components in replace_gradient_with_lod() We should make the dest in the textureLod() operation have the same number of components as the destination in the original textureGrad() Fixes regression in ES3-CTS.gtf.GL3Tests.shadow Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99072 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-15 08:34:55 +01:00
Timothy Arceri	a5502a721f	Revert "nir: Turn imov/fmov of undef into undef." This reverts commit `6aa730000f`. This was changing the size of the undef to always be 1 (the number of inputs to imov and fmov) which is wrong, we could be moving a vec4 for example. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-12-15 17:05:12 +11:00
Kenneth Graunke	84e19322d3	i965/vec4: Fix TCS output reads with non-zero component qualifiers. We want to perform the URB read to a vec4 temporary, with no writemask, then issue a MOV to swizzle the data and store it to the actual destination, using the final writemask. We were doing this wrong. For example, let's say we wanted to read a vec2 stored in components 2-3 of a vec4. We would generate a URB read message of: SEND <actual destination>.XY <header with mask set to XY> MOV <actual destination>.XY <actual destination>.ZW This doesn't work, because the URB message reads the .XY components of the vec4, rather than the ZW. It writes to the right place, but with the wrong data. Then the MOV comes along and overwrites it with data that didn't even come from the URB at all. Instead we want to do: SEND <temporary> <header with mask set to ZW> MOV <actual destination>.XY <temporary>.ZW Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-14 21:15:39 -08:00
Francisco Jerez	fd3120d85c	i965/disasm: Decode dataport constant cache control fields. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-14 16:50:27 -08:00
Francisco Jerez	23caf75182	i965/fs: Remove the FS_OPCODE_SET_SIMD4X2_OFFSET virtual opcode. Not used anymore. It was just a scalar MOV. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-14 16:50:27 -08:00
Francisco Jerez	e014058195	i965/fs: Drop useless access mode override from pull constant generator code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-14 16:50:27 -08:00
Francisco Jerez	b56fa830c6	i965/fs: Fetch one cacheline of pull constants at a time. Asking the DC for less than one cacheline (4 owords) of data for uniform pull constants is suboptimal because the DC cannot request less than that from L3, resulting in wasted bandwidth and unnecessary message dispatch overhead, and exacerbating the IVB L3 serialization bug. The following table summarizes the overall framerate improvement (with statistical significance of 5% and sample size ~10) from the whole series up to this patch for several benchmarks and hardware generations: \| SKL \| BDW \| HSW SynMark2 OglShMapPcf \| 24.63% ±0.45% \| 4.01% ±0.70% \| 10.31% ±0.38% GfxBench4 gl_manhattan31 \| 5.93% ±0.35% \| 3.92% ±0.31% \| 6.62% ±0.22% GfxBench4 gl_4 \| 2.52% ±0.44% \| 1.23% ±0.10% \| N/A Unigine Valley \| 0.83% ±0.17% \| 0.23% ±0.05% \| 0.74% ±0.45% Note that there are two versions of the Manhattan demo shipped with GfxBench4, one of them is the original gl_manhattan demo which doesn't use UBOs, so this patch will have no effect on it, and another one is the gl_manhattan31 demo based on GL 4.3/GLES 3.1, which this patch benefits as shown above. I haven't observed any statistically significant regressions in the benchmarks I have at hand. Note that the comparatively huge improvement on SKL in the OglShMapPcf test case is due to the combined effect of this patch and the register pressure benefit on SKL+ of "i965/fs: Switch to the constant cache for uniform pull constants.", part of the same series. Going up to 8 oword blocks would improve performance of pull constants even more, but at the cost of some additional bandwidth and register pressure, so it would have to be done on-demand based on the number of constants actually used by the shader. v2: Fix for Gen4 and 5. v3: Non-trivial rebase. Rework to allow the visitor specifiy arbitrary pull constant block sizes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-14 16:50:27 -08:00
Francisco Jerez	9b22a0d295	i965/fs: Expose arbitrary pull constant load sizes to the IR. Change the FS generator to ask the dataport for enough owords worth of constants to fill the execution size of the instruction -- Which means that the visitor now needs to set the execution size correctly for uniform pull constant load instructions, which we were kind of neglecting until now. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-14 16:50:26 -08:00
Francisco Jerez	7a6aadb76f	i965: Factor out oword block read and write message control calculation. We'll need roughly the same logic in other places and it would be annoying to duplicate it. Instead factor it out into a function-like macro that takes the number of dwords per block (which will prove more convenient than taking the same value in owords or some other unit). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-14 16:50:26 -08:00
Francisco Jerez	ad38ba1134	i965/fs: Switch to the constant cache for uniform pull constants. This reverts to using the oword block read messages for uniform pull constant loads, as used to be the case until `4c1fdae0a0`. There are two important differences though: Now the L3 cacheability bits are set up correctly for UBOs (since `11f5d8a5d4`), and we target the constant cache instead of the data cache. The latter used to get no L3 way allocation on boot on all platforms that existed at the time, so oword read messages wouldn't get cached on L3 regardless of the MOCS bits, what probably explains the apparent slowness of oword fetches. Constant cache loads seem to perform better than SIMD4x2 sampler loads in a number of cases, they alleviate some of the cache thrashing caused by the competition with textures for the L1/L2 sampler caches, and they allow fetching up to 128B worth of constants with a single oword fetch message. Note that IVB devices suffer from a hardware bug that leads to serialization of L3 read requests overlapping the same cacheline as result of a (on IVB buggy) mechanism of the L3 to preserve coherency. Since read requests for matching cachelines from any L3 client are not pipelined, throughput may decrease in cases where there are no non-overlapping requests left in the queue that can be processed between them. This situation should be relatively uncommon as long as we make sure that we don't use the 1/2 oword messages in cases where the shader intends to read from any other location of the same cacheline at some other point. This is generally a good idea anyway on all generations because using the 1 and 2 oword messages is expected to waste bandwidth since the minimum L3 request size for the DC is exactly 4 owords (i.e. one cacheline). A future commit will have this effect. I haven't been able to find any real-world example where this would still result in a regression on IVB, but if someone happens to find one it shouldn't be too difficult to add an IVB-specific check to have it fall back to the sampler cache for pull constant loads. Note that on SKL+ this change has the additional benefit of reducing the register footprint of pull constant loads. The following table summarizes the effect of the whole series on several shader-db stats: Total instructions Total cycles BWR: 4571248 -> 4568342 (-0.06%) 123375740 -> 123373296 (-0.00%) ELK: 3989020 -> 3985402 (-0.09%) 98757068 -> 98754058 (-0.00%) ILK: 6383591 -> 6376787 (-0.11%) 143649910 -> 143648914 (-0.00%) SNB: 7528395 -> 7501446 (-0.36%) 103503796 -> 102460370 (-1.01%) IVB: 6949221 -> 6943317 (-0.08%) 60592262 -> 60584422 (-0.01%) HSW: 6409753 -> 6403702 (-0.09%) 60609070 -> 60604414 (-0.01%) BDW: 8043467 -> 7976364 (-0.83%) 68427730 -> 68483042 (0.08%) CHV: 8045019 -> 7977916 (-0.83%) 68297426 -> 68352756 (0.08%) SKL: 8204037 -> 7939086 (-3.23%) 66583900 -> 65624378 (-1.44%) Lost->Gained Total spills Total fills BWR: 5 -> 5 1488 -> 1488 (0.00%) 1957 -> 1957 (0.00%) ELK: 5 -> 5 1489 -> 1489 (0.00%) 1958 -> 1958 (0.00%) ILK: 1 -> 4 1449 -> 1449 (0.00%) 1921 -> 1921 (0.00%) SNB: 0 -> 0 549 -> 549 (0.00%) 52 -> 52 (0.00%) IVB: 13 -> 3 1271 -> 1271 (0.00%) 1162 -> 1162 (0.00%) HSW: 11 -> 0 1271 -> 1271 (0.00%) 1162 -> 1162 (0.00%) BDW: 12 -> 0 1340 -> 1340 (0.00%) 1452 -> 1452 (0.00%) CHV: 12 -> 0 1340 -> 1340 (0.00%) 1452 -> 1452 (0.00%) SKL: 0 -> 120 1269 -> 375 (-70.45%) 1563 -> 690 (-55.85%) v3: Non-trivial rebase. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-14 16:50:26 -08:00
Francisco Jerez	3c78d31374	i965: Let the caller of brw_set_dp_write/read_message control the target cache. brw_set_dp_read_message already had a target_cache argument, but its interpretation was rather convoluted (on Gen6 the render cache was used if the caller asked for it, otherwise it was ignored using the sampler cache instead), and the constant cache wasn't representable at all. brw_set_dp_write_message used the data cache on Gen7+ except for RENDER_TARGET_WRITE messages, in which case it would use the render cache. On Gen6 the render cache was always used. Instead of the above, provide the shared unit SFID that the caller expects will be used. Makes no functional changes. v3: Non-trivial rebase. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-14 16:50:26 -08:00
Francisco Jerez	591e14ec08	i965/gen6+: Invalidate constant cache on brw_emit_mi_flush(). In order to make sure that the constant cache is coherent with previous rendering when we start using it for pull constant loads. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-14 16:50:26 -08:00
Kenneth Graunke	e0c1ec3b09	genxml: Make Gen8 3DSTATE_DS SIMD8 enable work like Gen9+. This will let us avoid ifdefs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-12-14 14:59:06 -08:00
Kenneth Graunke	000b563a1b	genxml: Rename "DS Function Enable" to "Function Enable". This makes Gen7/7.5 match Gen8-9. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-12-14 14:59:06 -08:00
Chad Versace	72ffe8318d	anv: Reject VkMemoryAllocateInfo::allocationSize == 0 The Vulkan 1.0.33 spec says "allocationSize must be greater than 0". Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-12-14 12:04:58 -08:00
Chad Versace	5e97b8f5ce	egl: Fix crashes in eglCreate*Surface() Don't dereference a null EGLDisplay. Fixes tests dEQP-EGL.functional.negative_api.create_pbuffer_surface dEQP-EGL.functional.negative_api.create_pixmap_surface Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=99038 Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-12-14 12:03:15 -08:00
Jason Ekstrand	b18cd8ce2c	i965/miptree: Use intel_miptree_copy for maps What we're really doing is copying a texture not blitting it in the sense of glBlitFramebuffers. Also, the intel_miptree_copy function is capable of properly handling compressed textures which intel_miptree_blit is not. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97473 Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-12-13 15:48:34 -08:00
Jason Ekstrand	157971e450	i965/blit: Fix the src dimension sanity check in miptree_copy Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-12-13 15:48:13 -08:00
Lionel Landwerlin	9fe3f2649e	docs: add INTEL_conservative_rasterization to relaese notes for 13.1.0 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-12-13 16:28:00 +00:00
Lionel Landwerlin	60330d730b	main: add INTEL_conservative_rasterization enum query support v2: add extra parameter (Ilia) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-12-13 16:27:59 +00:00
Lionel Landwerlin	d4b753a50b	glapi: add missing INTEL_conservative_rasterization v2: put enum directly in gl_API.xml (Ilia) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-12-13 16:27:56 +00:00
Lionel Landwerlin	47285d4602	extensions: update INTEL_conservative_rasterization dependencies Suggested by Ilia. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-12-13 16:27:54 +00:00
Lionel Landwerlin	300d96a433	main: don't error when enabling conservative rasterization on gles Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-12-13 16:27:51 +00:00
Lionel Landwerlin	9854a3ba8b	main: use new driver flag for conservative rasterization state Suggested by Marek. v2: Use new driver flag (Marek) v3: Fix i965 comments (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-13 16:27:33 +00:00
Iago Toral Quiroga	da3389a331	nir/lower_tex: lower gradients on shadow cube maps if lower_txd_shadow is set Even if lower_txd_cube_map isn't. Suggested by Ken to make the flag more consistent with its name. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-13 10:33:29 +01:00
Iago Toral Quiroga	44873ad0a4	i965: remove brw_lower_texture_gradients This has been ported to NIR now so we don'tneed to keep the GLSL IR lowering any more. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-13 10:33:20 +01:00
Iago Toral Quiroga	77f65b3b64	i965/nir: enable lowering of texture gradient for shadow samplers This gets the lowering on the Vulkan driver too, which is required for hardware that does not have the sample_l_d message (up to IvyBridge). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-13 10:33:14 +01:00
Iago Toral Quiroga	5be2e785b1	nir/lower_tex: add lowering for texture gradient on shadow samplers This is ported from the Intel lowering pass that we use with GLSL IR. This takes care of lowering texture gradients on shadow samplers other than cube maps. Intel hardware requires this for gen < 8. v2 (Ken): - Use the helper function to retrieve ddx/ddy - Swizzle away size components we are not interested in v3: - Get rid of the ddx/ddy helper and use nir_tex_instr_src_index instead (Ken, Eric) v4: - Add a 'continue' statement if the lowering makes progress because it replaces the original texture instruction Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v3)	2016-12-13 10:32:52 +01:00
Iago Toral Quiroga	f90da64fc6	i965/nir: enable lowering of texture gradient for cube maps This gets the lowering on the Vulkan driver too. Fixes Vulkan CTS cube map texture gradient tests in: dEQP-VK.glsl.texture_functions.texturegrad.* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-13 10:32:46 +01:00
Iago Toral Quiroga	a8e740c354	nir/lower_tex: add lowering for texture gradient on cube maps This is ported from the Intel lowering pass that we use with GLSL IR. The NIR pass only handles cube maps, not shadow samplers, which are also lowered for gen < 8 on Intel hardware. We will add support for that in a later patch, at which point we should be able to remove the GLSL IR lowering pass. v2: - added a helper to retrieve ddx/ddy parameters (Ken) - No need to make size.z=1.0, we are only using component x anyway (Iago) v3: - Get rid of the ddx/ddy helper and use nir_tex_instr_src_index instead (Ken, Eric) v4: - When emitting the textureLod operation, copy all texture parameters from the original textureGrad() (except for ddx/ddy) using a loop - Add a 'continue' statement if the lowering makes progress because it replaces the original texture instruction Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v3)	2016-12-13 10:32:00 +01:00
Iago Toral Quiroga	bac303c286	nir/lower_tex: generalize get_texture_size() This was written specifically for RECT samplers. Make it more generic so we can call this from the gradient lowerings too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-13 10:31:38 +01:00
Ilia Mirkin	fd249c803e	treewide: s/comparitor/comparator/ git grep -l comparitor \| xargs sed -i 's/comparitor/comparator/g' Just happened to notice this in a patch that was sent and included one of the tokens in question. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-12 22:13:07 -05:00
Ian Romanick	a0ce9ff8c4	nir: Only float and double types can be matrices In `19a541f` (nir: Get rid of nir_constant_data) a number of places that operated on nir_constant::values were mechanically converted to operate on the whole array without regard for the base type. Only GLSL_TYPE_FLOAT and GLSL_TYPE_DOUBLE can be matrices, so only those types can have data in the non-0 array element. See also `b870394`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Iago Toral Quiroga <itoral@igalia.com>	2016-12-12 17:17:12 -08:00
Tim Rowley	75149088be	swr: [rasterizer core/memory] StoreTile: AVX512 progress Fixes to 128-bit formats. Reviwed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-12-12 17:52:39 -06:00
Matt Turner	ac6646129f	nir: Move fsat outside of fmin/fmax if second arg is 0 to 1. instructions in affected programs: 550 -> 544 (-1.09%) helped: 6 cycles in affected programs: 6952 -> 6850 (-1.47%) helped: 6 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-12 12:39:27 -08:00
Matt Turner	7bed52bb5f	i965/fs: Reject copy propagation into SEL if not min/max. We shouldn't ever see a SEL with conditional mod other than GE (for max) or L (for min), but we might see one with predication and no conditional mod. total instructions in shared programs: 8241806 -> 8241902 (0.00%) instructions in affected programs: 13284 -> 13380 (0.72%) HURT: 62 total cycles in shared programs: 84165104 -> 84166244 (0.00%) cycles in affected programs: 75364 -> 76504 (1.51%) helped: 10 HURT: 34 Fixes generated code in at least Sanctum 2, Borderlands 2, Goat Simulator, XCOM: Enemy Unknown, and Shogun 2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92234 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-12 12:38:55 -08:00
Matt Turner	091a8a04ad	i965/fs: Add unit tests for copy propagation pass. Pretty basic, but it's a start. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-12 12:38:50 -08:00
Matt Turner	6014da50ec	i965/fs: Rename opt_copy_propagate -> opt_copy_propagation. Matches the vec4 backend, cmod propagation, and saturate propagation. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-12 12:38:43 -08:00
Nicolai Hähnle	ec0a0a60cc	radeonsi: shrink the GSVS ring to account for the reduced item sizes Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:05:17 +01:00
Nicolai Hähnle	6fdef7d265	radeonsi: shrink each vertex stream to the actually required size Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:05:13 +01:00
Nicolai Hähnle	2f2e941e2d	radeonsi: use a single descriptor for the GSVS ring We can hardcode all of the fields for swizzling in the geometry shader. The advantage is that we use fewer descriptor slots and we no longer have to update any of the (ring) descriptors when the geometry shader changes. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:05:05 +01:00
Nicolai Hähnle	18616e7551	radeonsi: pack GS output components for each vertex stream contiguously Note that the memory layout of one vertex stream inside one "item" (= memory written by one GS wave) on the GSVS ring is: t0v0c0 ... t15v0c0 t0v1c0 ... t15v1c0 ... t0vLc0 ... t15vLc0 t0v0c1 ... t15v0c1 t0v1c1 ... t15v1c1 ... t0vLc1 ... t15vLc1 ... t0v0cL ... t15v0cL t0v1cL ... t15v1cL ... t0vLcL ... t15vLcL t16v0c0 ... t31v0c0 t16v1c0 ... t31v1c0 ... t16vLc0 ... t31vLc0 t16v0c1 ... t31v0c1 t16v1c1 ... t31v1c1 ... t16vLc1 ... t31vLc1 ... t16v0cL ... t31v0cL t16v1cL ... t31v1cL ... t16vLcL ... t31vLcL ... t48v0c0 ... t63v0c0 t48v1c0 ... t63v1c0 ... t48vLc0 ... t63vLc0 t48v0c1 ... t63v0c1 t48v1c1 ... t63v1c1 ... t48vLc1 ... t63vLc1 ... t48v0cL ... t63v0cL t48v1cL ... t63v1cL ... t48vLcL ... t63vLcL where tNN indicates the thread number, vNN the vertex number (in the order of EMIT_VERTEX), and cNN the output component (vL and cL are the last vertex and component, respectively). The vertex streams are laid out sequentially. The swizzling by 16 threads is hard-coded in the way the VGT generates the offset passed into the GS copy shader, and the jump every 16 threads is calculated from VGT_GSVS_RING_OFFSET_n and VGT_GSVS_RING_ITEMSIZE in a way that makes it difficult to deviate from this layout (at least that's what I've experimentally confirmed on VI after first trying to go the simpler route of just interleaving the vertex streams). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:05:00 +01:00
Nicolai Hähnle	edf034ac14	radeonsi: do not write non-existent components through the GSVS ring Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:58 +01:00
Nicolai Hähnle	af976f12a5	radeonsi: only write values belonging to the stream when emitting GS vertex Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:54 +01:00
Nicolai Hähnle	bdf1bf1cb5	radeonsi: generate an explicit switch instruction over vertex streams SimplifyCFG generates a switch instruction anyway when all four streams are present, but is simultaneously not smart enough to eliminate some redundant jumps that it generates. The generated assembly is still a bit silly, probably because the control flow annotation doesn't know how to handle a switch with uniform condition. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:49 +01:00
Nicolai Hähnle	bae929f96e	radeonsi: fetch only outputs of current vertex stream from the GSVS ring Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:46 +01:00
Nicolai Hähnle	dfb69cac33	radeonsi: only export from GS copy shader for vertex stream 0 When running the copy shader for vertex streams != 0, the SX does not need any data from us (there is no rasterization for the higher vertex streams, only streamout). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:43 +01:00
Nicolai Hähnle	21f2bb22a3	radeonsi: do not export VS outputs from vertex streams != 0 This affects for GS copy shaders. When an output is meant for vertex stream != 0, then we don't have to make it available to the pixel shader. There is a minor inefficiency here because the GLSL varying packing pass does not group varyings of the same vertex stream together, but it shouldn't be important in practice. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:36 +01:00
Nicolai Hähnle	fc0e009aa7	radeonsi: pull iteration over vertex streams into GS copy shader logic The iteration is not needed for normal vertex shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:33 +01:00
Nicolai Hähnle	180ae18ec5	radeonsi: group streamout writes by vertex stream Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:30 +01:00
Nicolai Hähnle	d89592836a	radeonsi: load the streamout buf descriptors closer to their use LLVM can still decide to hoist the loads since they're marked invariant. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:27 +01:00
Nicolai Hähnle	564f17f0d7	radeonsi: extract writing of a single streamout output Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:24 +01:00
Nicolai Hähnle	b41dd00235	radeonsi: separate the call to si_llvm_emit_streamout from exports Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:22 +01:00
Nicolai Hähnle	5ad6e56ca3	radeonsi: plumb the output vertex_stream through to si_shader_output_values Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:19 +01:00
Nicolai Hähnle	2985708fa0	radeonsi: rename members of si_shader_output_values Be a bit more verbose and avoid confusion in future patches. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:16 +01:00
Nicolai Hähnle	88509518b0	radeonsi: fix an off-by-one error in the bounds check for max_vertices The spec actually says that calling EmitStreamVertex is undefined when you exceed max_vertices. But we do need to avoid trampling over memory outside the GSVS ring. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:13 +01:00
Nicolai Hähnle	7655bccce8	radeonsi: do not kill GS with memory writes Vertex emits beyond the specified maximum number of vertices are supposed to have no effect, which is why we used to always kill GS that reached the limit. However, if the GS also writes to memory (SSBO, atomics, shader images), then we must keep going and only skip the vertex emit itself. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:10 +01:00
Nicolai Hähnle	7b5b3d63c5	radeonsi: update all GSVS ring descriptors for new buffer allocations Fixes GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_geometry_instanced. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:06 +01:00
Nicolai Hähnle	2eaacba7f2	st/glsl_to_tgsi: plumb the GS output stream qualifier through to TGSI Allow drivers to emit GS outputs in a smarter way. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:03 +01:00
Nicolai Hähnle	cc34a6f0bd	tgsi/scan: collect information about output usagemasks Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:04:01 +01:00
Nicolai Hähnle	cf8e9778fc	tgsi/scan: collect information about output vertex streams Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:03:57 +01:00
Nicolai Hähnle	81d0dc5e55	gallium: extract individual streamout output structure So that we can pass pointers to individual array entries around. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:03:54 +01:00
Nicolai Hähnle	04811354c8	tgsi: add Stream{X,Y,Z,W} fields to tgsi_declaration_semantic This is for geometry shader outputs. Without it, drivers have no way of knowing which stream each output is intended for, and have to conservatively write all outputs to all streams. Separate stream numbers for each component are required due to output packing. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:03:51 +01:00
Nicolai Hähnle	173d80b401	glsl: remember per-component vertex streams for packed varyings Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:03:47 +01:00
Grazvydas Ignotas	6092169b96	i965/blorp: fix release build unused variable warning Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-12-12 07:09:33 +01:00
Edward O'Callaghan	5e6b2b05a5	virgl: Fix a strict-aliasing violation in the encoder As per the C spec, it is illegal to alias pointers to different types. This results in undefined behaviour after optimization passes, resulting in very subtle bugs that happen only on a full moon.. Use a memcpy() as a well defined coercion between the double to uint64_t interpretations of the memory. V.2: Use static_assert() instead of assert(). V.3: Use C99 compat STATIC_ASSERT() over C11 static_assert(). Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Dave Airlie <airlied@redhat.com>	2016-12-12 16:50:15 +11:00
Kenneth Graunke	35c5a9a64d	i965: Print out cycle estimates at the start of block annotations. We now print START B15 <-B14 (42774 cycles) indicating that we estimate B15 will take 42,774 cycles. Printing this should make it easier where time is spent in the program. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-12-11 16:33:05 -08:00
Kenneth Graunke	713cd23d8e	mesa: Return LINEAR encoding for winsys FBO depth/stencil. GetFramebufferAttachmentParameteriv should return GL_LINEAR for the window system default framebuffer's GL_DEPTH or GL_STENCIL attachments when there are zero depth or stencil bits. The GL 4.5 spec's GetFramebufferAttachmentParameteriv section says: "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is not NONE, these queries apply to all other framebuffer types: [...] If attachment is not a color attachment, or no data storage or texture image has been specified for the attachment, then params will contain the value LINEAR." Note that we already return LINEAR for the case where there is an actual depth or stencil renderbuffer attached. In the case modified by this patch, FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE returns FRAMEBUFFER_DEFAULT rather than NONE. Fixes a CTS test when run in a visual without depth / stencil buffers: GL45-CTS.gtf30.GL3Tests.framebuffer_srgb.framebuffer_srgb_default_encoding Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-12-11 16:33:05 -08:00
Grazvydas Ignotas	b58d1eecc6	intel/aubinator: fix 32bit shift overflow warning Doesn't look like this can work on 32bit, just rids of annoying warning. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-12-11 20:04:15 +01:00
Grazvydas Ignotas	3a1b15c392	anv: fix release build unused variable warnings Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-12-11 20:03:14 +01:00
Grazvydas Ignotas	90c29784c6	radv/ac: some fix maybe-uninitialized warnings Mark some paths unreachable so that compiler knows variables are initialized in all valid paths. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-10 21:46:56 +01:00
Grazvydas Ignotas	ec08666a28	radv/meta: use VK_NULL_HANDLE for handles Otherwise we get 32bit warnings because handle is plain uint64_t there and NULL is not suited to initialize that. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-10 21:46:56 +01:00
Grazvydas Ignotas	9bff2c9884	radv: fix release build unused variable warnings Just mark with MAYBE_UNUSED. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-12-10 21:46:56 +01:00
Grazvydas Ignotas	15e12ab8fc	softpipe: fix release build unused variable warning Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-12-10 21:25:45 +01:00
Grazvydas Ignotas	c81a89f662	radeonsi: fix release build unused variable warnings Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-12-10 21:19:59 +01:00
Chad Versace	42011be1e2	i965/mt: Disable HiZ when sharing depth buffer externally (v2) intel_miptree_make_shareable() discarded and disabled CCS. Fix it so that it discards and disables HiZ too. Fixes dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer on Skylake. v2: Actually do what the commit message says. Discard the HiZ buffer. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98329 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Nanley Chery <nanley.g.chery@intel.com Cc: Haixia Shi <hshi@chromium.org> Cc: mesa-stable@lists.freedesktop.org	2016-12-10 08:05:11 -08:00
Chad Versace	1c8be049be	i965/mt: Disable aux surfaces after making miptree shareable The entire goal of intel_miptree_make_shareable() is to permanently disable the miptree's aux surfaces. So set intel_mipmap_tree:disable_aux_buffers after the function's done with discarding down the aux surfaces. References: https://bugs.freedesktop.org/show_bug.cgi?id=98329 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Nanley Chery <nanley.g.chery@intel.com Cc: Haixia Shi <hshi@chromium.org> Cc: mesa-stable@lists.freedesktop.org	2016-12-10 08:05:11 -08:00
Jason Ekstrand	da1c49171d	spirv: Use a simpler and more correct implementaiton of tanh() The new implementation is more correct because it clamps the incoming value to 10 to avoid floating-point overflow. It also uses a much reduced version of the formula which only requires 1 exp() rather than 2. This fixes all of the dEQP-VK.glsl.builtin.precision.tanh.* tests. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0" <mesa-dev@lists.freedesktop.org>	2016-12-09 18:38:21 -08:00
Jason Ekstrand	9807f502eb	glsl: Use a simpler formula for tanh The formula we have used in the past is a trivial reduction from the definition by simply multiplying both the numerator and denominator of the formula by 2. However, multiplying by e^x, you can further reduce it. This allows us to get rid of one side of the clamp and two of exponential functions which should make it faster. The new formula still passes the dEQP precision tests for tanh so it should be fine. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-09 18:38:21 -08:00
Edward O'Callaghan	efe9d1cde3	anv: Clean up some unused variables Following on from the spirit of commit `011e5570f`. Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-10 11:59:59 +11:00
Tim Rowley	2a127b780b	swr: [rasterizer common/core/jitter] fetch support for GL_FIXED v2: use fmul(1/65536) instead of fdiv(65535) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-12-09 16:20:13 -06:00
Emil Velikov	d0d21532f9	configure: cleanup GLX_USE_TLS handling Mesa requires ax_pthread_ok = yes, thus we can fold/rewrite the conditional to follow the more common "if test" pattern. No functional change intended. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-12-09 19:22:03 +00:00
Emil Velikov	b83153e77b	configure: enable glx-tls by default In the (not too) distant future we'd want to remove this option and effectively drop the other codepath(s) we have in our dispatch. Linux distributions have been using --enable-glx-tls for a number of years. Some/most BSD platforms still don't support this, yet this should serve as an encouragement to move things forwards. Note: we had many bug reports were opened due to the wrong default option. See the list below for details. v2: - Correct default option in help string (Andreas) - Add bugzilla references. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70623 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72902 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73778 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89043 Cc: Jean-Sébastien Pédron <dumbbell@FreeBSD.org> Cc: Jonathan Gray <jsg@jsg.id.au> Cc: mesa-maintainers@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2016-12-09 19:21:41 +00:00
Emil Velikov	0715ba4be6	docs: document how to (self-) reject stable patches Document what has been the unofficial way to self-reject stable patches. Namely: drop the mesa-stable tag and push the commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-12-09 17:37:30 +00:00
Emil Velikov	26541a1fcc	egl: add and enable EGL_KHR_config_attribs Extension is already implemented in the main code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-12-09 17:36:28 +00:00
Emil Velikov	bf384a2d85	egl/surfaceless: remove duplicate KHR_image_base enablement Already set by the core code - dri2_create_screen/dri2_setup_screen Cc: Chad Versace <chadversary@chromium.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-12-09 17:36:26 +00:00
Eric Engestrom	9e1d35ca75	egl: unexport _eglConvertIntsToAttribs Nobody else makes use of this function. We can always re-export it if someone ever needs it. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-09 17:33:43 +00:00
Eric Engestrom	4729e1b511	egl: rename static functions to match convention Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-09 17:33:36 +00:00
Haixia Shi	d4983390a8	compiler/glsl: fix precision problem of tanh Clamp input scalar value to range [-10, +10] to avoid precision problems when the absolute value of input is too large. Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.* test failures. v2: added more explanation in the comment. v3: fixed a typo in the comment. Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0" <mesa-dev@lists.freedesktop.org>	2016-12-09 09:14:20 -08:00
Tim Rowley	7aea08667c	swr: [rasterizer core/memory] Finish R24_UNORM_X8_TYPELESS for AVX512 This one-off specialization was missed. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-12-09 10:41:31 -06:00
Bas Nieuwenhuizen	53e1c970ef	radv: Use enum for memory types. Inspired by patches from Eric Engestrom. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Cc: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-09 08:53:05 +01:00
Bas Nieuwenhuizen	4ae84efbc5	radv: Use enum for memory heaps. Inspired by patches from Eric Engestrom. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Cc: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-09 08:53:05 +01:00
Bas Nieuwenhuizen	011e5570f8	radv: Clean up some unused variables. Leftovers from anv? Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-09 08:53:05 +01:00
Timothy Arceri	8977cd4fdd	i965: delay adding built-in uniforms to Parameters list This is a step towards using NIR optimisations over GLSL IR optimisations. Delaying adding built-in uniforms until after we convert to NIR gives it a chance to optimise them away. V2: move the new code back to brw_link_shader() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-09 16:29:10 +11:00
Ilia Mirkin	429e2ec324	swr: [rasterizer core] supply proper clip distances to point sprites Large points become pairs of triangles when rasterized, so we must feed it three clip distances, one for each vertex. The clip distance is not subject to sprite coord replacement, so there's no interpolation of it. We just take its value and put it in the "z" component of the barycentric-ready plane equation. (We could also just cull it at an earlier point in time, but that would require larger changes.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-12-08 22:47:39 -05:00
Ilia Mirkin	192317dfeb	swr: [rasterizer core] perform perspective division on clip distances Clip distances need to be perspective-divided. This fixes all the interpolation-*-{distance,vertex} piglits. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-12-08 22:47:27 -05:00
Dave Airlie	bd56de88df	radv/ac: no need to pass nir to the post outputs handling We don't use the nir shader in here at all. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:10:34 +00:00
Dave Airlie	d38eece4e6	radv: fix warnings in ubo load code. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:06:30 +00:00
Dave Airlie	0fafe94a39	radv/ac: pass a mask of array params not a number. This makes it easier to add new params before the array ones. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:06:18 +00:00
Dave Airlie	257866ae46	radv: split out a chunk of variant filling code. This code will have use for copy shaders etc. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:06:14 +00:00
Dave Airlie	6cde094bf7	radv/meta: don't pass rect into blit2d src function. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:06:10 +00:00
Dave Airlie	71a9574ffa	radv/meta: cleanup image info setup. This just passes the subresource info in and uses it. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:06:06 +00:00
Dave Airlie	6f08dcd398	radv/meta: split copyimage api into api and meta function This make it easier to add multiple queues later. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:06:02 +00:00
Dave Airlie	0689b8f485	radv/meta: clean up buffer->image code. Removes some unnecessary functions and pull some stuff out of the loop. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:05:58 +00:00
Dave Airlie	c46c376977	radv/ac: don't pass nir to create_function This isn't needed for later things like geom shader copy shaders, we won't have NIR. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:05:52 +00:00
Dave Airlie	2a33049c70	radv: add missing license file to radv_meta_bufimage. Just noticed this file was missing license and any explaination of what is in it. (stable just for license header reasons) Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:05:26 +00:00
Dave Airlie	e54af02567	radv/ac: use build_gep0 instead of opencoding it. Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-08 23:03:39 +00:00
Marek Olšák	31f988a9d6	radeonsi: disable the constant engine (CE) on Carrizo and Stoney It must be disabled until the kernel bug is fixed, and then we'll enable CE based on the DRM version. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-08 15:54:38 +01:00
Michel Dänzer	26ba8c920d	radeonsi: Fix typo: "llvm.fs.interp" => "llvm.SI.fs.interp" Fixes lots of pixel shaders failing to compile with LLVM 3.9 or older. Trivial. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99013#c4	2016-12-08 10:51:03 +09:00
Dave Airlie	c7dc1b010a	radv: make push constants optional We don't set the push constants slot up unless something will cause us to need it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:26:19 +00:00
Dave Airlie	dfef9c7c1f	radv: only emit descriptor sgprs when needed This only emits enough descriptor sgprs for the number of sets in the layout, and only emits the descriptors necessary for the current stage. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:25:54 +00:00
Dave Airlie	ae61ddabe8	radv: move userdata sgpr ownership to compiler side. This isn't fully what we want yet, but is a good step on the way. This allows the compiler to create the information structures for the state setting side, however the state setting still expects things to be pretty much in 2 sgpr wide register sets, and can't handle the indirect setting yet. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:25:49 +00:00
Dave Airlie	221ab77956	radv: refactor out the constant setting user sgpr code. This just refactors out some common code to make future changes easier to understand. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:25:45 +00:00
Dave Airlie	11208f0049	radv: refactor out the descriptor user sgpr setting. This just splits some common code into a utility function. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:25:42 +00:00
Dave Airlie	a74a4edc90	radv: only bind descriptor sets to stages that need them This copies the push constant code and only binds descriptor sets to the stages that need them. It also now has to dirty descriptors on pipeline binds. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:25:38 +00:00
Dave Airlie	85118a1e4d	radv: move descriptor set userdata emission to draw flush time. This is another step towards having the compiler decide the user sgpr layout. This still emits the descriptors sets for all shader types, but we will fix this later. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:25:28 +00:00
Dave Airlie	a5d10844ee	radv: refactor descriptor set userdata emission out. This just moves this into a separate function. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:25:18 +00:00
Dave Airlie	f847676990	radv: pass pipeline to constant flush function I'll need this later rather than just the layout. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:25:15 +00:00
Dave Airlie	eb2ba5c8df	radv: consolidate compute pipeline flushing (v1.1) This just moves some common code into a utility function to avoid having to change multiple places later. v1.1: rename function to better reflect what it does. (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-07 23:24:53 +00:00
Marek Olšák	13c34cf8ca	radeonsi: wait for outstanding LDS instructions in memory barriers if needed Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-07 19:40:29 +01:00
Marek Olšák	16ba04d6de	tgsi: fix the src type of TGSI_OPCODE_MEMBAR It's a literal integer. The next commit will need this. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-07 19:40:29 +01:00
Marek Olšák	16f49c16c7	radeonsi: wait for outstanding memory instructions in TCS barriers Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-07 19:40:29 +01:00
Marek Olšák	15e96c70b0	radeonsi: allow specifying simm16 of emit_waitcnt at call sites The next commit will use this. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-07 19:40:29 +01:00
Marek Olšák	57b9d75af5	radeonsi: write shader descriptors into hang reports Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-07 19:40:29 +01:00
Marek Olšák	6caa558ca6	radeonsi: check for sampler state CSO corruption It really happens. v2: declare "magic" in debug builds only Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-12-07 19:40:03 +01:00
Marek Olšák	f2b0c66c3c	radeonsi: properly declare context sampler states Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-07 18:46:54 +01:00
Marek Olšák	38d4859b94	radeonsi: fix incorrect FMASK checking in bind_sampler_states Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-07 18:46:54 +01:00
Marek Olšák	b3a2aa9cba	radeonsi: always restore sampler states when unbinding sampler views Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-07 18:46:54 +01:00
Marek Olšák	d205faeb6c	radeonsi: take LDS into account for compute shader occupancy stats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-07 18:46:54 +01:00
Marek Olšák	132b69c4ed	st/mesa: round lod_bias to a multiple of 1/256 This reduces the number of sampler states 3.6x in Batman Arkham: Origins. (from ~7200 to ~2000) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-07 18:46:54 +01:00
Marek Olšák	4b0d8b2da0	gallium: decrease the size of pipe_sampler_state fields We've had unused bits. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-12-07 18:46:54 +01:00
Marek Olšák	6dc96de303	cso: don't release sampler states that are bound This fixes random radeonsi GPU hangs in Batman Arkham: Origins (Wine) and probably many other games too. cso_cache deletes sampler states when the cache size is too big and doesn't check which sampler states are bound, causing use-after-free in drivers. Because of that, radeonsi uploaded garbage sampler states and the hardware went bananas. Other drivers may have experienced similar issues. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-12-07 18:46:54 +01:00
Jordan Justen	e9133dd90e	i965: Increase max texture to 16k for gen7+ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98297 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-07 09:00:49 -08:00
Jordan Justen	d6526d7247	intel/blorp_blit: Add split_blorp_blit_debug switch Enabling this debug switch causes surface shrinking to happen by default, and lowers the surface size limit which causes blorp blits to be split. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-07 09:00:49 -08:00
Jordan Justen	da381ae647	intel/blorp_blit: Enable splitting large blorp blits Detect when the surface sizes are too large for a blorp blit. When it is too large, the blorp blit will be split into a smaller operation and attempted again. For gen7, this fixes the cts test: ES3-CTS.gtf.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_multisampled_to_singlesampled_blit It will also enable us to increase our renderable size from 8k x 8k to 16k x 16k. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-07 09:00:49 -08:00
Jordan Justen	efea8e7244	intel/blorp_blit: Move RGB=>R conversion to follow blit splitting In blorp_copy, when RGB surfaces are copied, we convert the destination surface to a Red only surface, but 3 times as wide. This introduces an implicit restriction of "mod 3" for the destination width. It is easier to handle the blorp split buffer offsetting with the original RGB surface, and do the RGB=>R after this. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-07 09:00:49 -08:00
Jordan Justen	edf3113aed	intel/blorp_blit: Adjust blorp surface parameters for split blits If try_blorp_blit() previously returned that a blit was too large, shrink_surface_params() will be used to update the surface parameters for the smaller blit so the blit operation can proceed. v2: * Use double instead of float. (Jason) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-07 09:00:49 -08:00
Jordan Justen	12e0a6e259	intel/blorp_blit: Split blorp blits if they are too large We rename do_blorp_blit() to try_blorp_blit(), and add a return error if the surface size for the blit is too large. Now, do_blorp_blit() is rewritten to try to split the blit into smaller operations if try_blorp_blit() fails. Note: In this commit, try_blorp_blit() will always attempt to blit and never return an error, which matches the previous behavior. We will enable the size checking and splitting in a future commit. The motivation for this splitting is that in some cases when we flatten an image, it's dimensions grow, and this can then exceed the programmable hardware limits. An example is w-tiled+MSAA blits. v2: * Use double instead of float. (Jason) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-07 09:00:49 -08:00
Jordan Justen	b74d4f6ca0	intel/blorp_blit: Create structure for src & dst coordinates This will be useful for splitting blits into smaller sizes. We also make the coordinates of type double rather than float. Since we will be splitting and scaling the coordinates, we might require extra precision in the calculations. v2: * Use double instead of float. (Jason) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-07 09:00:49 -08:00
Edward O'Callaghan	a77426fd92	vulkan: use STATIC_ASSERT instead of static_assert Following the spirit of commit `23d1799f`, fixes compilation warnings on Android build due to lack of C11 features. Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-07 22:32:38 +11:00
Lionel Landwerlin	e9f17e9fb0	i965: enable INTEL_conservative_rasterization on Gen9+ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-12-07 11:02:19 +00:00
Lionel Landwerlin	039d836d6e	mesa: add support for GL_INTEL_conservative_rasterization Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-12-07 11:02:16 +00:00
Plamena Manolova	0ff74a8990	i965: Add i965 plumbing for ARB_post_depth_coverage for i965 (gen9+). This extension allows the fragment shader to control whether values in gl_SampleMaskIn[] reflect the coverage after application of the early depth and stencil tests. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-07 11:01:50 +00:00
Plamena Manolova	8481386892	mesa: Add GL and GLSL plumbing for ARB_post_depth_coverage for i965 (gen9+). This extension allows the fragment shader to control whether values in gl_SampleMaskIn[] reflect the coverage after application of the early depth and stencil tests. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-12-07 11:01:50 +00:00
Nicolai Hähnle	d3931a355f	radeonsi: fix isolines tess factor writes to control ring Fixes piglit arb_tessellation_shader/execution/isoline{_no_tcs}.shader_test. Cc: mesa-stable@lists.freedesktop.org	2016-12-07 11:21:32 +01:00
Kenneth Graunke	9871bde351	i965: Drop redundant key->outputs_written initialization. This was already set to the same value earlier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-12-06 22:14:58 -08:00
Kenneth Graunke	09ffc5c84f	i965: Initialize "separate" flag in VUE maps. This was uninitialized, which resulted in weird looking printouts where it appeared that the TCS output and TES input patch URB entries differed in SSO/non-SSO layout. There is no "separable" layout for both, as they're tied together. It should have no other actual effect. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-12-06 22:14:58 -08:00
Ian Romanick	b87039499b	nir: In split_var_copies_block, uint, int, and bool types cannot be matrices Noticed while adding support for 64-bit integer types. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-06 17:30:38 -08:00
Tom Stellard	4c8c13b356	radeonsi: Use amdgcn intrinsics for fs interpolation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-07 00:42:40 +00:00
Rob Clark	a9383ae6d6	freedreno/a5xx: fix draw packet size with index buffer gpuaddr of idx buffer is now two dwords (64b). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-06 18:01:31 -05:00
Rob Clark	ec24f009ca	freedreno/a5xx: gmem bypass mode Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-06 18:01:31 -05:00
Rob Clark	85a3057f65	freedreno/a5xx: fix emit_string_marker() Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-06 18:01:31 -05:00
Rob Clark	c1e9cca696	freedreno: pitch alignment should match gmem alignment Deal w/ differing gmem tile size alignment between generations, and make sure texture pitch matches. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-06 18:01:31 -05:00
Rob Clark	8f4da2ff63	freedreno/a5xx: more formats Bunch of stuff we can at least turn on for vbo formats. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-06 18:01:31 -05:00
Rob Clark	b337099849	freedreno/a5xx: fix fragface Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-06 18:01:31 -05:00
Rob Clark	f143eeaffa	freedreno/a5xx: fix fragcoord Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-06 18:01:31 -05:00
Rob Clark	f5c5f76255	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-06 18:01:31 -05:00
Rob Clark	3ec4d1f809	freedreno/a5xx: fix alpha test GRAS_SU_DEPTH_PLANE_CNTL doesn't in fact seem to be anything to do with alpha test. This fixes xonotic and (other than some iommu faults) gets gnome-shell working. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-06 18:01:31 -05:00
Rob Clark	2b305725e2	freedreno/a5xx: fix VPC_VAR[n].DISABLE bits We don't need varying interpolators enabled for pos/psize out of the VS (despite the fact that they show up in VS_OUT map), so emit these before we append pos/psize to the linkage. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-06 18:01:31 -05:00
Nanley Chery	72db1570b4	anv/TODO: Document sampling from HiZ Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-06 14:51:30 -08:00
Kenneth Graunke	05a4e3a009	i965: Don't force SSO layout for VS->TCS. This was a hack which worked around the VS and TCS disagreeing on their shared interface due to the lack of varying packing. In particular, it was needed by Piglit's tcs-input-read-array-interface test. However, that was just one case where things could go awry, so the previous commit forcibly made interfaces match. This hack is no longer necessary. It also seems to be broken, though I'm not sure why. It fixes Piglit regressions in spec/arb_shader_image_load_store/semantics from commit `ec1f159ac8`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98893 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-12-06 12:36:21 -08:00
Kenneth Graunke	44fd85d8eb	i965: Unify shader interfaces explicitly. A while ago, I made i965 start compiling shaders independently. The VUE map layouts were based entirely on each shader's input/output bitfields. Assuming the interfaces match, this works out well - both sides will compute the same layout, and outputs are correctly routed to inputs. At the time, I had assumed that the linker would guarantee that the interfaces match. While it usually succeeds, it unfortunately seems to fail in some cases. For example, Piglit's tcs-input-read-array-interface test has a VS output array with two elements, but the TCS only reads one. The linker isn't able to eliminate the unused element from the VS, which makes the interfaces not match. Another case is where a shader other than the last writes clip/cull distances. These should be demoted to ordinary varyings, but they currently aren't - so we think they still have some special meaning, and prevent them from being eliminated. Fixing the linker to guarantee this in all cases is complicated. It needs to be able to optimize out dead code. It's tied into varying packing and other messiness. While we can certainly improve it---and should---I'd rather not rely on it being correct in all cases. This patch ORs adjacent stages' input/output bitfields together, ensuring that their interface (and hence VUE map layout) will be compatible. This should safeguard us against linker insufficiencies. Fixes line rendering in Dolphin, and the Piglit test based on it: spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97232 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-12-06 12:34:23 -08:00
Jason Ekstrand	eb7b51d62a	genxml/gen9: Change the default of MI_SEMAPHORE_WAIT::RegisterPoleMode We would really like it to be false as that's what you get on hardware that doesn't have RegisterPoleMode (Sky Lake for example). While we're at it, we change it to a boolean. This fixes dEQP-VK.synchronization.smoke.events on Broxton. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-12-06 11:35:13 -08:00
Roland Scheidegger	8ac3c1bf1a	gallivm: optimize 16bit->32bit gather path a bit LLVM can't really optimize anything which crosses scalar/vector boundaries, so help a bit with some particular gather operations when the width is expanded (only do it for 16->32bit expansion for now), by doing expansion after fetch. That is probably a better solution anyway even if llvm would recognize it, makes for cleaner IR... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-12-06 20:06:06 +01:00
Roland Scheidegger	fd5f420fbb	gallivm: handle 16bit float fetches in lp_build_fetch_rgba_soa Note that we really want to _never_ reach the bottom of the function, which resorts to AoS fetch. Half floats can be handled just like other formats which fit into 32bit vectors (so, only 1x16 and 2x16 formats, albeit with more channels things are not THAT bad), with minimal plumbing. I've seen code size go down nearly by a factor of 3 for a complete texture sampling function (including bilinear filtering) using R16F. (What we should do for everything not special cased is to do AoS gather, shuffle/shift things into SoA vectors, and then do the conversion there. Otherwise it's particularly bad with 1 or 2 channel formats - that r16f format with either 4 or 8-wide vectors was still doing one element at a time, essentially doing exactly the same work as for rgba16f. Also replacing the channels with SWIZZLE0/1 (particularly the latter) adds even more work, as it has to be done per aos vector, and not just straightforward at the end with the SoA vector.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-12-06 20:06:06 +01:00
Roland Scheidegger	775a244645	util: (trivial) ETC1 meets the criteria for fitting into unorm8 Just like other similar compressed formats. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-12-06 20:06:06 +01:00
Matt Turner	43cdbb3e6a	i965: Emit proper NOPs. The PRMs for HSW and newer say that other than the opcode and DebugCtrl bits of the instruction word, the rest must be zero. By zeroing the instruction word manually, we avoid using any of the state inherited through brw_codegen. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96959 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-12-06 10:42:46 -08:00
Roland Scheidegger	9c95ad24cc	glsl: (trivial) fix type typo Accidentally changed the type of a constant in `df33f11b39` causing assertion failures.	2016-12-06 17:44:21 +01:00
Kenneth Graunke	a41f5dcb14	i965: Allocate at least some URB space even when max_vertices = 0. Allocating zero URB space is a really bad idea. The hardware has to give threads a handle to their URB space, and threads have to use that to terminate the thread. Having it be an empty region just breaks a lot of assumptions. Hence, why we asserted that it isn't possible. Unfortunately, it /is/ possible prior to Gen8, if max_vertices = 0. In theory a geometry shader could do SSBO/image access and maybe still accomplish something. In reality, this is tripped up by conformance tests. Gen8+ already avoids this problem by placing the vertex count DWord in the URB entry header. This fixes things on earlier generations. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2016-12-05 20:47:03 -08:00
Roland Scheidegger	cd9bb4b918	main: allow NEAREST_MIPMAP_NEAREST for stencil texturing As per GL 4.5 rules, which fixed a spec mistake in GL_ARB_stencil_texturing. The extension spec wasn't updated, but just allow it with older GL versions as well, hoping there aren't any crazy tests which want to see an error there... (Compile tested only.) Reported by Józef Kucia <joseph.kucia@gmail.com> Acked-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-06 04:10:43 +01:00
Roland Scheidegger	df33f11b39	glsl: fix ldexp lowering if bitfield insert lowering is also requested Trivial, this just resurrects the code which was there once upon a time (the code can't lower instructions generated in the lowering pass there, and even if it could it would probably be suboptimal). This fixes piglit mesa_shader_integer_functions fs-ldexp.shader_test and vs-ldexp.shader_test with llvmpipe. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-12-06 04:10:43 +01:00
Nayan Deshmukh	3015a23fe0	radv: fix resource leak in radv_amdgpu_ctx_create CovID: 1396387 V2. Fixup bad whitespace. Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folkore1984.net>	2016-12-06 11:49:01 +11:00
Andy Furniss	5338fb34d6	st/omx/enc Raise default encode level Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=91281 Signed-off-by: Andy Furniss <adf.lists@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-12-05 19:39:47 -05:00
Andy Furniss	2a38a5b2b2	radeon/vce Handle H.264 level 5.2 Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=91281 v2: explicitly add case 52 Signed-off-by: Andy Furniss <adf.lists@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-12-05 19:39:47 -05:00
Jason Ekstrand	7db009b59e	nir: Remove some unused fields from nir_variable All of these are happily set from glsl_to_nir or spirv_to_nir but their values are never used for anything. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-05 15:40:10 -08:00
Jason Ekstrand	50e0b0bee3	nir: Delete most of the constant_initializer support Constant initializers have been a constant (ha!) pain for quite some time. While they're useful from a language perspective, people writing passes or backends really don't want deal with them most of the time. This commit removes most of the constant initializer support from NIR. It is expected that you call nir_lower_constant_initializers VERY EARLY to ensure that they're gone before you do anything interesting. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-05 15:40:09 -08:00
Jason Ekstrand	2f19c19b5d	nir: Simplify nir_lower_gs_intrinsics It's only ever called on single-function shaders. At this point, there are a lot of helpers that can make it all much simpler. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-05 15:40:09 -08:00
Jason Ekstrand	257aa5a1c4	nir/lower_returns: Stop using constant initializers Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-05 15:40:09 -08:00
Jason Ekstrand	507626304c	glsl/nir: Call nir_lower_constant_initializers Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-05 15:40:09 -08:00
Jason Ekstrand	c5d664f9dc	anv/pipeline: Call nir_lower_constant_initializers Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-05 15:40:09 -08:00
Jason Ekstrand	f5232db9e5	nir: Add a pass for lowering away constant initializers Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-05 15:40:09 -08:00
Jason Ekstrand	0291bf4db2	Revert "i965: use nir_lower_indirect_derefs() for GLSL" This reverts commit `9404439a75`. I didn't intend to push it and it breaks clip and cull distance.	2016-12-05 15:21:20 -08:00
Jason Ekstrand	5f0e4c7c79	i965: Delete the meta-base CopyImageSubData implementation When I originally implemented the ARB_copy_image extension, the fast-path was written in meta using texture views. This path only worked if both images were uncompressed color images. All of the other cases fell back to the blitter or, in the worst case, mapping and memcpy on the CPU. Now that we have the blorp path, it handles all copies ever and the old meta, blitter, and CPU paths are only used on gen5 and below. The primary reason why we needed the meta path (apart from having a slow blitter on later hardware) was to handle multisampling which gen5 and earlier don't support anyway. Since the blitter is reasonably fast on gen5, we can just delete the meta path and get rid of all that terrible code. If we decide that we're ok with just disabling ARB_copy_image on gen5 and earlier (I personally am), then we could get rid of another 300 lines or so of semi-hairy code. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-12-05 14:00:35 -08:00
Jason Ekstrand	06d864921e	i965/copy_image: Re-implement the blitter path with emit_miptree_blit By using emit_miptree_blit which does chunking, this fixes the blitter path for the case where the image is too tall to blit normally. We also pull it into intel_blit as intel_miptree_copy. This matches the naming of the blorp blit and copy functions brw_blorp_blit and brw_blorp_copy. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "13.0" <mesa-dev@lists.freedesktop.org>	2016-12-05 14:00:35 -08:00
Jason Ekstrand	6c74e7f492	i965/blit: Break the guts of intel_miptree_blit into a helper Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "13.0" <mesa-dev@lists.freedesktop.org>	2016-12-05 14:00:35 -08:00
Timothy Arceri	9404439a75	i965: use nir_lower_indirect_derefs() for GLSL This moves the nir_lower_indirect_derefs() call into brw_preprocess_nir() so thats is called by both OpenGL and Vulkan and removes that call to the old GLSL IR pass lower_variable_index_to_cond_assign() We want to do this pass in nir to be able to move loop unrolling to nir. There is a increase of 1-3 instructions in a small number of shaders, and 2 Kerbal Space program shaders that increase by 32 instructions. Shader-db results BDW: total instructions in shared programs: 8705873 -> 8706194 (0.00%) instructions in affected programs: 32515 -> 32836 (0.99%) helped: 3 HURT: 79 total cycles in shared programs: 74618120 -> 74583476 (-0.05%) cycles in affected programs: 528104 -> 493460 (-6.56%) helped: 47 HURT: 37 LOST: 2 GAINED: 0	2016-12-05 14:00:35 -08:00
Tim Rowley	0c70b26a2d	swr: mark PIPE_CAP_NATIVE_FENCE_FD unsupported Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-12-05 13:42:39 -06:00
Tim Rowley	efc3ca64ba	swr: include llvm version and vector width in renderer string Uses llvmpipe's string formating. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-12-05 13:42:39 -06:00
Tim Rowley	b035d9cab5	gallivm: use getHostCPUFeatures on x86/llvm-4.0+. Use llvm provided API based on cpuid rather than our own manually mantained list of mattr enabling/disabling. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-12-05 13:42:39 -06:00
Juan A. Suarez Romero	48416b6f4d	st/va: declare vlVaBuffer before vlVaContext And declare coded_buf in vlVaContext as "vlVaBuffer " instead of "struct vlVaBuffer ". This fixes several warnings later about assignment from incompatible pointer type. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 17:03:57 +00:00
Juan A. Suarez Romero	5a585d019e	st/va: remove unused variable pbuff Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2016-12-05 17:03:56 +00:00
Emil Velikov	510722d146	st/va: automake: cleanup C{PP,}FLAGS Remove some transitional left overs from the gallium pipe-loader rework and kill off unneeded AM_CPPFLAGS. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-12-05 17:03:56 +00:00
Rob Clark	8ca14b04e1	add EGL_TEXTURE_EXTERNAL_WL to WL_bind_wayland_display spec Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2016-12-05 16:01:21 +00:00
Emil Velikov	d09da32cfa	docs: add news item and link release notes for 12.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 15:42:58 +00:00
Emil Velikov	d7747cccaf	docs: add sha256 checksums for 12.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `6b1c3c3aa0`)	2016-12-05 15:38:56 +00:00
Emil Velikov	ef0417e8d1	docs: add release notes for 12.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `01579a9d00`)	2016-12-05 15:38:55 +00:00
Tobias Droste	0e9a5be7e7	configure.ac: Create correct LLVM_VERSION_INT with minor >= 10 This makes sure that we handle LLVM minor version >= 10 correctly. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:47 +00:00
Tobias Droste	a66cd76b16	configure.ac: Get complete LLVM version from header Major and minor version are included in the header file since LLVM version 3.1.0. Since the minimal required version is 3.3.0 we can remove the workaround if no values for major/minor were found in the header. Since LLVM 3.6.0 the patch version is inside the header file of LLVM. Only radeon drivers need the patch version and they depend on LLVM >= 3.6.0, so this is safe too. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:47 +00:00
Tobias Droste	5db89531bc	configure.ac: Add required LLVM versions to the top Consolidate the required LLVM versions at the top where the other versions for dependencies are listed. v5: Splitted out separate changes (see patch 19 and 20) Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:47 +00:00
Tobias Droste	45c8a4ea0a	configure.ac: Only add default LLVM components if needed LLVM components are only added when LLVM is needed. This means gallium adds this as soon as "--enable-gallium-llvm" is "yes" and radv + opencl add it explicitly. v5: Removed hunk that disabled LLVM for gallium if it was not found. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:47 +00:00
Tobias Droste	a8ae340f7e	configure.ac: Reorder arguments in radeon_llvm_check Use the same order as llvm_check_version_for. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:47 +00:00
Tobias Droste	3f42859367	configure.ac: Move radv check to the Vulkan section This moves the LLVM check for radv to the corresponding driver section. No functional change. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:47 +00:00
Tobias Droste	c702369bf5	configure.ac: Move LLVM ac_subst closer to usage This moves llvm_set_environment_variables to its final destination and moves all the LLVM AC_SUBST() below the function call. No functional change. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:47 +00:00
Tobias Droste	62f4e6f272	configure.ac: Move oCL LLVM checks to the oCL section The LLVM checks can be anywhere below line 1161 now. Move the openCL LLVM checks to the section with the other openCL checks. No functional change. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [Emil Velikov: s/ipos/ipo/, drop "yes" argument from llvm_add_component] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:47 +00:00
Tobias Droste	9d14a25bee	configure.ac: Move llvm_set_environment_variables higher. This moves the function to get the LLVM environment variables higher in the file. It still needs to be below the "--enable-opencl" because it uses $enable_opencl. It can be called without condition now as it only throws errors if openCL is enabled. v5: HAVE_MESA_LLVM is only used for gallium. Rename it to HAVE_GALLIUM_LLVM. In order to only link LLVM when it is needed, HAVE_GALLIUM_LLVM is only set if "$enable-gallium-llvm" is yes. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Tobias Droste	19ff3975de	configure.ac: Remove swr_llvm_check() No need for an additional function here. Use the same style for LLVM checks as the other drivers (e.g. r300, llvmpipe) that don't need a load of other checks. Instead of open conding the LLVM version check, use the function used by other drivers. "enable_gallium_llvm" is checked by gallium_require_llvm(). Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Tobias Droste	b3119a3360	configure.ac: Check gallium LLVM version in gallium_require_llvm This moves the LLVM version check to the helper function gallium_require_llvm() and uses the llvm_check_version_for() helper instead of open conding the LLVM version check. gallium_require_llvm is functionally the same as before, because "enable_gallium_llvm" is only set to "yes" if the host cpu is x86: if test "x$enable_gallium_llvm" = xauto; then case "$host_cpu" in i*86\|x86_64\|amd64) enable_gallium_llvm=yes;; esac fi This function is also only called now when needed. Before this patch llvmpipe would call this as soon as LLVM is installed. Now it only gets called by llvmpipe if gallium LLVM is actually enabled (i.e. only on x86). Both reasons mentioned above remove the need to check host cpu in the gallium_require_llvm function. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Tobias Droste	44a672ef0e	configure.ac: Use short names for r600 und r300 There are no non gallium r300 and r600 drivers anymore. No need to explicilty mention gallium here. Just cosmetics, no functional change. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Tobias Droste	1ca0486147	configure.ac: Remove useless oCL LLVM check This is handled by "llvm_check_version_for" for openCL. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Tobias Droste	8c98e27074	configure.ac: Move llvm-config searching outside the function There's no harm in always searching llvm-config. This way it's available as soon as possible for all functions. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Tobias Droste	0cc4ffd67b	configure.ac: Move LLVM functions to the top This just moves code around so that all LLVM related stuff is at the top of the file in the correct order. No functional change. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Tobias Droste	bf4b0fc33b	configure.ac: Move LLVM version check to the top A function with the LLVM version checked is moved to the top. The function is called where the old code was. No functional change. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [Emil Velikov: s/ipos/ipo/, drop "yes" argument from llvm_add_component] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Tobias Droste	9a3bccc75e	configure.ac: Use new helper function for LLVM Use the new helper function to add LLVM targets and components. The components are added one by one to later find out which component is missing in case there is one. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [Emil Velikov: s/ipos/ipo/, drop "yes" argument from llvm_add_component] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Tobias Droste	d434633b76	configure.ac: Use new llvm_add_default_components Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Tobias Droste	352831c5d9	configure.ac: Add helper function for targets/components Add functions to add and check targets/components. Not used in this patch. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Tobias Droste	2350387d24	configure.ac: Don't search llvm-config if it's known This way LLVM_CONFIG can bet set from an env variable if it's outside the $llvm_prefix. This is not a must, but it helps testing. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Boyuan Zhang	3949d7c6ea	st/va: fix gop size for rate control The gop_size in rate control is the budget window for internal rate control calculation, and shouldn't always equal to idr period. Define a coefficient to let budget window contains a number of idr period for proper rate control calculation. Adjust the number of i/p frame remaining accordingly. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005 Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-12-05 09:23:38 -05:00
Boyuan Zhang	8206882392	st/va: force to submit two consecutive single jobs The gop_size in rate control is the budget window for internal rate control calculation, and shouldn't always equal to idr period. Define a coefficient to let budget window contains a number of idr period for proper rate control calculation. Adjust the number of i/p frame remaining accordingly. v2: fixed regression issues introduced by previous version Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005 Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-12-05 09:23:38 -05:00
Nayan Deshmukh	7b811c362a	st/vdpau: fix compiler warning in vlVdpVideoMixerRender Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-12-05 11:20:55 +01:00
Topi Pohjolainen	5b27405eff	i965: Release aux buffer when disabling ccs Otherwise subsequent render cycles keep on using compression and/or fast clear. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-05 09:20:05 +02:00
Bas Nieuwenhuizen	92d7563fba	ac/nir: Only use the first component for SSBO atomics. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-05 01:40:54 +01:00
Dave Airlie	8033f78f94	radv: fix another regression since shadow fixes. This fixes: dEQP-VK.glsl.texture_gather.basic.2d.depth32f.* Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-05 10:14:37 +10:00
Iago Toral Quiroga	66e7effc85	spirv: Builtin Layer is an input for fragment shaders This change makes it so we emit a load_input intrinsic when Layer is read in a fragment shader. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-03 20:50:57 +01:00
Bruce Cherniak	a7b510f656	swr: Fix active_queries count The active_query count was incorrect for query types that don't require a begin_query. Removed the unnecessary assert. Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-12-02 14:36:28 -06:00
George Kyriazis	2085088033	swr: Fix type to match parameters of std::max() Include propagation of comparisons further down. Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-12-02 14:36:28 -06:00
Tim Rowley	f1ca377ab1	swr: [rasterizer jitter] include cstdarg in builder_misc.cpp Fixes build problem with llvm-svn. v2: use cstdarg instead of stdarg.h Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-12-02 14:36:28 -06:00
Jason Ekstrand	19a541f496	nir: Get rid of nir_constant_data This has bothered me for about as long as NIR has been around. Why do we have two different unions for constants? No good reason other than one of them is a direct port from GLSL IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-02 10:53:32 -08:00
Timothy Arceri	c45d84ad83	Revert "st/mesa: get Version from gl_program rather than gl_shader_program" This reverts commit `6bf63b0119`. A patch that adds a reference to gl_shader_program_data to gl_program needs to land befor this one.	2016-12-02 16:44:44 +11:00
Timothy Arceri	6bf63b0119	st/mesa: get Version from gl_program rather than gl_shader_program Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-02 13:54:54 +11:00
Timothy Arceri	ab8c01386a	st/mesa/glsl: move Version to gl_shader_program_data This is mostly just used during linking however the st uses it when updating textures. In order to store gl_program in the CurrentProgram array rather than gl_shader_program we need to move this field to the shared gl_shader_program_data struct. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-02 13:54:47 +11:00
Rob Clark	534917495d	freedreno: no-op render when we need a fence If app tries to create a fence but there is no rendering to submit, we need a dummy/no-op submit. Use a string-marker for the purpose.. mostly since it avoids needing to realize that the packet format changes in later gen's (so one less place to fixup for a5xx). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-01 20:24:59 -05:00
Rob Clark	0b98e84e9b	freedreno: native fence fd support Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-01 20:24:46 -05:00
Rob Clark	16f6ceaca9	freedreno: some fence cleanup Prep-work for next patch, mostly move to tracking last_fence as a pipe_fence_handle (created now only in fd_gmem_render_tiles()), and a bit of superficial renaming. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-01 20:16:31 -05:00
Rob Clark	026a7223a6	gallium: support for native fence fd's This enables gallium support for EGL_ANDROID_native_fence_sync, for drivers which support PIPE_CAP_NATIVE_FENCE_FD. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-12-01 20:16:31 -05:00
Rob Clark	72cc1ca58d	gallium: wire up server_wait_sync This will be needed for explicit synchronization with devices outside the gpu, ie. EGL_ANDROID_native_fence_sync. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-01 20:16:31 -05:00
Rob Clark	0201f01dc4	egl: add EGL_ANDROID_native_fence_sync With fixes from Chad squashed in, plus fixes for issues that Rafael found while writing piglit tests. Signed-off-by: Rob Clark <robclark@freedesktop.org> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Tested-by: Chad Versace <chadversary@chromium.org>	2016-12-01 10:57:35 -08:00
Rob Clark	21b1acfcfe	dri: extend fence extension to support native fd fences Required to implement EGL_ANDROID_native_fence_sync. Signed-off-by: Rob Clark <robclark@freedesktop.org> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Tested-by: Chad Versace <chadversary@chromium.org>	2016-12-01 10:57:35 -08:00
Rob Clark	2ba4c7e154	egl: un-fallthrough sync attr parsing Doesn't work so well when you start having more than one possible attrib. Prep-work for next patch. Signed-off-by: Rob Clark <robdclark@gmail.com> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Tested-by: Chad Versace <chadversary@chromium.org>	2016-12-01 10:57:24 -08:00
Rob Clark	cce04a4630	egl: initialize SyncCondition after attr parsing Reduce the noise in the next patch. For EGL_SYNC_NATIVE_FENCE_ANDROID the sync condition is conditional on EGL_SYNC_NATIVE_FENCE_FD_ANDROID attribute. Signed-off-by: Rob Clark <robclark@freedesktop.org> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Tested-by: Chad Versace <chadversary@chromium.org>	2016-12-01 10:52:55 -08:00
Tim Rowley	05f35a868c	tgsi: store writes_primid when scanning tgsi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-01 11:33:01 -06:00
Ilia Mirkin	7c16552f8d	mesa: only verify that enabled arrays have backing buffers We were previously also verifying that no backing buffers were available when an array wasn't enabled. This is has no basis in the spec, and it causes GLupeN64 to fail as a result. Fixes: `c2e146f487` ("mesa: error out in indirect draw when vertex bindings mismatch") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-12-01 06:35:13 -05:00
Eric Anholt	51244859e3	vc4: Avoid false scheduling dependencies for LOAD_IMMs. Noticed in shaders with branching, where we ended up scheduling delay slots near the start of a block for the uniforms reset setup. total instructions in shared programs: 93970 -> 93951 (-0.02%) instructions in affected programs: 3117 -> 3098 (-0.61%) 3DMMES performance +0.423087% +/- 0.133521% (n=9,10)	2016-11-30 19:58:09 -08:00
Eric Anholt	6c34084d8e	vc4: Try to schedule QIR instructions between writing to and reading math. This helps us get the delay slots between SFU writes and reads filled. total instructions in shared programs: 94494 -> 93970 (-0.55%) instructions in affected programs: 59206 -> 58682 (-0.89%) 3DMMES performance +1.89967% +/- 0.157611% (n=10,9)	2016-11-30 19:58:09 -08:00
Eric Anholt	d182740ac8	vc4: Improve interleaving of texture coordinates vs results. The latency_between was trying to handle the delay between the coordinate write ("before") and the corresponding sample read ("after"), but we were handing in the two instructions swapped. This meant that we tried to fit things between a tex_s and its preceding tex_result. This made us only interleave normal texture coordinates by accident, and pessimized UBO reads by pushing the tex_result collection earlier until there was nothing but it (and then its preceding coordinate setup) left. In addition to latency reduction, things end up packing better (probably due to reduced live ranges of the texture results): total instructions in shared programs: 98121 -> 94775 (-3.41%) instructions in affected programs: 91196 -> 87850 (-3.67%) 3DMMES performance +1.15569% +/- 0.124714% (n=8,10)	2016-11-30 19:58:09 -08:00
Eric Anholt	1f9daf7cd1	vc4: Fix stray "." on no-op MUL packs. This happened when the PM bit was set for R4 unpacks, where the MUL pack was NOP.	2016-11-30 19:58:09 -08:00
Eric Anholt	98d7e87488	vc4: Allow merging instructions with SF set where the other writes NOP. I'm not sure how I managed to write the SF merge code (`7d8b79f398`) without allowing merges with NOPs. Everything we try to merge with will have a NOP on one or the other side of the instruction, and that's why that commit showed no benefit. total instructions in shared programs: 99347 -> 95128 (-4.25%) instructions in affected programs: 91906 -> 87687 (-4.59%) 3DMMES performance +2.57105% +/- 0.135276% (n=6,8)	2016-11-30 19:58:09 -08:00
Eric Anholt	8e5ec33f11	vc4: In a loop break/continue, jump if everyone has taken the path. This should be a win for most loops, which tend to have uniform control flow. More importantly, it exposes important information to live variables: that the break/continue here means that our jump target may have access to values that were live on our input. Previously, we were just setting the exec mask and letting control flow fall through, so an intervening def between the break and the end of the loop would appear to live variables as if it screened off the variable, when it didn't actually. Fixes a regression in glsl-vs-loop-redundant-condition.shader_test when a perturbing of register allocation caused a live variable to get stomped. Cc: 13.0 <mesa-stable@lists.freedesktop.org>	2016-11-30 19:58:09 -08:00
Ilia Mirkin	fda1d0187d	anv: expose support for VK_KHR_sampler_mirror_clamp_to_edge This is already supported in genX_state.c, expose the extension string. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-11-30 20:49:04 -05:00
Jason Ekstrand	27433b26b1	anv/cmd_buffer: Actually use the stencil dimension In an attempt to fix 3DSTATE_DEPTH_BUFFER for stencil-only cases, I accidentally kept setting the SurfaceType to 2D in the stencil-only case thanks to a copy+paste error. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-11-30 17:42:42 -08:00
Ilia Mirkin	ef59cb0820	swr: add streamout buffer offset into pBuffer pointer The buffer_size does not take the offset into account. Just add the offset into the pointer which lines up the structures much better. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-30 20:36:03 -05:00
Ilia Mirkin	3d837a8871	swr: fix assertion for max number of so targets The number has to be less than or equal to the max, not just less than. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-30 20:36:00 -05:00
Ilia Mirkin	02b2efa5eb	swr: properly report max number of SO components The components count the number of individual values, not the number of slots. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-30 20:35:56 -05:00
Ilia Mirkin	ab3bbe06ed	swr: turn off queries around blits Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-30 20:35:53 -05:00
Ilia Mirkin	d8ce8acdfa	swr: don't advertise stream pause/resume There is no support for resuming streamout. Furthermore, this also controls glDrawTransformFeedback functionality which requires the same ability to query how many primitives were sent out of TF. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-30 20:35:43 -05:00
Ilia Mirkin	632c11e857	swr: fix range computation for instanced client-side arrays We need to take the instance divisor and number of instances into account for instanced client-side arrays, rather than the vertex parameters. Loosely based on the comparable nvc0 logic. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-30 20:35:33 -05:00
Ilia Mirkin	3b736acf1b	swr: [rasterizer memory] assert when trying to convert an unknown format Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-30 20:35:16 -05:00
Ilia Mirkin	763c015ce5	swr: remove warning about multi-layer surfaces We now support clearing these, and actually rendering to multiple layers would require GS support, which will fail in much more spectacular ways for now. Once that is hooked up, there won't be anything else to do here. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-30 20:35:06 -05:00
Ilia Mirkin	a9d292f5bd	swr: [rasterizer core] don't attempt to load another RTAI when storing Since we don't pass a renderTargetArrayIndex in, and the current hot tile may be for a different index, we may end up loading the RTAI=0 into the hot tile for no reason. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-30 20:34:55 -05:00
Marek Olšák	77014a0ad3	radeonsi: document a CP DMA bug that doesn't need a workaround yet This one is easy to miss, because it's not documented in any internal doc. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-01 02:16:51 +01:00
Marek Olšák	bacf9b4e73	radeonsi: apply the double EVENT_WRITE_EOP workaround to VI as well Internal docs don't mention it, but they also don't mention that the bug has been fixed (like other CI bugs fixed in VI). Vulkan does this too. v2: also update r600_gfx_write_fence_dwords Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-12-01 02:16:51 +01:00
Marek Olšák	a816c7fe07	radeonsi: add a tess+GS hang workaround for VI dGPUs ported from Vulkan Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-01 02:16:51 +01:00
Marek Olšák	da7453666a	radeonsi: don't apply the Z export bug workaround to Hainan not needed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-01 02:16:51 +01:00
Marek Olšák	78c4528ae7	radeonsi: apply a tessellation bug workaround for SI Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-01 02:16:51 +01:00
Marek Olšák	72e46c9889	radeonsi: apply a TC L1 write corruption workaround for SI Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-01 02:16:51 +01:00
Marek Olšák	72d48fcd8e	radeonsi: apply a multi-wave workgroup SPI bug workaround to affected CIK chips All codepaths are handled except for clover. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-01 02:16:51 +01:00
Marek Olšák	ec36c63b4f	radeonsi: consolidate max-work-group-size computation The next commit will need this. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-01 02:16:51 +01:00
Timothy Arceri	966567aa12	mesa: reset linked_stages bitmask when re-linking `34953f8907` added this bitmask but it wasn't being reset when a program was relinked. If a stage was removed from the new program then it could case a crash as we expect the linked shader for that stage to not be null. Fixes crashes in: ESEXT-CTS.tessellation_shader.single.xfb_captures_data_from_correct_stage ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98917	2016-12-01 10:24:16 +11:00
Rob Clark	45eef9af03	freedreno/a5xx: fix negative branches Looks like immed branch offset size increased again.. making what we think is a small negative number look to hw like a huge positive number. And things go badly when shader tries to jump to hyperspace. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-30 17:32:54 -05:00
Rob Clark	ef30e91fe6	freedreno: fix android build with a5xx Android doesn't build all the files that normal linux/autotools build does (mainly standalond ir3_compiler).. but possibly we should pull C_SOURCES + aNxx_SOURCES into a single variable picked up by both Android.mk and Makefile.am? (Suggested by Rob H.) Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-30 17:32:54 -05:00
Rob Clark	8b6f8f2576	freedreno/a5xx: fix discard Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-30 17:32:54 -05:00
Ville Syrjälä	676c0cf287	anv: Prefer in-tree headers to out-of-tree headers Set the include paths to consider in-tree headers before out-of-tree headers. Avoids the build failing due to stale headers being present in $prefix. Previosuly 'make -ki install' or something similar was required to update the out-of-tree headers to allow the build to succeed. Also avoids having to rebuild the entire thing after every 'make install'. Cc: Rob Clark <robdclark@gmail.com> Cc: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-11-30 20:01:00 +02:00
Rob Clark	946cf4eb68	freedreno/a5xx: initial support Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-30 12:35:49 -05:00
Rob Clark	fcba3046e1	freedreno: update generated headers Pull in a5xx Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-30 12:25:48 -05:00
Rob Clark	8c56789f60	freedreno: make gmem tile size alignment configurable a5xx seems to prefer 64 pixel alignment, in at least some cases. Make this configurable per generation. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-30 12:25:48 -05:00
Rob Clark	728e2c4d38	freedreno/ir3: don't offset inloc by 8 On a3xx/a4xx, the SP_VS_VPC_DST_REG.OUTLOCn is offset by 8, so we used to add this offset into fs->inputs[n].inloc. But a5xx drops this extra offset-by-8. So instead make inloc zero based and add the offset when we emit OUTLOCn values (for the gen's that need the offset). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-30 12:25:48 -05:00
Rob Clark	7a59157287	freedreno/a3xx: use new shader linkage helper Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-30 12:25:48 -05:00
Rob Clark	98c83b5d1c	freedreno/a4xx: use new shader linkage helper Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-30 12:25:48 -05:00
Rob Clark	1be5670c8d	freedreno/ir3: add new helper for shader linkage Helps simplify things on a5xx, where pos/psize get added to the vs-out map. And anyways, simplifies a3xx and a4xx. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-30 12:25:48 -05:00
Nicolai Hähnle	f60374aa68	st/mesa: skip lower_output_reads when possible Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-30 09:10:02 +01:00
Nicolai Hähnle	0a58b258ca	st/glsl_to_tgsi: swizzle PROGRAM_OUTPUTs correctly in src_register translation This is required for reading directly from fragment shader stencil and depth outputs. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-30 09:09:59 +01:00
Nicolai Hähnle	611166b8ed	gallium: add PIPE_CAP_TGSI_CAN_READ_OUTPUTS Drivers that support this benefit by saving one lowering pass in the GLSL-to-TGSI conversion. radeonsi already supports this because all outputs are stored in temporary variables before the export (except for TCS outputs, which have always been readable in TGSI anyway due to their special semantics). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-30 09:09:50 +01:00
Bas Nieuwenhuizen	abc887faa1	ac/nir: Fix out of bounds array access. With nir_intrinsic_ssbo_atomic_comp_swap we run out of params. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-30 07:09:38 +01:00
Kristian H. Kristensen	d3d7cab812	aubinator: Add support for enum types Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	7fc659d8d5	intel/genxml: Fix ksp for INTERFACE_DESCRIPTOR_DATA This one was split across two dwords as "Kernel Start Pointer" and "Kernel Start Pointer High", which looks like it works when the driver only accesses "Kernel Start Pointer". This breaks, of course, with BO offsets > 4G. Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	99e573b4e0	intel/genxml: Use enum 3D_Logic_Op_Function where applicable Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	374d19ac00	intel/genxml: Use blend function and factor enums where applicable Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	09fe8ad010	intel/genxml: Use enum 3D_Vertex_Component_Control where applicable Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	54e71e5851	intel/genxml: Use enum 3D_Stencil_Operation where applicable Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	193c1b72e0	intel/genxml: Use enum SURFACE_FORMAT where applicable Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	0799022bf9	intel/genxml: Use enum 3D_Prim_Topo_Type where applicable Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	993babc014	intel/genxml: Use 3D_Compare_Function for gen8+ test functions When the state fields where shuffled around for gen8, the compare function enums were downgraded to just uints. Change them to enum 3D_Compare_Function. Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	fc2225b1af	intel/genxml: Emit genxml enums as C enums The previous commits got rid of any clashes between #defines and enum values and we can now emit the genxml enums as debugger friendly C enums. Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	8fc74b879e	intel/genxml: Remove duplicate COMPAREFUNCTION values These values were defined both as an enum and as inline values. Remove the inline values and reference the 3D_Compare_Function enum instead. Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	5814fc1bb7	intel/genxml: Allow referencing enums in type attributes This lets us reference enums in the type attribute of a field. Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	3b6b6f6463	anv: Emit cherryview SF state without including gen9_pack.h Cleaner this way and we avoid including gen9_pack.h when we compile with gen8_pack.h. We also avoid the if (cherryview) condition for non-gen8 gens that don't need it. Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	908febcf21	anv: Don't include two different pack headers The batch chain logic only needs the pre-gen8 size of MI_BATCH_BUFFER_START, which seems like something we can make a special case for. The other two gen7 references, MI_BATCH_BUFFER_END and MI_NOOP, are the same on all gens. Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	be9c2ab23b	intel/genxml: Move enums above structs We'll need to define them before we can reference them in structs and instructions. Enums have no dependencies, so move them first in the file. Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Kristian H. Kristensen	ce26486115	genxml: Add values for Barycentric Interpolation Mode Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 22:02:49 -08:00
Ilia Mirkin	ed0b3cbd09	anv: remove per-sample shading from TODO This was done some time ago. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-11-30 00:17:56 -05:00
Ilia Mirkin	be92b3f49d	anv: clean up VkPhysicalDeviceFeatures list Remove duplicate .alphaToOne, add missing .shaderResourceMinLod, and reorder a few entries to match their vulkan.h order. All the sparse features are still left out entirely. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-11-30 00:17:56 -05:00
Michel Dänzer	550cd272b4	vulkan/wsi/x11: Destroy Present event context when destroying swapchain Without this, the X server may accumulate stale Present event contexts if a client creates and destroys multiple swapchains using the same window. v2: Based on Chris Wilson's review: * Use xcb_present_select_input_checked so that protocol errors generated by old X servers can be handled gracefully * Use xcb_discard_reply() instead of free(xcb_request_check()) v3: Rebased on top of this code having been refactored out of anv Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-30 12:31:25 +09:00
Timothy Arceri	2ea021a1eb	glsl: use linked_shaders bitmask to iterate stages for subroutine fields This should be faster than looping over every stage and null checking, but will also make the code a bit cleaner when we switch to getting more fields from gl_program rather than from gl_linked_shader as we can just copy the pointer and not need to worry about null checking then copying. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-11-30 14:13:52 +11:00
Timothy Arceri	6d3458cbfb	mesa: optimise interleaved sso validation Now that we have a linked_stages bitfield we can use this to check if the program is used at a later stage. This change is also required to be able to use gl_program rather than gl_shader_program in the CurrentProgram array. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-11-30 14:13:52 +11:00
Timothy Arceri	34953f8907	mesa/glsl: add bitmask to track stages a program was linked against This will be used to enable us to store the current gl_program rather than gl_shader_program in the gl_pipline_object allowing us to simplify handing of validation. Also we should not be depending on _LinkedShader for this information as it may contain shaders from a failed linking attempt rather than the current program still in use. We could also use this mask to iterate over the stages during linking with _mesa_bit_scan() rather then the current method of NULL checking each stage. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-11-30 14:13:52 +11:00
Ilia Mirkin	ddf0f097e7	swr: [rasterizer jit] use signed integer representation for logic op Instead of (incorrectly) biasing the snorm value to make it look like a unorm, just use signed integer math. This fixes arb_color_buffer_float-render GL_RGBA8_SNORM Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-29 20:55:00 -05:00
Ilia Mirkin	8ed703cfa6	swr: add missing rgbx8_srgb variant Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-29 20:54:57 -05:00
Ilia Mirkin	d6a06228a6	swr: reorder renderable formats, add grouping comments Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-29 20:54:54 -05:00
Ilia Mirkin	53ca06be8f	swr: use util_copy_framebuffer_state helper Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-29 20:54:50 -05:00
Ilia Mirkin	86f7932b1e	swr: enable cubemap arrays Everything is in place for these. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-29 20:54:46 -05:00
Ilia Mirkin	8dd9853516	swr: rearrange caps into limits/supported/unsupported groups I find this a lot more readable and compact - much easier to scan through the list and see what's on and what's off. No functional change intended. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-29 20:54:43 -05:00
Ilia Mirkin	9f568e5db1	swr: only store up to the LOD size Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-29 20:54:36 -05:00
Tim Rowley	f7ab0e4b7e	swr: [rasterizer common] add SwrTrace() and macros Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-29 19:36:46 -06:00
Marek Olšák	662b9c24d0	radeonsi: don't fetch 8 dwords for samplerBuffer and imageBuffer The compiler doesn't shrink s_load_dwordx8, so we always wasted 4 SGPRs. Also, the extraction of the descriptor created some really ugly asm code with lots of VALU bitwise ops and v_readfirstlane. Totals from affected shaders: SGPRS: 13880 -> 13253 (-4.52 %) VGPRS: 15200 -> 15088 (-0.74 %) Code Size: 499864 -> 459816 (-8.01 %) bytes Max Waves: 1554 -> 1564 (0.64 %) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	dbbdc6bb5a	radeonsi: disable XNACK to free 2 SGPRs on APUs My LLVM commit disables it for dGPUs, but not APUs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	274fb601c2	radeonsi: count and report temp arrays in scratch separately v2: only do this if debug output of shader dumping is enabled Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-11-29 23:52:31 +01:00
Marek Olšák	a91add9369	radeonsi: don't try to eliminate trivial VS outputs for PS and CS PS and CS don't have any param exports, so it's a no-op. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	5e5573b1bf	radeonsi: disable RB+ blend optimizations for dual source blending This fixes dual source blending on Stoney. The fix was copied from Vulkan. The problem was discovered during internal testing. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	ff50c44a5f	radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending copied from Vulkan Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	87b208a54e	radeonsi: always set all blend registers better safe than sorry Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	fc9f7fc9d0	radeonsi: set the smallest possible CB_TARGET_MASK better safe than sorry; set_framebuffer_state always makes this dirty Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	ea43d0b5e8	radeonsi: don't print bodies of header-only packets Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	7abd94c9b0	radeonsi: print unknown registers with correct formatting Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	9e1dc10432	ddebug: fix hang detection with deferred flushes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Dave Airlie	048143b9d9	radv: set spi_baryc_cntl.pos_float_location to 0 This fixes: dEQP-VK.pipeline.multisample_interpolation.offset_interpolate_at_sample_position.* This should probably be 2 when sample shading is enabled, but I'm not sure. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-29 22:48:23 +00:00
Dave Airlie	f3a3fea973	radv: force persample shading when required. We need to force persample shading when a) shader uses sample_id b) shader uses sample_position c) shader uses sample qualifier. Also since ps_iter_samples can now change independently of the rasterizer samples we need to move setting the regs more often. This fixes: dEQP-VK.pipeline.multisample_interpolation.centroid_interpolate_at_consistency.* dEQP-VK.pipeline.multisample_interpolation.centroid_qualifier_inside_primitive.137_191_1.* dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_distinct_values.* dEQP-VK.pipeline.multisample_interpolation.sample_qualifier_distinct_values.128_128_1.* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-29 22:48:03 +00:00
Dave Airlie	6a62026dd4	nir: print var binding in dumps. This only useful for spir-v shaders, but I keep finding myself having to add it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-29 22:07:13 +00:00
Eric Engestrom	fae5e1dc74	docs: fix small typo Fixes: `ba28f2136f` ("docs: add note about r-b/other tags when resending") Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-29 22:02:57 +00:00
Matt Turner	218fec66cc	i965/sched: Schedule trivial blocks. In commit `45cd76e342` schedule_instructions(bblock_t *) began setting bblock_t::cycle_count, but that function was not called on trivial blocks. Remove the code to skip trivial blocks so that cycle_count is set. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-11-29 11:53:36 -08:00
Matt Turner	cab0952d4b	i965/sched: Make 'time' a local variable. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-11-29 11:53:36 -08:00
Matt Turner	b0156702fa	i965/cfg: Initialize bblock_t::cycle_count. schedule_instructions(bblock_t *) isn't called on blocks with a single instruction, and since it is the only thing that set cycle_count, cycle_count would be uninitialized. A non-empty block with bblock_t::cycle_count == 0 is arguably a bug. That'll be fixed in the next commit. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-11-29 11:53:36 -08:00
Matt Turner	ca9e30e002	i965/cfg: Initialize cfg_t::cycle_count. This reverts commit `b4001af174`. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-11-29 11:53:36 -08:00
Bas Nieuwenhuizen	b8c9ce4459	ac/nir: Fix accessing an unitialized value. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-29 20:13:28 +01:00
Bas Nieuwenhuizen	029e8ff81c	radv: Initialize the shader_stats_dump flag. Meta was using it before it was set. I suspect we typically don't want to dump meta shaders, so just set it to false in the beginning. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-29 20:13:28 +01:00
Eric Anholt	d40a3212ae	vc4: Add a note for the future about texture latency calculation. Debugging a shader-db reported cycle count regression from the tex coalescing, I eventually figured out that the texture latencies were totally bogus. Really fixing it will probably involve mirroring vc4_qir_schedule.c's texture fifo management here.	2016-11-29 09:01:23 -08:00
Eric Anholt	4690a93b12	vc4: Add support for coalescing ALU ops into tex_[srtb] MOVs. This isn't as complete as I would like (can't merge interpolation because of the implicit r5 dependency, doesn't work with control flow), but this was cheap and easy. Improves 3DMMES Taiji performance by 1.15353% +/- 0.299896% (n=29, 16) total instructions in shared programs: 99810 -> 99059 (-0.75%) instructions in affected programs: 10705 -> 9954 (-7.02%)	2016-11-29 08:52:50 -08:00
Eric Anholt	f4baf80993	vc4: Restructure VPM write optimization into two passes. For texturing, there won't be a fixed limit on how many writes there are, so we need to compute uses up front.	2016-11-29 08:38:59 -08:00
Eric Anholt	a025983dd9	vc4: Make qir_for_each_inst_inorder() safe against removal. The dead code elimination wants it to be safe, and I actually got segfaults due to it being unsafe with the new coalescing pass.	2016-11-29 08:38:59 -08:00
Eric Anholt	27544ea8d3	vc4: Split optimizing VPM writes from VPM reads. The VPM write logic will be basically the same as the texture coordinate write logic we need, and it's not really related to the VPM read logic other than the reuse of the use_count array.	2016-11-29 08:38:59 -08:00
Eric Anholt	d4c20e82ae	vc4: Restructure texture insts as ALU ops with tex_[strb] as the dst. For now we're still just generating MOVs, but this will let us fold into other ops in the future. No difference on shader-db.	2016-11-29 08:38:59 -08:00
Eric Anholt	314f0c57e4	vc4: Refactor qir_get_op_nsrc(enum qop) to qir_get_nsrc(struct qinst *). Every caller was dereffing the qinst, and this will let us make the number of sources vary depending on the destination of the qinst so that we can have general ALU ops that store to tex_[strb] and get an implicit uniform.	2016-11-29 08:38:59 -08:00
Eric Anholt	51087327f2	vc4: Replace the qinst src[] with a fixed-size array. This may have made a tiny bit of sense when we had one 4-arg inst per shader, but if we only ever put 2 things in, having a pointer to 2 things almost every instruction is pointless indirection.	2016-11-29 08:38:59 -08:00
Eric Anholt	a220f1b5a9	vc4: Remove qir_inst4(). This was used originally for unorm4x8 packs, but we now represent those as a series of packed movs.	2016-11-29 08:38:59 -08:00
Ilia Mirkin	7a8def8c18	anv: bump the texture gather offset limits This matches what NVIDIA and AMD hardware expose, as well as what Intel hardware supports. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 07:44:01 -08:00
Ilia Mirkin	62b8dbf35e	i965/gen7: expose larger gather offsets This matches the capabilities of the hardware. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 07:44:01 -08:00
Ilia Mirkin	4f2d1d6ea7	i965: support constant gather offsets larger than 4 bits Offsets that don't fit into 4 bits need to force gather_po to be selected. Adjust the logic so that this happens. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 07:44:01 -08:00
Jason Ekstrand	faf20df143	i965/fs: Refactor handling of constant tg4 offsets Previously, we had an OFFSET_VALUE source for logical texture instructions that was intended to mean exactly what it says, "offset". In reality, we only fully used it for tg4 offsets. We used offset_value.file == IMM to mean, "you have a constant offset, go look in instr->offset" and didn't actually use the contents of the register at all in that case except for in nir_emit_texture where we used it as a temporary before we copy it into instr->offset. This commit renames OFFSET_VALUE to TG4_OFFSET and restricts its usage to indirect tg4 offsets only. The nir_emit_texture code is refactored so that we explicitly build a header_bits value which is placed in instr->offset and the constant offset values (both for tg4 and regular texture operations) are used to construct header_bits and don't go through the offset source at all. Finally, we stop passing offset_value in to lower_sampler_logical_send_gen5 because we can't do indirect offsets until gen7 anyway. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-29 07:44:01 -08:00
Bas Nieuwenhuizen	05533ce418	radv: Use different intrinsic for ubo loads. Not sure about the deprecation path, but this intrinsic can be lowered to SMEM loads. This results in a significant Talos performance improvement. v2: Fix for LLVM attribute changes. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-29 08:36:16 +01:00
Timothy Arceri	0303201dfb	mesa: fix active subroutine uniforms properly `07fe2d565b` introduced a big hack in order to return NumSubroutineUniforms when querying ACTIVE_RESOURCES for <shader>_SUBROUTINE_UNIFORM interfaces. However this is the wrong fix we are meant to be returning the number of active resources i.e. the count of subroutine uniforms in the resource list which is what the code was previously doing, anything else will cause trouble when trying to retrieve the resource properties based on the ACTIVE_RESOURCES count. The real problem is that NumSubroutineUniforms was counting array elements as separate uniforms but the innermost array is always considered a single uniform so we fix that count instead which was counted incorrectly in `7fa0250f9`. Idealy we could probably completely remove NumSubroutineUniforms and just compute its value when needed from the resource list but this works for now. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org>	2016-11-29 15:29:51 +11:00
Jason Ekstrand	f469235a6e	anv/cmd_buffer: Remove the 1-D case from the HiZ QPitch calculation The 1-D special case doesn't actually apply to depth or HiZ. I discovered this while converting BLORP over to genxml and ISL. The reason is that the 1-D special case only applies to the new Sky Lake 1-D layout which is only used for LINEAR 1-D images. For tiled 1-D images, such as depth buffers, the old gen4 2-D layout is used and the QPitch should be in rows. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-28 20:17:29 -08:00
Jason Ekstrand	d4ef87c1bb	anv/cmd_buffer: Set the correct surface type for depth/stencil Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-11-28 20:17:16 -08:00
Ilia Mirkin	e6847f24f0	anv: enable drawIndirectFirstInstance This was already piped through in the CmdDraw(Indexed)Indirect handling. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-28 19:32:14 -08:00
Ilia Mirkin	d2280a007a	anv: expose depthBiasClamp, it is already set The gen7/8_cmd_buffer logic already sets the clamp, and it's piped through via the dynamic state. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-28 19:32:14 -08:00
Ilia Mirkin	e2c669a56b	anv: bump maxFramebufferLayers to 2048 This matches maxImageArrayLayers, as well as the same setting in the GL frontend. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-28 19:32:14 -08:00
Ilia Mirkin	76b97d544e	anv: enable storage image extended formats These are all regularly available in desktop GL, so the backend fully supports them. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-28 19:32:14 -08:00
Ilia Mirkin	a34f89c5e6	anv: expose imageCubeArray functionality This appears to be fully supported already. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-28 19:32:13 -08:00
Dave Airlie	eaf0768b8f	radv: set maxFragmentDualSrcAttachments to 1 Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-29 13:27:26 +10:00
Dave Airlie	f9ab60202d	anv: set maxFragmentDualSrcAttachments to 1 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-29 13:26:53 +10:00
Ilia Mirkin	e0fc18a435	swr: [rasterizer memory] only clear up to the LOD size Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-28 20:14:48 -05:00
Ilia Mirkin	2fca08e550	swr: [rasterizer memory] hook up stencil clears for ClearTile Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-28 20:14:48 -05:00
Ilia Mirkin	5582610ea1	swr: [rasterizer memory] add support for clearing Z32F_X32 and Z16 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-28 20:14:48 -05:00
Jason Ekstrand	6bc8bef1a1	intel/aubinator: Pull useful information from the AUB header This commit does two things. One is to pull useful and/or interesting information from the AUB file header and display it as a header above your decoded batches. Second, it is now capable of pulling the PCI ID from the AUB file comment left by intel_aubdump. This removes the need to use the --gen flag all the time. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-28 16:45:09 -08:00
Jason Ekstrand	da5ebeffdf	intel/aubinator: Wait to setup decoders until we parse the aub header This requires that a few more state bits become global. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-28 16:45:09 -08:00
Jason Ekstrand	e6c01fb17d	intel/aubinator: Rework handling of the --gen flag This makes it just store the pci_id instead of a struct pointer Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-28 16:45:09 -08:00
Jason Ekstrand	12f2eae7e7	intel/aubinator: Trust the packet size in the header for SUBOPCODE_HEADER We were reading from the "comment size" dword and incrementing by that amount. This never caused a problem because that field was always zero. However, experimenting with actual aub file comments indicates, the simulator seems to include the comment size in the packet size provided in the header. We should do the same. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-28 16:45:09 -08:00
Jason Ekstrand	89bb515e91	intel/aubinator: Add a get_offset helper The helper automatically handles masking for us so we don't have to worry about whether or not something is in the bottom bits. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-28 16:45:09 -08:00
Jason Ekstrand	318cf3ffa4	intel/aubinator: Fix the kernel start pointer for 3DSTATE_HS Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-28 16:45:09 -08:00
Jason Ekstrand	294daaa36f	intel/aubinator: Add a get_address helper This new helper is automatically handles 32 vs. 48-bit GTT issues. It also handles 48-bit canonical addresses on Broadwell and above. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-28 16:45:09 -08:00
Jason Ekstrand	d6cef32047	intel/aubinator: Properly handle batch buffer chaining The original aubinator that Kristian wrote had a bug in the handling of MI_BATCH_BUFFER_START that propagated into the version in upstream mesa. In particular, it ignored the "2nd level" bit which tells you whether this MI_BATCH_BUFFER_START is a subroutine call (2nd level) or a goto. Since the Vulkan driver uses batch chaining, this can lead to a very confusing interpretation of the batches. In some cases, depending on how things are laid out in the virtual GTT, you can even end up with infinite loops in batch processing. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-28 16:45:08 -08:00
Ilia Mirkin	0a5e1b02cf	swr: don't clear all dirty bits when changing so targets Among other things, blits would clear existing SO targets which would cause a bunch of updates from u_blitter to be missed. Fixes fbo-scissor-blit fbo, probably among many others. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-28 19:41:23 -05:00
Ilia Mirkin	8a70a4d984	swr: [rasterizer core] fix typo in scissor tile-alignment logic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-28 19:41:13 -05:00
Kenneth Graunke	15d3fc167a	anv: Fix cache UUID generation. I asked Emil to switch from 0 (success) vs. -1 (fail) to use a boolean in my review comments. The "not" went missing. Easy mistake, but the result is that nothing runs at all :) Fix whitespace while we're here too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-28 13:40:04 -08:00
Gwan-gyeong Mun	65ea559465	vulkan/wsi: Fix resource leak in success path of wsi_queue_init() It fixes leakage of pthread_condattr resource on wsi_queue_init() Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-11-28 21:11:25 +00:00
Gwan-gyeong Mun	b178652b41	anv: Update the teardown in reverse order of the anv_CreateDevice This updates releasing of resource in reverse order of the anv_CreateDevice to anv_DestroyDevice. And it fixes resource leak in pthread_mutex, pthread_cond, anv_gem_context. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-28 21:11:25 +00:00
Gwan-gyeong Mun	ca4706960c	anv: drop the return type for anv_queue_init() anv_queue_init() always returns VK_SUCCESS, so caller does not need to check return value of anv_queue_init(). Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-28 21:11:25 +00:00
Gwan-gyeong Mun	ecc618b0d8	anv: Add missing error-checking to anv_block_pool_init (v2) When the memfd_create() and u_vector_init() fail on anv_block_pool_init(), this patch makes to return VK_ERROR_INITIALIZATION_FAILED. All of initialization success on anv_block_pool_init(), it makes to return VK_SUCCESS. CID 1394319 v2: Fixes from Emil's review: a) Add the return type for propagating the return value to caller. b) Changed anv_block_pool_init() to return VK_ERROR_INITIALIZATION_FAILED on failure of initialization. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-28 21:11:25 +00:00
Chandu Babu Namburu	02bf1bbe6e	st/omx/dec/h264: consider POC as signed instead of unsigned picture order count can be a negative value Reviewed-by: Christian König <christian.koenig@amd.com>	2016-11-28 15:31:51 -05:00
Emil Velikov	7c277eae98	radv: don't return VK_SUCCESS if radv_device_get_cache_uuid() fails If radv_device_get_cache_uuid() fails result will be VK_SUCCESS as set by the radv_init_wsi() call above. Fixes: `d943839` (radv: Use library mtime for cache UUID.) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-11-28 19:51:31 +00:00
Emil Velikov	78707a15f2	radv: don't leak the fd if radv_physical_device_init() succeeds radv_amdgpu_winsys_create() does not take ownership of the fd, thus we end up leaking it as we return with VK_SUCCESS. Cc: Dave Airlie <airlied@redhat.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-11-28 19:51:22 +00:00
Emil Velikov	a1cf494f77	anv: don't leak memory if anv_init_wsi() fails brw_compiler_create() rzalloc-ates memory which we forgot to free. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-28 19:47:34 +00:00
Emil Velikov	3af8171547	anv: don't double-close the same fd Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-28 19:47:28 +00:00
Emil Velikov	2f1a1f589e	configure.ac: remove no longer used TIMESTAMP_CMD Good bye, you shall not be missed. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-28 19:47:24 +00:00
Emil Velikov	2d42a34566	anv: automake: don't generate anv_timestamp.h No longer used as of last commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-28 19:47:17 +00:00
Emil Velikov	83548e1292	anv: Use library mtime for cache UUID. Inspired by a similar commit for radv. Rather than recomputing the timestamp on each make invocation, just fetch it at runtime. Thus we no longer get the constant rebuild of anv_device.c and the follow-up libvulkan_intel.so link, when nothing has changed. I.e. using make && make install is a little bit faster. v2: Use bool return type (Ken). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-28 19:46:45 +00:00
Emil Velikov	de138e9ced	anv: Store UUID in physical device. Port of an equivalent commit for radv. v2: Move the call just after MMAP_VERSION (Ken). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-28 19:46:05 +00:00
Emil Velikov	3f9397753b	isl: Make isl_finishme only warn once per call-site Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-28 19:12:49 +00:00
Emil Velikov	f3a1c17b96	radv: Make radv_finishme only warn once per call-site Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-11-28 19:12:48 +00:00
Emil Velikov	7feac8bdb9	anv: use do { } while (0) in the anv_finishme macro Use the generic construct instead of the currect GCC specific one. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-28 19:12:38 +00:00
Emil Velikov	6dae5be806	docs: add git tips how to do commit fixups and squash them Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-28 17:49:04 +00:00
George Kyriazis	ba28f2136f	docs: add note about r-b/other tags when resending [Emil Velikov: split from the typos fixes] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-28 17:48:50 +00:00
Andres Gomez	28158c3e54	docs: fix small typos in the submit patches page Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-28 17:46:12 +00:00
Emil Velikov	028d29b8b3	docs/releasing: use correct page title Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-28 17:46:12 +00:00
Emil Velikov	f9959ca92e	docs/releasing: correctly document touch-testing I've used an ancient version of the script which did not cover: - version expansion (cd mesa-* does not work) - --enable-glx-tls - EGL and es2* testing - Vulkan and DOTA2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-28 17:46:12 +00:00
Emil Velikov	a7a416f347	docs/release: drop references to patchwork The changes to release.sh have landed, so all we need is a recent checkout of xorg-utils. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-28 17:45:11 +00:00
Emil Velikov	5ce7a32068	docs: add news item and link release notes for 13.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-28 15:30:13 +00:00
Emil Velikov	ad7879bbb4	docs: add sha256 checksums for 13.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `2722144bed`)	2016-11-28 15:29:06 +00:00
Emil Velikov	bc5c299b4f	docs: add release notes for 13.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `c9e993ba13`)	2016-11-28 15:29:05 +00:00
Dave Airlie	09c0c17bc3	radv: fix 3D clears with baseMiplevel This fixes: dEQP-VK.api.image_clearing.clear_color_image.3d* These were hitting an assert as the code wasn't taking the baseMipLevel into account when minify the image depth. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-28 07:10:12 +00:00
Dave Airlie	020978af12	radv: brown-paper bag for a forgotten else. This fixes the fix: radv/ac/llvm: fix regression with shadow samplers fix Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-28 16:23:10 +10:00
Dave Airlie	b2e217369e	radv/ac/llvm: fix regression with shadow samplers fix This fixes `b56b54cbf1`: radv/ac/llvm: shadow samplers only return one value It makes sure we only do that for shadow sampling, as opposed to sizing requests. Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-28 15:43:59 +10:00
Dave Airlie	b56b54cbf1	radv/ac/llvm: shadow samplers only return one value. The intrinsic engine asserts in llvm due to this. Reported-by: Christoph Haag <haagch+mesadev@frickel.club> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-27 23:05:01 +00:00
Dave Airlie	9838db8f64	radv/si: fix optimal micro tile selection The same fix was posted for radeonsi, so port it here. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-27 23:03:20 +00:00
Emil Velikov	a025c5b2c7	radv: honour the number of properties available Cap up-to the number of properties available while copying the data. Otherwise we might crash and/or leak data. Cc: Dave Airlie <airlied@redhat.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-27 23:03:01 +00:00
Mun Gwan-gyeong	0a27dd458b	radv: drop the return type for radv_queue_init() radv_queue_init() always returns VK_SUCCESS, so caller does not need to check return value of radv_queue_init(). Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-27 23:00:57 +00:00
Rob Clark	8cb965b112	freedreno: fix slice size for imported buffers Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-27 17:26:05 -05:00
Rob Clark	f4ffe2786b	freedreno/a3xx: make _emit_const() static Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-27 17:26:05 -05:00
Rob Clark	b8b800d18a	freedreno/a4xx: make _emit_const() static Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-27 17:26:05 -05:00
Jason Ekstrand	af98c6c31d	anv/pipeline: Make is_dual_src_blend_factor inline It's not used on gen8+ so it causes unused function warnings. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-26 11:58:59 -08:00
Jason Ekstrand	e41f7c3063	anv/pipeline: Make the temp blend attachment state pointer const This fixes a "discards const" warning since blend is const. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-26 11:55:09 -08:00
Samuel Pitoiset	8fdb800bda	gm107/ir: optimize 32-bit CONST load to mov This is not allowed for indirect accesses because the source GPR might be erased by a subsequent instruction (WaR hazard) if we don't emit a read dep bar. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-26 19:05:11 +01:00
Samuel Pitoiset	948cce0196	gm107/ir: do not combine CONST loads This will allow to use MOV instead of LD. The main advantage is that MOV doesn't require a read dependency barrier while LD does, and so this will both reduce barriers pressure and the number of stall counts needed to read data from constant memory. This is currently only for user uniform accesses. I should do something similar when loading from the driver constant buffer but it seems like a bit tricky to handle for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-26 19:05:08 +01:00
Jason Ekstrand	fa6bbb5c00	anv/device: Remove a bogus finishme comment We've been properly detecting bit6 swizzling for a long time now.	2016-11-25 21:46:11 -08:00
Ben Widawsky	2a7db18890	i965: Enable fast clears for multi-lod On SKL (also fast clear is used for level 0, layer 0): Manhattan 3.0: 3.88434% +/- 0.814659% Manhattan 3.0 off: 3.25542% +/- 0.101149% Trex: 3.43501% +/- 0.31223% Trex off: 4.13781% +/- 0.0993569% ON BDW: Manhattan 3.0: 1.37079% +/- 0.571208% Manhattan 3.0 off: 1.74029% +/- 0.267499% v2 (Ben, Matt): Fix rebase error by removing the perf warning v3 (Topi): Rebased on top of revised eligibility logic Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:07 +02:00
Topi Pohjolainen	3aec6bce5b	i965: Allow single-sampled miptree to be resolved and shared Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:07 +02:00
Topi Pohjolainen	17d7c5a037	i965/gen8: Relax asserts prohibiting arrayed/mipmapped fast clears Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:07 +02:00
Topi Pohjolainen	544ed74315	i965: Use ISL for CCS layouts One can now also delete intel_get_non_msrt_mcs_alignment(). v2 (Jason): Do not leak aux buf but allocate only after getting ISL surfaces. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:07 +02:00
Topi Pohjolainen	96dbe765e1	i965: Resolve non-compressed fast clears prior layered rendering Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:07 +02:00
Topi Pohjolainen	dea8e7fb07	i965: Restrict fast color clear on first slice only Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:07 +02:00
Topi Pohjolainen	d41fc8dc9f	i965: Track fast color clear state in level/layer granularity Note that RESOLVED is not tracked in the map explicitly. Absence of item implicitly means RESOLVED state. v2: Added intel_resolve_map_clear() into intel_miptree_release() v3 (Jason): Properly handle the assumption of resolve map not containing any items with state RESOLVED. Removed unnecessary intel_miptree_set_fast_clear_state() call in brw_blorp_resolve_color() preventing intel_miptree_set_fast_clear_state() from asserting against RESOLVED. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:07 +02:00
Topi Pohjolainen	28dc3f6199	i965: Move fast clear state enumeration into resolve map Status is still tracked per miptree. Next patch will switch to resolve map per slice/level. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:07 +02:00
Topi Pohjolainen	6859d2ba2e	i965: Refactor check if color resolve is needed Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:07 +02:00
Topi Pohjolainen	ea2c419600	i965: Add plumbing for fast clear layer/level details Until now fast clear has been supported only for non-layered and non-mipmapped buffers. However, from gen8 onwards there is hardware support also for layered/mipmapped. Once this is enabled, fast clear operations target specific layer/level and call for the state to be tracked in the same granularity. This is the first step providing the details from callers to the state tracking. Patch introduces new interface for reading and writing the state hiding the upcoming bookkeeping changes in the call sites. There is bunch of sanity checks added that will be relaxed per hardware generation later on when the actual functionality is enabled. v2: Rebased on top current master setting the state in blorp_surf_for_miptree(). v3: Replace open-coded resolved check in surface state emission with intel_miptree_has_color_unresolved(). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:07 +02:00
Topi Pohjolainen	d07cf68a97	i965: Add interface for checking multiple slices if any is unresolved Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:07 +02:00
Topi Pohjolainen	17e6a214fd	i965: Provide slice details to renderbuffer fast clear state tracker This patch also introduces getter and setter for fast clear state preparing for tracking the state per slice. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:06 +02:00
Topi Pohjolainen	cec30a6669	i965: Split per miptree and per slice/level fast clear bits Currently the status bits for fast clear include the flag telling if non-multisampled mcs buffer should be used at all. Once the state tracking is changed to follow individual levels/layers one still needs to have the mcs enabling information in the miptree. Therefore simply split it out to its own boolean. Possible follow-up work is to combine disable_aux_buffers and no_ccs into single enum. v2 (Jason): Changed no_msrt_mcs to no_ccs and updated comment Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:06 +02:00
Topi Pohjolainen	9c7717c066	i965: Provide slice details to color resolver v2: Make intel_miptree_resolve_color() take start layer and layer count. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:06 +02:00
Topi Pohjolainen	12010b9226	i965: Add new interface for full color resolves Upcoming patches will introduce fast clear in level/layer granularity like the driver does already for depth/hiz. This patch introduces equivalent full resolve option. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:06 +02:00
Topi Pohjolainen	71d48d6f42	i965: Refactor lossless compression state tracking Essentially this moves fast clear state update away from surface state setup into brw_postdraw_set_buffers_need_resolve() that gets called just after draw submission. Calling intel_miptree_used_for_rendering() can be drop for gen6 and earlier as it is no-op. v2: Rebased on top current master setting the state in blorp_surf_for_miptree(). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 16:57:06 +02:00
Andres Gomez	b27be186cb	Revert "glsl: allow layout qualifier overrides with ARB_shading_language_420pack" This reverts commit `aaa69c79cd`. The commit was erroneous because the ast_layout_expression class is meant to hold a list used for an after check that all the declared values for a layout-qualifier-name are consistent. Therefore, the check for the possibility of duplicated values was previously fixed to happen much sooner, in the GLSL parser and the merge of layout qualifiers, and the process_qualifier_constant method only needs to check that the values are consistent. By now, those layout-qualifier-name represented as a ast_layout_expression are "max_vertices", "invocations", "vertices", "local_size_[x\|y\|z]" and "xfb_stride". From page 40 (page 46 of the PDF) of the GLSL 1.50 spec: " All geometry shader output layout declarations in a program must declare the same layout and same value for max_vertices." From page 44 (page 50 of the PDF) of the GLSL 4.00 spec: " If an invocation count is declared, all such declarations must specify the same count." From page 47 (page 53 of the PDF) of the GLSL 4.00 spec: " All tessellation control shader layout declarations in a program must specify the same output patch vertex count." From page 60 (page 66 of the PDF) of the GLSL 4.30 spec: " Also, if such a layout qualifier is declared more than once in the same shader, all those declarations must set the same set of local work-group sizes and set them to the same values; otherwise a compile-time error results. If multiple compute shaders attached to a single program object declare local work-group size, the declarations must be identical; otherwise a link-time error results." From page 73 (page 79 of the PDF) of the GLSL 4.40 spec: " While xfb_stride can be declared multiple times for the same buffer, it is a compile-time or link-time error to have different values specified for the stride for the same buffer." Fixes GL44-CTS.enhanced_layouts.xfb_duplicated_stride Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:31 +02:00
Andres Gomez	2a47c83d7e	Revert "glsl: geom shader max_vertices layout must match." This reverts commit `4c86399378`. The commit was erroneous because the ast_layout_expression class was created to hold a list of values for a layout-qualifier-name which is allowed to appear in more than one expression in the same shader/program but not to hold different values. In other words, the list is used for an after check that all the declared values for a layout-qualifier-name are consistent. Therefore, the values stored must match always, not just for "max_vertices" or any other eventual layout-qualifier-name. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:31 +02:00
Andres Gomez	e5041c6409	glsl: push layout-qualifier-name values from variable declarations to global After the previous modifications in the merging of the layout-qualifier-name values, we no longer push the final value in a declaration to the global values. This regression happens because we don't call for merging on the right-most layout qualifier of a declaration which is also the overriding one in case of multiple appearances. Now, we add a new method to push these values to the global ones and we call for this just after all the layout-qualifier collapsing has happened in a declaration. This simplifies how this was working in two ways; we make a clear differentiation of when we are pushing this to the global values since before it was mixed in the merging call and we only run this once all the processing for layout-qualifiers in a declaration has happened. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:30 +02:00
Andres Gomez	5132d0c7b6	glsl: simplified error checking for duplicated layout-qualifiers The GLSL parser has been simplified to check for the needed GL_ARB_shading_language_420pack extension just when merging the qualifiers in the proper cases. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:30 +02:00
Andres Gomez	93f90d7795	glsl: simplified ast_type_qualifier::merge_into_[in\|out]_qualifier API Since we modified the way in which multiple repetitions of the same layout-qualifier-name in a single declaration collapse into the ast_type_qualifier class, we can simplify the merge_into_[in\|out]_qualifier APIs through removing the create_node parameter. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:30 +02:00
Andres Gomez	be54a58da3	glsl: ignore all but the rightmost layout qualifier name from the rightmost layout qualifier From page 46 (page 52 of the PDF) of the GLSL 4.20 spec: " More than one layout qualifier may appear in a single declaration. If the same layout-qualifier-name occurs in multiple layout qualifiers for the same declaration, the last one overrides the former ones." Consider this example: " #version 150 #extension GL_ARB_shading_language_420pack: enable layout(max_vertices=2) layout(max_vertices=3) out; layout(max_vertices=3) out;" Although different values for "max_vertices" results in a compilation error. The above code is valid because max_vertices=2 is ignored. Hence, when merging qualifiers in an ast_type_qualifier, we now ignore new appearances of a same layout-qualifier-name if the new "is_multiple_layouts_merge" parameter is on, since the GLSL parser works in this case from right to left. In addition, any special treatment for the buffer, uniform, in or out layout defaults has been moved in the GLSL parser to the rule triggered just after any previous processing/merging on the layout-qualifiers has happened in a single declaration since it was run too soon previously. Fixes GL44-CTS.shading_language_420pack.qualifier_override_layout Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:30 +02:00
Andres Gomez	b95793b9a7	glsl: refactor duplicated validations between 2 layout-qualifiers Several layout-qualifier validations are duplicated in the merge_qualifier and validate_in_qualifier methods. We would rather have them refactored into single calls. Suggested by Timothy. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:30 +02:00
Andres Gomez	ae1ce8ecd3	glsl: assert on incoherent point mode layout-id-qualifier validation The point mode value in an ast_type_qualifier can only be true if the flag is already set since this layout-id-qualifier can only be or not be present in a shader. Hence, it is useless to check for its value if the flag is already set. Just replaced with an assert. V2: assert instead of checking for coherence and raising a compilation error. Suggested by Timothy. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:30 +02:00
Andres Gomez	a5d6ae2f51	glsl: remove unneeded check for incompatible primitive types in GS The validation of the default in layout qualifier already assures that we won't have 2 ast_gs_input_layout objects with different primitive type values. In fact, the validation already assures that we won't have 2 ast_gs_input_layout objects in the AST tree at all. The check for an error in the shader has been replaced by an assert. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:30 +02:00
Andres Gomez	0ecfff0d08	glsl: simplifies the merge of the default in layout qualifier The merge into the default in layout qualifier duplicates a lot of code that can be reused from the generic merge method. Now, we use the generic merge method inside the specific merge for the default in layout qualifier. The generic merge method has been completed with some bits that were only present in the merge for the default in layout qualifier and the specific validation bits have been moved to the validation method for the default in layout qualifier. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:30 +02:00
Andres Gomez	65df02c002	glsl: split default in layout qualifier merge Currently, the default in layout qualifier merge performs specific validation and merge. We want to split out the validation from the merge so they can be done independently. Additionally, for simplification, the direction of the validation and merge is changed so the ast_type_qualifier calling the method is the one validated and merged against the default in qualifier. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:30 +02:00
Andres Gomez	fe5c522edd	glsl: split default out layout qualifier merge Currently, the default out layout qualifier merge performs specific validation and merge. We want to split out the validation from the merge so they can be done independently. Additionally, for simplification, the direction of the validation and merge is changed so the ast_type_qualifier calling the method is the one validated and merged against the default out qualifier. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:30 +02:00
Andres Gomez	70456aca8d	glsl: merge layouts into the default one as the last step in interface blocks Consider this example: " #version 150 core #extension GL_ARB_shading_language_420pack: require #extension GL_ARB_explicit_attrib_location: require layout(location=0) out vec4 o; layout(binding=2) layout(binding=3, std140) uniform U { vec4 a; } u[2];" As there is 2 layout-qualifiers for the uniform U and the binding layout-qualifier-id is duplicated, the rules set by the ARB_shading_language_420pack spec state that the rightmost should prevail. Our ast_type_qualifier merges with others in a way that if the value for a layout-qualifier-id is set in both, the object being merged overwrites the value of the object invoking the merge. Hence, the merge has to happen from the left layout towards the right one and this was not happening for interface blocks because we were merging into the default layout qualifier. Now, the merge is done from left to right and, as a last step, we merge into the default layout qualifier if needed, so the values of the explicit layouts prevail over it. V2: added a default_layout variable instead of a layout_helper and make the merge directly over the layout one. Suggested by Timothy. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:30 +02:00
Andres Gomez	9f13d0c64b	glsl: ignore all but the rightmost layout-qualifier-name When a layout contains a duplicated layout-qualifier-name in a single declaration, only the last occurrence should be taken into account. From page 59 (page 65 of the PDF) of the GLSL 4.40 spec: " More than one layout qualifier may appear in a single declaration. Additionally, the same layout-qualifier-name can occur multiple times within a layout qualifier or across multiple layout qualifiers in the same declaration. When the same layout-qualifier-name occurs multiple times, in a single declaration, the last occurrence overrides the former occurrence(s)." Consider this example: " #version 150 #extension GL_ARB_enhanced_layouts: enable layout(max_vertices=2, max_vertices=3) out; layout(max_vertices=3) out;" Although different values for "max_vertices" results in a compilation error. The above code is valid because max_vertices=2 is ignored. When merging qualifiers in an ast_type_qualifier, we now simply ignore new appearances of a same layout-qualifier-name if the "is_single_layout_merge" parameter is true, this works because the GLSL parser processes qualifiers from right to left. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-11-25 13:18:30 +02:00
Iago Toral Quiroga	b3fca51617	anv/state: if enabled, use anisotropic filtering also with VK_FILTER_NEAREST Fixes multiple Vulkan CTS tests that combine anisotropy and VK_FILTER_NEAREST in dEQP-VK.texture.filtering_anisotropy.* Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-25 08:20:28 +01:00
Vedran Miletić	95ddb37708	clover: Restore support for LLVM <= 3.9. The commit `8e430ff8b0` broke support for LLVM 3.9 and older versions in Clover. This patch restores it and refactors the support using Clover compatibility layer for LLVM. v2: merged #ifdef blocks v3: added support for LLVM 3.6-3.8 v4: add missing #ifdef around <memory> v5: simplify using templates and lambda Signed-off-by: Vedran Miletić <vedran@miletic.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98740 Tested-by[v4]: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-11-24 16:40:29 -08:00
Vinson Lee	f07da5aa5e	scons: Recognize LLVM_CONFIG environment variable. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-24 13:37:33 -08:00
Bas Nieuwenhuizen	a794f09017	radv: Don't generate radv_timestamp.h Not needed anymore. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-24 19:25:03 +01:00
Dave Airlie	bb8ac18340	radv: fix texel fetch offset with 2d arrays. The code didn't limit the offsets to the number supplied, so if we expected 3 but only got 2 we were accessing undefined memory. This fixes random failures in: dEQP-VK.glsl.texture_functions.texelfetchoffset.sampler2darray_* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-24 18:06:05 +10:00
Eduardo Lima Mitev	116fed80ff	mesa/getteximage: Add validation of target to glGetTextureImage There is an specific list of texture targets that can be used with glGetTextureImage. From OpenGL 4.5 spec, section '8.11 Texture Queries', page 234 of the PDF: "An INVALID_ENUM error is generated if the effective target is not one of TEXTURE_1D , TEXTURE_2D , TEXTURE_3D , TEXTURE_1D_- ARRAY , TEXTURE_2D_ARRAY , TEXTURE_CUBE_MAP_ARRAY , TEXTURE_- RECTANGLE , one of the targets from table 8.19 (for GetTexImage and GetnTexImage only), or TEXTURE_CUBE_MAP (for GetTextureImage only)." We are currently not validating the target for glGetTextureImage. As an example, calling this function on a texture with target GL_TEXTURE_2D_MULTISAMPLE should return INVALID_ENUM, but instead it hits an assertion down the road in the i965 driver. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-24 08:24:07 +01:00
Eduardo Lima Mitev	89cbe0d21f	main/texobj: Check that texture id > 0 before looking it up in hash-table _mesa_lookup_texture_err() is not currently checking that the texture-id can be zero, but _mesa_HashLookup() doesn't expect the key to be zero, and will fail an assertion. Considering that _mesa_lookup_texture_err() is called from _mesa_GetTextureImage and _mesa_GetTextureSubImage with user provided arguments, we must validate the texture-id before looking it up in the hash-table. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-24 08:23:45 +01:00
Charmaine Lee	3233a9fe0b	util: fix memory leak from the fragment shaders for SINT<->UINT blits This patch deletes those fragment shaders in util_blitter_destroy(). Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-23 22:53:08 -08:00
Kenneth Graunke	ec1f159ac8	i965: Always reserve clip distance VUE slots in SSO mode. This fixes rendering in Dolphin on Vulkan since we enabled clip distances. (Dolphin on GL has a similar bug because the linker fails to eliminate unused clip distance built-in arrays, but it isn't using SSO...so that needs more fixing.) Also fixes a Piglit test: spec/glsl-1.50/execution/geometry.clip-distance-vs-gs-out-sso Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-23 21:23:38 -08:00
Ilia Mirkin	8cdf73c324	anv/gen7: only enable dual-source blending when there are dual-source factors Apparently the hw wedges otherwise, as mentioned in i965 comments. Reported-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-23 19:40:00 -08:00
Ilia Mirkin	a783b67e17	swr: clear every layer of the attached surfaces Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-23 20:34:02 -05:00
Ilia Mirkin	1a80ec0cd1	swr: [rasterizer core] pipe renderTargetArrayIndex through to clears Currently clears only operate on the 0th array index (ignoring surface layout parameters). Instead normalize to take a RTAI like all the load/store tile logic does, and use ComputeSurfaceAddress to properly take the surface state's lod/array index into account. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-23 20:33:50 -05:00
Ilia Mirkin	cec515999c	swr: [rasterizer core] clear data now comes in as float The non-fast-clear path was never updated after clear colors were passed in as floats. Remove the now-harmful conversion from unorm8. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-23 20:33:36 -05:00
Ilia Mirkin	74943db82c	swr: [rasterizer core] actually perform clear before store in GetHotTile When switching render target array indexes (as might happen in a GS, or in a future change, with layered clears), if the previous state is HOTTILE_CLEAR, we should actually clear the tile before saving it off. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-23 20:33:32 -05:00
Kenneth Graunke	5da84a7e12	i965: Fix a mistake from porting the URB allocation code to arrays. Commit `6d416bcd84` (i965: Use arrays in Gen7+ URB code.) introduced a regression which caused us to fail to allocate all of our URB space. - total_wants -= ds_wants; + total_wants -= additional; The new line should have been total_wants -= wants[i]. Fixes a large performance regression in TessMark. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98815 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-11-23 16:57:29 -08:00
Kenneth Graunke	903056e016	i965: Use 3DSTATE_CLIP's User Clip Distance Enable bitmask on Gen8+. Gen6-7.5 specify the user clip distance enable bitmask in 3DSTATE_CLIP. Gen8+ normally uses the new internal signalling mechanism to select the one specified in the last enabled shader stage (3DSTATE_VS, DS, or GS). This is a pretty good fit for Vulkan, or even newer GL, where the bitmask comes entirely from the shader. But with glClipPlane(), this is dynamic state, and we have to listen to _NEW_TRASNFORM. Clip plane enables are the only reason the VS/DS/GS atoms need to listen to _NEW_TRANSFORM. 3DSTATE_CLIP already has to listen to it in order to support ARB_clip_control settings. Setting the "Use the 3DSTATE_CLIP bitmask" force enable bit allows us to drop _NEW_TRANSFORM from all the shader stage atoms, so we can re-emit them less often. Improves performance of OglBatch7 (version 6) by 2.70773% +/- 0.491257% (n = 38) at 1024x768 on Cherryview. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-11-23 16:57:29 -08:00
Dave Airlie	3b6893b678	radv: fix flipped blits This fixes: dEQP-VK.api.copy_and_blit.blit_image.simple_tests.mirror* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-23 23:49:32 +00:00
Dave Airlie	b06568873d	radv/meta: just local vars for src/dst subresources. This is just a cleanup before I rework this code to fix mirrored blits. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-23 23:49:23 +00:00
Fredrik Höglund	28c781b574	radv: add support for VK_AMD_draw_indirect_count Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-24 08:19:27 +10:00
Fredrik Höglund	eff7bbc47e	radv: add support for VK_AMD_negative_viewport_height The driver already supports this extension in practice. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-24 08:19:24 +10:00
Fredrik Höglund	2c748c5c8a	radv: add support for VK_KHR_sampler_mirror_clamp_to_edge radv_tex_wrap() already supports VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE, so all that's needed is to advertise support for the extension. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-24 08:19:20 +10:00
Fredrik Höglund	5cbcbc75f4	radv: add support for anisotropic filtering on SI-CI Ported from radeonsi. Note that si_make_texture_descriptor() already sets img7 to the mask value referred to in the comment. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-24 08:19:06 +10:00
Jordan Justen	72c00e7c47	i965/gen7: Only advertise 4 samples for RGBA32F on GLES We can't render to 8x MSAA if the width is greater than 64 bits. (see brw_render_target_supported) Fixes ES31-CTS.sample_variables.mask.rgba32f.samples_8.mask_* Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-11-23 11:15:31 -08:00
Marek Olšák	76e953788a	radeonsi: print new opt flags in si_dump_shader_key Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-23 18:49:10 +01:00
Marek Olšák	e5302ad936	radeonsi: add a debug flag that disables optimized shader variants Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-23 18:49:10 +01:00
Aaron Watry	ac458d2ae8	compiler/glsl/tests: Fix print format when building 32-bit binaries on 64-bit host Avoids two warnings. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-11-23 10:15:00 -06:00
Aaron Watry	60c3a0a67c	compiler/glsl/tests: Fix print format when building 32-bit binaries on 64-bit host Avoids three warnings. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-11-23 10:15:00 -06:00
Emil Velikov	5cc07d854c	anv: fix enumeration of properties Driver should enumerate only up-to min2(num_available, num_requested) properties and return VK_INCOMPLETE if the # of requested props is smaller than the ones available. Presently we assert out in such cases. Inspired by a similar fix for RADV. v2: Use MIN2 + typed_memcpy (Jason). Should fix: dEQP-VK.api.info.device.extensions Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-23 14:13:47 +00:00
Ben Widawsky	0a0ce884ea	i965: Restructure fast clear eligibility decision v2 (Jason): - Use PRM citation for SKL now that it is available - Also return false for gen < 8 mipmapped/arrayed Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-23 11:06:53 +02:00
Topi Pohjolainen	f4c7989408	i965: Set initial msaa fast clear status explicitly instead of in intel_miptree_init_mcs(). For lossless compression the status is immediately overwritten in intel_miptree_alloc_non_msrt_mcs() while the status for non-compressed non-msaa miptrees is explicitly set in do_blorp_clear(). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-23 11:06:53 +02:00
Topi Pohjolainen	dfd6088b3a	i965: Declare read-only input to level/layer check const Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-23 11:06:53 +02:00
Topi Pohjolainen	07d070f324	i965/fbo: Prepare layer multiplier for render buffer compression This path is not yet taken for fast cleared or compressed buffers but later patches will enable it. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-23 11:06:53 +02:00
Topi Pohjolainen	a2d029dc5f	i965: Add multi-slice getter for resolve maps This is useful when checking if any slice is in unresolved state. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-23 11:06:53 +02:00
Topi Pohjolainen	7c75fd9a59	i965/meta: Split conversion of color and setting it And fix a mangled comment while at it. v2 (Ben): Return the converted color. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-23 11:06:53 +02:00
Topi Pohjolainen	f19e0967c9	intel/blorp: Fix rectangle size for level-not-zero resolves Needed to prevent gpu hangs when mip-mapped compression gets enabled. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-23 11:06:52 +02:00
Topi Pohjolainen	ca84e190a4	i965/miptree: Don't shrink textures when augmenting for more levels This was detected when examining CCS_E failures with piglit test: "fbo-generatemipmap-formats". Test creates a 2D texture with dimensions 293x277. It manually loops over all levels and calls glTexImage2D(). Level one triggers creation of full miptree: intel_alloc_texture_image_buffer() realizes that there is only one level in the miptree and calls intel_miptree_create_for_teximage() to re-allocate the miptree with all 9 levels. However, the end result is a miptree with level zero dimensions of 292x276. Related, and possibly calling for treatment of its own is mip-map generation: After calling glTexImage2D() against every level test continues by replacing content for levels one to eight with data derived from level zero by calling glGenerateMipmapEXT(). This results into the miptree being allocated anew for every level: Mip-map generation goes thru meta which ends up validating the texture (brw_validate_textures()->intel_finalize_mipmap_tree()-> intel_miptree_match_image()) where one finds texture with base level size 292:276. This results into new miptree being created for the npot size 293:277. Only here intel_finalize_mipmap_tree() is asked for only one level, and therefore such is created. Generation for level one in turn finds right base level size but only one level when two is needed. And the same goes on for all eight levels. This patch prevents the shrink maintaining the NPOT size of 293x277. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-23 11:06:52 +02:00
Eduardo Lima Mitev	6e8f12619f	main/getteximage: Use the height argument to calculate memcpy copy size In get_tex_memcpy, when copying texture data directly from source to destination (when row strides match for both src and dst), the copy size is currently calculated using the full texture height instead of the sub-region height parameter that was passed. This can cause a read past the end of the mapped buffer when y-offset is greater than zero, leading to a segfault. Fixes CTS test (from crash to pass): * GL45-CTS/get_texture_sub_image/functional_test v2: (Jason) Use the passed 'height' instead of copying til the end of the buffer (tex-height - yoffset). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-23 09:22:32 +01:00
Iago Toral Quiroga	e062eb6415	nir/spirv: implement ordered / unordered floating point comparisons properly Besides the logical operation involved, these also require that we test if the operands are ordered / unordered. For ordered operations, both operands must be ordered (and they must pass the conditional test) while for unordered operations it is sufficient if only one of the operands is unordered (or they pass the logical test). Fixes the following Vulkan CTS tests: dEQP-VK.spirv_assembly.instruction.compute.opfunord.equal dEQP-VK.spirv_assembly.instruction.compute.opfunord.greater dEQP-VK.spirv_assembly.instruction.compute.opfunord.greaterequal dEQP-VK.spirv_assembly.instruction.compute.opfunord.less dEQP-VK.spirv_assembly.instruction.compute.opfunord.lessequal v2: Fixed typo: s/nir_eq/nir_feq Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-11-23 08:07:44 +01:00
Dave Airlie	9ce5926476	anv: fix segfault in anv_BindImageMemory Since bind image memory started memsetting surfaces, the device node can't be NULL, since we lookup device->info.has_llc. Not sure why it ever was NULL before. Fixes some things on my Ivybridge. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-23 16:11:03 +10:00
Tim Rowley	9c13cc9451	swr: [rasterizer core] fix cast for stencil clear value Bad type cast for stencil clear value was picking up structure padding bytes. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-22 20:06:17 -06:00
Ilia Mirkin	f6f644ea12	swr: color interpolation is also supposed to get perspective division Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-22 20:27:20 -05:00
Ilia Mirkin	7cbfe59cf3	swr: add sprite coord enable mask to fs key This fixes gl-coord-replace-doesnt-eliminate-frag-tex-coords Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-22 20:27:20 -05:00
Ilia Mirkin	6d6ef3fb55	swr: rework vert <-> frag shader linkage logic Fixes a few things: - sprite coords only apply to generic varyings, and are a bitmask - back color only applies in 2-sided lighting mode - handle some odd situations between only some front/back colors being there. This is only semi-legal in GL, but we shouldn't start crashing. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-22 20:27:20 -05:00
Ilia Mirkin	2595aebd91	swr: flatshading makes color outputs flat, it doesn't affect others We were previously not marking the "regular" flat outputs as flat when flatshading was enabled. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-22 20:27:20 -05:00
Ilia Mirkin	37be598dda	swr: only broadcast color0 value, not all color values The way that dual-source blending is described for GLES2 is very odd, and we end up with a shader that both has this property set and has a color1 value to be used as the second source. While changing the state tracker is an option, it seems more reliable to verify that the broadcast is only done on color0. Fixes arb_blend_func_extended-fbo-extended-blend-pattern_gles2 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-22 20:27:20 -05:00
Ilia Mirkin	2234a4330e	swr: report a reasonable max lod bias This is the same value that llvmpipe uses. Since swr uses the same sampler logic, makes sense for this value to also be the same. Most applications don't care. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-22 20:27:20 -05:00
Ilia Mirkin	2b7bdff83f	swr: avoid using exceptions for expected condition handling I was getting a weird segfault from GCC 4.9.3: 0x00007ffff54f27aa in strlen () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff54f27aa in strlen () from /lib64/libc.so.6 #1 0x00007ffff4f128e5 in get_cie_encoding (cie=cie@entry=0x7ffff6e09813) at /gcc-4.9.3/libgcc/unwind-dw2-fde.c:272 #2 0x00007ffff4f1318e in classify_object_over_fdes (ob=ob@entry=0xd7bb90, this_fde=0x7ffff7f11010) at /gcc-4.9.3/libgcc/unwind-dw2-fde.c:628 #3 0x00007ffff4f135ba in init_object (ob=0xd7bb90) at /gcc-4.9.3/libgcc/unwind-dw2-fde.c:749 #4 search_object (ob=ob@entry=0xd7bb90, pc=pc@entry=0x7ffff4f11f4d <_Unwind_RaiseException+61>) at /gcc-4.9.3/libgcc/unwind-dw2-fde.c:961 #5 0x00007ffff4f13e62 in _Unwind_Find_registered_FDE (bases=0x7fffffffd358, pc=0x7ffff4f11f4d <_Unwind_RaiseException+61>) at /gcc-4.9.3/libgcc/unwind-dw2-fde.c:1025 #6 _Unwind_Find_FDE (pc=0x7ffff4f11f4d <_Unwind_RaiseException+61>, bases=bases@entry=0x7fffffffd358) at /gcc-4.9.3/libgcc/unwind-dw2-fde-dip.c:450 #7 0x00007ffff4f11197 in uw_frame_state_for (context=context@entry=0x7fffffffd2b0, fs=fs@entry=0x7fffffffd100) at /gcc-4.9.3/libgcc/unwind-dw2.c:1245 #8 0x00007ffff4f11b15 in uw_init_context_1 (context=context@entry=0x7fffffffd2b0, outer_cfa=outer_cfa@entry=0x7fffffffd660, outer_ra=0x7ffff518d23b <__cxa_throw+91>) at /gcc-4.9.3/libgcc/unwind-dw2.c:1566 #9 0x00007ffff4f11f4e in _Unwind_RaiseException (exc=0xd7c250) at /gcc-4.9.3/libgcc/unwind.inc:88 #10 0x00007ffff518d23b in __cxa_throw () from /usr/lib/gcc/x86_64-pc-linux-gnu/4.9.3/libstdc++.so.6 #11 0x00007ffff51ed556 in std::__throw_out_of_range(char const*) () from /usr/lib/gcc/x86_64-pc-linux-gnu/4.9.3/libstdc++.so.6 #12 0x00007fffea778be0 in std::map<pipe_format, SWR_FORMAT, std::less<pipe_format>, std::allocator<std::pair<pipe_format const, SWR_FORMAT> > >::at ( this=0x7fffebeb4c40 <mesa_to_swr_format(pipe_format)::mesa2swr>, __k=@0x7fffffffd73c: PIPE_FORMAT_RGTC1_UNORM) at /usr/lib/gcc/x86_64-pc-linux-gnu/4.9.3/include/g++-v4/bits/stl_map.h:549 #13 0x00007fffea776aee in mesa_to_swr_format (format=PIPE_FORMAT_RGTC1_UNORM) at swr_screen.cpp:597 We can just void this whole issue by not using exceptions in the first place. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-22 20:27:20 -05:00
Ilia Mirkin	946a7abd1c	swr: remove formats from mapping table that don't have StoreTile impls This table exists for the purpose of determining renderable formats. Without a StoreTile implementation, that can't happen. This basically removes rendering support to all L/LA/I formats. They can be re-added when/if StoreTile implementations are added. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-22 20:27:20 -05:00
Ilia Mirkin	2e12d2ba72	swr: remove unnecessary -1 entries in format mapping table Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-22 20:27:20 -05:00
Ilia Mirkin	7cfb364b1a	swr: rework resource layout and surface setup This is a bit of a mega-commit, but unfortunately there's no great way to break this up since a lot of different pieces have to match up. Here we do the following: - change surface layout to match swr's Load/StoreTile expectations - fix sampler settings to respect all sampler view parameters - fix stencil sampling to read from secondary resource - respect pipe surface format, level, and layer settings - fix resource map/unmap based on the new layout logic - fix resource map/unmap to copy proper parts of stencil values in and out of the matching depth texture These fix a massive quantity of piglits, including all the tex-miplevel-selection ones. Note that the swr native miptree layout isn't extremely space-efficient, and we end up using it for all textures, not just the renderable ones. A back-of-the-envelope calculation suggests about 10%-25% increased memory usage for miptrees, depending on the number of LODs. Single-LOD textures should be unaffected. There are a handful of regressions as a result of this change: - Some textureGrad tests, these failures match llvmpipe. (There are debug settings allowing improved gallivm sampling accurancy.) - Some layered clearing tests as swr doesn't currently support that. It was getting lucky before because enough other things were broken. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-22 20:27:20 -05:00
Charmaine Lee	5d2b5996e1	util: fix missing swizzle components in the SINT <-> UINT conversion string Fixes tgsi error introduced in commit `3817a7a`. The error complains missing swizzle component in the conversion string "UMIN TEMP[0], TEMP[0], IMM[0].x". Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-11-23 01:54:57 +01:00
Eric Anholt	414dbb2d5c	vc4: Don't conditionalize the src1 mov of qir_SEL(). My thought in having both arguments conditionally moved was that it should theoretically save some power by not doing work in those channels. However, it ends up costing us instructions because we can't register-coalesce the first of the MOVs, and it also introduces extra scheduling dependencies. The instruction cost would swamp whatever power benefit I was hoping for. shader-db results: total instructions in shared programs: 100548 -> 99741 (-0.80%) instructions in affected programs: 42450 -> 41643 (-1.90%) With obvious outliers removed (I had an X11 emacs running over the network in the "after" case), 3DMMES Taiji showed 1.07231% +/- 0.488241% fps improvement (n=18, 30).	2016-11-22 16:46:03 -08:00
Eric Anholt	1f0ba902f0	vc4: Re-add R4 to the "any" register class. I screwed this up in `fdad4d2402` which was supposed to be making this code more maintainable. What's amazing is multithreaded FS showed the wins it did despite this bug. shader-db results: total instructions in shared programs: 103535 -> 100548 (-2.89%) instructions in affected programs: 83794 -> 80807 (-3.56%)	2016-11-22 16:46:03 -08:00
Eric Anholt	9728887e7f	vc4: Disable MSAA rasterization when the job binning is single-sampled. Gallium core just changed to start setting MSAA enabled in the rasterizer state even with samples==1 buffers. This caused disagreements in our driver between binning and rasterization state, which the simulator threw assertion failures about. Keep the single-sampled samples==1 behavior for now.	2016-11-22 16:46:03 -08:00
Eric Anholt	ff018e0979	vc4: Make sure we don't overflow texture input/output FIFOs when threaded. I dropped the first hunk of this change last minute when I decided it wasn't actually needed, and apparently failed to piglit it in simulation. The simulator threw an an assertion in gl-1.0-drawpixels-color-index, which queued up 5 coordinates (3 before a switch, two after) before loading the result.	2016-11-22 16:46:03 -08:00
Dave Airlie	ea417f5335	radv: move pipeline barrier image transitions after src flushing This seems like it would conform better with the spec. noticed while digging into fast clears. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-23 10:16:34 +10:00
Jason Ekstrand	3fd79558be	anv: Enable fast clears on gen7-8 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 14:24:29 -08:00
Jason Ekstrand	5e8069a572	anv: Add support for fast clears on gen9 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 14:24:29 -08:00
Jason Ekstrand	dae8e52030	anv/blorp: Rework flushing around resolves It turns out that the flushing required around resolves is a bit more extensive than I first thought. You actually need render cache flush and a CS stall both before and after the resolve. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 14:24:29 -08:00
Jason Ekstrand	8d1ccd6729	anv/cmd_buffer: Apply remaining flushes in EndCommandBuffer Otherwise, some pipe flushes may just never happen. This is unlikely to cause problems depending on how the kernel schedules batches, but we shouldn't count on it. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 14:24:29 -08:00
Jason Ekstrand	878499d323	anv/blorp: Use regular blorp clears for subpass clears At vkCmdNextSubpass time, we have the actual framebuffer so we can use regular blorp_clear for subpass clears. For fast clears, there is no attachment version, so this will make fast clears a bit easier. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 14:24:29 -08:00
Jason Ekstrand	772d223c9c	anv: Add a vk_to_isl_color helper Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 14:24:29 -08:00
Jason Ekstrand	d1d6b78898	anv/cmd_buffer: Make setup_attachments take a RenderPassBeginInfo Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 14:13:53 -08:00
Jason Ekstrand	1d5ac0a462	anv: Set up binding tables and surface states for input attachments This commit adds the last remaining bits to support input attachments in the Intel Vulkan driver. For color and depth attachments, we allocate an input attachment surface state during vkCmdBeginRenderPass like we do for the render target surface states. This is so that we can incorporate the clear color and aux information as used in rendering. For stencil, we just treat it like a regular texture because we don't there is no aux. Also, only having to worry about at most one input attachment surface for each attachment makes some of the vkCmdBeginRenderPass code simpler. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:44:55 -08:00
Jason Ekstrand	140d041fac	anv/pipeline: Handle depth/stencil self-dependencies Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:44:55 -08:00
Jason Ekstrand	0b01262844	anv: Use pass attachment information to insert flushes Input and resolve attachments can cause an implicit dependency in the pipeline. It's our job to insert the needed flushes. Fortunately, we can easily reuse the usage tracking that we use for CCS resolves. This fixes 159 Vulkan CTS tests on Haswell because we're now flushing in between drawing and MSAA resolves. I have no idea how they were passing before on newer hardware. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:44:55 -08:00
Jason Ekstrand	57174d6042	anv/cmd_buffer: Fix pipeline barriers for input attachments We were using VK_IMAGE_ACCESS_COLOR_ATTACHMENT_READ_BIT to detect an input attachment read. We should use VK_IMAGE_ACCESS_INPUT_ATTACHMENT_READ_BIT instead. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:44:55 -08:00
Jason Ekstrand	0acb28e0cf	anv/pipeline: Add a input_attachment_index to the bindings This allows us to go from the binding to either the descriptor or the input attachment at will. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:44:55 -08:00
Jason Ekstrand	3f1eda0b42	anv/pass: Calculate the combined image usage of attachments Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:44:55 -08:00
Jason Ekstrand	347f43c8ec	anv: Add an input attachment lowering pass Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:44:55 -08:00
Jason Ekstrand	2e311e4211	i965/fs: Implement load_layer_id for fragment shaders Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:03:31 -08:00
Jason Ekstrand	08441dae59	nir: Add a layer_id system value intrinsic Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:03:29 -08:00
Jason Ekstrand	2e44799f50	spirv: Stop warning about input attachments Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:03:23 -08:00
Jason Ekstrand	c54097cc48	spirv: Handle the InputAttachmentIndex decoration Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:02:35 -08:00
Jason Ekstrand	111d57e7d2	compiler: Add the rest of the subpassInput types There are actually 6 of them according to the GL_KHR_vulkan_glsl spec. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:02:29 -08:00
Jason Ekstrand	7a2cfd4adb	anv/cmd_buffer: Emit CS push constants after binding tables Emitting binding tables can cause push constants to be dirtied if the shader uses images so we need to handle push constants later.	2016-11-22 10:10:38 -08:00
Marek Olšák	a3f6bea69a	gallium: fix more occurences of u_hash.h this fixes compile failures since `86514d84e0`	2016-11-22 18:28:18 +01:00
Marek Olšák	d219720d19	mesa: use special checksums for unset checksums and fixed-func shaders for debugging Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-22 18:07:16 +01:00
Marek Olšák	b818df1e71	glsl: add gl_linked_shader::SourceChecksum for debugging v2: wrap all checksums in #ifdef DEBUG Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-22 18:05:51 +01:00
Marek Olšák	6dfdf52b6a	mesa: use util_hash_crc32 instead of _mesa_str_checksum Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-22 18:05:51 +01:00
Marek Olšák	86514d84e0	util: import CRC32 implementation from gallium Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-22 18:05:51 +01:00
Jason Ekstrand	3ef8dff865	anv/cmd_buffer: Add an assert on emit_binding_table failure The != VK_SUCCESS case is really only capable of handling the one error. This assert makes things a bit safer if something else goes wrong. Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-11-22 08:50:27 -08:00
Lucas Stach	d9a3ad94ca	gbm: request correct version of the DRI2_FENCE extension There is no version 2 of the DRI2_FENCE extension. So only a request for version 1 has a chance to succeed. Fixes: `74b1969d71` (gbm: wire up fence extension) Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-22 15:56:44 +00:00
Jason Ekstrand	f680a01ad4	anv/cmd_buffer: Emit a CS stall before setting a CS pipeline Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-22 08:06:33 -08:00
Jason Ekstrand	054e48ee0e	anv/cmd_buffer: Re-emit MEDIA_CURBE_LOAD when CS push constants are dirty This can happen even if the binding table isn't changed. For instance, you could have dynamic offsets with your descriptor set. This fixes the new stress.lots-of-surface-state.cs.dynamic cricible test. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-22 08:06:33 -08:00
Jason Ekstrand	722ab3de9f	anv/cmd_buffer: Handle running out of binding tables in compute shaders If we try to allocate a binding table and fail, we have to get a new binding table block, re-emit STATE_BASE_ADDRESS, and then try again. We already handle this correctly for 3D and blorp but it never got handled for CS. This fixes the new stress.lots-of-surface-state.cs.static crucible test. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-22 08:06:33 -08:00
Jason Ekstrand	a8ef92b031	i965/compiler: Disable trig workarounds on KBL+ The precision of our trig instructions appears to have been fixed on Kaby Lake. Neither Ben nor I can find any documentation for this. However, the dEQP precision tests now pass with INTEL_PRECISE_TRIG=0 where they fail on Sky Lake. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-11-22 08:06:33 -08:00
Jason Ekstrand	767b163e47	intel/common: Add an is_kabylake field to gen_device_info Most of the 3-D engine Kaby Lake is identical to Sky Lake. However, there are a few small differences that we need to be able to detect. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-11-22 08:06:33 -08:00
Gwan-gyeong Mun	e074a08a6d	anv: Fix unintentional integer overflow in anv_CreateDmaBufImageINTEL Since both pCreateInfo->strideInBytes and pCreateInfo->extent.height are of uint32_t type 32-bit arithmetic will be used. Fix unintentional integer overflow by casting to uint64_t before multifying. CID 1394321 Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> [Emil Velikov: cast only of the arguments] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-22 15:15:45 +00:00
Gwan-gyeong Mun	69cc7d90f9	util/disk_cache: close a previously opened handle in disk_cache_put (v2) We're missing the close() to the matching open(). CID 1373407 v2: Fixes from Emil Velikov's review Update the teardown in reverse order of the setup/init. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)	2016-11-22 15:13:42 +00:00
Gwan-gyeong Mun	0e8dc81c3a	docs: get rid of duplicated description from sourcetree.html Fixes: `438086efb1` (docs: sourcetree.html misc updates) Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-22 15:13:42 +00:00
Emil Velikov	a2283b50e6	docs/submitting patches: mention get_reviewers.pl Mention the script - why/how to use alongside a useful trick to make it work interactively (thanks Rob!). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2016-11-22 15:13:41 +00:00
Timothy Arceri	e260bfec04	docs/submitting patches: add git tips v2: [Emil Velikov] - Add the shorthand git send-email -vX - Move to submittingpatches.html - Add to the TOC. v3: [Emil Velikov] - Use @~8 instead of HEAD~8 (Nicolai) Cc: Timothy Arceri <t_arceri@yahoo.com.au> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2016-11-22 15:13:41 +00:00
Emil Velikov	29c8a4a4ce	auxiliary/vl/dri: call get_xcb_screen() only once Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-11-22 15:13:41 +00:00
Emil Velikov	7c6babb22c	egl/x11: store xcb_screen_t *screen instead of int screen Just fetch and store it once, rather than doing the xcb_setup_roots_iterator + get_xcb_screen dance five times. v2: Call xcb_disconnect() on error (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)	2016-11-22 15:13:41 +00:00
Emil Velikov	b9880d2e93	egl/x11: factor out dri2_get_xcb_connection() Identical throughout dri2, dri3 and drisw. Next patch will add more common code, so rather than duplicating it factor out the function. Note: this also sets eglError on failure. Something that's quite inconsistent throughout the codebase. v2: Call xcb_disconnect() on error (Eric) Note: use xcb_disconnect() even in the xcb_connection_has_error() case as per the manual: ... memory will not be freed until xcb_disconnect... Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)	2016-11-22 15:13:41 +00:00
Timothy Arceri	a56a505db7	mesa/glsl: remove unused uses_builtin_functions field This has been unused since `943b69cddd` Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-11-23 00:17:13 +11:00
Kenneth Graunke	38a8507f79	i965: Use NIR-based clip/cull lowering for OpenGL as well. The old approach works fine, and this approach isn't necessarily better. But it at least has the advantage that Vulkan and GL use the same approach. I originally wrote it to gain additional testing for the new paths. shader-db statistics show 0 instruction count changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-22 00:29:24 -08:00
Kenneth Graunke	a4d7a5bd1e	anv: Enable clip and cull distance support. Everything is now in place, and we appear to pass the tests on Gen7+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-22 00:29:24 -08:00
Kenneth Graunke	f182e5eafc	i965/vec4: Handle component qualifiers on non-generic varyings. ARB_enhanced_layouts only requires component qualifier support for generic varyings, so this is all the vec4 backend knew how to handle. This patch extends the backend to handle it for all varyings, so we can use store_output intrinsics with a component set for things like clip/cull distances. We may want to use that for other VUE header fields in the future as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-22 00:29:24 -08:00
Kenneth Graunke	b63f7671a3	i965/fs: Handle compact outputs. We need to calculate the number of vec4 slots correctly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-22 00:29:24 -08:00
Kenneth Graunke	536af43fe3	spirv: Silence unsupported capability warnings for Clip/CullDistance. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-22 00:29:24 -08:00
Kenneth Graunke	7471bb5fa4	anv: Set clip/cull distances fields in packets. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-22 00:29:23 -08:00
Kenneth Graunke	a9eabd539c	anv: Combine ClipDistance and CullDistance arrays. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-22 00:29:23 -08:00
Kenneth Graunke	9a179f2db0	nir: add a pass to compact clip/cull distances. v2: Use nir_is_per_vertex_io() rather than is_arrays_of_arrays(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-22 00:29:23 -08:00
Kenneth Graunke	663b2e9a92	nir: Add a "compact array" flag and IO lowering code. Certain built-in arrays, such as gl_ClipDistance[], gl_CullDistance[], gl_TessLevelInner[], and gl_TessLevelOuter[] are specified as scalar arrays. Normal scalar arrays are sparse - each array element usually occupies a whole vec4 slot. However, most hardware assumes these built-in arrays are tightly packed. The new var->data.compact flag indicates that a scalar array should be tightly packed, so a float[4] array would take up a single vec4 slot, and a float[8] array would take up two slots. They are still arrays, not vec4s, however. nir_lower_io will generate intrinsics using ARB_enhanced_layouts style component qualifiers. v2: Add nir_validate code to enforce type restrictions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-22 00:29:23 -08:00
Dave Airlie	f395e3445d	radv: add support for shader stats dump I've started working on a shader-db alike for Vulkan, it's based on vktrace and it records pipelines, this adds support to dump the shader stats exactly like radeonsi does, so I can reuse the shader-db scripts it uses. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-22 07:20:17 +00:00
Dave Airlie	220912e214	radv: fix sample id loading The sample id is packed into bits 8-12, so adjust things properly. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-22 17:15:57 +10:00
Dave Airlie	3c6151ccaf	radv/ac: add implementation of load_sample_pos intrinsic. This fixes a bunch of crashes in CTS tests looking for this. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-22 17:15:54 +10:00
Dave Airlie	5697cfb7ec	radv/ac: cleanup ddxy emission This cleans up the ddxy emission along the same lines as radeonsi. It also means we don't use LDS on VI chips we use the dspermute interface, it also removes some duplicated code. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-22 17:15:43 +10:00
Dave Airlie	fa57b77105	radv/meta: cleanup resolve vertex state emission For the hw resolve there is no need to emit any sort of texture coordinates, so drop them all in the meta path. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-22 17:15:37 +10:00
Bas Nieuwenhuizen	24427e31ef	radv: Incorporate GPU family into cache UUID. Invalidates the cache when someone switches cards. Signed-off-by: Bas Nieuwenhuizen <basni@google.com>	2016-11-22 07:58:35 +01:00
Bas Nieuwenhuizen	d94383970f	radv: Use library mtime for cache UUID. We want to also invalidate the cache when LLVM gets changed. As the specific LLVM revision is not fixed at build time, we will need to check at runtime. Computing a checksum for LLVM is going to be very expensive, so just use the mtime. Tested on my computer that the returned DSO for the LLVM symbol is actually the LLVM DSO. Signed-off-by: Bas Nieuwenhuizen <basni@google.com>	2016-11-22 07:58:35 +01:00
Bas Nieuwenhuizen	43ee4917ca	radv: Store UUID in physical device. No sense in repeatedly determining it. Also, it might be dependent on the device as shaders get compiled differently for SI/CIK/VI etc. Signed-off-by: Bas Nieuwenhuizen <basni@google.com>	2016-11-22 07:58:35 +01:00
Timothy Arceri	581bd1d12a	glsl: fix NULL check Fixes copy and paste error in `9d96d3803a`	2016-11-22 14:40:26 +11:00
Ilia Mirkin	807bc6ea9e	swr: calculate viewport width/height based on the scale The former calculations were for min/max y. The width/height don't take translate into account. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-21 21:11:26 -05:00
Ilia Mirkin	c3dd5b2e3f	swr: don't claim to allow setting layer/viewport from VS This may ultimately be possible to support, but for now it's not hooked up and the swr core only supports this output from GS. This normally wouldn't matter, but we lie about supporting GL 3.2, and also the blitter and st/mesa will make use of this functionality if claimed. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-21 21:11:26 -05:00
Ilia Mirkin	d48740568f	swr: allocate all scratch space in one go for vertex buffers Multiple buffers may reference client arrays. When this happens, we might reach for scratch space multiple times, which could cause later arrays to invalidate the pointers allocated for the earlier ones. This fixes copyteximage 2D_ARRAY. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-21 21:11:26 -05:00
Ilia Mirkin	16d42f2f3d	swr: call swr_update_derived unconditionally when drawing/clearing Currently a sequence like draw/map/draw/map will cause the second map to not wait for the second draw. This is because the first map will clear the resource business bit, and the second draw won't reset it since no state has changed. swr_update_derived does a tiny bit of extra work, including updating the SWR_BACKEND_STATE as well as waiting for prending fences. If that's a problem, we could call swr_update_resource_status directly from draw/clear handlers. Fixes clearbuffer-stencil, clearbuffer-depth, clearbuffer-depth-stencil, and clearbuffer-display-lists. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-21 21:11:26 -05:00
Ilia Mirkin	ee0b6597a9	swr: [rasterizer memory] minify texture width before alignment The minification should happen before alignment, not after. See similar logic on ComputeLODOffsetY. The current logic requires unnecessarily large textures when there's an initial NPOT size. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-21 21:11:26 -05:00
Ilia Mirkin	c5a654786b	swr: [rasterizer memory] minify original sizes for block formats There's no guarantee that mip width/height will be a multiple of the compressed block size. Doing a divide by the block size first yields different results than GL expects, so we do the divide at the end. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-21 21:11:26 -05:00
Marek Olšák	bf75ef3f92	radeonsi: remove all varyings for depth-only rendering or rasterization off Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	ef6c84b301	radeonsi: eliminate VS outputs that aren't used by PS at runtime A past commit added the ability to compile "optimized" shader variants asynchronously (not stalling the app). This commit builds upon that and adds what is basically a runtime shader linker. If a VS output isn't used by the currently-bound PS, a new VS compilation is started without that output. The new shader variant is used when it's ready. All apps using separate shader objects I've seen had unused VS outputs. Eliminating unused/useless VS outputs also eliminates the corresponding vertex attribute loads. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	7e76f9a7a8	radeonsi: record information about all written and read varyings It's just tgsi_shader_info with DEFAULT_VAL varyings removed. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	c7f3e5c647	radeonsi: make si_shader_io_get_unique_index stricter Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	ed3190b3f3	radeonsi: don't export ClipVertex and ClipDistance[] if clipping is disabled This is the first user of optimized monolithic shader variants. Cull distances can't be disabled by states. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	d984a324bf	radeonsi: add infrastr. for compiling optimized shader variants asynchronously Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	d2a56985d7	radeonsi: don't set vs.epilog.export_prim_id if TES is bound there is no VS epilog in this case Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	fee71fec25	radeonsi: simplify checking for monolithic compilation Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	e6aee45db4	radeonsi: print all flags in si_dump_shader_key Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	6d5c2a8b5c	radeonsi: split the shader key into 3 logical parts key->part.: prolog and epilog flags only key->as_{ls,es}: special flags key->mono.: flags for monolithic compilation only Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	d4e9f409e9	radeonsi: fix culling if clip & cull distances are used at the same time Fixed piglits: - arb_cull_distance/clip-cull-3 - arb_cull_distance/clip-cull-4 Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	9d8db805ef	radeonsi: clean up si_emit_clip_regs Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	e59389d738	radeonsi: assume that a VS without POSITION is LS Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	7dbf83af54	tgsi/scan: record if a shader writes the position output Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	8a2251911e	tgsi/scan: use a big switch for scanning outputs Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	bdd860e307	radeonsi: decrease the number of texture slots to 24 Company Of Heroes 2 needs only 24. This saves 512 bytes of CE RAM per shader stage. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	fa476e0566	radeonsi: fast exit si_emit_derived_tess_state early Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	79a8e674ae	winsys/amdgpu: set addrlib flag opt4Space Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	72d1669ed2	radeonsi: check for !is_linear in do_hardware_msaa_resolve We don't want opt4Space here. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	49fa4a4e60	gallium/radeon: add RADEON_SURF_OPTIMIZE_FOR_SPACE FORCE_TILING should disable it. It has no effect now, but that may change soon. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Mun Gwan-gyeong	44a3f2ee09	radeonsi: Add missing error-checking to si_create_compute_state (v2) When the uploading of shader fails on si_shader_binary_upload(), it returns -ENOMEM. We should handle si_shader_binary_upload() failure path on si_create_compute_state(). CID 1394027 v2: Fixes from Edward O'Callaghan's review a) Update explicitly return value check with "si_shader_binary_upload() < 0" b) Update commit message. Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-11-21 21:09:06 +01:00
Roland Scheidegger	e442db8e98	draw: drop some overflow computations It turns out that noone actually cares if the address computations overflow, be it the stride mul or the offset adds. Wrap around seems to be explicitly permitted even by some other API (which is a _very_ surprising result, as these overflow computations were added just for that and made some tests pass at that time - I suspect some later fixes fixed the actual root cause...). So the requirements in that other api were actually sane there all along after all... Still need to make sure the computed buffer size needed is valid, of course. This ditches the shiny new widening mul from these codepaths, ah well... And now that I really understand this, change the fishy min limiting indices to what it really should have done. Which is simply to prevent fetching more values than valid for the last loop iteration. (This makes the code path in the loop minimally more complex for the non-indexed case as we have to skip the optimization combining two adds. I think it should be safe to skip this actually there, but I don't care much about this especially since skipping that optimization actually makes the code easier to read elsewhere.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	2471aaa02f	draw: simplify fetch some more Don't keep the ofbit. This is just a minor simplification, just adjust the buffer size so that there will always be an overflow if buffers aren't valid to fetch from. Also, get rid of control flow from the instanced path too. Not worried about performance, but it's simpler and keeps the code more similar to ordinary fetch. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	4e1be31f01	draw: unify linear and elts draw jit functions The code for elts and linear paths was nearly 100% identical by now - with the elts path simply having some additional gather for the elements in the main loop (with some additional small differences before the main loop). Hence nuke the separate functions and decide this at jit shader execution time (simply based on the presence of the elts pointer). Some analysis shows that the generated vs jit functions seem to be just very minimally more complex than the former elts functions, and almost none of the additional complexity is in the main loop (basically just the branch logic for the branch fetching the actual indices). Compared to linear, the codesize of the function is of course a bit larger, however the actual executed code in the main loop appears to be near 100% identical (the additional code looking up indices is skipped as expected). So, I would not expect a (meaningful) performance difference with the generated code, neither with elts nor linear, this does however roughly half the compilation time (the compiled shaders should also use only half the memory of course). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	8cf7edff7d	draw: use same argument order for jit draw linear / elts functions This is a bit simpler. Mostly to make it easier to unify the paths later... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	78a997f728	draw: drop unnecessary index overflow handling from vsplit code This was kind of strange, since it replaced indices which were only overflowing due to bias with MAX_UINT. This would cause an overflow later in the shader, except if stride was 0, however the vertex id would be essentially random then (-1 + eltBias). No test cared about it, though. So, drop this and just use ordinary int arithmetic wraparound as usual. This is much simpler to understand and the results are "more correct" or at least more consistent (vertex id as well as actual fetch results just correspond to wrapped around arithmetic). There's only one catch, it is now possible to hit the cache initialization value also with ushort and ubyte elts path (this wouldn't be an issue if we'd simply handle the eltBias itself later in the shader). Hence, we need to make sure the cache logic doesn't think this element has already been emitted when it has not (I believe some seriously bad things could happen otherwise). So, borrow the logic which handled this from the uint case, but not before fixing it up... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
Roland Scheidegger	7a55c436c6	draw: simplify vsplit elts code a bit vsplit_get_base_idx explicitly returned idx 0 and set the ofbit in case of overflow. We'd then check the ofbit and use idx 0 instead of looking it up. This was necessary because DRAW_GET_IDX used to return DRAW_MAX_FETCH_IDX and not 0 in case of overflows. However, this is all unnecessary, we can just let DRAW_GET_IDX return 0 in case of overflow. In fact before `bbd1e60198` the code already did that, not sure why this particular bit was changed (might have been one half of an attempt to get these indices to actual draw shader execution - in fact I think this would make things less awkward, it would require moving the eltBias handling to the shader as well). Note there's other callers of DRAW_GET_IDX - those code paths however explicitly do not handle index buffer overflows, therefore the overflow value doesn't matter for them. Also do some trivial simplification - for (unsigned) a + b, checking res < a is sufficient for overflow detection, we don't need to check for res < b too (similar for signed). And an index buffer overflow check looked bogus - eltMax is the number of elements in the index buffer, not the maximum element which can be fetched. (Drop the start check against the idx buffer though, this is already covered by end check and end < start). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-21 20:02:53 +01:00
George Kyriazis	9aae167e94	gallium: Add support for SWR compilation Include swr library and include -DHAVE_SWR in the compile line. v3: split to a separate commit Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:47 -06:00
George Kyriazis	5b4d1500dd	gallium: swr: Added swr build for windows v4: Add windows-specific gen_knobs.{cpp\|h} changes v5: remove aggresive squashing of gen_knobs.py to this commit; added SConscript to EXTRA_DIST in Makefile.am Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:47 -06:00
George Kyriazis	9e4e1f5190	swr: Modify gen_knobs.{cpp\|h} creation script Modify gen_knobs.py so that each invocation creates a single generated file. This is more similar to how the other generators behave. v5: remove Scoscript edits from this commit; moved to commit that first adds SConscript Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:47 -06:00
George Kyriazis	9085f1a9cc	scons: Add swr compile option To buils The SWR driver (currently optional, not compiled by default) v3: add option as opposed to target Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:47 -06:00
George Kyriazis	bc26e8d4a7	swr: Windows-related changes - Handle dynamic library loading for windows - Implement swap for gdi - fix prototypes - update include paths on configure-based build for swr_loader.cpp v2: split to multiple patches v3: split and reshuffle some more; renamed title v4: move Makefile.am changes to other commit. Modify header files Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
George Kyriazis	87bd28210f	swr: renamed duplicate swr_create_screen() There are 2 swr_create_screen() functions. One in swr_loader.cpp, which is used during driver init, and the other is hiding in swr_screen.cpp, which ends up in the arch-specific .dll/.so. Rename the second one to swr_create_screen_internal(), to avoid confusion in header files. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
George Kyriazis	974d280e81	swr: Handle windows.h and NOMINMAX Reorder header files so that we have a chance to defined NOMINMAX before mesa include files include windows.h v3: split from bigger patch Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
George Kyriazis	915b4b0d49	gallium: Added SWR support for gdi Added hooks for screen creation and swap. Still keep llvmpipe the default software renderer. v2: split from bigger patch v3: reword commit message Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
George Kyriazis	30ae2cbf82	scons: add llvm 3.9 support. v2: reworded commit message Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
George Kyriazis	2da28dbd11	scons: ignore .hpp files in parse_source_list() Drivers that contain C++ .hpp files need to ignore them too, along with .h files, when building source file lists. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
George Kyriazis	c323180733	mesa: removed redundant #else Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 12:44:46 -06:00
Jordan Justen	44c5ed02d1	i965/hsw: Set integer mode in sampling state for stencil texturing Fixes: ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_pot ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_npot ES31-CTS.functional.texture.border_clamp.unused_channels.depth24_stencil8_sample_stencil ES31-CTS.functional.texture.border_clamp.unused_channels.depth32f_stencil8_sample_stencil Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-21 10:10:53 -08:00
Emil Velikov	8e0e2478ba	reviewers: add Rob H for the Android EGL+build parts Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 16:01:06 +00:00
Emil Velikov	7a39a0091d	docs: recommend using --enable-mangling over the manual -DUSE... Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:27 +00:00
Emil Velikov	0fa854aea5	docs: rework/update install.html Still far from perfect, but a few small steps in the right direction. - Split build systems, compilers, third party tools - Mention building mesa for Android (part of AOSP) - Drop explicit "other" dependencies. Reference to disto methods to get them. - HTML 4.01 Traditional compliance fixes - mixed ul and br tags. - nuke dead links README.{CYGWIN,VMS} v2: Squash typos, add note about buggy flex 2.6.2 (Eric), add Suse zipper command (Tobias). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:23 +00:00
Emil Velikov	438086efb1	docs: sourcetree.html misc updates A mixed bag of updates/fixes - mostly aiming at removing no longer applicable directories. Add a few more state-trackers, drivers, etc. alongside "XXX more" where applicable. Attribute for the GLSL/NIR movement and nukage of src/egl/docs. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:20 +00:00
Emil Velikov	2edc29ab1e	docs: flesh out releasing.html Properly document the whole process: - Brief on what, when, where - Picking, testing, branchpoints, pre-release announcement - Releasing, announcement, website and bugzilla updates Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:18 +00:00
Emil Velikov	b571c075e9	docs/submittingpatches: fix tags mis/abuse Fix the odd tag so that we're HTML 4.01 Traditional compliant Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:14 +00:00
Emil Velikov	07384468af	docs/submittingpatches: flesh out "how to nominate" methods Currently they are buried within the text, making it hard to find. Move them to the top and be clear what is _not_ a good idea. v2: Minor commit polish, use only "resending" as suggested by Matt. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:12 +00:00
Emil Velikov	019f055f32	docs/autoconf: update glx driver / enable-debug text With earlier commit we folded all the xlib handling in --enable-glx, but we forgot to update the documentation. Elaborate on --enable-debug and drop mentions about depenencies. v2: Grammar - s\|haven't\|hasn't\| (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:09 +00:00
Emil Velikov	49ac732651	docs/repository: refer to Submitting patches v2: Improve grammar - add missing "to" (Eric). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:07 +00:00
Emil Velikov	259e65c03e	docs: split Submitting Patches into separate document Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:05 +00:00
Emil Velikov	e561737c52	docs: split Codying style into separate document Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:04 +00:00
Emil Velikov	edbf3ebe1f	docs: mention/suggest testing your patch against dEQP Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:08:02 +00:00
Emil Velikov	f2d9c7b60c	docs: mention that coding style can differ between drivers ... and point people to use/honour the EditorConfig/Emacs files, where applicable. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 15:07:59 +00:00
Emil Velikov	4fbeac398a	revieweds: add Tomasz for the Android/EGL implementation As mentioned/requested on the mailing list. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 14:46:40 +00:00
Emil Velikov	4f12fcb6d3	mesa: fold always true conditional Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 14:46:40 +00:00
Emil Velikov	e70d0d22a2	mesa: drop unneeded assert As seen a couple of lines above - there's no way for the assert to trigger. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-21 14:46:40 +00:00
Emil Velikov	130b12f96a	egl/wayland: remove non-applicable destroyDrawable from error path If we fail to create the drawable there's not much point in attampting to destroy it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-11-21 14:46:40 +00:00
Emil Velikov	b421fec958	loader: automake: whitespace cleanup Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-11-21 14:46:40 +00:00
Emil Velikov	4ffa9b2847	gbm: automake: remove unused defines Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-11-21 14:46:40 +00:00
Gwan-gyeong Mun	d3780e2e4d	intel: aubinator: Fix resource leak in gen_spec_load_from_path This fixes resource leak in gen_spec_load_from_path XML_ParserCreate failure path CID 1373564 Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-11-21 14:38:11 +00:00
Tomasz Figa	51727b1cf5	egl/android: Use gralloc::lock_ycbcr for resolving YUV formats (v2) There is an interface that can be used to query YUV buffers for their internal format. Specifically, if gralloc:lock_ycbcr() is given no SW usage flags, it's supposed to return plane offsets instead of pointers. Let's use this interface to implement support for YUV formats in Android EGL backend. v2: Fixes from Emil's review: a) Added comments for parts that might be not clear, b) Changed get_fourcc_yuv() to return -1 on failure, c) Changed is_yuv() to use bool. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-11-21 13:27:47 +00:00
Tomasz Figa	859d0b0121	egl/android: Get gralloc module in dri2_initialize_android() (v2) Currently droid_open_device() gets a reference to the gralloc module only for its own use and does not store it anywhere. To make it possible to call gralloc methods from code added in further patches, let's refactor current code to get gralloc module in dri2_initialize_android() and store it in dri2_dpy. v2: fixes from Emil's review: a) remove duplicate initialization of 'err'. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-11-21 13:27:41 +00:00
Tomasz Figa	925ff0b534	egl/android: Remove handling of RGB_888 pixel format It is currently completely broken, as it ends up using RGBX_8888 on hardware side, due to no way of distinguishing between these two in the DRI API, while HAL_PIXEL_FORMAT_RGB_888 is clearly defined to be the 3-byte per pixel RGB format. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-21 13:27:01 +00:00
Gwan-gyeong Mun	9c5b1c7990	radeonsi: Fix resource leak in gs_copy_shader allocation failure path CID 1394028 Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-22 00:04:59 +11:00
Nicolai Hähnle	0e11290ef5	glsl/lower_output_reads: remove unused mem_ctx Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-21 08:21:52 +01:00
Nicolai Hähnle	a3b98edf6f	glsl/lower_output_reads: bail early in tessellation control shaders This whole pass is a no-op. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-21 08:21:41 +01:00
Nicolai Hähnle	0d383a79a8	glsl/lower_output_reads: fix geometry shader output handling with conditional emit Consider a geometry shader that contains code like this: some_out = expr; if (cond) { ... EmitVertex(); } else { ... EmitVertex(); } Both branches should see the correct value of some_out. Since this is a rather subtle and rare case, I'm submitting a piglit test for this as well. GLSL says that the values of output variables are undefined after EmitVertex(). With this change, the values will now be defined and unmodified. This may reduce optimization opportunities in the probably quite rare case where subsequent compiler passes cannot prove that the value of the output variable is overwritten. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-21 08:21:31 +01:00
Nicolai Hähnle	42d5e91a2a	radeonsi: store group_size_variable in struct si_compute For compute shaders, we free the selector after the shader has been compiled, so we need to save this bit somewhere else. Also, make sure that this type of bug cannot re-appear, by NULL-ing the selector pointer after we're done with it. This bug has been there since the feature was added, but was only exposed in piglit arb_compute_variable_group_size-local-size by commit `9bfee7047b` (which is totally unrelated). Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-21 08:19:43 +01:00
Nicolai Hähnle	47db6b4600	glsl: don't flatten if-blocks with dynamic array indices This fixes the regression of radeonsi in glsl-1.10/execution/variable-indexing/vs-output-array-vec3-index-wr caused by commit `74e39de932`. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-21 08:18:47 +01:00
Iago Toral Quiroga	39c47e7698	anv/state: enable coordinate address rounding for Min/Mag filters This patch improves pass rate of dEQP-VK.texture.explicit_lod.2d.sizes.* from 68.0% (98/144) to 83.3% (120/144) by enabling sampler address rounding mode when the selected filter is not nearest, which is the same thing we do for OpenGL. These tests check texture filtering for various texture sizes and mipmap levels. The failures (without this patch) affect cases where the target texture has odd dimensions (like 57x35) and either the Min or the Mag filter is not nearest. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-21 08:01:54 +01:00
Jason Ekstrand	a8b85f1f77	anv: Implement a depth stall restriction on gen7 Fixes around 60 Vulkan CTS tests on Haswell Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-20 20:40:40 -08:00
Ilia Mirkin	9145873b15	nvc0/ir: use levelZero flag when the lod is set to 0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-11-20 18:13:12 -05:00
Dave Airlie	b1340fd708	radv: spir-v allows texture size query with and without lod. The translation to llvm was failing here due to required lod. This fixes some new SteamVR shaders. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-21 09:00:22 +10:00
Dave Airlie	6d7be52d90	radv: fix image view creation for depth and stencil only This fixes the image view for sampling just the depth. It removes some pointless swizzle code, and adds a missing case for the x8_d24 format. Fixes: dEQP-VK.renderpass.formats.d32_sfloat_s8_uint.input.* dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.* dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.* Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-21 08:58:03 +10:00
Dave Airlie	51a44c0021	radv: make sure to flush input attachments correctly. This fixes 9 of the dEQP-VK.renderpass.attachment_allocation.input_output.* tests. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-21 08:57:31 +10:00
Brian Paul	e899d47bc9	tnl: remove unneeded #include "util/simple_list.h" Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2016-11-20 06:41:42 -07:00
Brian Paul	a6e849c672	radeon: remove unneeded #include "util/simple_list.h" Compile tested only. Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2016-11-20 06:41:42 -07:00
Brian Paul	36678e97e4	r200: remove unneeded #include "util/simple_list.h" And include "util/simple_list.h" where it is needed in r200_state.c Compile tested only. Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2016-11-20 06:41:41 -07:00
Brian Paul	5d7b5d8627	i915: remove unneeded #include "util/simple_list.h" Compile tested only. Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2016-11-20 06:41:41 -07:00
Brian Paul	da0bc7b646	mesa: remove unneeded #includes in errors.c Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2016-11-20 06:41:41 -07:00
Brian Paul	0d1e240a4f	mesa: remove trailing whitespace in errors.c Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2016-11-20 06:41:41 -07:00
Kenneth Graunke	9c1609f0d6	nir: Add a C wrapper for glsl_type::is_array_of_arrays(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-19 12:30:26 -08:00
Kenneth Graunke	a1a292d177	i965: Store a clip_distance_mask field similar to cull_distance_mask. This isn't useful for legacy GL, but will be used in Vulkan. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-19 12:30:25 -08:00
Kenneth Graunke	19c652b29c	i965: Use shader_info for brw_vue_prog_data::cull_distance_mask. This also allows us to move it from a GL specific location to a part of the compiler shared by both GL and Vulkan. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-19 12:30:25 -08:00
Kenneth Graunke	c447ca64c1	compiler: Store the clip/cull distance array sizes in shader_info. We switched from a boolean to array lengths in gl_program a while back. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-19 12:30:25 -08:00
Kenneth Graunke	c4be6e0b8d	i965: Fix GS push inputs with enhanced layouts. We weren't taking first_component into account when handling GS push inputs. We hardly ever push GS inputs, so this was not caught by existing tests. When I started using component qualifiers for the gl_ClipDistance arrays, glsl-1.50-transform-feedback-type-and-size started catching this. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-19 12:30:25 -08:00
Kenneth Graunke	45aee6be02	i965: Delete unused variable. I forgot to delete this in `9ef2b9277d`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-19 12:30:25 -08:00
Kenneth Graunke	9ef2b9277d	intel: Share URB configuration code between GL and Vulkan. This code is far too complicated to cut and paste. v2: Update the newly added genX_gpu_memcpy.c; const a few things. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-19 11:40:01 -08:00
Kenneth Graunke	6d416bcd84	i965: Use arrays in Gen7+ URB code. So much of this code was cut and pasted per stage. We can accomplish much of it by looping over shader stages. Improves performance of OglBatch7 (version 6) by 1.50783% +/- 0.287049% (n = 71) at 1024x768 on Cherryview. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-19 11:40:00 -08:00
Kenneth Graunke	6656dd4b92	i965: Drop brw->urb.{nr__entries,_start} assignments from gen7_urb.c. The context fields are for Gen4-5; setting them has always been useless. There's no point in spending the cost in the hottest path in the driver. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-19 11:40:00 -08:00
Kenneth Graunke	74d8612eed	i965: Switch to roundf in HS/DS URB code. Matt intentionally switched the VS calculation to be float-based in commit `c1da15709a`. Tessellation support was written before this and rebased forward, and missed the change. Now it's consistent. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-19 11:39:59 -08:00
Kenneth Graunke	c87b5dee11	i965: Make URB code use prog_data for GS/tessellation enable checks. If geometry/tessellation shaders are disabled, prog_data will be NULL (see brw_state_upload.c). This consolidates dirty bits a little. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-19 11:39:58 -08:00
Kenneth Graunke	639af2a7c6	intel: Convert devinfo->urb.min_*_entries into an array. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-19 11:39:56 -08:00
Kenneth Graunke	58c09e72b1	intel: Convert devinfo->urb.max_*_entries into an array. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-19 11:39:45 -08:00
Brian Paul	2acfd36479	docs: document MESA_DEBUG=context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-11-19 08:44:03 -07:00
Ilia Mirkin	ea276512a0	swr: mark streamout buffers as written Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-19 10:40:37 -05:00
Timothy Arceri	203c8794a1	st/mesa/glsl/nir/i965: make use of new gl_shader_program_data in gl_shader_program Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-19 15:45:46 +11:00
Timothy Arceri	65cd0a0d7f	mesa: create new gl_shader_program_data struct This will be used to share data between gl_program and gl_shader_program allowing for greater code simplification as we can remove a number of awkward uses of gl_shader_program. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-19 15:45:46 +11:00
Timothy Arceri	0c85d2fea4	glsl: add new program driver function to standalone compiler This fixes a regression with the standalone compiler caused by `9d96d3803a` Note that we change standalone_compiler_cleanup() to no longer explicitly free the linked shaders as the will be freed when we free the parent ctx whole_program. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98774	2016-11-19 15:00:12 +11:00
Kenneth Graunke	ff0253a5ed	i965: Disable depth writes when depth test is GL_EQUAL. There's no point in performing depth writes when the depth test comparison function is set to GL_EQUAL - it would just write out the same value that's already there (if it is written at all). While this is harmless from a functional perspective, it hurts performance. Obviously, writing to memory is not free, but there's another more subtle impact as well: it can prevent early depth optimizations. Depth writes aren't supposed to happen for pixels that are killed by fragment shader discard statements or the alpha test. So, with depth writes enabled and either of those, the pixel shader must be invoked to determine whether or not to perform the write. This is fairly stupid in the EQUAL case - we're running a shader to decide whether to replace the existing depth value with itself. By disabling these pointless writes, we allow early depth even with discards and alpha testing, allowing the hardware to skip the pixel shader altogether if the depth test fails. Improves performance of Unigine Valley: - Skylake GT2: +17.8% - Broadwell GT3e: +11.5% - Cherrytrail: +19.4% Huge thanks to Mark Janes for building frameretrace [1], the performance analysis tool that helped us find this issue, and to Robert Bragg for providing us performance metrics on Linux. Mark also spent the time to analyze Valley performance on Windows vs. Linux and discovered a discrepancy in early depth test metrics. Once he had isolated a draw call and drawn attention to the problem, fixing it was pretty simple. [1] https://github.com/janesma/apitrace/wiki/frameretrace-branch Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-18 14:48:52 -08:00
Timothy Arceri	adb3a83c09	glsl: tidy up entries temporary Here we just move initialisation of entries to where it is needed i.e. outside the loop and after the continue checks. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-19 09:35:58 +11:00
Timothy Arceri	c20564ae3e	glsl/i965: move per stage AtomicBuffers list to gl_program Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-19 09:35:58 +11:00
Timothy Arceri	9d96d3803a	glsl: create gl_program at the start of linking rather than the end This will allow us to directly store metadata we want to retain in gl_program this metadata is currently stored in gl_linked_shader and will be lost if relinking fails even though the program will remain in use and is still valid according to the spec. "If a program object that is active for any shader stage is re-linked unsuccessfully, the link status will be set to FALSE, but any existing executables and associated state will remain part of the current rendering state until a subsequent call to UseProgram, UseProgramStages, or BindProgramPipeline removes them from use." This change will also help avoid the double handing that happens in _mesa_copy_linked_program_data(). Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-19 07:42:33 +11:00
Timothy Arceri	2b8f97d0ff	st/mesa/i965: simplify gl_program references and stop leaking In i965 we were calling _mesa_reference_program() after creating gl_program and then later calling it again with NULL as a param to get the refcount back down to 1. This changes things to not use _mesa_reference_program() at all and just have gl_linked_shader take ownership of gl_program since refcount starts at 1. The st and ir_to_mesa linkers were worse as they were both getting in a state were the refcount would never get to 0 and we would leak the program. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-19 07:42:33 +11:00
Nanley Chery	9db5cc829f	anv/cmd_buffer: Enable stencil-only HZ clears The HZ sequence modifies less state than the blorp path and requires less CPU time to generate the necessary packets. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-18 12:12:55 -08:00
Nanley Chery	37c07d64b4	anv/cmd_buffer: Manage Anv state around HZ op emission Move the assignment to a less surprising location. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-18 12:12:50 -08:00
Nanley Chery	6ff4c24fdd	anv/cmd_buffer: Clarify HZ rectangle behavior This behavior differs from what's described in the PRMs and was observed by analyzing CTS test results. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-18 12:12:34 -08:00
Nanley Chery	63318d34ac	mesa/fbobject: Update CubeMapFace when reusing textures Framebuffer attachments can be specified through FramebufferTexture* calls. Upon specifying a depth (or stencil) framebuffer attachment that internally reuses a texture, the cube map face of the new attachment would not be updated (defaulting to TEXTURE_CUBE_MAP_POSITIVE_X). Fix this issue by actually updating the CubeMapFace field. This bug manifested itself in BindFramebuffer calls performed on framebuffers whose stencil attachments internally reused a depth texture. When binding a framebuffer, we walk through the framebuffer's attachments and update each one's corresponding gl_renderbuffer. Since the framebuffer's depth and stencil attachments may share a gl_renderbuffer and the walk visits the stencil attachment after the depth attachment, the uninitialized CubeMapFace forced rendering to TEXTURE_CUBE_MAP_POSITIVE_X. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77662 Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-18 11:58:19 -08:00
Lionel Landwerlin	9a806d2d15	mesa: add NV_image_formats extension support This extension can be enabled automatically as it is a subset of ARB_shader_image_load_store. v2: Replace helper function by qualifier struct field (Ilia) Enable NV_image_formats using ARB_shader_image_load_store (Ilia) v3: Drop extension field from gl_extensions (Ilia) Release notes (Ilia) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98480 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-18 13:27:28 +00:00
Timothy Arceri	88fe2c308e	mesa: fix old classic drivers to use ralloc for ARB asm programs These changes were missed in `0ad69e6b5`. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98767	2016-11-18 23:39:40 +11:00
Nicolai Hähnle	da2a51129b	st/mesa: silence warnings in optimized builds Mark variables and static functions that only occur in assert()s as MAYBE_UNUSED. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-18 09:49:22 +01:00
Nicolai Hähnle	9882ed85bd	radeonsi: emit sample locations also when nr_samples == 1 Since the state tracker now enables MSAA in the hardware for the case nr_samples == 1 as well, we need to set sample locations correctly for this case. The Polaris override is still needed for the non-MSAA case (when nr_samples == 0). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-18 09:48:46 +01:00
Nicolai Hähnle	70454f5b55	radeonsi: allow sample mask export for single-sample framebuffers This fixes GL45-CTS.sample_variables.mask..samples_1.. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-18 09:48:43 +01:00
Nicolai Hähnle	ceac3397fb	st/mesa: remove a redundant call to _mesa_is_multisample_enabled We called it immediately prior, so re-use the previously returned value. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-18 09:48:39 +01:00
Nicolai Hähnle	adba706122	mesa/main: consider multisampling enabled when number of samples == 1 There are some differences between how non-multisampled framebuffers (i.e. samples == 0) and multisampled framebuffers with a single sample should be treated. For example, alpha to coverage and writing to gl_SampleMask has an effect with single-sample multisample framebuffers, but not on non-multisample framebuffers. This fixes GL45-CTS.sample_variables.mask..samples_1. at least for Gallium drivers (and possibly others, though at least radeonsi needs an additional fix). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-18 09:48:14 +01:00
Kenneth Graunke	14af96007f	i965: Delete fs_visitor::nir_setup_single_output_varying prototype. I deleted this function in `59864e8e02`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-18 00:29:11 -08:00
Tapani Pälli	ec4e71f75e	mesa: fix empty program log length In case we have empty log (""), we should return 0. This fixes Khronos WebGL conformance test 'program-infolog'. From OpenGL ES 3.1 (and OpenGL 4.5 Core) spec: "If pname is INFO_LOG_LENGTH , the length of the info log, including a null terminator, is returned. If there is no info log, zero is returned." v2: apply same fix for get_shaderiv and _mesa_GetProgramPipelineiv (Ian) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97321 Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-18 07:42:41 +02:00
Roland Scheidegger	5ec3a7333f	draw: finally optimize bool clip mask generation lp_build_any_true_range is just what we need, though it will only produce optimal code with sse41 (ptest + set) - but even without it on 64bit x86 the code is still better (1 unpack, 2 movq + or + set), on 32bit x86 it's going to be roughly the same as before. While here also make it a "real" 8bit boolean - cuts one instruction but more importantly similar to ordinary booleans. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-18 01:25:21 +01:00
Roland Scheidegger	b16f06fd05	draw: use vectorized calculations for fetch (v2) Instead of doing all the math with scalars, use vectors. This means the overflow math needs to be done manually, albeit that's only really problematic for the stride/index mul, the rest has been pretty much moved outside the shader loop (albeit the mul could actually be optimized away too), where things are still scalar. To eliminate control flow in the main shader loop fetch, provide fake buffers (so index 0 is always valid to fetch). Still uses aos fetch though in the end - mostly because some more code would be needed to handle unaligned fetches in that path, and because for most formats it won't make a difference anyway (we generate some truly horrendous code for things like R16G16_something for instance). Instanced fetch however stays roughly the same as before, except that no longer the same element is fetched multiple times (I've seen a reduction of ~3 times in main shader loop size due to llvm not recognizing it's all the same fetch, since it would have been possible some of the fetches getting replaced with zeros in case vector size exceeds remaining fetch count - the values of such fetches don't matter at all though). Also, for elts gathering, use vectorized code as well. The generated shaders are smaller and faster to compile (not entirely sure about execution speed, but generally unless there's just single vertices to handle I would expect it to be faster - there's more opportunities for future improvements by using soa fetch). v3: skip the fake index buffer, not needed due to the jit code never seeing the real index buffer in the first place. Fix a bug with mask expansion (needs SExt, not ZExt). Also, be really really careful to keep the behavior the same, even in cases where it looks wrong, and add comments why the code is doing the seemingly wrong stuff... Fortunately it's not actually more complex in the end... Also change function order slightly just to make the diff more readable. No piglit change. Passes some internal testing with another api too... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-18 01:25:21 +01:00
Jordan Justen	0cee3fd5c7	i965/gen7: Minify blit size for stencil tree copy Found by the piglit 'fbo-depth-array stencil-clear' test when implementing blorp blit splitting for gen7. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-17 14:15:44 -08:00
Kenneth Graunke	9bfee7047b	mesa: Drop PATH_MAX usage. GNU/Hurd does not define PATH_MAX since it doesn't have such arbitrary limitation, so this failed to compile. Apparently glibc does not enforce PATH_MAX restrictions anyway, so it's kind of a hoax: https://www.gnu.org/software/libc/manual/html_node/Limits-for-Files.html MSVC uses a different name (_MAX_PATH) as well, which is annoying. We don't really need it. We can simply asprintf() the filenames. If the filename exceeds an OS path limit, presumably fopen() will fail, and we already check that. (We actually use ralloc_asprintf because Mesa provides that everywhere, and it doesn't look like we've provided an implementation of GNU's asprintf() for all platforms.) Fixes the build on GNU/Hurd. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98632 Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-17 14:14:37 -08:00
Kenneth Graunke	ca76e6b521	i965: Fix compute shader crash. Fixes crashes when starting Deus Ex: Mankind Divided. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-11-17 14:14:06 -08:00
Jason Ekstrand	3da7adc755	anv/TODO: Check off render buffer compression There's still a tiny bit of work to do for storage images but it's otherwise pretty much done at this point.	2016-11-17 12:03:24 -08:00
Jason Ekstrand	4e91f158e6	anv: Enable "permanent" compression for immutable format images This commit extends our support of color compression to surfaces without the VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT set. These images will never have an image view created with a different format then the one set at image creation time so it's safe to always use compression. We still bail if the image is used as a storage image because that sometimes ends up using a different format. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	2b5644e94d	intel/blorp: Properly handle color compression in blorp_copy Previously, blorp copy operations were CCS-unaware so you had to perform resolves on the source and destination before performing the copy. This commit makes blorp_copy capable of handling CCS-compressed images without any resolves. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	89f9c46a74	intel/blorp: Always use UINT formats on SKL+ Many of these UINT formats aren't available prior to Sky Lake so we used UNORM formats. Using UINT formats is a bit nicer because it guarantees we don't run into rounding issues. Also, we will need it in the next commit for handling copies with CCS enabled. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	c8357b5d34	i965/blorp: Rework resolve handling This commit moves the handling of resolves into blorp_surf_for_miptree(). Instead of each helper doing resolves and checks itself, it simply tells blorp_surf_for_miptree which aux modes are supported by the given blorp operation and blorp_surf_for_miptree will resolve as-needed. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	edb7f67bd9	anv/image: Add an aux_usage field for "default" aux Initially, the field is set to ISL_AUX_USAGE_NONE so this commit shouldn't bring any functional changes. Setting this field to something else will cause all sampled and storage image views to be created with AUX and blorp will start trying to respect it so set with care.	2016-11-17 12:03:24 -08:00
Jason Ekstrand	338cdc172a	anv: Add initial support for Sky Lake color compression This commit adds basic support for color compression. For the moment, color compression is only enabled within a render pass and a full resolve is done before the render pass finishes. All texturing operations still happen with CCS disabled.	2016-11-17 12:03:24 -08:00
Jason Ekstrand	e2f5880839	anv/pass: Precompute some subpass usage information	2016-11-17 12:03:24 -08:00
Jason Ekstrand	9b9fb6d212	util/vk_alloc: Add a vk_zalloc2 helper Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	a512565b2b	anv/image: Memset all aux surfaces (not just HiZ) to 0 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	c3eb58664e	anv/image: Rename hiz_surface to aux_surface	2016-11-17 12:03:24 -08:00
Jason Ekstrand	ccdf9af392	anv/blorp: Ignore clears for attachments first used as resolve destinations Otherwise, we'll try to clear it the first time it's used as a draw so if you do some multisampled rendering, resolve to an attachment, and then draw on top of the single-sampled attachment, we might accidentally clear it. Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	1ba2f05bc0	intel/blorp: Take a fast_clear_op in ccs_resolve Eventually, we may want to just have a single blorp_ccs_op function that does both clears and resolves. For now we'll stick to just making the ccs_resolve function we have now a bit more configurable. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Pohjolainen, Topi	7c560e8ccc	intel/blorp: Add plumbing for color resolve slice details Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	d7bd8c15d6	intel/isl: Allow non-2D CCS surfaces The CCS calculations in ISL are already correct for 1-D and 3-D CCS surfaces since they have exactly the same layout as 2-D array surfaces (at least on Sky Lake). The only problem was that we weren't passing in the right dimensionality and we weren't passing in the depth. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	26c8bb7bc0	intel/isl: Rework the asserts and fails in isl_surf_get_ccs There are some invariants such as number of samples on which we should assert. However, most other things should silently return false since they're much easier for isl_surf_get_ccs to check than the caller. We also update the checking to be a bit more complete.	2016-11-17 12:03:24 -08:00
Jason Ekstrand	818c7bfb31	anv/cmd_buffer: Refactor surface state relocation handling Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	9be9f5f1c7	anv/cmd_buffer: Pull add_surface_state_reloc into genX_cmd_buffer.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	0c403df310	anv/image: Stop force-disabling AUX Auxiliary surfaces have to be created manually anyway so force-disabling it does nothing whatsoever at the moment. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Tom Stellard	929fcee47e	mesa: Add missing call to _mesa_unlock_debug_state(ctx); v2 `cd724208d3` added a call to _mesa_lock_debug_state(ctx) but wasn't unlocking the debug state. This fixes a hang in glsl-fs-loop piglit test with MESA_DEBUG=context. v2: - Remove unrelated changes. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-17 18:32:35 +00:00
Eric Engestrom	9702f91366	egl: fix helper function name I introduced this code last month, but didn't follow the naming convention. Fix this. Fixes: `0a606a400f` ("egl: add eglSwapBuffersWithDamageKHR") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Eric Engestrom <eric@engestrom.ch>	2016-11-17 09:33:25 +02:00
Eric Engestrom	8b780a543a	egl/x11: misc style fixes Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-11-17 09:32:48 +02:00
Eric Engestrom	41b5d98b28	egl: fix function name in debug string Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-11-17 09:32:11 +02:00
Jason Ekstrand	9557147592	nir/spirv: Fix handling of gl_PrimitiveId Before, we were always treating it as an output which bogus. The only stage in which this it can be an output is the geometry stage. In all other stages, it's an input which, in the back-end, we actually want to be a system value. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-16 20:07:23 -08:00
Jason Ekstrand	1c97432ce8	anv/fence: Handle ANV_FENCE_CREATE_SIGNALED_BIT Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-16 20:07:23 -08:00
Jason Ekstrand	49f08ad77f	anv: Handle null in all destructors This fixes a bunch of new CTS tests which look for exactly this. Even in the cases where we just call vk_free to free a CPU data structure, we still handle NULL explicitly. This way we're less likely to forget to handle NULL later should we actually do something less trivial. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-16 20:07:23 -08:00
Jason Ekstrand	d0646c8015	util/vk_alloc: Ensure NULL is handled correctly in vk_free Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-16 20:07:23 -08:00
Jason Ekstrand	18266247a0	anv/device: Silence a 32-bit warning	2016-11-16 20:07:20 -08:00
Eric Anholt	80786a67cf	nir: Avoid an extra NIR op in integer divide lowering. NIR bools are ~0 for true, so ((unsigned)a >> 31) != 0 -> ((int)a >> 31). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-16 19:45:01 -08:00
Eric Anholt	7f27ad5597	vc4: Try compiling our FSes in multithreaded mode on new kernels. Multithreaded fragment shaders let us hide texturing latency by a hyperthreading-style switch to another fragment shader. This gets us up to 20% framerate improvements on glmark2 tests.	2016-11-16 19:45:01 -08:00
Eric Anholt	45c022f2b0	vc4: Add support for ETC1 textures if the kernel is new enough. The kernel changes for exposing the param have now been merged, so we can expose it here.	2016-11-16 19:45:01 -08:00
Eric Anholt	7130260d12	vc4: Fix simulator mode missing-GETPARAM debug info. The value is 0 since we didn't set it, we wanted to see the param.	2016-11-16 19:45:01 -08:00
Mun Gwan-gyeong	20c1623a11	vc4: Fix resource leak in register allocation failure path. CID 1394322 Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>	2016-11-16 19:45:01 -08:00
Timothy Arceri	686dad657f	glsl: stub out _mesa_reference_program() in standalone compiler The follow patch will call this directly from the linker, the shader cache will also start calling these from the compiler.	2016-11-17 12:53:12 +11:00
Timothy Arceri	c3df65c123	st/mesa/r200/i915/i965: move ARB program fields into a union It's common for games to compile 2000 programs or more so at 32bits x 2000 programs x 22 fields x 2 (at least) stages This should give us something like 352 kilobytes in savings once we add some more glsl only fields. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-17 12:53:12 +11:00
Timothy Arceri	d6bdb3a862	st/mesa: stop initialing Instructions and NumInstructions Since gl_program is now created with rzalloc() they should already be initialised. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-17 12:52:59 +11:00
Timothy Arceri	0ad69e6b51	mesa: make use of ralloc when creating ARB asm gl_program fields This will allow us to move the ARB asm fields in gl_program into a union as we will be able call ralloc_free() on the entire struct when destroying the context. In this change we switch over to using ralloc for the Instructions, String and LocalParams fields of gl_program. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-17 12:52:24 +11:00
Timothy Arceri	9c9589f1e2	mesa: remove unused Comment field in prog_instruction Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-17 12:52:24 +11:00
Timothy Arceri	67b9c26342	i965: get num_abos from shader_info rather than gl_linked_shader This is a step towards freeing gl_linked_shader after linking. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-17 12:52:24 +11:00
Timothy Arceri	5581f2a8f2	mesa/glsl: copy num_abos to gl_program We should be able to free gl_linked_shader after linking in order to do so we need to switch to getting values from gl_program instead. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-17 12:52:24 +11:00
Timothy Arceri	ba40c8b03c	i965: get num_images from shader_info rather than gl_linked_shader This is a step towards freeing gl_linked_shader after linking. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-17 12:52:24 +11:00
Timothy Arceri	9c2042f2ce	mesa/glsl: copy num_images to gl_program We should be able to free gl_linked_shader after linking in order to do so we need to switch to getting values from gl_program instead. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-17 12:52:24 +11:00
Timothy Arceri	6b82e957be	nir: add support for counting AoA uniforms in nir_shader_gather_info() Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-17 12:52:24 +11:00
Timothy Arceri	c3b8bf9bc9	i965: only try print GLSL IR once when using INTEL_DEBUG to dump ir Since we started releasing GLSL IR after linking the only time we can print GLSL IR is during linking. When regenerating variants only NIR will be available. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-17 12:52:24 +11:00
Jason Ekstrand	2e2160969e	anv/descriptor_set: Put the whole state in the state free list We're not really saving much by just putting the offset in there. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-11-16 17:07:35 -08:00
Jason Ekstrand	37537b7d86	anv/descriptor_set: Write the state offset in the surface state free list. When Kristian reworked descriptor set allocation, somehow he forgot to actually store the offset in the free list. Somehow, this completely missed CTS testing until now... This fixes all 2744 of the new 'dEQP-VK.texture.filtering.* tests in the latest CTS. Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-11-16 17:07:29 -08:00
Timothy Arceri	8af1b2a2ce	compiler: remove now unused copy_shader_info() declaration Left over from `4ac66861` Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-17 11:02:25 +11:00
Timothy Arceri	29ade71af9	compiler: include shader_enums.h in shader_info.h We make use of some enums here. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-17 11:02:19 +11:00
Tim Rowley	a456ea17fb	swr: [rasterizer core] fix clear with multiple color attachments Fixes fbo-mrt-alphatest v2: styling fixes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-16 14:21:04 -06:00
Ben Widawsky	0272f76741	Partial revert "i965: "Fix" aux offsets" This partially reverts commit `0d241085f7`. HiZ buffer cannot do this properly now, and it's not required, so remove it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-16 11:40:12 -08:00
Ben Widawsky	0d241085f7	i965: "Fix" aux offsets When 1 BO is used for aux data, it needs to point to the correct offset, which will not be the BOs offset but instead an offset from the BOs offset. Since today there are always multiple BOs for aux, this doesn't actually change anything. Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-11-16 11:24:33 -08:00
Jason Ekstrand	e50bf059b0	anv/blorp: Handle VK_ATTACHMENT_UNUSED in CmdClearAttachments From the Vulkan 1.0.29 spec for vkCmdClearAttachments: "If the subpass’s depth/stencil attachment is VK_ATTACHMENT_UNUSED, then the clear has no effect." and "If colorAttachment is VK_ATTACHMENT_UNUSED then the clear has no effect." I have no idea why it's spec'd this way; it seems very anti-Vulkan to me, but that's what it says and it's really not much work to support. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-11-16 10:32:20 -08:00
Jason Ekstrand	633677194f	Allocate a null state whenever there is depth/stencil Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:32:20 -08:00
Jason Ekstrand	a380f95461	anv: Set framebuffer to NULL in secondary command buffers Nothing that is allowed to be called within a secondary now relies on the framebuffer. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-11-16 10:32:15 -08:00
Jason Ekstrand	9fcaf4e37a	anv/blorp: Use the new clear_attachments entrypoint for attachment clears This allows us to re-use the surface states emitted from the Vulkan driver instead of blorp creating its own. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:29 -08:00
Jason Ekstrand	e371850d94	anv/blorp: Break the guts of alloc_binding_table into a shared helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:29 -08:00
Jason Ekstrand	3c1ee052bd	anv: Bring back anv_cmd_buffer_emit_state_base_address This reverts most of commit `52904ba85c`. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:29 -08:00
Jason Ekstrand	72878f9f53	intel/blorp: Add a clear_attachments entrypoint Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:29 -08:00
Jason Ekstrand	0aea29cc1c	intel/blorp: Add capability to use pre-baked binding tables When a pre-baked binding table is requested, no binding table is created, instead the binding table offset (relative to surface state base address) provided by the user is used verbatim. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:29 -08:00
Jason Ekstrand	f7f768d195	intel/blorp: Add support for vertex shaders Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:29 -08:00
Jason Ekstrand	768c8dd718	intel/blorp: Use an actual chunk of vertex buffer for the VUE header We're about to start passing other things in as a sort of "VS header" for vertex shaders and we need a place to put them. Since we want the instance id to be one of them, it makes sense to have one vec4 that's either VUE header or VS header. Always uploading some handy zeros makes the code a bit simpler. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:29 -08:00
Jason Ekstrand	8c8095c260	blorp/exec: Use uint32_t for copying varying data Some things may not be floats and intel CPUs are known for mangling bits when a float type is used for copying integers. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:29 -08:00
Jason Ekstrand	21943c35f7	intel/blorp: Handle NIR clear inputs the same way as blit inputs By using offsetof() we can ensure that adding fiels to wm_inputs is always safe as long as we maintain alignment. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:29 -08:00
Jason Ekstrand	570a0e844b	intel/blorp: Remove NIR support for uniforms Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:29 -08:00
Jason Ekstrand	99b436ae5c	intel/blorp: Add a shader type to make keys more unique Depending on how the driver using blorp implements its shader caching, there is a small chance of shader collisions due to identical keys between blit and clear programs. Adding a small shader type at the top of the key alleviates this problem. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:29 -08:00
Jason Ekstrand	1acebeb191	intel/blorp: Make the number of samples an explicit parameter Previously, we always inferred it from params->dst which meant that references to params->dst were scattered all throughout the state upload code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:29 -08:00
Jason Ekstrand	6614234fc9	anv/cmd_buffer: Stop relying on the framebuffer for 3DSTATE_SF on gen7 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:07 -08:00
Jason Ekstrand	d2b4a9da03	anv: Rework the way render target surfaces are allocated This commit moves the allocation and filling out of surface state from CreateImageView time to BeginRenderPass time. Instead of allocating the render target surface state as part of the image view, we allocate it in the command buffer state at the same time that we set up clears. For secondary command buffers, we allocate memory for the surface states in BeginCommandBuffer but don't fill them out; instead, we use our new SOL-based memcpy function to copy the surface states from the primary command buffer. This allows us to handle secondary command buffers without the user specifying the framebuffer ahead-of-time. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:07 -08:00
Jason Ekstrand	e283cd549c	anv/cmd_buffer: Expose add_surface_state_reloc as an inline helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:07 -08:00
Jason Ekstrand	858b75563f	anv/cmd_buffer: Use the surface state alloc helper in null_surface_state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:07 -08:00
Jason Ekstrand	3d9747780b	anv: Add a helper for doing buffer copies with nothing but VF and SOL. This method of doing copies has the advantage of touching very little of the GPU state. While it does disable all the shader stages, it doesn't have to blow away binding tables, viewports, scissors, or any other bits of dynamic state other than VBO 32 which is already reserved. All of the state that it does touch is contained within a pipeline anyway so that's the only thing that has to be dirtied. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:07 -08:00
Jason Ekstrand	184bbfd69b	intel/genxml: Add SO_WRITE_OFFSET registers for gen7-9 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:10:26 -08:00
Jason Ekstrand	b3bc806855	intel/isl: Add some basic info about RENDER_SURFACE_STATE to isl_device Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:10:26 -08:00
Jason Ekstrand	ba349e106e	anv/pipeline: Use get_scratch_space/address for compute shaders Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-16 10:09:18 -08:00
Jason Ekstrand	d33e2ad67c	anv: Move INTERFACE_DESCRIPTOR_DATA setup to the pipeline There are a few dynamic bits, namely binding table and sampler addresses, but most of it is static and really belongs in the pipeline. It certainly doesn't belong in flush_compute_descriptor_set. We'll use the same state merging trick we use for gen7 DEPTH_STENCIL. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-16 10:09:16 -08:00
Jason Ekstrand	8db6f2e6eb	anv/pipeline: Roll genX_pipeline_util.h into genX_pipeline.c Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-16 10:09:14 -08:00
Jason Ekstrand	68c58edcfa	anv/pipeline: Unify graphics_pipeline_create Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-16 10:09:12 -08:00
Jason Ekstrand	9359835fcb	anv/pipline: Re-order state emission and make it consistent This commit makes both gen7 and gen8 pipeline setup emit state packets in exactly the same order. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:09:10 -08:00
Jason Ekstrand	5706d2590f	anv/pipeline: Rework the 3DSTATE_VF_TOPOLOGY helper It gets a new name and moved to genX_pipeline_util.h. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:09:08 -08:00
Jason Ekstrand	3f480d5dd3	anv/pipeline: Move 3DSTATE_PS_EXTRA setup into a helper Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:09:07 -08:00
Jason Ekstrand	8be164d05a	anv/pipeline: Unify 3DSTATE_WM emission Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-16 10:09:05 -08:00
Jason Ekstrand	1587ac1edc	intel/genxml: Make 3DSTATE_WM more consistent across gens Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-16 10:09:03 -08:00
Jason Ekstrand	23ad998246	anv/pipeline: Unify 3DSTATE_PS emission Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-16 10:09:01 -08:00
Jason Ekstrand	f989d04f39	anv/pipeline/gen7: Properly set 3DSTATE_PS::DualSourceBlendEnable Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-16 10:08:59 -08:00
Jason Ekstrand	fb02d2d13b	intel/genxml: Make some 3DSTATE_PS fields more consistent Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:08:58 -08:00
Jason Ekstrand	5a10ab8a15	anv/pipeline: Make emit_3dstate_sbe safe to call without a FS Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-16 10:08:56 -08:00
Jason Ekstrand	7fe6655aad	anv/pipeline: Unify 3DSTATE_GS emission Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:08:54 -08:00
Jason Ekstrand	f3783f1249	anv/pipeline/gen8: Set 3DSTATE_GS::InstanceControl Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:08:53 -08:00
Jason Ekstrand	9da442b63a	intel/genxml: Make some 3DSTATE_GS fields more consistent Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:08:51 -08:00
Jason Ekstrand	4a48d19d93	anv/pipeline: Unify 3DSTATE_VS emission With this commit, a few fields are now specified on gen7 which weren't before. However, the values specified are zero which is the default so the final hardware packet remains the same. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:08:48 -08:00
Jason Ekstrand	c3e908e9d3	anv/pipeline/gen8: Enable VS statistics Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-16 10:08:44 -08:00
Jason Ekstrand	23d1919fe3	anv/pipeline: Stop claiming to support running without a vertex shader From the Vulkan spec version 1.0.32 docs for vkCreateGraphicsPipelines: The stage member of one element of pStages must be VK_SHADER_STAGE_VERTEX_BIT Since a vertex shader is always required, this hasn't been used since we deleted meta. Let's get rid of the complexity. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:08:42 -08:00
Jason Ekstrand	bda247d3fd	intel/genxml: Make some VS/GS fields consistent across gens We use the names from gen8+ Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:08:40 -08:00
Jason Ekstrand	623e1e06d8	anv/pipeline: Get rid of the kernel pointer fields Now that we have anv_shader_bin, they're completely redundant with other information we have in the pipeline. For vertex shaders, we also go through way too much work to put the offset in one or the other field and then look at which one we put it in later. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:08:38 -08:00
Jason Ekstrand	0087064f26	anv/pipeline: Set correct binding table and sampler counts Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:08:36 -08:00
Brian Paul	cd724208d3	mesa: if MESA_DEBUG=context, create a debug context A number of drivers report useful debug/perf information accessible through GL_ARB_debug_output and with debug contexts (i.e. setting the GLX_CONTEXT_DEBUG_BIT_ARB flag). But few applications actually use the GL_ARB_debug_output extension. This change lets one set the MESA_DEBUG env var to "context" to force-set a debug context and report debug/perf messages to stderr (or whatever file MESA_LOG_FILE is set to). This is a useful debugging tool. The small change in st_api_create_context() is needed so that st_update_debug_callback() gets called to hook up the driver debug callbacks when ST_CONTEXT_FLAG_DEBUG was not set, but MESA_DEBUG=context. v2: use %.*s format string instead of allocating temporary buffer. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-16 09:34:10 -07:00
Nicolai Hähnle	fb17b7f99d	u_simple_shaders: try to un-break the Windows build Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 13:25:35 +01:00
Nicolai Hähnle	6403a9e074	radeonsi: fix a subtle bounds checking corner case with 3-component attributes I'm also sending out a piglit test, gl-2.0/vertexattribpointer-size-3, which exposes this corner case. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 10:31:42 +01:00
Nicolai Hähnle	50c95d0c54	radeonsi: reject some 3-component formats as buffer textures Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 10:31:39 +01:00
Nicolai Hähnle	78314c57cb	st/mesa: swap bytes in the fallback format translation path of GetTexImage Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 10:31:36 +01:00
Nicolai Hähnle	22360406f7	st/mesa: simplify and fix st_GetTexSubImage By using _mesa_image_address, the code becomes simpler _and_ fixes the bug that GL_PACK_SKIP_IMAGES was applied even on non-3D textures. Also, converting a whole slice at a time simplifies the format translation fallback path. Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore. v2: fix a silly mistake during code movement Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 10:31:34 +01:00
Nicolai Hähnle	7cdf292dc3	st/mesa: fix SINT <-> UINT conversion during PBO upload / download This fixes use cases like glReadPixels from an RGBA8I framebuffer into a PBO with type GL_INT by clamping values appropriately when they fall outside the range of the destination format. Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 10:31:31 +01:00
Nicolai Hähnle	5e10a3d6e5	st/mesa: change st_pbo_create_upload_fs to st_pbo_get_upload_fs For consistency with st_pbo_get_download_fs. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 10:31:28 +01:00
Nicolai Hähnle	2fb4b5bdf6	st/mesa: fix ReadPixels into packed formats with PBO When using the GPU download path, we bind the PBO as a buffer texture, so call is_format_supported accordingly. On radeonsi, this means that GPU downloads aren't used for UNSIGNED_SHORT_5_6_5 destinations, for example. Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 10:31:25 +01:00
Nicolai Hähnle	3817a7a1d7	util/blitter: add clamping during SINT <-> UINT blits Even though glBlitFramebuffer cannot be used for SINT <-> UINT blits, we still need to handle this type of blit here because it can happen as part of texture uploads / downloads, e.g. uploading a GL_RGBA8I texture from GL_UNSIGNED_INT data. Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 10:31:21 +01:00
Nicolai Hähnle	ab5fd10eaa	util/blitter: index texfetch_col shaders by type Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-16 10:31:07 +01:00
Lionel Landwerlin	25a8e8bbd5	i965: miptree: prevent potential NULL pointer access If the mcs buffer allocation fails we might get a NULL pointer. This was reported by Coverity and should only happen if we run out of memory. v2: return failure at the point of allocation (Chris) CID: 1394290 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-11-16 08:56:08 +00:00
Jordan Justen	615ccf44cf	intel/blorp: Use designated initializers in surf_convert_to_single_slice Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-15 22:51:19 -08:00
Liu Zhiquan	b663753f3b	EGL/android: pbuffer implementation Android path didn't support pbuffer, so add pbuffer support to fix most failing dEQP and CTS pbuffer test cases. Patch adds a single buffer config to support pbuffer, and creates image in getBuffers for pbuffer when surface type is front surface. The EGL 1.5 spec states that pbuffers have a back buffer but no front buffer, single-buffered surfaces with no front buffer confuse Mesa; so we deviate from the spec, following the precedent of Mesa's EGL X11 platform. V3: update commit message and code review changes. Signed-off-by: Liu Zhiquan <zhiquan.liu@intel.com> Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-16 08:15:59 +02:00
Eric Engestrom	25c60fa6a2	egl: add missing error-checking to eglReleaseTexImage() Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-11-16 08:02:16 +02:00
Ben Widawsky	19a01f8139	i965: Fix KBL typo in string Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-15 17:34:37 -08:00
Ben Widawsky	37370f6bfc	i965: Consolidate GEN9 LP definition Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-15 17:34:37 -08:00
Ben Widawsky	2193fb0e1f	i965/glk: Add basic Geminilake support v2: s/bdw/gen; Add the 2x6 config v3: Add min_ds_entries Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-15 17:34:37 -08:00
Vinson Lee	ed6694d511	util: Fix Clang trivial destructor check. Check for Clang before GCC. Clang defines __GNUC__ == 4 and __GNUC_MINOR__ == 2 and matches the GCC check but not the GCC version for trivial destructor. Fixes: `98ab905af0` ("mesa: Define introspection macro to determine whether a type is trivially destructible.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98526 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-11-15 17:35:56 -08:00
Ilia Mirkin	dafffd2f11	swr: mark color clamping as unsupported There is no functionality in swr to clamp either vertex or frag colors. This could be added in swr_shader, at which point these could be re-enabled. Fixes arb_color_buffer_float-render Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-15 20:26:32 -05:00
Ilia Mirkin	2b6b15ab3f	swr: always enable adding start/base vertex to gl_VertexId Fixes gl-3.2-basevertex-vertexid Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-15 20:26:29 -05:00
Ilia Mirkin	6364491a0b	swr: add support for upper-left fragcoord position Fixes glsl-arb-fragment-coord-conventions. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-15 20:26:11 -05:00
Ilia Mirkin	a2c1d58ddb	swr: make sure that all rendering is finished on shader destroy Rendering could still be ongoing (or have yet to start) when the shader is deleted. There's no refcounting on the shader text, so insert a pipeline stall unconditionally when this happens. [Note, we should instead introduce a way to attach work to fences, so that the freeing can be done in the current fence.] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-15 20:25:48 -05:00
Ilia Mirkin	7caed50ff4	swr: disable blending for integer formats The EXT_texture_integer test says that blending and alphatest should all be disabled. st/mesa takes care of alphatest already. Fixes the ext_texture_integer-fbo-blending piglit test. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-15 20:25:43 -05:00
Ilia Mirkin	2f19a974a5	swr: mark rgb9_e5 as unrenderable The support in swr requires shaders to output the components as UINTs. This is not how GL or Gallium work, and since this is not a required-renderable format, just leave it out. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-15 20:25:35 -05:00
Ilia Mirkin	6fd398f48e	swr: no support for shader stencil export Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-15 20:25:28 -05:00
Ilia Mirkin	96291478ea	swr: mark both frag and vert textures read, don't forget about cbs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-15 20:25:22 -05:00
Ilia Mirkin	8c0f76e961	swr: fix texture layout for compressed formats Fixes the texsubimage piglit and lets the copyteximage one get further. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-15 20:25:15 -05:00
Ilia Mirkin	00efbbc38c	swr: add archrast generated files to gitignore Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-15 20:25:08 -05:00
Ilia Mirkin	b53a33feef	swr: [rasterizer jitter] don't bother quantizing unused channels In a BGR10X2 or BGR5X1 situation, there's no need to try to quantize the X channel - the default will have the proper quantization required. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-15 20:24:50 -05:00
Ilia Mirkin	5dd0b8d3c6	swr: [rasterizer memory] fix store tile for 128-bit ymajor tiling Noticed by inspection. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-15 20:24:41 -05:00
Ilia Mirkin	45d9cd36fe	swr: [rasterizer memory] add support for R32_FLOAT_X8X24 formats This is the format used for the primary surface of a PIPE_FORMAT_Z32_FLOAT_S8X24_UINT resource. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-15 20:24:18 -05:00
Dave Airlie	713522fb8d	ac/nir/llvm: fix channel in texture gather lowering code. This fixes a number of CTS tests like: dEQP-VK.glsl.texture_gather.basic.2d.rgba8ui.size_npot.clamp_to_edge_repeat Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-16 09:18:15 +10:00
Dave Airlie	38ab625c5f	radv: don't crash on null swapchain destroy. Just return if the passed in swapchain is NULL. Fixes: dEQP-VK.wsi.xlib.swapchain.destroy.null_handle Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-16 09:18:03 +10:00
Dave Airlie	253fa25d09	wsi: fix VK_INCOMPLETE for vkGetSwapchainImagesKHR This fixes the x11 and wayland backends to not assert: dEQP-VK.wsi.xcb.swapchain.get_images.incomplete Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-16 09:17:34 +10:00
Jordan Justen	0ac57afa6f	isl: Fix height calculation in isl_msaa_interleaved_scale_px_to_sa No known fixed tests, but it looks like a typo from: commit `8ac99eabb6` intel/isl: Add a helper for getting the size of an interleaved pixel Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-15 14:27:27 -08:00
Emil Velikov	75a39cca8d	amd: automake: android: rename sources lists to foo_FILES Autotools goes smart on us warning that foo_SOURCES variable is present yet a target with name foo is missing. Rename things (like we do throughout the build) to silence the warnings. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-15 20:04:37 +00:00
Mauro Rossi	95ed2d9d2c	amd: flatten amd/common makefile structure This pulls amd/common build rules into upper level makefile, along with amd/addlib which is already there. v2: [Emil Velikov] - Move NEED_RADEON_LLVM conditional, drop amd/common from SUBDIRS - Drop AM_ from common_libamd_common_la* Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-15 20:04:37 +00:00
Marek Olšák	74e39de932	radeonsi: set IF_THRESHOLD to 3 Piglit regressions (radeonsi or LLVM bugs, they pass on softpipe): - glsl-1.10/execution/variable-indexing/vs-output-array-vec3-index-wr - glsl-1.10/execution/variable-indexing/vs-output-array-vec4-index-wr - glsl-110/execution/variable-indexing/vs-temp-array-mat2-index-col-row-wr - glsl-110/execution/variable-indexing/vs-temp-array-mat2-index-row-wr Totals: SGPRS: 1132185 -> 1168801 (3.23 %) VGPRS: 907856 -> 906204 (-0.18 %) Spilled SGPRs: 2011 -> 2425 (20.59 %) Spilled VGPRs: 368 -> 96 (-73.91 %) Scratch VGPRs: 1344 -> 1060 (-21.13 %) dwords per thread Code Size: 35916164 -> 35705372 (-0.59 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 194010 -> 194921 (0.47 %) Wait states: 0 -> 0 (0.00 %) Before: VGPR SPILLING APPS Shaders SpillVGPR ScratchVGPR alien_isolation 2938 38 40 bioshock-infinite 1769 245 732 dirt-showdown 548 85 72 f1-2015 776 0 320 ue4_lightroom_inter.. 74 0 180 After: VGPR SPILLING APPS Shaders SpillVGPR ScratchVGPR alien_isolation 2938 38 40 bioshock-infinite 1769 0 480 dirt-showdown 548 58 40 f1-2015 776 0 320 ue4_lightroom_inter.. 74 0 180 Bioshock and DiRT benefit. If I set IF_THRESHOLD=4, tesseract starts spilling VGPRs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-15 20:23:40 +01:00
Marek Olšák	537b897f51	glsl_to_tgsi: lower small branches based on the CAP Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-15 20:23:40 +01:00
Marek Olšák	72217d4335	gallium: add PIPE_SHADER_CAP_LOWER_IF_THRESHOLD Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-15 20:23:40 +01:00
Marek Olšák	e33440070a	glsl/lower_if: conditionally lower if-branches based on their size Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-15 20:23:39 +01:00
Marek Olšák	83d9b8a6f6	glsl/lower_if: don't lower branches touching tess control outputs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-15 20:23:35 +01:00
Marek Olšák	654e9466b5	glsl/lower_if: check more node types in check_control_flow -> check_ir_node Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-15 20:22:52 +01:00
Marek Olšák	68f35005ed	glsl/lower_if: move and rename found_control_flow I'll want to update more variables in check_control_flow, so using the visitor is convenient. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-15 20:22:52 +01:00
Marek Olšák	a6ff2a3378	util/disk_cache: use unambiguous naming Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-15 20:22:28 +01:00
Marek Olšák	31727300e1	util: import cache.c/h from glsl It's not dependent on GLSL and it can be useful for shader caches that don't deal with GLSL. v2: address review comments v3: keep the other 3 lines in configure.ac Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-15 20:22:28 +01:00
Marek Olšák	5b8876609e	gallivm: limit use of setFastMathFlags to LLVM 3.8 and later Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-15 20:22:28 +01:00
Kenneth Graunke	341fc0073a	intel: Set min_ds_entries on Broxton. This was missing. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-11-15 10:45:47 -08:00
Christian Gmeiner	0c73a3b7d0	dri: make use of loader_get_extensions_name(..) helper Changes since v1: - removed not needed includes - use the loader version of the helper v2 [Emil Velikov] - Keep the includes - they are required. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-15 18:15:16 +00:00
Emil Velikov	fb10c89877	Revert "dri: make use of dri_get_extensions_name(..) helper" This reverts commit `1a21d21580`. Pushed the wrong version of the patch.	2016-11-15 18:15:15 +00:00
Marek Olšák	358079da2d	radeonsi: set unsafe fpmath on FP instructions when allowed by R600_DEBUG Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-15 19:17:56 +01:00
Marek Olšák	41d20d4920	gallivm: add lp_create_builder with an unsafe_fpmath option Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-15 19:17:56 +01:00
Marek Olšák	171e349782	radeonsi: fold some shader context initialization to si_llvm_context_init Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-15 19:17:56 +01:00
Christian Gmeiner	e4b01c97c4	loader: fixup driver names if needed This makes it possible to 'use' the imx-drm driver. Remeber that it is not possible to have sysmbol names in C/C++ with a '-' in it. Changes since v1: - move the fix to loader.c Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-15 15:59:23 +00:00
Christian Gmeiner	1a21d21580	dri: make use of dri_get_extensions_name(..) helper Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-15 15:56:52 +00:00
Christian Gmeiner	0890aa6f7f	loader: add loader_get_extensions_name(..) helper Changes since v1: - renamed function to loader_get_extensions_name - moved function into loader Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> V2: [Emil Velikov] - Use local define. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-15 15:55:33 +00:00
Gurchetan Singh	0639e253a5	egl: Use pkg-config for Android NDK build It's possible to build Mesa for Android using the traditional autotools workflow [1]. ChromiumOS fetches Android prebuilts and puts them in a sysroot. We now want to use pkg-config to specify the location of system headers and libraries [2]. To enable this, let's add the required pkg-config checks and link against them. [1] https://developer.android.com/ndk/guides/standalone_toolchain.html [2] https://chromium-review.googlesource.com/#/c/403237/ v2: Bundle pkg-config checks together (Emil) v3: Provide further context on standalone NDK Mesa build (Emil) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-15 15:49:20 +00:00
Gurchetan Singh	e23608db1c	configure.ac: Don't look for pthreads in Android platform In Android, the pthreads libs are in bionic. When building Mesa for Android with the autotools workflow, we shouldn't set -lpthread or -pthread. [Emil Velikov] Other platforms could use a similar fix, although that is left as separate exercise. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-15 15:48:02 +00:00
Eduardo Lima Mitev	e73513f3c8	meta/GetTexSubImage: Account for GL_PACK_SKIP_IMAGES on compressed textures This option was being ignored when packing compressed 3D and cube textures. Fixes CTS test (on gen8+): * GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore v2: Drop API checks. v3 (Ken): Just apply the existing code in more cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-15 12:26:13 +01:00
Iago Toral Quiroga	277f868e66	anv/format: handle unsupported formats earlier Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-15 08:50:48 +01:00
Samuel Iglesias Gonsálvez	308b06d471	main: return error if asking for GL_TEXTURE_BORDER_COLOR in TEXTURE_2D_MULTISAMPLE{_ARRAY} through TexParameter{i,Ii,Iui}v() OpenGL ES 3.2 says in section 8.10. "TEXTURE PARAMETERS", at the end of the section: "An INVALID_ENUM error is generated if target is TEXTURE_2D_- MULTISAMPLE or TEXTURE_2D_MULTISAMPLE_ARRAY , and pname is any sampler state from table 21.12." GL_TEXTURE_BORDER_COLOR is present in that table. v2: - Add check to _mesa_texture_parameteriv() (Kenneth) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98250 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-15 07:44:12 +01:00
Lionel Landwerlin	a46bc3f70a	anv: fix multi level clears with VK_REMAINING_MIP_LEVELS A commit from the CTS suite on the 1.0-dev branch started using VK_REMAINING_MIP_LEVELS, we're not dealing with it properly for clears. Fixes: dEQP-VK.api.image_clearing.clear_color_image.* Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-14 17:40:58 +00:00
Andres Gomez	5923088d75	dir-locals.el: Adds White Space support Trailing white spaces will be now always highlighted, not just in prog-mode. Also, the White Space package, which is available since GNU Emacs 22, is loaded and activated locally in prog-mode. Additionally, using White Space variables, we set highlighting through faces on wrong indentation and the maximum length of a coding line. Notice that: - The highlighting for the characters beyond the set length of a coding line is not activated by default, only for wrong indentations. - If the White Space package is not available, errors on loading or activation are ignored. - If the White Space mode is not activated the set variables would not have any effect. v2: Removed too long lines trail highlighting, as suggested by Ilia Mirkin. Signed-off-by: Andres Gomez <agomez@igalia.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-14 19:17:49 +02:00
Iago Toral Quiroga	9730f2734b	anv/format: support VK_FORMAT_R8G8B8_SRGB Fixes dEQP-VK.api.image_clearing.clear_color_image.1d_r8g8b8_srgb Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-11-14 17:13:42 +01:00
Iago Toral Quiroga	35deeda66f	anv/format: handle unsupported formats properly According to the spec for vkGetPhysicalDeviceImageFormatProperties: "If format is not a supported image format, or if the combination of format, type, tiling, usage, and flags is not supported for images, then vkGetPhysicalDeviceImageFormatProperties returns VK_ERROR_FORMAT_NOT_SUPPORTED." Makes the following Vulkan CTS tests report 'Not Supported' instead of crashing: dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_unorm dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_snorm dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_uscaled dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_sscaled dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_uint dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_sint dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_srgb dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_unorm dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_snorm dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_uscaled dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_sscaled dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_uint dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_sint dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_srgb dEQP-VK.api.image_clearing.clear_color_image.1d_r4g4_unorm_pack8 dEQP-VK.api.image_clearing.clear_color_image.1d_r8_srgb dEQP-VK.api.image_clearing.clear_color_image.1d_r8g8_srgb dEQP-VK.api.image_clearing.clear_color_image.1d_r8g8b8_srgb dEQP-VK.api.image_clearing.clear_color_image.1d_b5g5r5a1_unorm_pack16 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-11-14 17:13:42 +01:00
Vedran Miletić	8e430ff8b0	clover: adapt to new error API since LLVM r286752 Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-11-14 15:50:29 +00:00
Tim Rowley	c8a51fa75d	swr: [rasterizer core] remove driverType Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:03:10 -06:00
Tim Rowley	ddc898aaf3	swr: [rasterizer archrast] move to pass by value Move to pass by value since most events are very small in size. We can look at pass by reference but will need to create multiple versions to handle temp objects. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:03:04 -06:00
Tim Rowley	23e459b606	swr: [rasterizer core] add mode for aux buffer in the SWR_SURFACE_STATE Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:02:59 -06:00
Tim Rowley	e9a3ad164d	swr: [rasterizer common] don't bleed NOMINMAX definition after <windows.h> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:02:53 -06:00
Tim Rowley	cd8d840ce1	swr: [rasterizer archrast] add events Added events for tracking early/late Depth and stencil events, TE patch info, GS prim info, and FrontEnd/BackEnd DrawEnd events. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:02:48 -06:00
Tim Rowley	7c3ca2e704	swr: [rasterizer core] fix culling issues - Do proper culling of wireframe triangles (including non-culling of degenerates) - Fix degenerate culling of CCW front-facing triangles in wireframe and conservative rast Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:02:42 -06:00
Tim Rowley	cee66dd2aa	swr: [rasterizer core/jitter] fix alpha test bug Alpha from render target 0 should always be used for alpha test for all render targets, according to GL and DX9 specs. Previously we were using alpha from the current render target. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:02:36 -06:00
Tim Rowley	5912552947	swr: [rasterizer core] various code style changes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:02:31 -06:00
Tim Rowley	584b65ad44	swr: [rasterizer archrast] don't generate empty files Don't generate files when no events have been generated outside the header events. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:02:25 -06:00
Tim Rowley	e6f7d8a094	swr: [rasterizer archrast] fix open file handle limit issue Buffer events ourselves and then when that's full or we're destroying the context then write the contents to file. Previously, we're relying ofstream to buffer for us. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:02:17 -06:00
Tim Rowley	2c697754a9	swr: [rasterizer archrast] fix double free issue Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:02:11 -06:00
Tim Rowley	dc8408920c	swr: [rasterizer core] separate frontend/backend stats enables Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:02:04 -06:00
Tim Rowley	937b7d8e5a	swr: [rasterizer core] 16-wide tile store nearly completed * All format combinations coded * Fully emulated on AVX2 and AVX * Known issue: the MSAA sample locations need to be adjusted for 8x2 Set ENABLE_AVX512_SIMD16 and USD_8x2_TILE_BACKEND to 1 in knobs.h to enable Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-14 09:00:59 -06:00
Emil Velikov	f233bcda89	docs: add news item and link release notes for 13.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-14 11:39:01 +00:00
Emil Velikov	0a2b7c16c4	docs: add sha256 checksums for 13.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `b47ce6ddb8`)	2016-11-14 11:37:52 +00:00
Emil Velikov	eeedb52f75	docs: add release notes for 13.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `f2f487ebbb`)	2016-11-14 11:37:51 +00:00
Juan A. Suarez Romero	7b9a9a0c5d	i965/vec4: skip registers already marked as no_spill Do not evaluate spill costs for registers that were already marked as no_spill. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-14 10:09:30 +01:00
Kenneth Graunke	151aecabe4	glsl: Don't crash on function names with invalid identifiers. Karol Herbst's fuzzing efforts noticed that we would segfault on: void bug() { 2(0); } We just need to bail if the function name isn't an identifier. Based on a bug fix by Karol Herbst. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97422 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-12 22:08:15 -08:00
Kenneth Graunke	9c676a6427	glsl: Fix assert fails when assignment expressions are in array sizes. Karol Herbst's fuzzing efforts discovered that we would hit the following assert: assert(dummy_instructions.is_empty()); when processing an illegal array size expression of float[(1=1)?1:1] t; In do_assignment, we realized we needed an rvalue for (1 = 1), and generated a temporary variable and assignment from the RHS. We've already flagged an error (non-lvalue in assignment), and return a bogus value as the rvalue. But process_array_size sees the bogus value, which happened to be a constant expression, and rightly assumes that processing a constant expression shouldn't have generated any code. instructions. To handle this, make do_assignment not generate any temps or assignments when it's already raised an error - just return an error value directly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98694 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-12 22:08:15 -08:00
Jonas Pfeil	5debfeb86f	vc4: Add simulator kernel validation for multithreaded fragment shaders. This is Jonas Pfeil's code from the kernel, brought back to Mesa by anholt.	2016-11-12 19:21:46 -08:00
Eric Anholt	96ffee2d02	vc4: Mark threaded FSes as non-singlethread in the CL.	2016-11-12 19:21:46 -08:00
Eric Anholt	ace0d810e5	vc4: Flag the last thread switch in the program as the last. We don't allow the last thread switch to be inside control flow, to be sure that we hit the last state exactly once. If the last texturing was in control flow, fall back to single threaded.	2016-11-12 19:21:46 -08:00
Eric Anholt	67f72c5f5d	vc4: Add THRSW nodes after each tex sample setup in multithreaded mode. This is a suboptimal implementation, but Jonas Pfeil found that it was still a massive performance gain.	2016-11-12 19:21:46 -08:00
Eric Anholt	e3c620e868	vc4: Add some spec citations about texture fifo management.	2016-11-12 18:46:35 -08:00
Eric Anholt	fd2aff858b	vc4: Use ra14/rb14 as the spilling registers. This makes the raddr fixups compatible with FS threading.	2016-11-12 18:46:35 -08:00
Eric Anholt	755037173d	vc4: Add support for register allocation for threaded shaders. We have two major requirements: Make sure that only the bottom half of the physical reg space is used, and make sure that none of our values are live in an accumulator across a switch.	2016-11-12 18:46:35 -08:00
Eric Anholt	fdad4d2402	vc4: Split register class setup for physical files from accumulators.	2016-11-12 18:46:35 -08:00
Eric Anholt	8e704dca7f	vc4: Use register allocator CLASS_BIT_R0_R3 to clean up CLASS_B. We have had no reason to separate ability to store in an accumulator from ability to store in B, but with FS threading, we need to be able to force values to be stored only in the physical regfiles.	2016-11-12 18:46:35 -08:00
Eric Anholt	1ee503c74d	vc4: Add support for QPU scheduling of thread switch instructions. This is vaguely based off of Jonas Pfeil's thread switch support branch.	2016-11-12 18:46:35 -08:00
Eric Anholt	4f527f1260	vc4: Add a thread switch QIR instruction. This will eventually be generated at the QIR level, so that vc4_qir_schedule.c can arrange the separation of tex_strb from tex_result correctly. It will also be important so that register allocation set the register classes appropriately for values that are live across the switch.	2016-11-12 18:46:35 -08:00
Eric Anholt	93cdae44de	vc4: Add a bit of QPU validation for threaded shaders. These are both bugs we've run into along the way writing multithreaded FS support.	2016-11-12 18:46:35 -08:00
Eric Anholt	977d8b526b	vc4: Fix register class handling of DDX/DDY arguments. I had this exactly backwards, but apparently the piglit tests were all landing in r0-r3 anyway. Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-12 18:46:35 -08:00
Darren Salt	9b121512ac	radv/pipeline: Don't dereference NULL dynamic state pointers This is a port of commit `a4a5917248`: Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts of pCreateInfo members are moved to the earliest points at which they should not be NULL. This fixes a segfault, related to pColorBlendState, seen in Talos Principle which I've observed after startup is completed and when exiting the menus, depending on when Vulkan rendering is selected. v2: moved the NULL check in radv_pipeline_init_blend_state to after the declarations. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-11-12 15:06:27 +01:00
Rob Clark	dfc001dccc	freedreno/ir3: fixup ralloc fallout Fixes fallout from `acc23b04` ("ralloc: remove memset from ralloc_size"). We were still depending on zero'd allocations in a couple of places. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-12 08:57:03 -05:00
Steinar H. Gunderson	2e2562cabb	Fix races during _mesa_HashWalk(). There is currently no protection against walking a hash (using _mesa_HashWalk()) and modifying it at the same time, for instance by inserting or deleting elements. This leads to segfaults in multithreaded code if e.g. someone calls glTexImage2D (which may have to walk the list of FBOs) while another thread is calling glDeleteFramebuffers on another thread with the two contexts sharing lists. The reason for this is that _mesa_HashWalk() doesn't actually take the mutex that normally protects the hash; it takes an entirely different mutex. Thus, walks are only protected against other walks, and there is also no outer lock taking this. There is an old comment saying that this is to fix problems with deadlock if the callback needs to take a mutex; we solve this by changing the mutex to be recursive. A demonstration Helgrind hit from a real application: ==13412== Possible data race during write of size 8 at 0x3498C6A8 by thread #1 ==13412== Locks held: 2, at addresses 0x1AF09530 0x2B3DF400 ==13412== at 0x1F040C99: _mesa_hash_table_remove (hash_table.c:395) ==13412== by 0x1EE98174: _mesa_HashRemove_unlocked (hash.c:350) ==13412== by 0x1EE98174: _mesa_HashRemove (hash.c:365) ==13412== by 0x1EE2372D: _mesa_DeleteFramebuffers (fbobject.c:2669) ==13412== by 0x6105AA4: movit::ResourcePool::cleanup_unlinked_fbos(void*) (resource_pool.cpp:473) ==13412== by 0x610615B: movit::ResourcePool::release_fbo(unsigned int) (resource_pool.cpp:442) [...] ==13412== This conflicts with a previous read of size 8 by thread #20 ==13412== Locks held: 2, at addresses 0x1AF09558 0x1AF73318 ==13412== at 0x1F040CD9: _mesa_hash_table_next_entry (hash_table.c:415) ==13412== by 0x1EE982A8: _mesa_HashWalk (hash.c:426) ==13412== by 0x1EED6DFD: _mesa_update_fbo_texture.part.33 (teximage.c:2683) ==13412== by 0x1EED9410: _mesa_update_fbo_texture (teximage.c:3043) ==13412== by 0x1EED9410: teximage (teximage.c:3073) ==13412== by 0x1EEDA28F: _mesa_TexImage2D (teximage.c:3105) ==13412== by 0x166A68: operator() (mixer.cpp:454) There are many more interactions than just these two possible. Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Steinar H. Gunderson <steinar+mesa@gunderson.no> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-12 12:42:07 +11:00
Kenneth Graunke	07566ad4b6	i965: Drop tabs in brw_state.h.	2016-11-11 17:12:46 -08:00
Daniel Scharrer	0b98e885e7	ac/nir/llvm: Fix setting function attributes for intrinsics This fixes a NULL pointer dereference for intrinsics with more than one function attribute introduced in commit `2fdaf38`. The fix is ported from the lp_build_intrinsic changes in commit `8bdd52c`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-11-11 22:40:32 +01:00
Kenneth Graunke	74d5d393df	i965: Update a comment: s/brw_state_cache/brw_program_cache/g Tim renamed this recently - stop referring to it by the old name. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-11 13:19:18 -08:00
Laurent Carlier	3ff9f8c532	clover: fix building since llvm r286566 pretty trivial fix	2016-11-11 19:45:22 +00:00
Emil Velikov	6ff948ece1	egl/wayland: fix return value in dri2_wl_swrast_commit_backbuffer The function returns "void" rather than int. We could rework that, yet again there will be no benefit since all the callers have no use of it. Fixes: `9ca6711faa` ("Revert "wayland: Block for the frame callback in get_back_bo not dri2_swap_buffers"") Reviewed-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-11-11 17:33:37 +00:00
Brian Paul	92ec47a6ba	glsl: define __STDC_FORMAT_MACROS to get PRIx64 macro Otherwise, inttypes.h may not define the macro for C++ on MinGW. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98681 Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-11 09:43:29 -07:00
Brian Paul	f9052536c9	mesa: fix comment indentation in bind_buffers_check_offset_and_size() Trivial.	2016-11-11 09:43:29 -07:00
Emil Velikov	db45f1eaab	glsl: automake: add opt_add_neg_to_sub.h to the sources list Otherwise it'll be missing in the release tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-11 14:46:12 +00:00
Timothy Arceri	e36f0878cf	i965: update gl_PrimitiveIDIn to be a system variable Now that we have switched to using nir_shader_gather_info() we can remove the hacks and just use the system variable. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-11 20:39:09 +11:00
Timothy Arceri	00620782c9	i965: use nir_shader_gather_info() over do_set_program_inouts() This takes us one step closer to being able to drop the GLSL IR optimisation passes during linking in favour of the NIR passes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-11 20:39:09 +11:00
Timothy Arceri	e86fc2c285	i965: remove remaining tabs in brw_program_cache.c Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-11 20:39:09 +11:00
Timothy Arceri	663fc64965	i965: rename brw_state_cache_check_size() to brw_program_cache_check_size() Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-11 20:39:09 +11:00
Timothy Arceri	0d897be973	i965: rename brw_state_cache.c -> brw_program_cache.c Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-11 20:39:09 +11:00
Jason Ekstrand	a5e88e66e6	i965/gs: Allow primitive id to be a system value This allows for gl_PrimitiveId to come in as a system value rather than as an input. This is the way it will come in from SPIR-V. We keeps the input path working for now so we don't break GL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-10 22:43:59 -08:00
Jason Ekstrand	e73d136a02	vulkan/wsi/x11: Implement FIFO mode. This implements VK_PRESENT_MODE_FIFO_KHR for X11. Unfortunately, due to the way the present extension works, we have to manage the queue of presented images in a separate thread. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-10 22:40:59 -08:00
Jason Ekstrand	4fa0ca80ee	vulkan/wsi: Report the correct min/maxImageCount From the Vulkan spec 1.0.32 section 29.6 docs for vkAcquireNextImageKHR: "Let n be the total number of images in the swapchain, m be the value of VkSurfaceCapabilitiesKHR::minImageCount, and a be the number of presentable images that the application has currently acquired (i.e. images acquired with vkAcquireNextImageKHR, but not yet presented with vkQueuePresentKHR). vkAcquireNextImageKHR can always succeed if a ≤ n - m at the time vkAcquireNextImageKHR is called. vkAcquireNextImageKHR should not be called if a > n - m with a timeout of UINT64_MAX; in such a case, vkAcquireNextImageKHR may block indefinitely." With minImageCount == 2 (as it was previously, the client is allowed to acquire all but one image withoutblocking. If we really need 4 images for mailbox mode + pageflipping, then we need to request a minimum of 4 images up-front. This is a bit unfortunate because it means we will always consume 4 images. In the future, we may be able to optimize this a bit by waiting until the server starts to flip and returning OUT_OF_DATE to get the client to re-allocate with more images or something like that. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-10 22:40:51 -08:00
Kevin Strasser	932bb3f0dd	vulkan/wsi: Add a thread-safe queue implementation In order to support FIFO mode without blocking the application on calls to vkQueuePresentKHR it is necessary to enqueue the request and defer calling the server until the next vblank period. The xcb present api doesn't offer a way to register a callback, so we will have to spawn a worker thread that will wait for a request to be added to the queue, call to the server, and then make the image available for reuse. This commit introduces the queue data structure needed to implement this. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-10 22:40:44 -08:00
Tapani Pälli	3ca600fe71	android/i965: add libmesa_i965_compiler static library this will be shared between OpenGL and Vulkan drivers Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-11 07:59:29 +02:00
Tapani Pälli	1c2de8977b	android: add SPIRV_FILES to libmesa_nir Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-11 07:59:29 +02:00
Tapani Pälli	23d1799f7d	anv: use STATIC_ASSERT instead of static_assert fixes following compilation warnings on Android build: "warning: implicit declaration of function 'static_assert' is invalid in C99 [-Wimplicit-function-declaration]" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-11 07:59:29 +02:00
Tapani Pälli	ec725dc140	util: use STATIC_ASSERT instead of static_assert fixes following compilation warnings on Android build: "warning: implicit declaration of function 'static_assert' is invalid in C99 [-Wimplicit-function-declaration]" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-11 07:59:29 +02:00
Dave Airlie	98969808ff	vulkan: import latest public vulkan headers + and fix drivers. I just noticed the new vulkan headers changed a prototype, so I've decided to import them and fix the drivers to use the new API. Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-11 12:33:07 +10:00
Emil Velikov	e4c465e230	docs: add news item and link release notes for 12.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-11 01:57:50 +00:00
Emil Velikov	33c2958930	docs: add sha256 checksums for 12.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `7b9d7257b2`)	2016-11-11 01:56:19 +00:00
Emil Velikov	d82bbf34df	docs: add release notes for 12.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `3776e97f9d`)	2016-11-11 01:56:18 +00:00
Brian Paul	d881e1c024	glsl: include inttypes.h for PRIx64 macro To fix MinGW build. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-11-10 17:59:18 -07:00
Dave Airlie	2de85eb97a	radv: fix texturesamples to handle single sample case We can only read the valid samples if this is an MSAA texture, which means the type field must be 0x14 or 0x15. This fixes: dEQP-VK.glsl.texture_functions.query.texturesamples.* Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-11 09:35:43 +10:00
Jason Ekstrand	a6c3d0f92b	anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4 This fixes hangs in Dota2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>	2016-11-10 15:21:18 -08:00
Jason Ekstrand	1e3e347fd5	anv/cmd_buffer: Take a command buffer instead of a batch in two helpers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>	2016-11-10 15:21:18 -08:00
Ian Romanick	e9acae8486	glsl/standalone: Add the ability to generate ir_builder code Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-11-10 14:30:49 -08:00
Ian Romanick	191d9a5195	glsl: Add a C++ code generator that uses ir_builder to rebuild a program This is only in libstandalone currently because it will only be used in the stand-alone compiler. v2: Change the signature of the generated function. The ir_factory is created in the generator, and an availability predicate is taken as a parameter. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-10 14:30:49 -08:00
Ian Romanick	984f16bbd7	glsl: Generate strings that are the enum names without the ir_*op_ prefix For many expressions, this is different from the printable name. The printable name for ir_binop_add is "+", but we want "add". This is needed for ir_builder_print_visitor. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-10 14:30:49 -08:00
Ian Romanick	d0028b2e1c	glsl/standalone: Enable par-linking If the user did not request full linking, link the shader with the built-in functions, inline them, and eliminate them. Previous to this you'd see all these calls to "dot" and "max" in the output. This prevented a lot of expected optimizations and cluttered the output. This gives it some chance of being useful. v2: Rebase on top of Ken's "built-ins now" work. v3: Don't do_common_optimizations if par-linking fails. Update expected output of warnings tests to prevent 'make check' regressions. v4: Optimize harder. Most important, do function inlining. Otherwise it's quite impractical for one function in a file to call another function in the same file. v5: Add some code simplifications and an assertion suggested by Iago. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-11-10 14:30:49 -08:00
Ian Romanick	4dc759c8c2	glsl/standalone: Optimize dead variable declarations We didn't bother with this in the regular compiler because it doesn't change the generated code. In the stand-alone compiler, this can clutter the output with useless variables. It's especially bad after functions are inlined but the foo_retval declarations remain. v2: Use set_foreach. Suggested by Tapani. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-11-10 14:30:49 -08:00
Ian Romanick	f45a2a93ae	glsl/standalone: Optimize add-of-neg to subtract This just makes the output of the standalone compiler a little more compact. v2: Fix indexing typo noticed by Iago. Move the add_neg_to_sub_visitor to it's own header file. Add a unit test that exercises the visitor. Both the neg_a_plus_b and neg_a_plus_neg_b tests reproduced the bug that Iago discovered. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-10 14:30:49 -08:00
Ian Romanick	9788b3b6f3	glsl/linker: Allow link_intrastage_shaders when there is no main() This enables a sort of par-linking. The primary use for this feature is resolving built-in functions in the stand-alone compiler. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-11-10 14:30:49 -08:00
Timothy Arceri	7372d2153a	nir: update nir_gather_info to only mark used array/matrix elements This is based on the code from the GLSL IR pass however unlike the GLSL IR pass it also supports arrays of arrays. As well as implementing the logic from the GLSL IR pass we add some additional intrinsic cases to catch more system values. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-11 09:17:07 +11:00
Kenneth Graunke	53d1f4251f	mesa/compiler: move MAX_VARYING to shader_enums.h Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-11 09:17:07 +11:00
Timothy Arceri	cd52b4fb16	nir: add more helpers to nir_types.cpp These new helpers will be used in nir_gather_info.c in a following patch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-11 09:17:07 +11:00
Kenneth Graunke	ad9d4a4f8d	nir: Generalize the "is per-vertex variable?" helpers and export them. I want this function for nir_gather_info(), and realized it's basically the same as the ones in nir_lower_io(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-11 09:17:07 +11:00
Samuel Pitoiset	561f2208bd	nvc0: support MP performance counters on Maxwell This adds some performance counters/metrics for SM50/SM52. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-11-10 22:13:49 +01:00
Tim Rowley	b9578b683d	gallium: detect avx512 cpu features v3: fix check for xmm/ymm test v2: style code, add avx512 to cpu dump Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-11-10 15:03:21 -06:00
Ian Romanick	c8c46641af	glsl: Parse 0 as a preprocessor INTCONSTANT This allows a more reasonable error message for '#version 0' of 0:1(10): error: GLSL 0.00 is not supported. Supported versions are: 1.10, 1.20, 1.30, 1.00 ES, 3.00 ES, 3.10 ES, and 3.20 ES instead of 0:1(10): error: syntax error, unexpected $undefined, expecting INTCONSTANT Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97420 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: mesa-stable@lists.freedesktop.org Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Karol Herbst <karolherbst@gmail.com>	2016-11-10 10:57:59 -08:00
Ian Romanick	e85a747e29	glcpp: Handle '#version 0' and other invalid values The #version directive can only handle decimal constants. Enforce that the value is a decimal constant. Section 3.3 (Preprocessor) of the GLSL 4.50 spec says: The language version a shader is written to is specified by #version number profile opt where number must be a version of the language, following the same convention as __VERSION__ above. The same section also says: __VERSION__ will substitute a decimal integer reflecting the version number of the OpenGL shading language. Use a separate flag to track whether or not the #version line has been encountered. Any possible sentinel (0 is currently used) could be specified in a #version directive. This would lead to trying to (internally) redefine __VERSION__. Since there is no parser location for this addition, NULL is passed. This eventually results in a NULL dereference and a segfault. Attempts to use -1 as the sentinel would also fail if '#version 4294967295' or '#version 18446744073709551615' were used. We should have piglit tests for both of these. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97420 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: mesa-stable@lists.freedesktop.org Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Karol Herbst <karolherbst@gmail.com>	2016-11-10 10:57:59 -08:00
Ian Romanick	cbba5e13ac	linker: Remove unnecessary overload of program_resource_visitor::visit_field It looks like I added this version as a short-hand for users that didn't need the fuller version. I don't think there's any real utility in that. I'm not sure what my thinking was there. Maybe if those users overloaded the recursion function could just call the compact version to avoid passing some parameters? None of the users do that. Either way, having this extra overload is not useful. Delete it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-11-10 10:57:46 -08:00
Emil Velikov	b359f62456	radv: automake: list correct file in the EXTRA_DIST Earlier commit renamed the file radeon_icd.json{,.in} but missed one reference of the file - in EXTRA_DIST. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Fixes: `0f434a68a` ("radv: Suffix the radeon_icd file with the host CPU") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-10 18:50:13 +00:00
Marek Olšák	f500c36339	mesa: remove LowerShaderSharedVariables always true for compute shaders Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-10 18:34:55 +01:00
Marek Olšák	0f6360eedb	glsl: handle partial swizzles in opt_dead_code_local correctly Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-10 18:34:55 +01:00
Marek Olšák	e27333a568	glsl: don't run loop passes if loop unrolling is disabled Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-10 18:34:55 +01:00
Marek Olšák	ce3f453f01	radeonsi: fix r600_texture::tc_compatible_htile htile_size is now always non-zero if HTILE is allocated. It seems to have caused no issues. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-10 18:34:55 +01:00
Marek Olšák	ce3189cbe6	radeonsi: accept is_store in image_fetch_rsrc instead of dcc_off Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-10 18:34:55 +01:00
Marek Olšák	f83b2f524a	radeonsi: don't rely on tgsi_scan::images_buffers the instruction knows the target Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-10 18:34:55 +01:00
Marek Olšák	4e00e20074	radeonsi: re-order cases in si_get_shader_param Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-10 18:34:55 +01:00
Marek Olšák	3f6e0063c8	radeonsi: increase MAX_CONTROL_FLOW_DEPTH AKA MaxIfDepth we don't want to lower deep IFs unconditionally Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-10 18:34:55 +01:00
Emil Velikov	0187f88453	Revert "configure.ac: honour LLVM_LIBDIR when linking against LLVM" This reverts commit `a39ad18593`. The commit aims to address "missing" -L/foo/bar during linking stage. At the same time it doesn't add the -L and yet the LLVM_LDFLAGS [which provide -L/foo/bar] are already used throughout. Seems like something pretty unique (broken?) on my end. Since the commit introduces issues (due to the missing -L) revert until we get to the root of it (PEBKAC or a genuine issue).	2016-11-10 15:10:34 +00:00
Nicolai Hähnle	b21912e2e9	radeonsi: fix/silence unused variable warnings in optimized builds I'm leaving num_out_sgpr around since it's not in a fast path, and besides the compiler should be able to optimize it away easily. The alternative with #if/#endif would be extremely ugly. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-10 13:18:16 +01:00
Nicolai Hähnle	b46a9c570f	gallivm: fix [IU]MUL_HI regression harder The fix in commit `88f791db75` was insufficient for radeonsi because the vector case was not handled properly. It seems piglit only covers the scalar case, unfortunately. Fixes GL45-CTS.shader_bitfield_operation.[iu]mulExtended.* Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-11-10 13:17:10 +01:00
Daniel Stone	9ca6711faa	Revert "wayland: Block for the frame callback in get_back_bo not dri2_swap_buffers" This reverts commit `25cc889004`, though since the code has changed, it was applied manually. The intent of moving blocking from SwapBuffers to get_back_bo, was to avoid unnecessary triple-buffering by ensuring that the compositor had fully processed the previous frame before we started rendering. This means that the only time we would have to resort to triple-buffering would be when the buffer is directly scanned out, thus saving an extra buffer for composition anyway. The 'repaint window' changes introduced in Weston since then, however, have narrowed the window of time between the frame event being sent and the repaint loop needing to conclude, to 7ms by default, in order to reduce latency. This means however that blocking in get_back_bo gives a maximum of 7ms for the entire GL submission to begin and complete. Not only this, but if a client is using buffer_age to avoid full repaints, the buffer-age request will stall in get_back_bo until the frame callback completes, meaning that the client cannot even calculate the repaint area before the 7ms window. The combination of the two meant that WebKit-GTK+ was failing to achieve full framerate on a Minnowboard, due to spending a great deal of its time attempting to query the age of the next buffer before redraw. Revert to the previous behaviour of allowing rendering to begin but delaying SwapBuffers, unless and until we can find a more gentle behaviour. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jonas Ådahl <jadahl@gmail.com> Reviewed-by: Derek Foreman <derekf@osg.samsung.com> Tested-by: Derek Foreman <derekf@osg.samsung.com> Cc: Kristian Høgsberg <krh@bitplanet.net>	2016-11-10 10:25:03 +00:00
Iago Toral Quiroga	8933417565	glsl: validate output blocks against input blocks Until now were validating in/out blocks by listing the inputs in the consumer stage and then, for each output of the producer, we checked that it was a match if it was consumed. This method does not catch the case where the consumer has an input that is not present as an output in the producer stage, because it only generates link errors for outputs present in the producer stage that don't match the inputs in the consumer stage. The current method does catch the case were an output from the producer stage is not consumed, which is irrelevant and is ignored. By reversing the way we do this, we can detect this situation, so this patch lists the outputs of the producer stage and then validates inputs of the consumer stage against them. If we see an input in the consumer for which there is no associated output in the producer, we produce a link error. The only exception to this is the special built-in input block gl_in[], since this is implicitly generated for geometry and tessellation stages, but we don't generate it if the producer stage does not write to any of the pre-defined outputs (for example, if the vertex shader does not write to gl_Position, etc). Since writing to these is not mandatory, do not produce a link error in that case. There is a CTS tessellation test (GL45-CTS.tessellation_shader.program_object_properties) that has an empty vertex shader (so it does not produce gl_in[]) and would fail to link if we don't do this. This fixes the following dEQP test: dEQP-GLES31.functional.shaders.linkage.io_block.missing_output_block Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98245 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-10 08:08:07 +01:00
Dave Airlie	19decd8ce4	radv: fixup botched llvm API changes. Reported-by: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-10 14:12:32 +10:00
Dave Airlie	2fdaf38c01	ac/nir/llvm: adopt to new LLVM attribute API. Ported from corresponding changes to gallivm. tested build against 3.9 and master. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-10 13:29:12 +10:00
Jason Ekstrand	302f641d14	vulkan/wsi/wayland: Clean up some error handling paths This gets rid of all the memory leaks reported by the WSI CTS tests. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 18:18:42 -08:00
Jason Ekstrand	3b6abfc69a	vulkan/wsi/wayland: Include pthread.h We use pthreads and, for some reason, it wasn't getting included Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 18:18:36 -08:00
Jason Ekstrand	b1217eada9	anv/device: Implicitly unmap memory objects in FreeMemory From the Vulkan spec version 1.0.32 docs for vkFreeMemory: "If a memory object is mapped at the time it is freed, it is implicitly unmapped." Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>	2016-11-09 18:17:55 -08:00
Jason Ekstrand	920f34a2d9	anv/device: Return the right error for failed maps Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>	2016-11-09 18:17:48 -08:00
Jason Ekstrand	73ef9c8f04	anv/device: Add some asserts to MapMemory Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-09 18:17:41 -08:00
Jason Ekstrand	843775bab7	anv: Rework fences Our previous fence implementation was very simple. Fences had two states: signaled and unsignaled. However, this didn't properly handle all of the edge-cases that we need to handle. In order to handle the case where the client calls vkGetFenceStatus on a fence that has not yet been submitted via vkQueueSubmit, we need a three-status system. In order to handle the case where the client calls vkWaitForFences on fences which have not yet been submitted, we need more complex logic and a condition variable. It's rather annoying but, so long as the client doesn't do that, we should still hit the fast path and use i915_gem_wait to do all our waiting. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 18:17:29 -08:00
Jason Ekstrand	73701be667	anv/wsi: Set the fence to signaled in AcquireNextImageKHR Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 18:17:21 -08:00
Jason Ekstrand	71397042fe	anv/gen8: Stall when needed in Cmd(Set\|Reset)Event Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 18:17:06 -08:00
Ilia Mirkin	73f53c097a	glsl: record number of components used in each slot for varying packing Instead of packing varyings into vec4's, keep track of how many components each slot uses and create varyings with matching types. This ensures that we don't end up using more components than the orginal shader, which is especially important for geometry shader output limits. This comes up for NVIDIA hw, where the limit is 1024 output components for a GS, and the hardware complains loudly if you even think about going over. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-09 20:26:48 -05:00
Ilia Mirkin	885c788017	glsl: fix slot_end calculations and simplify reserved_slots check The previous code was confused about whether slot_end was inclusive or exclusive. Make it so that it is inclusive consistently, and use it for setting the new location. This also avoids discrepancies in how num_components is calculated vs the more manual approach taken for the former reserved_slots check. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-09 20:25:15 -05:00
Ilia Mirkin	828faaef40	swr: correct setting of independentAlphaBlendEnable This setting is for whether color and alpha have different blend settings, not for whether blending is enabled on a per-RT basis. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-09 20:11:57 -05:00
Ilia Mirkin	5be635d5e4	swr: [rasterizer] add a .dir-locals.el to support 4-space indents Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-09 20:11:39 -05:00
Ilia Mirkin	36e5d68cad	swr: set halfz rasterizer setting Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-09 20:11:10 -05:00
Ilia Mirkin	4b5b87e7ab	swr: [rasterizer core] allow an OpenGL driver to specify halfz clipping With ARB_clip_control, GL may also do 0..1 depth clipping, not just -1..1. This removes clip's reliance on driver type. DX users will need to be updated to set the new clipHalfZ flag to get proper clipping functionality. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-09 20:10:52 -05:00
Ilia Mirkin	4af25e7131	swr: fix support for inverted depth scales Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-09 20:10:44 -05:00
Ilia Mirkin	aed517f985	swr: [rasterizer jitter] fix logic op to work with unorm/snorm Most logic op usage is probably going to end up with normalized textures. Scale the floating point values and convert to integer before performing the logic operations. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-09 20:10:25 -05:00
Eric Anholt	08d51487e3	vc4: Clamp the shadow comparison value. Fixes piglit glsl-fs-shadow2D-clamp-z. Cc: <mesa-stable@lists.freedesktop.org>	2016-11-09 15:33:56 -08:00
Eric Anholt	e887341d3f	vc4: Don't pair up TLB scoreboard locking instructions early in QPU sched. Jonas Pfeil noticed that we were putting passthrough tlb_z writes early in the shader, despite QIR and QPU scheduling both trying to delay scoreboard locking for as long as possible. The problem was that when trying to pair up QPU instructions, at some point the passthrough tlb_z would be the last one available and it would get paired, even if the other half would open up other instructions to be scheduled and we could have paired tlb_z with something later in the program. Also, since passthrough z is just a mov, it pairs up really easily. The proper fix would probably be to flip the order of scheduling instructions so we went from bottom to top (also relevant for branch delay slot scheduling). However, we can do a quick fix here to just not schedule a TLB lock until there's nothing but TLB left in the program, at a slight instruction cost (est .61% cycle count in shader-db) but a major fragment shader parallelism win. glmark2 results: texture:texture-filter=linear: +1.24481% +/- 0.626117% (n=15) bump:bump-render=height: 1.24991% +/- 0.154793% (n=136,133 -- screensaver outliers removed)	2016-11-09 15:33:56 -08:00
Eric Anholt	695a2e2ffa	vc4: Print a reg pressure estimate in our reg allocation failure dump.	2016-11-09 15:33:56 -08:00
Eric Anholt	4d019bd703	vc4: Don't abort when a shader compile fails. It's much better to just skip the draw call entirely. Getting this information out of register allocation will also be useful for implementing threaded fragment shaders, which will need to retry non-threaded if RA fails. Cc: <mesa-stable@lists.freedesktop.org>	2016-11-09 15:33:56 -08:00
Kenneth Graunke	aaee3daa90	mesa: Fix pixel shader scratch space allocation on Gen9+ platforms. We had missed a bit of errata - PS scratch needs to be computed as if there were 4 subslices per slice, rather than 3. Skylake Broxton Kabylake GT1 GT2 GT3 GT4 2x6 3x6 GT1 GT1.5 GT2 GT3 GT4 Actual Slices 1 1 2 3 1 1 1 1 1 2 3 Total Subslices 3 3 6 9 2 3 2 3 3 6 9 Subsl. for PS Scratch 4 4 8 12 4 4 4 4 4 8 12 Note that Skylake GT1-3 already worked because we allocated 64 * 9 (trying to use a value that would work on GT4, with 9 subslices), and the actual required values were 64 * 4 or 64 * 8. However, all others (Skylake GT4, Broxton, and Kabylake GT1-4) underallocated, which can lead to scratch writes trashing random process memory, and rendering corruption or GPU hangs. Fixes GPU hangs and rendering corruption on Skylake GT4 in shaders that spill. Particularly, dEQP-GLES31.functional.ubo.all_per_block_buffers.* now runs successfully with no hangs and renders correctly. This may fix problems on Broxton and Kabylake as well. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-11-09 15:30:59 -08:00
Kevin Strasser	1d6fe13c13	mesa/extensions: expose OES_vertex_half_float for ES2 Half float support already exists for desktop GL. Reuse the ARB_half_float_vertex enable bit and account for the different enum to enable the extension for ES2. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-09 14:35:20 -08:00
Emil Velikov	aeaf21ab3e	Revert "egl: remove explicit config_id management from dri2_add_config()" This reverts commit `3652d1d594`. Self nack/reject on this one. The base.ConfigID is overwritten immediately after we store the current value, thus one memcpy [further down] the wrong value will be copied.	2016-11-09 21:48:50 +00:00
Brian Paul	5b92008ae2	util: add MSVC HAS_TRIVIAL_DESTRUCTOR implementation Based on a patch by George Kyriazis but changed to test for _MSC_VER >= 1800 (Visual Studio 2015). This fixes the failed CANARY assertion in src/util/ralloc.c:get_header() on Windows. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98595 Tested-by: Brian Paul <brianp@vmware.com> Signed-off-by: Brian Paul <brianp@vmware.com>	2016-11-09 14:55:10 -07:00
Emil Velikov	0f434a68a3	radv: Suffix the radeon_icd file with the host CPU Port of the anv commit `d96345de98` ("anv: Suffix the intel_icd file with the host CPU"). v2: s/intel_icd/radeon_icd/ in commit summary (Gražvydas) Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com> (IRC)	2016-11-09 21:36:45 +00:00
Emil Velikov	abe110df01	radv: use correct .specVersion for extensions Analogous to previous commit. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> (IRC)	2016-11-09 21:36:36 +00:00
Emil Velikov	f373a91a52	anv: use correct .specVersion for extensions Vulkan has introduced the consept of .specVersion which can be used to attribute changes of the said extension. The current loader does not check the value, thus it have gone unnoticed that the driver exposes an old version of the following extensions: VK_KHR_xcb_surface (Rev 6) VK_KHR_xlib_surface (Rev 6) VK_KHR_wayland_surface (Rev 5) - Updated the surface create function to take a pCreateInfo structure VK_KHR_swapchain (Rev 68) - Moved the "validity" include for vkAcquireNextImage to be in its proper place, after the prototype and list of parameters. ... According to the documentation: * pname:specVersion is the version of this extension. It is an integer, incremented with backward compatible changes. Based on the history of vk.xml the above (latest) revision has been available since Vulkan 1.0 so even if they were any backwards incompatible change(s) [as hinted by the revision log] those should be safe. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-09 21:36:35 +00:00
Emil Velikov	190bae7685	amd/addrlib: limit fastcall/regparm to GCC i386 The use of regparm causes an error on arm/arm64 builds with clang. fastcall is allowed, but still throws a warning. As both options only have effect on 32-bit x86 builds, limit them to that case. v2: keep the __i386__ within GCC (Nicolai) Cc: 13.0 <mesa-stable@lists.freedesktop.org> Cc: Rob Herring <robh@kernel.org> Cc: Nicolai Hähnle <nhaehnle@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rob Herring <robh@kernel.org>	2016-11-09 21:36:35 +00:00
Emil Velikov	a39ad18593	configure.ac: honour LLVM_LIBDIR when linking against LLVM Currently if one uses a non-default prefix, the path won't get propagated and we'll fail at link-time. A very quick and easy example is to install to /usr/local. At this point, llvm-config will be picked even without the --with-llvm-prefix, but regardless of the latter linking will fail. Currently people can workaround that via LD_LIBRARY_PATH. Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Cc: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-09 21:36:14 +00:00
Emil Velikov	3652d1d594	egl: remove explicit config_id management from dri2_add_config() Currently we only saved the id to memcpy the whole _EGLConfig to write back the exact same id value. Remove the unneeded and confusing/misleading code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-11-09 21:18:48 +00:00
Ian Romanick	084105c213	linker: Accurately track gl_uniform_block::stageref As the linked per-stage shaders are processed, mark any block that has a field that is accessed as referenced. When combining all the linked shaders, combine the per-stage stageref masks. This fixes a number of GLES CTS tests: ES31-CTS.core.geometry_shader.program_resource.program_resource ES32-CTS.core.geometry_shader.program_resource.program_resource ESEXT-CTS.geometry_shader.program_resource.program_resource piglit.gl45-cts.geometry_shader.program_resource.program_resource However, it makes quite a few more fail: ES31-CTS.functional.program_interface_query.buffer_variable.random.6 ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.compute.unnamed_block.float ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.separable_fragment.unnamed_block.float ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_fragment_only_fragment.unnamed_block.float ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_fragment.unnamed_block.float ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_geo_fragment_only_fragment.unnamed_block.float ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_geo_fragment.unnamed_block.float ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_fragment_only_fragment.unnamed_block.float ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_fragment.unnamed_block.float ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_geo_fragment_only_fragment.unnamed_block.float ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_geo_fragment.unnamed_block.float ES32-CTS.functional.program_interface_query.buffer_variable.random.6 ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.compute.unnamed_block.float ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.separable_fragment.unnamed_block.float ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_fragment_only_fragment.unnamed_block.float ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_fragment.unnamed_block.float ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_geo_fragment_only_fragment.unnamed_block.float ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_geo_fragment.unnamed_block.float ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_fragment_only_fragment.unnamed_block.float ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_fragment.unnamed_block.float ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_geo_fragment_only_fragment.unnamed_block.float ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_geo_fragment.unnamed_block.float I have diagnosed the failures, but I'm not sure whether we or the tests are wrong. After optimizations are applied, all of the tests are of the form: buffer X { float f; } x; void main() { x.f = x.f; } The test then queries that x is referenced by that shader stage. We eliminate the assignment of x.f to itself, and that removes the last reference to x. We report that x is not referenced, and the test fails. I do not know whether or not we are allowed to eliminate that assignment of x.f to itself. After discussions with the OpenGL ES group in Khronos, we believe that Mesa's behavior is correct. I will provide patches to the CTS tests to Khronos. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-09 12:47:51 -08:00
Ian Romanick	392fabcfee	linker: Slight code rearrange to prevent duplication in the next commit Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-09 12:47:51 -08:00
Ian Romanick	a529acfb2b	linker: Trivial coding standards fixes v2: Revert the unreachable to assert in parcel_out_uniform_storage::visit_field. Suggested by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-09 12:47:51 -08:00
Ian Romanick	fbc1a4b7d2	glsl: Add some comments to methods of ir_variable_refcount_visitor It was not obvious from the just the .h file what the hash table contained. It was also not obvious that get_variable_entry would create a new entry in the hash table. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-09 12:47:51 -08:00
Aaron Watry	1492633070	llvmpipe: Fix build after removal of deprecated attribute API v2 Applies on top of v3 of Tom's gallivm change. v2: - Tom Stellard: Use enums instread of strings. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> CC: Tom Stellard <thomas.stellard@amd.com> CC: Jan Vesely <jan.vesely@rutgers.edu>	2016-11-09 20:13:27 +00:00
Tom Stellard	8bdd52c8f3	gallivm: Fix build after removal of deprecated attribute API v3 v2: Fix adding parameter attributes with LLVM < 4.0. v3: Fix typo. Fix parameter index. Add a gallivm enum for function attributes. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-09 20:13:27 +00:00
Dave Airlie	fb50245ac1	radv: fix GetFenceStatus for signaled fences if a fence is created pre-signaled we should return that in GetFenceStatus even if it hasn't been submitted. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-09 19:49:26 +00:00
Dave Airlie	3c9af7578f	radv: enable conditional discard optimisation on radv. This fixes a bunch of GPU hangs introduced in some CTS tests like dEQP-VK.memory.pipeline_barrier.host_write_uniform_buffer.65536 It works around an issue seen in the LLVM backend, but also makes the radv code work more like the radeonsi stack. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-10 05:46:49 +10:00
Dave Airlie	b16dff2d88	nir: add conditional discard optimisation (v4) This is ported from GLSL and converts if (cond) discard; into discard_if(cond); This removes a block, but also is needed by radv to workaround a bug in the LLVM backend. v2: handle if (a) discard_if(b) (nha) cleanup and drop pointless loop (Matt) make sure there are no dependent phis (Eric) v3: make sure only one instruction in the then block. v4: remove sneaky tabs, add cursor init (Eric) Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-10 05:46:33 +10:00
Dave Airlie	dd77faeca2	ac/nir: add support for discard_if intrinsic (v2) We are going to start lowering to this in NIR code, so prepare radv for it. v2: handle conversion to kilp properly (nha) Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-10 05:46:20 +10:00
Kristian Høgsberg Kristensen	b3a29f2e9e	anv: Do relocations in userspace before execbuf ioctl Since our surface state buffer is shared by all batches, the kernel does a full stall and sync with the CPU between batches every time we call execbuf2 because it refuses to do relocations on an active buffer. Doing them in userspace and passing the NO_RELOC flag to the kernel allows us to perform the relocations without stalling. This improves the performance of Dota 2 by around 30% on a Sky Lake GT2. v2 (Jason Ekstrand): - Better comments (Chris Wilson) - Fixed write_reloc for correct canonical form (Chris Wilson) v3 (Jason Ekstrand): - Skip relocations which aren't needed - Provide an environment variable to always use the kernel - More comments about correctness (Chris Wilson) v4 (Jason Ekstrand): - More comments (Chris Wilson) v5 (Jason Ekstrand): - Rebase on top of moving execbuf2 setup go QueueSubmit Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 11:31:14 -08:00
Jason Ekstrand	8b61c57049	anv: Move relocation handling from EndCommandBuffer to QueueSubmit Ever since the early days of the Vulkan driver, we've been setting up the lists of relocations at EndCommandBuffer time. The idea behind this was to move some of the CPU load out of QueueSubmit which the client is required to lock around and into command buffer building which could be done in parallel. Then QueueSubmit basically just becomes a bunch of execbuf2 calls. Technically, this works. However, when you start to do more in QueueSubmit than just execbuf2, you start to run into problems. In particular, if a block pool is resized between EndCommandBuffer and QueueSubmit, the list of anv_bo's and the execbuf2 object list can get out of sync. This can cause problems if, for instance, you wanted to do relocations in userspace. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 11:31:12 -08:00
Jason Ekstrand	595400d577	anv/batch: Move last_ss_pool_bo_offset to the command buffer The original reason for putting it in the batch_bo was to allow primaries to share it across secondaries or something like that. However, the relocation lists in secondary command buffers are are always left alone and copied into the primary command buffer's relocation list. This means that the offset really applies at the command buffer level and putting it in the batch_bo doesn't make sense. This fixes a couple of potential bugs around re-submission of command buffers that are not likely to be hit but are bugs none the less. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 11:31:10 -08:00
Jason Ekstrand	0fe6829427	anv: Add an anv_execbuf helper struct This commit adds a little helper struct for storing everything we use to build an execbuf2 call. Since the add_bo function really has nothing to do with a command buffer, it makes sense to break it out a bit. This also reduces some of the churn in the next commit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 11:31:08 -08:00
Jason Ekstrand	095c48a496	anv/batch_chain: Improve write_reloc The old version wasn't properly handling large addresses where we have to sign-extend to get it into the "canonical form" expected by the hardware. Also, the new version is capable of doing a clflush of the newly written reloc if requested. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 11:31:06 -08:00
Jason Ekstrand	d46bfb6297	anv: Initialize anv_bo::offset to -1 Since -1 is an invalid GPU address, this lets us know whether or not we have a valid address for a buffer. We don't get a valid address until the first time that buffer is used in an execbuf2 ioctl. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 11:31:04 -08:00
Jason Ekstrand	bd0f8d5070	anv/allocator: Simplify anv_scratch_pool The previous implementation was being overly clever and using the anv_bo::size field as its mutex. Scratch pool allocations don't happen often, will happen at most a fixed number of times, and never happen in the critical path (they only happen in shader compilation). We can make this much simpler by just using the device mutex. This also means that we can start using anv_bo_init_new directly on the bo and avoid setting fields one-at-a-time. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 11:31:01 -08:00
Jason Ekstrand	6283b6d56a	anv: Add a new bo_pool_init helper This ensures that we're always setting all of the fields in anv_bo Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 11:30:59 -08:00
Jason Ekstrand	ba1eea4f95	anv: Don't presume to know what address is in a surface relocation Because our relocation processing happens at EndCommandBuffer time and because RENDER_SURFACE_STATE objects may be shared by batches, we really have no clue whatsoever what address is actually written to the relocation offset in the BO. We need to stop making such claims to the kernel and just let it relocate for us. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 11:30:57 -08:00
Jason Ekstrand	db9f4b2a2b	anv: Add a cmd_buffer_execbuf helper This puts the actual execbuf2 call in anv_batch_chain.c along with the other relocation stuff. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 11:30:55 -08:00
Jason Ekstrand	07798c9c3e	anv/device: Add an execbuf wrapper This wrapper ensures that we always update all anv_bo::offset fields based on the offsets returned by the kernel. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 11:30:42 -08:00
Jason Ekstrand	64b140498d	anv: Make anv_finishme only warn once per call-site When you fire up Dota2 on Haswell you get spammed with thousands of "Implement Gen7 HZ ops" finishme's. The point of anv_finishme is to act as a reminder that there is something left to implement. Printing it once should be sufficient. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-09 10:26:37 -08:00
Jordan Justen	7bcb94bc2f	i965/compute: Allow ARB_compute_shader in compat profile Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97447 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Evan Odabashian <eodabash@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-11-09 08:23:33 -08:00
Roland Scheidegger	4d5346aaac	Revert "draw: use vectorized calculations for fetch" Trivial. There's some regressions internally, related to overflow behavior. I'll have to look at it at another time, some interactions with vsplit/vcache are actually mind-blowing. This reverts commit `3fa10ffb49`.	2016-11-09 05:53:16 +01:00
Ilia Mirkin	f037afb701	swr: disable logic op when the rt format is float or srgb Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-08 19:28:35 -05:00
Ilia Mirkin	e2e40e236f	swr: fix AND_INVERTED logic op conversion Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-08 19:28:35 -05:00
Ilia Mirkin	bef4a48d1c	swr: add support for EXT_depth_bounds_test Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-08 19:28:35 -05:00
Ilia Mirkin	aa62fa8fb7	swr: [rasterizer core] set depth hottile when depth bounds test enabled Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-08 19:28:35 -05:00
Anuj Phogat	b9df2251c1	i965: Fix GPU hang related to multiple render targets and alpha testing This patch should have been the part of commit `e592f7df`. In a situation when there are multiple render targets with alpha testing enabled, if fragment shader doesn't write to draw buffer zero, it causes the GPU hang on SKL. No GPU hang is seen on HSW. Simulator gives a warning for all gen6+ h/w: "Illegal render target write message length 0xa expected 0xc" This patch fixes the GPU hang as well as the simulator warning with new piglit test fbo-mrt-alphatest-no-buffer-zero-write: https://patchwork.freedesktop.org/patch/118212 No regressions in Jenkins CI system. Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-11-08 14:22:53 -08:00
Tim Rowley	95ed1c19bf	swr: allow alphatest without blend or logicop We need to compile a blend function when alphatest is enabled. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-08 14:18:47 -06:00
Dave Airlie	bafc75b437	radv: emit correct last export when Z/stencil export is enabled I was getting a random GPU hang in the renderpass simple tests, it turns out sometimes radv emitted the wrong thing "last". This fixes the logic to emit Z/stencil last if they occur, and not mark a color output as last. Also this relies on the Z/STENCIL being the first two fragment outputs, which they are so yay. Fixes: dEQP-VK.renderpass.simple.color_depth (random hangs) Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-09 06:05:03 +10:00
Marek Olšák	bdd48e47c0	tgsi/scan: turn a huge if-else-if.. chain into a switch statement Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-08 17:56:42 +01:00
Marek Olšák	f864547fa9	tgsi/scan: fix images_buffers regression The first IF statement disabled the second one. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98599 Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-08 17:56:42 +01:00
Jason Ekstrand	6b7cc8a9ec	anv: Document cmd_buffer_alloc_binding_table Some of the details of this function are very confusing and have a long history. We should document that history and this seems like the best place to do it. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-08 08:32:55 -08:00
Jason Ekstrand	406cd9d126	intel/blorp: Emit all the binding tables At least on Sky Lake, after emitting 3DSTATE_CONSTANT_*, you are required to re-emit the 3DSTATE_BINDING_TABLE_POINTERS packet for the corresponding stage. If you don't, double-buffering may fail and you may get the wrong constants. It turns out that you need to do this even if you have no push constants to speak of or else the next 3DSTATE_CONSTANT packet you emit for that stage may not work correctly. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-08 08:32:55 -08:00
Jordan Justen	112a2ba276	i965/gen9: Allow sampling with hiz when supported For gen9+ this will indicate when we should allow hiz based sampling during rendering. Improves performance in : - Synmark's OglDeferred by 2.2% (n=20) - Synmark's OglShMapPcf by 0.44% (n=20) v2 by Ben: Add spec reference, and make it fix with some of the changes made on the previous patches Change the check from mt->aux_buf to mt->num_samples. The presence of an aux_buf isn't enough to determine there isn't a HiZ buffer to use. v3: It seems all depth surface end up with num_samples = 0 by default, so allow sampling from depth HiZ if num_samples <= 1. (Lionel) Allow sampling from HiZ only if all LOD are available from the HiZ buffer. (Lionel) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> (v2) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v3) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-08 16:13:57 +00:00
Ben Widawsky	3b0c2bc417	i965/gen9: Add HiZ auxiliary buffer support The original functionality this patch introduces was authored by a patch from Ken (the commit subject was the same). Since I ended up changing so many patches in the code before this one, I had some non-trivial decisions to make, and I didn't feel it was appropriate to keeps Ken's name as author (mostly because he might not like what I've done). Ken's original patch was like 2 LOC :-) In either case, some credit needs to go to Ken, and to Jordan for a few small other changes in that original patch. v2: Back to a smaller diff now that ISL handles most of the actual programming (Lionel) Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> (v1) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-08 16:13:57 +00:00
Jordan Justen	c0f505c7ef	i965: Add function to indicate when sampling with hiz is supported Currently it indicates that this is never supported, but soon it will be supported for gen8+^w gen9+ v2 by Ben: - Explicitly disable aux_hiz for gen < 9 (with comment) - squashed in next patch to avoid unused and useless functions i965: Support sampling with hiz during rendering For gen8, we can sample from depth while using the hiz buffer. This allows us to sample depth without resolving from hiz to the depth texture. To do this we must resolve to hiz before drawing so we can use the hiz buffer to sample while rendering. Hopefully the hiz buffer will already be resolved in most cases because it was previously rendered, meaning the hiz resolve is a no-op. Note that this is still controlled by the intel_miptree_sample_with_hiz function, and we will enable hiz sampling for gen8 in a separate patch. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> (v2) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-08 16:13:57 +00:00
Ben Widawsky	c53e9c9780	i965/miptree: Create a hiz mcs type This seems counter to the goal of consolidating hiz, mcs, and later ccs buffers. Unfortunately, hiz on gen6 is a thing the code supports, and this wart will be helpful to achieve that. Overall, I believe it does help unify AUX buffers on gen7+. I updated the size field which I introduced in the previous patch, even though we have no use for it. XXX: As I mentioned in the last patch, the height given to the MCS buffer allocation in intel_miptree_alloc_mcs() looks wrong, but I don't claim to fully understand how the MCS buffer is laid out. v2: rebase on master (Lionel) Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> (v1) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-08 16:13:57 +00:00
Ben Widawsky	36d1c555ed	i965: Drop the aux mt when not used This patch will preserve the BO & offset, and not the miptree for the aux_mcs buffer. Eventually it might make sense to pull put the sizing function in miptree creation, but for now this should be sufficient and not too hideous. v2: Save BO's offset too (Lionel) v3: Squash previous patch storing the size of the allocated aux buffer (Lionel) Fix memory leak with mcs_buf->bo (Lionel) Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> (v1) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-08 16:13:57 +00:00
Ben Widawsky	42db7ab179	i965/miptree: Directly gtt map the mcs buffer The next patch will change the map type, and this will make sure there are no regressions as a result of the other stuff. Since the miptree is newly created, I believe it is always safe to just map. It is possible to CPU map this buffer on LLC platforms (it additionally requires rounding up to tile size). I did experiment with that patch, and found no performance gains to be had. I've added in error handling while here. Generally GTT mapping is an operation which is highly unlikely to fail, but we may as well handle it when it does. v2: rebase on master (Lionel) v3: print out error if gtt mapping fails (Topi) Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> (v1) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-08 16:13:57 +00:00
Jordan Justen	0041169cac	i965: Wrap MCS miptree in intel_miptree_aux_buffer This will allow us to treat HiZ and MCS the same when using as an auxiliary surface buffer. v2: (Ben) Minor rebase conflict resolution. Rename mcs_buf to aux_buf to address upcoming change for hiz specific buffers. That second thing is essentially a squash of: i965/gen8: Use intel_miptree_aux_buffer for auxiliary buffer - which didn't need to be separate in my opinion. v3: rebase on master (Lionel) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>a (v2) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v3) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-08 16:13:57 +00:00
Nicolai Hähnle	88f791db75	gallivm: fix [IU]MUL_HI regression This patch does two things: 1. It separates the host-CPU code generation from the generic code generation. This guards against accidently breaking things for radeonsi in the future. 2. It makes sure we actually use both arguments and don't just compute a square :-p Fixes a regression introduced by commit `29279f44b3` Cc: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-11-08 16:25:54 +01:00
Roland Scheidegger	3fa10ffb49	draw: use vectorized calculations for fetch Instead of doing all the math with scalars, use vectors. This means the overflow math needs to be done manually, albeit that's only really problematic for the stride/index mul, the rest has been pretty much moved outside the shader loop (albeit the mul could actually be optimized away too), where things are still scalar. Because llvm is complete fail with the zero-extend widening mul, roll our own even... To eliminate control flow in the main shader loop fetch, provide fake buffers (so index 0 is always valid to fetch). Still uses aos fetch though in the end - mostly because some more code would be needed to handle unaligned fetches in that path, and because for most formats it won't make a difference anyway (we generate some truly horrendous code for things like R16G16_something for instance). Instanced fetch however stays roughly the same as before, except that no longer the same element is fetched multiple times (I've seen a reduction of ~3 times in main shader loop size due to apparently llvm not being able to deduce it's really all the same with a couple instanced elements). Also, for elts gathering, use vectorized code as well - provide a fake elt buffer if there's no valid one bound. The generated shaders are smaller and faster to compile (not entirely sure about execution speed, but generally unless there's just single vertices to handle I would expect it to be faster - there's more opportunities for future improvements by using soa fetch). No piglit change. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-08 03:41:26 +01:00
Roland Scheidegger	29279f44b3	gallivm: introduce 32x32->64bit lp_build_mul_32_lohi function This is used by shader umul_hi/imul_hi functions (and soon by draw). It's actually useful separating this out on its own, however the real reason for doing it is because we're using an optimized sse2 version, since the code llvm generates is atrocious (since there's no widening mul in llvm, and it does not recognize the widening mul pattern, so it generates code for real 64x64->64bit mul, which the cpu can't do natively, in contrast to 32x32->64bit mul which it could do). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-08 03:41:26 +01:00
Anuj Phogat	b0554c25e7	i965: Add space before paren Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-11-07 16:13:57 -08:00
Anuj Phogat	501d608e56	i965: Remove unnecessary white space Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-11-07 16:13:57 -08:00
Anuj Phogat	329ae922bd	i965: Fix alpha-to-coverage and alpha test enabled checks Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-11-07 16:13:02 -08:00
Anuj Phogat	a1bd2f6950	mesa: Add helper function _mesa_is_alpha_to_coverage_enabled() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-11-07 16:13:02 -08:00
Anuj Phogat	0295c792b4	mesa: Add helper function _mesa_is_alpha_test_enabled() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-11-07 16:13:02 -08:00
Anuj Phogat	7fed07766d	mesa: Use separate line for function return type Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-11-07 16:13:02 -08:00
Samuel Pitoiset	e32e5d214e	nvc0: simplify draw parameters upload for vertex shaders Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-07 22:50:17 +01:00
Steven Toth	381edca826	gallium/hud: protect against and initialization race In the event that multiple threads attempt to install a graph concurrently, protect the shared list. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-07 18:31:52 +01:00
Steven Toth	5a58323064	gallium/hud: close a previously opened handle We're missing the closedir() to the matching opendir(). Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-07 18:31:52 +01:00
Steven Toth	6ffed08679	gallium/hud: fix a problem where objects are free'd while in use. Instead of trying to maintain a reference counted list of valid HUD objects, and freeing them accordingly, creating race conditions between unanticipated multiple threads, simply accept they're allocated once and never released until the process terminates. They're a shared resource between multiple threads, so accept they're always available for use. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-07 18:31:52 +01:00
Rob Clark	a5e733c6b5	mesa: drop current draw/read buffer when ctx is released This fixes a problem seen with gallium drivers vs android wallpaper. Basically, what happens is: EGLSurface tmpSurface = mEgl.eglCreatePbufferSurface(mEglDisplay, mEglConfig, attribs); mEgl.eglMakeCurrent(mEglDisplay, tmpSurface, tmpSurface, mEglContext); int[] maxSize = new int[1]; Rect frame = surfaceHolder.getSurfaceFrame(); glGetIntegerv(GL_MAX_TEXTURE_SIZE, maxSize, 0); mEgl.eglMakeCurrent(mEglDisplay, EGL_NO_SURFACE, EGL_NO_SURFACE, EGL_NO_CONTEXT); mEgl.eglDestroySurface(mEglDisplay, tmpSurface); ... check maxSize vs frame size and bail if needed ... mEglSurface = mEgl.eglCreateWindowSurface(mEglDisplay, mEglConfig, surfaceHolder, null); ... error checking ... mEgl.eglMakeCurrent(mEglDisplay, mEglSurface, mEglSurface, mEglContext); When the window-surface is created, it ends up with the same ptr address as the recently freed tmpSurface pbuffer surface. Which after many levels of indirection, results in st_framebuffer_validate() ending up with the same/old framebuffer object, and in the end never calling the DRIimageLoaderExtension::getBuffers(). Then in droid_swap_buffers(), the dri2_surf is still the old pbuffer surface (with dri2_surf->buffer being NULL, obviously, so when wallpaper app calls eglSwapBuffers() nothing gets enqueued to the compositor). Resulting in a black/blank background layer. Note that at the EGL layer, when the context is unbound, EGL drops it's references to the draw and read buffer as well. Signed-off-by: Rob Clark <robdclark@gmail.com> Tested-by: Robert Foss <robert.foss@collabora.com> Acked-by: Tapani Pälli <tapani.palli@intel.com>	2016-11-07 10:23:26 -05:00
Serge Martin	cc495055cd	clover: Add CL_PROGRAM_BINARY_TYPE support (CL1.2). v3 [Francisco Jerez]: Loosely based on Serge's v1 of this patch in order to avoid CL-specific enums in the clover module binary format. In addition to other changes made in v2: Represent the CL program binary type as the section type instead of adding a CL API-specific enum, check that the binary types of the input objects are valid during clLinkProgram(), pass section type as argument to build_module_library() instead of using separate function. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-11-06 15:56:54 +01:00
Serge Martin	05fcc73f08	clover: add missing clGetDeviceInfo CL1.2 queries Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-11-06 15:56:49 +01:00
Samuel Pitoiset	8cc4a74971	nvc0: get rid of NVE4_COMPUTE_MP_PM_{A,B}_SIGSEL_XXX Instead, hardcode group sigsel because there are a bunch of unknown groups, especially on SM50/SM52. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-11-05 19:28:25 +01:00
Samuel Pitoiset	a295364596	gm107/ir: emit RED instead of ATOM when no dst This is similar to NVC0 and GK110 emitters where we emit reduction operations instead of atomic operations when the destination is not used. Found after writing some tests which check if performance counters return the expected value. In that case, gred_count returned 0 on gm107 while at least gk106 returned the correct value. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-05 19:27:35 +01:00
Brian Paul	cfb5a9ab23	st/mesa: initialize members of glsl_to_tgsi_instruction in emit_asm() This fixes random crashes with MSVC release builds. It seems the members are implicitly initialized to zero with gcc, but not MSVC. In particular, the tex_offset_num_offset field was non-zero causing a loop over the NULL tex_offsets array to crash. Zero-init those fields and a few others to be safe. The regression began with `acc23b04cf` "ralloc: remove memset from ralloc_size". Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-05 12:09:40 -06:00
Mauro Rossi	0148313ea3	android: amd/common: add support for libmesa_amd_common Fixes the following building error introduced with commit `7115e56` and related amd/common dependencies: external/mesa/src/gallium/drivers/radeonsi/si_shader.c:6861: error: undefined reference to 'ac_is_sgpr_param' external/mesa/src/gallium/drivers/radeonsi/si_shader.c:6951: error: undefined reference to 'ac_is_sgpr_param' clang++: error: linker command failed with exit code 1 (use -v to see invocation) ninja: build stopped: subcommand failed. build/core/ninja.mk:148: recipe for target 'ninja_wrapper' failed make: *** [ninja_wrapper] Error 1 Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-11-05 18:42:29 +01:00
Marek Olšák	0f72f7292a	winsys/radeon: don't call surface_best for FMASK Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98518 Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-05 18:36:26 +01:00
Kenneth Graunke	0c17b0b6f0	mesa: Add linear ETC2/EAC to the compressed format list with ES3 compat. GL_ARB_ES3_compatibility brings ETC2/EAC formats to desktop GL. The meaning of the GL compressed format list is pretty vague - it's supposed to return formats for "general-purpose usage". (GL 4.2 deprecates the list because of this.) Basically everyone interprets this as "linear RGB/RGBA". ETC2/EAC meets that criteria, so while we shouldn't be required to add it to the list, there's also little harm in doing so, at least on platforms with native support. I doubt anyone is using this list for much anyway, so even on platforms without native support, it's probably not a big deal. Makes the following GL45-CTS.gtf43 tests pass: * GL3Tests.eac_compression_r11.gl_compressed_r11_eac * GL3Tests.eac_compression_rg11.gl_compressed_rg11_eac * GL3Tests.eac_compression_signed_r11.gl_compressed_signed_r11_eac * GL3Tests.eac_compression_signed_rg11.gl_compressed_signed_rg11_eac * GL3Tests.etc2_compression_rgb8.gl_compressed_rgb8_etc2 * GL3Tests.etc2_compression_rgb8_pt_alpha1.gl_compressed_rgb8_pt_alpha1_etc2 * GL3Tests.etc2_compression_rgba8.gl_compressed_rgba8_etc2 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-11-04 16:10:20 -07:00
Eric Anholt	283d4d18e5	vc4: Use Newton-Raphson on the 1/W write to fix glmark2 terrain. The 1/W was apparently not accurate enough, and we were getting sparklies in the distance. The closed driver also did a N-R step here. Cc: <mesa-stable@lists.freedesktop.org>	2016-11-04 15:34:38 -07:00
Eric Anholt	70fc3a941a	vc4: Make sure that vertex shader texture2D() calls use LOD 0. I noticed this while trying to debug glmark2 terrain (which does vertex shader texturing, but no mipmaps on its textures sampled from the VS).	2016-11-04 15:34:38 -07:00
Nicolai Hähnle	2c875158e2	radeonsi: fix vertex fetches for 2_10_10_10 formats The hardware always treats the alpha channel as unsigned, so add a shader workaround. This is rare enough that we'll just build a monolithic vertex shader. The SINT case cannot actually happen in OpenGL, but I've included it for completeness since it's just a mix of the other cases. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-04 21:30:18 +01:00
Nicolai Hähnle	322483f71b	st/mesa: fix the layer of VDPAU surface samplers A (latent) bug in VDPAU interop was exposed by commit `e5cc84dd43`. Before that commit, the st_vdpau code created samplers with first_layer == last_layer == 1 that the general texture handling code would immediately delete and re-create, because the layer does not match the information in the GL texture object. This was correct behavior at least in the DMABUF case, because the imported resource is supposed to have the correct offset already applied. In the non-DMABUF case, this was just plain wrong but apparently nobody noticed. After that commit, the state tracker assumes that an existing sampler is correct at all times. Existing samplers are supposed to be deleted when they may become invalid, and they will be created on-demand. This meant that the sampler with first_layer == last_layer == 1 stuck around, leading to rendering artefacts (on radeonsi), command stream failures (on r600), and assertions (in debug builds everywhere). This patch fixes the problem by simply not creating a sampler at all in st_vdpau_map_surface. We rely on the generic texture code to do the right thing, adding the layer_override to make the non-DMABUF case work. v2: add the layer_override Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98512 Cc: 13.0 <mesa-stable@lists.freedesktop.org> Cc: Christian König <deathsimple@vodafone.de> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Christian König <christian.koenig@amd.com>	2016-11-04 21:26:29 +01:00
Dave Airlie	d0d5f7600c	Revert "st/vdpau: use linear layout for output surfaces" This reverts commit `d180de3532`. This is a radeon specific hack that causes problems on nouveau when combined with the SHARED flag later. If radeonsi needs a fix for this, please fix it in the driver. [chk] Using linear surfaces for this makes sense because tilling isn't beneficial and the surfaces can potentially be shared with other GPUs using the VDPAU OpenGL interop. [airlied] I think we need a flag that isn't SHARED/LINEAR that is more SHARED_OTHER_GPU. [mareko] Does radeonsi need PIPE_BIND_VIDEO_DECODE_OUTPUT that it would translate into linear ? [mareko] My only concern is decoding performance. If the decoder works in 64x1 blocks, tiling will hurt. That's the theory. I don't know how the decoder works. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (I+A)	2016-11-04 15:04:21 +00:00
Marek Olšák	00baaa4752	radeonsi: fix an assertion failure in si_decompress_sampler_color_textures This fixes a crash in Deus Ex: Mankind Divided. Release builds were unaffected, so it's not too serious. Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-04 11:30:47 +01:00
Marek Olšák	64c2593a5c	glx: make interop ABI visible again This was broken when the GLAPI use was removed from mesa_glinterop.h. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-04 11:30:47 +01:00
Marek Olšák	ee39d4456e	egl: make interop ABI visible again This was broken when the GLAPI use was removed from mesa_glinterop.h. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-04 11:30:47 +01:00
Marek Olšák	bf51b45313	egl: use util/macros.h I need the definition of PUBLIC. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-04 11:30:47 +01:00
Nicolai Hähnle	84a74be9e4	radeonsi: enable GLSL 4.50 Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-04 10:33:50 +01:00
Nicolai Hähnle	e4b378800e	st/glsl_to_tgsi: fix dvec[34] loads from SSBO When splitting up loads, we have to add 16 bytes to the offset for the high components, just like already happens for stores. Fixes arb_gpu_shader_fp64@shader_storage@layout-std140-fp64-shader. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-04 10:31:02 +01:00
Nicolai Hähnle	aef7eb4cac	glsl/cache: correct asprintf error handling From the manpage of asprintf: "If memory allocation wasn't possible, or some other error occurs, these functions will return -1, and the contents of strp are undefined." Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-04 10:28:08 +01:00
Michel Dänzer	8ce7ef75f5	gallium/radeon: Multiply bpe by nsamples in surf_winsys_to_drm For symmetry with surf_drm_to_winsys. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-04 16:51:18 +09:00
Michel Dänzer	356458363d	gallium/radeon: Use flags parameter in radeon_winsys_surface_init Fixes valgrind warnings about surf_ws->flags being uninitialized while starting X. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-04 16:49:39 +09:00
Michel Dänzer	6f844a30c1	gallium/radeon: Only convert stencil info if RADEON_SURF_SBUFFER is set Fixes valgrind warnings about using uninitialized memory when starting X. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-04 16:48:59 +09:00
Michel Dänzer	38fb9aa1aa	gallium/radeon: Only loop up to last_level for drm<->winsys conversion Fixes spurious assertion failure in surf_level_drm_to_winsys when starting X, due to processing a miplevel which was never initialized. Fixes: `e9c76eeeaa` ("gallium/radeon: remove radeon_surf_level::pitch_bytes") Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-04 16:47:43 +09:00
Tapani Pälli	1e3f7bfc9a	anv: use limits.h instead of deprecated/obsolete values.h Mesa uses limits.h elsewhere, and this makes is possible to compile anv_allocator.c on Android. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-11-04 08:35:43 +02:00
Eric Anholt	80157466cd	vc4: Add miptree/texture state support for ETC1 compressed textures. The format isn't flagged as enabled at runtime yet, because we need kernel validation support.	2016-11-03 18:42:58 -07:00
Eric Anholt	bedb996087	vc4: Fix use of undefined values since the ralloc zeroing changes. reralloc() no longer zeroes the new contents, so switch to using rzalloc_array() instead.	2016-11-03 18:42:58 -07:00
Eric Anholt	49936364e4	nir: Make sure to set the texsrc type in nir drawpixels/bitmap lowering. We were leaving an undefined value since the ralloc zeroing changes. Fixes nir_validate() failures on vc4. v2: Fix the color-index case of drawpixels as well. Reviewed-by: Rob Clark <robdclark@gmail.com> (v1)	2016-11-03 18:42:58 -07:00
Roland Scheidegger	572a952126	draw: fix undefined input handling some more... Previous fixes were incomplete - some code still iterated through the number of elements provided by velem layout instead of the number stored in the key (which is the same as the number defined by the vs). And also actually accessed the elements from the layout directly instead of those in the key. This mismatch could still cause crashes. (Besides, it is a very good idea to only use data stored in the key anyway.) v2: move null format check, remove now unnecessary function parameter, some minor prettify Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-04 01:48:22 +01:00
Brian Paul	f4dd3bde37	gallium/hud: call fflush() after printing error messages For Windows. Otherwise, we don't see the message until the program exits. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-11-03 14:29:23 -06:00
Brian Paul	260d951486	svga: move svga_mark_surfaces_dirty() prototype to svga_surface.h Trivial.	2016-11-03 14:29:23 -06:00
Brian Paul	c96f63cac2	svga: whitespace / formatting clean-up in svga_context.c Trivial.	2016-11-03 14:29:23 -06:00
Brian Paul	1691e29e62	svga: collect stats for time spent in svga_context_finish() This should have appeared with commit "svga: add guest statistic gathering interface" from August 4, but was somehow lost.	2016-11-03 14:29:23 -06:00
Charmaine Lee	8a195e2fd5	svga: invalidate new surface before it is bound to a render target view Invalidate a "new" surface before it is bound to a render target view or depth stencil view in order to avoid the unnecessary host side copy of the surface data before it is rendered to. Note that, recycled surface is already invalidated before it is reused. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:23 -06:00
Charmaine Lee	06bba2452f	Revert "svga: use untyped surface formats in most cases" Using untyped surface formats causes huge performance degradation on Fusion. This reverts commit `eb0ced74f6` until the backend has a better solution to address typeless surface formats.	2016-11-03 14:29:23 -06:00
Charmaine Lee	f2eec4e829	svga: allow quad blit for more formats Currently blitter will fail if the blit format is different and view-incompatible to the resource format. Instead of punting to software blit which will stall the pipeline, we will create temporary resource to allow blitter to work. Fixes piglit test arb_copy_image-formats. Also tested with MTT piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	4bd5ce853b	svga: create BGRX render target view for BGRX_UNORM surface Currently we adjust the view format when we are asked to create a BGRA render target view for BGRX surface. But we only look for SVGA3D_B8G8R8X8_TYPELESS surface format. With this patch, we will also check for SVGA3D_B8G8R8X8_UNORM surface format, and use SVGA3D_B8G8R8X8_UNORM as the view format for that case. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	0d221fcd40	svga: add a helper function to check for typeless format This patch adds a helper function svga_format_is_typeless() which returns TRUE if the specified format is typeless. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Brian Paul	d451421bca	svga: add SVGA_NEW_FRAME_BUFFER to svga_hw_tss_binding state atom We may need to re-emit texture bindings when the framebuffer state changes. In particular, emitting the texture binding can also involve updating a texture from its backing copy during sampler view validation. The backing copy is made during framebuffer validation. This helps to fix an issue with Photoshop on VGPU9 (VMware bug 1723971). Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	ec138d6237	svga: allow copy_region if sample counts match With this patch, we will allow blit with copy_region if the source and destination textures have the same sample counts. Fixes failures with piglit tests spec@arb_texture_float@multisample-formats 2 gl_arb_texture_float spec@arb_texture_rg@multisample-formats 2 gl_arb_texture_rg-float Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	a2d49c4b46	svga: set rendered-to flag after updating the texture using PredCopyRegion This patch sets the rendered-to flag for the subresource after it is updated using the PredCopyRegion command. This is to ensure that the GB surface will be sync up properly before it will be directly mapped to. Tested with MTT piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	59f14563a3	svga: add can_use_upload flag This patch adds a flag "can_use_upload" to svga_texture structure to avoid some checking of the upload availability at each transfer map time. Tested with Lightsmark2008, Tropics, MTT glretrace, piglit. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	3dfb4243bd	svga: fix texture upload path condition As Thomas suggested, we'll first try to map directly to a GB surface. If it is blocked, then we'll use texture upload buffer. Also if a texture is already "rendered to", that is, the GB surface is already out of sync, then we'll use the texture upload buffer to avoid syncing the GB surface. Tested with Lightsmark2008, Tropics, MTT piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	4750c4e543	svga: set rendered_to flag with texture uploaded using TransferFromBuffer command This patch sets the rendered_to flag for the texture subresource that is uploaded using the TransferFromBuffer command. This is to ensure that the subresource will be read back or invalidated before it will be directly mapped to. This makes sure that the content of the GB surface will not be accidentally overwritten by the device at suspend/resume time. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Neha Bhende	03e1b7cacd	svga: Add render_condition boolean flag in struct svga_context set render_condition flag when driver performs conditional rendering. Blit using DXPredCopyRegion command gets affected by conditional rendering so We should check this flag while performing blit operation Tested with piglit tests. v2: As per Charmaine's comment, setting render_condition flag if svga_query is valid. Tested with pigit tests. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-11-03 14:29:22 -06:00
Neha Bhende	2cff6f4512	svga: Allow DXPredCopyRegion for depth_and_stencil formats. DXPredCopyRegion supports copy between src and dst for depth_and_stencil formats if src and dst have same formats. tested ith piglit v2: As per Brian's comment, allow DXPredCopyRegion for depth+stencil buffers if the blit mask is PIPE_MASK_ZS. Tested with piglit tests and added new piglit test arb_framebuffer_object-depth-stencil-blit to test this particular testcase. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Neha Bhende	9a9627a791	svga: fix memory leak in svga_clear_texture() Piglit tests which uses arb_clear_texture extension, have memory leak issue. pipe_surface created in svga_clear_texture() was not deleted which happens to be the cause for memory leak. tested all arb_clear_texture-* piglit tests with valgrid. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-11-03 14:29:22 -06:00
Thomas Hellstrom	d787ce7288	svga: Implement the pipe clear_render_target functionality v2 v2: Accounted for the fact that svga_try_clear_render_target also honors conditional rendering. Testing done: Excercised all functions in a separate feature branch. Forced emission of conditional rendering commands when necessary. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	76f5f76468	svga: add SVGA_3D_CMD_INVALIDATE_GB_SURFACE support This command will be used in a subsequent patch to invalidate a surface. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Francisco Jerez	f3d387867f	nir: Flip gl_SamplePosition in nir_lower_wpos_ytransform(). Assuming the hardware is set up to use a screen coordinate system flipped vertically with respect to the GL's window coordinate system, the SYSTEM_VALUE_SAMPLE_POS vector will also be flipped vertically with respect to the value expected by the GL, so we need to give it the same treatment as gl_FragCoord. Fixes the following CTS tests on i965: ES31-CTS.functional.shaders.multisample_interpolation.interpolate_at_offset.at_sample_position.default_framebuffer ES31-CTS.functional.shaders.sample_variables.sample_pos.correctness.default_framebuffer when run with any multisample configuration, e.g. rgba8888d24s8ms4. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-11-03 11:46:44 -07:00
Nanley Chery	faab6a0f18	isl: Only allow Y-tiling for ASTC textures Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-03 11:22:58 -07:00
Nanley Chery	1625d911d7	anv/blorp: Don't create linear ASTC surfaces for buffers Such a surface is not possible on our hardware. Without this change, ISL surface creation would fail with the next patch. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-03 11:22:58 -07:00
Nanley Chery	bb550e2977	anv/formats: Disallow linear ASTC textures Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-03 11:22:58 -07:00
Nanley Chery	80de528c7e	anv/formats: Disallow 1D compressed textures Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-03 11:22:58 -07:00
Chris Wilson	b4001af174	i965: Use rzalloc for cfg_t Valgrind reports that we use cfg.cycle_count uninitialised, so zero the cfg_t on construction. Fixes: `52d2b28f7f` ("ralloc: use rzalloc where it's necessary") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 11:16:05 -07:00
Nicolai Hähnle	27bd9c0f0a	pipe-loader: add libamd_common for radeonsi This fixes a build regression of commit `7115e56c21`. Sorry for the breakage, this second location for link dependencies escaped my build tests. Bugzilla: https://patchwork.freedesktop.org/patch/119816/ Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-11-03 16:54:55 +01:00
Andreas Boll	f792f0687f	glx/windows: Add wgl.h to the sources list Otherwise it won't be picked in the tarball and the build will fail. Fixes: `533b3530c1` ("direct-to-native-GL for GLX clients on Cygwin ("Windows-DRI")") Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>	2016-11-03 11:38:04 +01:00
Tapani Pälli	979ec2cf75	i965: use rzalloc instead of calloc in brwNewProgram commit `cc6aa1d161` changed to using rzalloc for gl_program creation but one instance for program creation was still using calloc. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-03 12:09:40 +02:00
Nicolai Hähnle	908f92ad1f	radeonsi: generate GS prolog to (partially) fix triangle strip adjacency rotation Fixes GL45-CTS.geometry_shader.adjacency.adjacency_indiced_triangle_strip and others. This leaves the case of triangle strips with adjacency and primitive restarts open. It seems that the only thing that cares about that is a piglit test. Fixing this efficiently would be really involved, and I don't want to use the hammer of degrading to software handling of indices because there may well be software that uses this draw mode (without caring about the precise rotation of triangles). v2: - skip the GS prolog entirely if workaround is not needed - only check for TES (TES is always non-null when tessellation is used) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:11:24 +01:00
Nicolai Hähnle	ffe4e829b0	radeonsi: remove si_shader_context::is_gs_copy_shader It has become redundant. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:53 +01:00
Nicolai Hähnle	3b2516721b	radeonsi: make the GS copy shader owned by the GS selector The copy shader only depends on the selector. This change avoids creating separate code paths for monolithic vs. non-monolithic geometry shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:50 +01:00
Nicolai Hähnle	9c6f7d66dc	radeonsi: si_shader_vs only depends on the GS selector Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:48 +01:00
Nicolai Hähnle	693435d846	radeonsi: si_vgt_gs_mode only depends on the selector Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:45 +01:00
Nicolai Hähnle	2e1fb7e7fc	radeonsi: make si_generate_gs_copy_shader usable as a standalone function It really only depends on the shader selector. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:42 +01:00
Nicolai Hähnle	ba5de0d034	radeonsi: unify the si_compile_* functions for prologs and epilogs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:37 +01:00
Nicolai Hähnle	aa9583b0da	radeonsi: get rid of no_{prolog,epilog} Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:34 +01:00
Nicolai Hähnle	75503b1904	radeonsi: get rid of si_llvm_emit_fs_epilogue It is no longer used. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:31 +01:00
Nicolai Hähnle	611510038a	radeonsi: get rid of get_interp_param Replace by a simple LLVMGetParam, since ctx->no_prolog is always false. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:29 +01:00
Nicolai Hähnle	3f4439b6ba	radeonsi: get rid of select_interp_param The condition !ctx->no_prolog is now always true. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:26 +01:00
Nicolai Hähnle	858ac2228f	radeonsi: use TCS epilog for monolithic shaders For fixed function TCS, we keep the copying of VS outputs to TES inputs inside the main function; the call to si_copy_tcs_inputs is moved accordingly. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:23 +01:00
Nicolai Hähnle	3f1be54e53	radeonsi: extract si_build_tcs_epilog_function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:20 +01:00
Nicolai Hähnle	be6e31c6a0	radeonsi: use VS epilog for monolithic TES Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:17 +01:00
Nicolai Hähnle	06dcb2d2a9	radeonsi: use VS prolog and epilog for monolithic shaders Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:14 +01:00
Nicolai Hähnle	f9daa2f470	radeonsi: extract si_build_vs_{prolog,epilog}_function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:12 +01:00
Nicolai Hähnle	6f37e992a3	radeonsi: use PS prolog for monolithic shaders Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:09 +01:00
Nicolai Hähnle	15dd332e6a	radeonsi: set num_input_vgprs for fragment shaders in create_function So that the prolog generated for monolithic fragment shaders will have the right signature. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:05 +01:00
Nicolai Hähnle	fec7ced211	radeonsi: extract si_build_ps_prolog_function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:02 +01:00
Nicolai Hähnle	7115e56c21	radeonsi: use PS epilog for monolithic shaders Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:00 +01:00
Nicolai Hähnle	bf86c56594	radeonsi: extract si_build_ps_epilog_function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:06:57 +01:00
Nicolai Hähnle	0b9bba7f6c	radeonsi: pass the function name to si_llvm_create_func We will use multiple functions in one module, so they should have different names. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:06:54 +01:00
Nicolai Hähnle	96d60dd9ee	radeonsi: split is_monolithic into no_prolog and no_epilog This helps to achieve a gradual transition towards building monolithic shaders via inlining. no_prolog and no_epilog will be removed by the end of the series, separate_prolog remains in use to control the PS input mapping. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:06:50 +01:00
Nicolai Hähnle	8db9d915cd	radeonsi: free data structures when shader compiles fail Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:06:47 +01:00
Nicolai Hähnle	4c1504af6a	radeonsi: move main TGSI translation into its own function The idea is that adding prolog and epilog code will be pulled out into the caller. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:06:44 +01:00
Nicolai Hähnle	23dfb688ba	radeonsi: add always-inline pass to si_llvm_finalize_module Change the pass manager as well, since this is a module-level pass. No noticeable run-time difference on shader-db. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:06:42 +01:00
Nicolai Hähnle	4ada1dabc4	radeonsi: fix signature of export intrinsic in VS epilog The incompatible signature becomes an issue when the VS epilog gets merged with the main vertex shader at the IR level. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:06:33 +01:00
Nicolai Hähnle	899b2f24a4	radeonsi: link against amd_common Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:06:30 +01:00
Nicolai Hähnle	908100cfae	amd/common: add ac_is_sgpr_param helper Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:06:27 +01:00
Nicolai Hähnle	2ff5df8f50	amd/common: build also for gallium drivers At least when LLVM is used, which is basically always (unless you're only building r600 without OpenCL). Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:06:24 +01:00
Nicolai Hähnle	8eabee9ec0	amd/common: move llvm helper prototype to ac_llvm_util.h Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:05:46 +01:00
Nicolai Hähnle	37d646c1b3	glsl: fix lowering of UBO references of named blocks When a UBO reference has the form block_name.foo where block_name refers to a block where the first member has a non-zero offset, the base offset was incorrectly added to the reference. Fixes an assertion triggered in debug builds by GL45-CTS.enhanced_layouts.uniform_block_layout_qualifier_conflict. That test doesn't properly check for correct execution in this case, so I am also going to send out a piglit test. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-11-03 09:57:25 +01:00
Kenneth Graunke	8df4aebc94	glsl: Update deref types when resizing implicitly sized arrays. At link time, we resolve the size of implicitly sized arrays. When doing so, we update the type of the ir_variables. However, we neglected to update the type of ir_dereference nodes which reference those variables. It turns out array_resize_visitor (for GS/TCS/TES interface array handling) already did 2/3 of the cases for this, so we can simply refactor the code and reuse it. This fixes: GL45-CTS.shader_storage_buffer_object.basic-syntax GL45-CTS.shader_storage_buffer_object.basic-syntaxSSO which have an SSBO containing an implicitly sized array, followed by some other members. setup_buffer_access uses the dereference types to compute offsets to fields, and it had a stale type where the implicitly sized array's length was still 0 instead of the actual length. While we're here, we can also fix update_array_sizes to properly update deref types as well, fixing a FINISHME from 2010. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-03 01:42:37 -07:00
Timothy Arceri	d2861d682a	mesa/glsl: delete previously linked shaders earlier when linking This moves the delete linked shaders call to _mesa_clear_shader_program_data() which makes sure we delete them before returning due to any validation problems. It also reduces some code duplication. From the OpenGL 4.5 Core spec: "If LinkProgram failed, any information about a previous link of that program object is lost. Thus, a failed link does not restore the old state of program. ... If one of these commands is called with a program for which LinkProgram failed, no error is generated unless otherwise noted. Implementations may return information on variables and interface blocks that would have been active had the program been linked successfully. In cases where the link failed because the program required too many resources, these commands may help applications determine why limits were exceeded." Therefore it's expected that we shouldn't be able to query the program that failed to link and retrieve information about a previously successful link. Before this change the linker was doing validation before freeing the previously linked shaders and therefore could exit on failure before they were freed. This change also fixes an issue in compat profile where a program with no shaders attached is expect to fall back to fixed function but was instead trying to relink IR from a previous link. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97715 Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-03 11:58:53 +11:00
Timothy Arceri	903e5eae97	nir: fix nir_shader_clone() and nir_sweep() These were broken in `e1af20f18a` when the info field in nir_shader was turned into a pointer. Clone was copying the pointer rather than the data and nir_sweep was cleaning up shader_info rather than claiming it. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-11-03 10:39:13 +11:00
Timothy Arceri	f304aca542	mesa: move shader_info to the start of gl_program This will allow use to use ralloc_parent() on the info field and fix a regression in nir_sweep() caused by `e1af20f18a`. This is intended to be a temporary requirement that will be removed when we finish separating shader_info from nir_shader. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-11-03 10:39:13 +11:00
Timothy Arceri	cc6aa1d161	st/mesa/r200/i915/i965: use rzalloc() to create gl_program This allows us to use ralloc_parent() to see which data structure owns shader_info which allows us to fix a regression in nir_sweep(). This will also allow us to move some fields from gl_linked_shader to gl_program, which will allow us to do some clean-ups like storing gl_program directly in the CurrentProgram array in gl_pipeline_object enabling some small validation optimisations at draw time. Also it is error prone to depend on the gl_linked_shader for programs in current use because a failed linking attempt will free infomation about the current program. In i965 we could be trying to recompile a shader variant but may have lost some required fields. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-11-03 10:39:13 +11:00
Samuel Pitoiset	548b5fee6b	nv50,nvc0: stop limiting the number of active queries to 1 This limitation was initially here because AMD_performance_monitor doesn't allow to expose the real number of hardware counters. But this actually really annoying when profiling with qapitrace. Anyways, performance counters are mostly for developers and failures are expected if you try to monitor more queries than supported. This breaks amd_performance_monitor_measure but it's expected. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-11-02 23:42:09 +01:00
Samuel Pitoiset	b6137f226c	nvc0: add new warp_nonpred_execution_efficiency metric on SM35 Event not_predicated_off_thread_inst_executed is SM35+. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-02 23:35:49 +01:00
Samuel Pitoiset	98a382d013	nvc0: add missing metric-issue_slot on SM35 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-02 23:35:46 +01:00
Samuel Pitoiset	c32d7175aa	nvc0: do not expose metric-inst_issued twice on SM35 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-02 23:35:44 +01:00
Samuel Pitoiset	524703da58	nvc0: add new warp_execution_efficiency metric on SM30+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-02 23:35:42 +01:00
Samuel Pitoiset	51fe48660a	nvc0: respect 80-chars for perf metrics descriptions Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-02 23:35:39 +01:00
Samuel Pitoiset	b58d85bac8	nvc0: sort performance metrics alphabetically Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-02 23:35:28 +01:00
Fredrik Höglund	e7b9c5eb74	radv: add support for anisotropic filtering on VI+ Ported from radeonsi. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-03 08:27:21 +10:00
Dave Airlie	73592b9284	radv: fix dual source blending Dolphin tried to use this, but we hadn't had any tests for it properly. All that is required is the shader output format needs to be set for 0 and 1 exports. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-03 08:26:51 +10:00
Samuel Pitoiset	1d75d681d3	nv50: add missing draw_calls_indexed driver stat Spotted when glancing at the VBO push code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-02 21:11:57 +01:00
Adam Jackson	afaaf623d4	glx/glvnd: Use bsearch() in FindGLXFunction instead of open-coding it Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-11-02 14:52:43 -04:00
Adam Jackson	8bca8d89ef	glx/glvnd: Fix dispatch function names and indices As this array was not actually sorted, FindGLXFunction's binary search would only sometimes work. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-11-02 14:52:38 -04:00
Adam Jackson	deb0eb1660	glx/glvnd: Don't modify the dummy slot in the dispatch table Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-11-02 14:52:31 -04:00
Jason Ekstrand	71cc1e188d	anv/pipeline: Properly cache prog_data::param Before we were caching the prog data but we weren't doing anything with brw_stage_prog_data::param so anything with push constants wasn't getting cached properly. This commit fixes that. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-02 09:32:28 -07:00
Jason Ekstrand	ff3185e3ba	anv/pipeline: Put actual pointers in anv_shader_bin While we can simply calculate offsets to get to things such as the prog_data and the key, it's much more user-friendly if there are just pointers. Also, it's a bit more fool-proof. While we're at it, we rework the pipeline cache API to use the brw_stage_prog_data type directly. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-02 09:32:22 -07:00
Jason Ekstrand	4306c10a88	intel/blorp: Pass a brw_stage_prog_data to upload_shader Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-02 09:32:19 -07:00
Jason Ekstrand	058304f081	intel/blorp: Use wm_prog_data instead of hand-rolling our own Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-02 09:32:15 -07:00
Jason Ekstrand	a5f8ff6ca1	anv: Better handle return codes from anv_physical_device_init The case where we just want the loop to continue is INCOMPATIBLE_DRIVER because that simply means that whatever FD we opened isn't a supported Intel chip. Other error codes such as OUT_OF_HOST_MEMORY are actual errors and we should be returning early in that case. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-02 09:26:41 -07:00
Jason Ekstrand	daeb21e478	vulkan/wsi/x11: Clean up connections in finish_wsi Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-02 09:26:36 -07:00
Jason Ekstrand	fc0e9e3e40	vulkan/wsi/x11: Better handle wsi_x11_connection_create failure Without this fix, the function would still end up returning NULL but it would put that NULL connection in the hash table which would be bad. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-02 09:25:57 -07:00
Chih-Wei Huang	e3e5b1a488	android: avoid using libdrm with host modules Note LOCAL_CFLAGS and LOCAL_SHARED_LIBRARIES in Android.common.mk are used by both host and target modules. However, commit `112e988` moved libdrm related flags to common. It causes the errors like: error: 'out/host/linux-x86/obj32/SHARED_LIBRARIES/libdrm_intermediates/export_includes', needed by 'out/host/linux-x86/obj32/EXECUTABLES/mesa_gen_matypes_intermediates/import_includes', missing and no known rule to make it No reason to use libdrm with host modules. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Fixes: `112e988329` ("Android: move libdrm settings to top-level Android.common.mk") Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-02 14:43:26 +00:00
Nicolai Hähnle	1ef505bb02	glsl: compute lvalues of [in]out parameters before inlined function body This is required when an out argument involves an array index that is either a global variable modified by the function or another out argument in the same function call. Fixes the shaders/out-parameter-indexing/vs-inout-index-inout-* tests. v2: - modify the ir_dereference_array nodes in place - use ir_hierarchical_visitor v3: use base_ir (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-11-02 12:32:47 +01:00
Nicolai Hähnle	5aef14932a	radeonsi: fix BFE/BFI lowering for GLSL semantics Fixes spec/arb_gpu_shader5/execution/built-in-functions/*-bitfield{Extract,Insert} Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-02 12:30:11 +01:00
Nicolai Hähnle	6526977306	tgsi: align the definition of BFI & [UI]BFE with GLSL As previously written, these opcodes use the SM5 semantics which is incompatible with GLSL when bits == 0, offset == 32. At some point we may want to add BFI_SM5 etc. opcodes, but all users currently either want (and expect!) the GLSL semantics or don't care. Bitfield inserts are generated by the GLSL lower_instructions and lower_packing_builtins passes with constant bits and offset arguments, so any workaround code that drivers may have to emit to follow GLSL semantics should be optimized away easily for those uses. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-02 12:30:07 +01:00
Dave Airlie	9f0726f3e5	radv: expose xlib platform extension I missed this when I added the xlib code, this allows dolphin emu to start and crash later. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-02 10:00:38 +10:00
Lionel Landwerlin	a28db12e21	intel: aubinator: print field values if available Turning this : sampler state 0 Sampler Disable: false Texture Border Color Mode: 0 LOD PreClamp Enable: 1 Base Mip Level: 0.000000 Mip Mode Filter: 0 Mag Mode Filter: 1 Min Mode Filter: 1 Texture LOD Bias: foo Anisotropic Algorithm: 0 into this : sampler state 0 Sampler Disable: false Texture Border Color Mode: 0 (DX10/OGL) LOD PreClamp Enable: 1 (OGL) Base Mip Level: 0.000000 Mip Mode Filter: 0 (NONE) Mag Mode Filter: 1 (LINEAR) Min Mode Filter: 1 (LINEAR) Texture LOD Bias: foo Anisotropic Algorithm: 0 (LEGACY) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sirisha Gandikota<sirisha.gandikota@intel.com>	2016-11-01 22:37:56 +00:00
Lionel Landwerlin	74c4c84482	intel: aubinator: load fields values from xml data Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sirisha Gandikota<sirisha.gandikota@intel.com>	2016-11-01 22:37:52 +00:00
Lionel Landwerlin	c8806eeefc	intel: aubinator: print boolean fields to true with colors Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sirisha Gandikota<sirisha.gandikota@intel.com>	2016-11-01 22:37:22 +00:00
Marek Olšák	d3244c47ce	amd: fix a typo in PIXEL_PIPE_STAT_RESET definition Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	7786f8c635	gallium/radeon: add enum radeon_micro_mode Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	1a4e0162fc	gallium/radeon: make it clear that DRM 2.x.x fast clear constraint is CIK-only Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	e3697b4be6	gallium/radeon: remove r600_surface::level_info Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	bf4d102ea3	gallium/radeon: add radeon_surf::is_linear Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	e9c76eeeaa	gallium/radeon: remove radeon_surf_level::pitch_bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	c66a550385	gallium/radeon: don't call u_format helpers if we have that info already Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	692f2640ab	gallium/radeon: replace radeon_surf_info::dcc_enabled with num_dcc_levels Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	315eb0acb4	radeonsi: add a driver query for counting CP DMA calls CP DMA calls are synchronous with regard to shaders, but can be made asynchronous if needed. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	d268b7f95e	radeonsi: add a driver query for shader cache hits This is an 8-month old patch. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	6b309f7368	gbm: set up the interop extension for egl/drm breaking libgbm -> libEGL ABI? Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-01 22:33:13 +01:00
Samuel Pitoiset	8bfd65395e	nvc0: do not duplicate similar performance metrics Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2016-11-01 19:03:26 +01:00
Emil Velikov	bc4c09dc99	docs: add news item and link release notes for 13.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-01 16:09:13 +00:00
Emil Velikov	631fa587e1	docs: add sha256 checksums for 13.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `405dd26860`)	2016-11-01 16:07:26 +00:00
Emil Velikov	e205c265c8	docs: Update 13.0.0 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `df1b0a5a86`)	2016-11-01 16:07:24 +00:00
Jason Ekstrand	c41ec1679f	anv/device: Return DEVICE_LOST if execbuf2 fails This makes more sense than OUT_OF_HOST_MEMORY. Technically, you can recover from a failed execbuf2 but the batch you just submitted didn't fully execute so things are in an ill-defined state. The app doesn't want to continue from that point anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-01 07:54:52 -07:00
Antia Puentes	61a8a55f55	i965/gen8: Fix vertex attrib upload for dvec3/4 shader inputs The emission of vertex attributes corresponding to dvec3 and dvec4 vertex shader input variables was not correct when the <size> passed to the VertexAttribL* commands was <= 2. This was because we were using the vertex array size when emitting vertices to decide if we uploaded a 64-bit floating point attribute as 1 slot (128-bits) for sizes 1 and 2, or 2 slots (256-bits) for sizes 3 and 4. This caused problems when mapping the input variables to registers because, for deciding which registers contain the values uploaded for a certain variable, we use the size and type given to the variable in the shader, so we will be assigning 256-bits to dvec3/4 variables, even if we only uploaded 128-bits for them, which happened when the vertex array size was <= 2. The patch uses the shader information to only emit as 128-bits those 64-bit floating point variables that were declared as double or dvec2 in the vertex shader. Dvec3 and dvec4 variables will be always uploaded as 256-bits, independently of the <size> given to the VertexAttribL* command. From the ARB_vertex_attrib_64bit specification: "For the 64-bit double precision types listed in Table X.1, no default attribute values are provided if the values of the vertex attribute variable are specified with fewer components than required for the attribute variable. For example, the fourth component of a variable of type dvec4 will be undefined if specified using VertexAttribL3dv or using a vertex array specified with VertexAttribLPointer and a size of three." We are filling these unspecified components with zeros, which coincidentally is also what the GL44-CTS.vertex_attrib_binding.basic-inputL-case1 expects. v2: Do not use bitcount (Kenneth Graunke) Fixes: GL44-CTS.vertex_attrib_binding.basic-inputL-case1 test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97287 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-01 09:39:09 +01:00
Dave Airlie	f88ea8c72a	radv: drop some unused cmask info members. These were assigned but never used. Inspired by similiar patch in radeonsi. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-01 15:11:35 +10:00
Lionel Landwerlin	1b88760f85	intel: aubinator: fix printing missing gen option Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-31 22:03:13 +00:00
Lionel Landwerlin	46d67799a6	intel: aubinator: fix assumptions on amount of required data We require 12 bytes of headers but in some cases we just need 4. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-31 22:03:09 +00:00
Lionel Landwerlin	6f05b69572	intel: aubinator: don't print out blocks twice Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-31 22:02:41 +00:00
Nanley Chery	e9a25e0247	i965: Move gen8_disable_stages to brw_upload_initial_gpu_state 3DSTATE_WM_CHROMAKEY isn't programmed anywhere else. 3DSTATE_WM_HZ_OP is programmed, then cleared by blorp during a HZ op, so repeatedly clearing it after every blorp execution is redundant. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-31 13:20:05 -07:00
Nanley Chery	477ea60b68	i965: Program 3DSTATE_AA_LINE_PARAMETERS in upload_invariant_state This packet is non-pipelined and doesn't ever change across emissions. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-31 13:20:00 -07:00
Leo Liu	06e3cd6a45	st/omx/dec: disable tunnel for size different case When the video coded size is different from frame size, we need the result buffers are same as coded size, which are not size compatible with encode required size, so that simply use no tunnel for this case instead of frame by frame converting. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org>	2016-10-31 11:45:29 -04:00
Leo Liu	d9b2c4048d	st/omx/dec: result buffers size should match codec decoder size Otherwise fails the check of matching between decoder size and buffers size in kernel. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org>	2016-10-31 11:45:14 -04:00
George Kyriazis	55fb874376	swr: [rasterizer] added EventHandlerFile contructor Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-10-31 09:06:29 -05:00
George Kyriazis	0a5811b0f3	swr: [rasterizer core] Frontend dependency work Add frontend dependency concept in the DRAW_CONTEXT, which allows serialization of frontend work if necessary. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-10-31 09:06:21 -05:00
George Kyriazis	06f93d0329	swr: [rasterizer core] Refactor/cleanup backends Used for common code reuse and simplification Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-10-31 09:06:15 -05:00
George Kyriazis	78a0a09e48	swr: [rasterizer core] Remove deprecated simd intrinsics Used in abandoned all-or-nothing approach to converting to AVX512 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-10-31 09:06:08 -05:00
George Kyriazis	1a3ed86348	swr: [rasterizer archrast] Add thread tags to event files. This allows the post-processor to easily detect the API thread and to process frame information. The frame information is needed to optimized how data is processed from worker threads. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-10-31 09:05:25 -05:00
Marek Olšák	7a2387c3e0	glsl: use a non-malloc'd storage for short ir_variable names Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-31 11:53:38 +01:00
Marek Olšák	21e11b5282	glsl: use the linear allocator in opt_constant_propagation Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-31 11:53:38 +01:00
Marek Olšák	565b2c4c4b	glsl: use the linear allocator in opt_copy_propagation Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-31 11:53:38 +01:00
Marek Olšák	b6f50e4640	glsl: use the linear allocator in opt_copy_propagation_elements Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-31 11:53:38 +01:00
Marek Olšák	9c19dedff0	glsl: use the linear allocator in opt_dead_code_local Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-31 11:53:38 +01:00
Marek Olšák	23e373eb4f	glsl: use the linear allocator in glsl_symbol_table no ralloc_free occurences Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-31 11:53:38 +01:00
Marek Olšák	a4a93103fb	glsl: use the linear allocator for ast_node and derived classes Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-31 11:53:38 +01:00
Marek Olšák	2296bb0967	glsl/lexer: use the linear allocator Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-31 11:53:38 +01:00
Marek Olšák	47e1758692	glcpp: use the linear allocator for most objects v2: cosmetic changes Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-10-31 11:53:38 +01:00
Marek Olšák	6608dbf540	ralloc: add a linear allocator as a child node of ralloc v2: remove goto, cosmetic changes Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-31 11:53:38 +01:00
Marek Olšák	acc23b04cf	ralloc: remove memset from ralloc_size only do it in rzalloc_size as it was supposed to be Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>	2016-10-31 11:53:38 +01:00
Marek Olšák	52d2b28f7f	ralloc: use rzalloc where it's necessary No change in behavior. ralloc_size is equivalent to rzalloc_size. That will change though. Calls not switched to rzalloc_size: - ralloc_vasprintf - glsl_type::name allocation (it's filled with snprintf) - C++ classes where valgrind didn't show uninitialized values I switched most of non-glsl stuff to rzalloc without checking whether it's really needed. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-31 11:53:38 +01:00
Marek Olšák	9454f7c0ef	ralloc: add DECLARE_RZALLOC_CXX_OPERATORS Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whiteacpe.org>	2016-10-31 11:53:38 +01:00
Juha-Pekka Heikkila	3bf6c6c3ad	nir: zero allocated memory where needed Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-10-31 11:53:38 +01:00
Juha-Pekka Heikkila	4d4335c81a	i965/fs: fill allocated memory with zeros where needed Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-10-31 11:53:38 +01:00
Juha-Pekka Heikkila	5fa41520e4	i965/vec4: zero allocated memory where needed Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-10-31 11:53:38 +01:00
Tapani Pälli	e40c5dab5e	glsl/glcpp: initialize all fields of glcpp_parser_t on creation this fixes some of the regressions with "ralloc: remove memset from ralloc_size" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-10-31 11:53:38 +01:00
Juha-Pekka Heikkila	6770b17b99	glsl: Fix reading of uninitialized memory Switch to use memory allocations which zero memory for places where needed. v2: modify and rebase on top of Marek's series (Tapani) Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-10-31 11:53:38 +01:00
Marek Olšák	f67c5a7ccd	glsl: initialize glsl_struct_field properly don't rely on ralloc doing memset Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whiteacpe.org>	2016-10-31 11:53:38 +01:00
Marek Olšák	330482177c	ralloc: don't memset ralloc_header, clear it manually time GALLIUM_NOOP=1 ./run shaders/private/alien_isolation/ >/dev/null Before (2 takes): real 0m8.734s 0m8.773s user 0m34.232s 0m34.348s sys 0m0.084s 0m0.056s After (2 takes): real 0m8.448s 0m8.463s user 0m33.104s 0m33.160s sys 0m0.088s 0m0.076s Average change in "real" time spent: -3.4% calloc should only do 2 things compared to malloc: - check for overflow of "n * size" - call memset I'm not sure if that explains the difference. v2: clear "parent" and "next" in the caller of add_child. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-10-31 11:53:38 +01:00
Serge Martin	cb0879985a	clover: Implement clGetExtensionFunctionAddressForPlatform. Add clGetExtensionFunctionAddressForPlatform (CL 1.2). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-10-30 12:53:03 -07:00
Vedran Miletić	2fba72046d	clover: Introduce CLOVER_EXTRA__OPTIONS environment variables The options specified in the CLOVER_EXTRA_BUILD_OPTIONS shell variable are appended to the options specified by the OpenCL program in the clBuildProgram function call, if any. Analogously, the options specified in the CLOVER_EXTRA_COMPILE_OPTIONS and CLOVER_EXTRA_LINK_OPTIONS variables are appended to the options specified in clCompileProgram and clLinkProgram function calls, respectively. v2: rename to CLOVER_EXTRA_COMPILER_OPTIONS * use debug_get_option * append to linker options as well v3: code cleanups v4: separate CLOVER_EXTRA_LINKER_OPTIONS options v5: * fix documentation typo * use CLOVER_EXTRA_COMPILER_OPTIONS in link stage v6: * separate in CLOVER_EXTRA_{BUILD,COMPILE,LINK}_OPTIONS * append options in cl{Build,Compile,Link}Program Signed-off-by: Vedran Miletić <vedran@miletic.net> Reviewed-by[v1]: Edward O'Callaghan <funfunctor@folklore1984.net> v7 [Francisco Jerez]: Slight simplification. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-10-30 12:45:26 -07:00
Vedran Miletić	e3272865c2	clover: Pass unquoted compiler arguments to Clang OpenCL apps can quote arguments they pass to the OpenCL compiler, most commonly include paths containing spaces. If the Clang OpenCL compiler was called via a shell, the shell would split the arguments with respect to to quotes and then remove quotes before passing the arguments to the compiler. Since we call Clang as a library, we have to split the argument with respect to quotes and then remove quotes before passing the arguments. v2: move to tokenize(), remove throwing of CL_INVALID_COMPILER_OPTIONS v3: simplify parsing logic, use more C++11 v4: restore error throwing, clarify a comment Signed-off-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-10-30 12:14:59 -07:00
Jason Ekstrand	2a4a86862c	i965/fs/generator: Don't use the address immediate for MOV_INDIRECT The address immediate field is only 9 bits and, since the value is in bytes, the highest GRF we can point to with it is g15. This makes it pretty close to useless for MOV_INDIRECT. There were already piles of restrictions preventing us from using it prior to Broadwell, so let's get rid of the gen8+ code path entirely. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97779 Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-10-28 17:11:16 -07:00
Marek Olšák	4bf45a6079	radeonsi: fix behavior of GLSL findLSB(0) 12.0 and older need the same fix but elsewhere. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-29 01:17:36 +02:00
Marek Olšák	e24dc43164	radeonsi: set VGT_GS_ONCHIP_CNTL on CIK and later Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>	2016-10-29 01:17:36 +02:00
Jason Ekstrand	cab3d46739	i965: Fix make check after `66fcfa6894` Commit `66fcfa6894` changed the vec4 version of offset() to have 3 parameters instead of 2 but the vec4_cmod_propagation test was never updated. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-28 14:55:38 -07:00
Kenneth Graunke	e6aeeace69	glsl: Improve accuracy of alpha scaling in advanced blend lowering. When blending with GL_COLORBURN_KHR and these colors: dst = <0.372549027, 0.372549027, 0.372549027, 0.372549027> src = <0.09375, 0.046875, 0.0, 0.375> the normalized dst value became 0.99999994 (due to precision problems in the floating point divide of rgb by alpha). This caused the color burn equation to fail the dst >= 1.0 comparison. The blue channel would then fall through to the dst < 1.0 && src >= 0 comparison, which was true, since src.b == 0. This produced a factor of 0.0 instead of 1.0. This is an inherent numerical instability in the color burn and dodge equations - depending on the precision of alpha scaling, the value can be either 0.0 or 1.0. Technically, GLSL floating point division doesn't even guarantee that 0.372549027 / 0.372549027 = 1.0. So arguably, the CTS should allow either value. I've filed a bug at Khronos for further discussion (linked below). In the meantime, this patch improves the precision of alpha scaling by replacing the division with (rgb == alpha ? 1.0 : rgb / alpha). We may not need this long term, but for now, it fixes the following CTS tests: ES31-CTS.blend_equation_advanced.blend_specific.GL_COLORBURN_KHR ES31-CTS.blend_equation_advanced.blend_all.GL_COLORBURN_KHR_all_qualifier Cc: currojerez@riseup.net Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16042 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-10-28 10:40:53 -07:00
Brian Paul	c538846e31	mesa: rename gl_client_array -> gl_vertex_array The term "client array" is a legacy thing dating back to the pre-VBO era when _all_ vertex arrays lived in client memory. Nowadays, it only contains vertex array state which is derived from gl_array_attributes and gl_vertex_buffer_binding. It's used by the VBO module and some drivers. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-28 09:25:30 -07:00
Brian Paul	161db1335b	mesa: code clean-up in _mesa_update_vao_client_arrays() Init vars where declared, use const qualifiers. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-28 09:25:30 -07:00
Brian Paul	7a4ba9f16e	mesa: update comment on vertex_attrib_binding() Was missed in an earlier renaming patch. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-28 09:25:29 -07:00
Brian Paul	910bc4d12c	mesa: rename gl_vertex_array_object::VertexBinding to BufferBinding To be a little more understandable. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-28 09:25:29 -07:00
Eduardo Lima Mitev	129da27426	vulkan/wsi/x11: Smplify implementation of vkGetPhysicalDeviceSurfaceFormatsKHR This patch simplifies x11_surface_get_formats(). It is actually just a readability improvement over the patch I provided earlier this week (`750d8cad72`). Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-28 16:53:28 +02:00
Eduardo Lima Mitev	b677b99db5	vulkan/wsi/x11: Fix behavior of vkGetPhysicalDeviceSurfacePresentModesKHR x11_surface_get_present_modes() is currently asserting that the number of elements in pPresentModeCount must be greater than or equal to the number of present modes available. This is buggy because pPresentModeCount elements are later copied from the internal modes' array, so if pPresentModeCount is greater, it will overflow it. On top of that, this assertion violates the spec. From the Vulkan 1.0 (revision 32, with KHR extensions), page 581 of the PDF: "If the value of pPresentModeCount is less than the number of presentation modes supported, at most pPresentModeCount values will be written. If pPresentModeCount is smaller than the number of presentation modes supported for the given surface, VK_INCOMPLETE will be returned instead of VK_SUCCESS to indicate that not all the available values were returned." So, the correct behavior is: if pPresentModeCount is greater than the internal number of formats, it is clamped to that many present modes. But if it is lesser than that, then pPresentModeCount elements are copied, and the call returns VK_INCOMPLETE. This fix is similar (but simpler and more readable) than the one I provided in `750d8cad72` for vkGetPhysicalDeviceSurfaceFormatsKHR, which was suffering from the same problem. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-28 16:53:06 +02:00
Timothy Arceri	7d059bdfb9	i965: use memory context when creating passthrough tcs Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-28 19:57:15 +11:00
Timothy Arceri	5857c3082e	intel/blorp: remove stale comment Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-28 19:51:08 +11:00
Eduardo Lima Mitev	c06480390b	drivers/meta: Accept GL_TEXTURE_3D as target for tex image decompression An assert is currently raised, preventing decompression of a texture image into a GL_TEXTURE_3D target. I have not found any spec wording that would explain this, or implementation detail that would prevent it. And in any case, the driver should not cause a crash upon user input arguments. Fixes most failing subcases in CTS tests: * GL44-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore * GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore These tests were crashing the driver before. Now they just fail, but due to an unrelated issue affecting 2 out of the 45 test subcases. No regressions observed against piglit or CTS-GL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-28 00:20:32 -07:00
Jason Ekstrand	43dadb6edd	intel/blorp: Rework our usage of ralloc when compiling shaders Previously, we were creating the shader with a NULL ralloc context and then trusting in blorp_compile_fs to clean it up. The only problem was that blorp_compile_fs didn't clean up its context properly so we were leaking. When I went to fix that, I realized that it couldn't because it has to return the shader binary which is allocated off of that context and used by the caller. The solution is to make blorp_compile_fs take a ralloc context, allocate the nir_shaders directly off that context, and clean it all up in whatever function creates the shader and calls blorp_compile_fs. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0, 13.0" <mesa-stable@lists.freedesktop.org>	2016-10-27 22:46:13 -07:00
Jason Ekstrand	ab92480272	intel/blorp: Rename compile_nir_shader to compile_fs Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-27 22:46:13 -07:00
Fredrik Höglund	044ef54d65	radv: split the device local memory heap into two Advertise two device local memory heaps; one that is host visible and one that is not. This makes it possible for clients to tell how much host visible vs. non-host visible memory is available. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-28 12:27:49 +10:00
Fredrik Höglund	c9675b4e17	radv: add a write-combining host-local memory type Add the new memory type between the two device-local types. This makes the list of supported memory types look like this: 1) DEVICE_LOCAL \| \| \| 2) \| HOST_VISIBLE \| HOST_COHERENT \| 3) DEVICE_LOCAL \| HOST_VISIBLE \| HOST_COHERENT \| 4) \| HOST_VISIBLE \| HOST_COHERENT \| HOST_CACHED With this order a client that searches for a HOST_VISIBLE and HOST_COHERENT memory type using the algorithm described in section 10.2 of the Vulkan specification (revision 32) will find the host- local memory type first. A client that requires the memory type to be HOST_VISIBLE and HOST_COHERENT, but not DEVICE_LOCAL is most likely searching for a memory type suitable for staging buffers / images. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-28 12:27:46 +10:00
Jason Ekstrand	44760c100c	i965/miptree: Remove the width/height < 32768 restrictions These restrictions existed because intel_miptree_blit couldn't handle surfaces bigger than 32k. How that we're chopping blits up into chunks, it can handle any size we throw at it so we can get rid of this restriction. This improves the terrain tests in synmark by 25-30% on my Sky Lake gt3. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reported-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-27 14:44:59 -07:00
Jason Ekstrand	80d3af8129	i965/blit: Break blits into chunks in intel_miptree_blit This allows us to blit much larger images than if we use the blitter directly. In particular, it gives us an almost infinite image height compared to the fairly limiting 32k. We do, however, still have a restriction on stride of the image because handling larger strides, while possible, is fairly difficult. v2: Properly handle linear blit alignment restrictions Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-27 14:44:54 -07:00
Jason Ekstrand	b7979a849b	i965/blit: Break blits into chunks in set_alpha_to_one v2: Properly handle linear blit alignment restrictions Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-27 14:43:59 -07:00
Jason Ekstrand	174f4900b2	i965/blit: Remove a bogus assertion This assertion, while valid for linear buffers, doesn't work properly for tiled memory. It used to work most of the time because the offset provided was always to the left-hand edge of the image. However, if you use a byte offset to get to the inside of the image, the height * stride calculation may actually end up being too large. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-27 14:43:24 -07:00
Jason Ekstrand	6da8149601	i965/miptree: Break miptree -> ISL tiling conversion into a helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-27 14:43:21 -07:00
Jason Ekstrand	c30b7164b7	i965/miptree: Remove the stencil_as_y_tiled parameter from get_aligned_offset The only actual user of this parameter was blorp and, since the conversion to ISL, it no longer uses this function. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-27 14:43:17 -07:00
Jason Ekstrand	4964a5149b	intel/blorp: Fix a couple asserts around image copy rectangles With dealing with rectangles in compressed images, you can have a width or height that isn't a multiple of the corresponding compression block dimension but only if that edge of your rectangle is on the edge of the image. When we call convert_to_single_slice, it creates an 2-D image and a set of tile offsets into that image. When detecting the right-edge and bottom-edge cases, we weren't including the tile offsets so the assert would misfire. This caused crashes in a few UE4 demos Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reported-by: "Eero Tamminen" <eero.t.tamminen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98431 Cc: "13.0" <mesa-stable@lists.freedesktop.org> Tested-by: "Eero Tamminen" <eero.t.tamminen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-27 13:45:39 -07:00
Jason Ekstrand	caf67bb12f	anv/allocator: Assert that we have a valid gem handle in bo_pool_alloc	2016-10-27 13:45:39 -07:00
Samuel Pitoiset	84e946380b	nvc0/ir: fix emission of IMAD with NEG modifiers The emitter tried to emit sub instead of subr when src0 has actually a NEG modifier. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 12.0 13.0" <mesa-stable@lists.freedesktop.org>	2016-10-27 19:29:56 +02:00
Juan A. Suarez Romero	5d83820a1d	glsl: inspect interfaces in contains_foo() When checking if a type contains doubles, integers, samples, etc. we check if the current type is a record or array, but not if it is an interface. This commit also inspects if the type is an interface. It fixes spec/arb_enhanced_layouts/compiler/transform-feedback-layout-qualifiers/xfb_offset/invalid-block-with-double.vert piglit test. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-27 12:36:09 +02:00
Iago Toral Quiroga	66fcfa6894	i965/vec4: make offset() work in terms of a simd width and scalar components So that it has the same semantics as the scalar backend implementation. The helper will now take a simd width (which is always 8 in vec4 mode) and step as many scalar components as specified by that width, respecting the size of the scalar channels. v2 (Curro): - Remove the assertion in offset(), byte_offset() has the same checks. - Use byte_offset() directly instead of add_byte_offset(). - Make things more clear by explicitly including the vertical stride in the byte offset expression. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-10-27 10:59:31 +02:00
Iago Toral Quiroga	ba63db1f2e	i965/vec4: use byte_offset() instead of offset() In a later patch we want to change the semantics of offset() to be in terms of SIMD width and scalar channels so it is consistent with the definition of the same helper in the scalar backend. However, some uses of offset() in the vec4 backend do not operate naturally in terms of these semantics. In these cases it is more natural to use the byte_offset() helper instead. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-10-27 10:59:31 +02:00
Iago Toral Quiroga	5a4ce9f9a7	i965/vec4: add a byte_offset helper v2: wrap the helper in a namespace to make clear that it is an implementation detail of byte_offset() and is not intended to be used independently (Curro). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-10-27 10:59:31 +02:00
Kenneth Graunke	173558445d	glsl: Size TCS->TES unsized arrays to gl_MaxPatchVertices for queries. SSO validation and other program interface queries want to see that unsized (non-patch) TCS output/TES input arrays are implicitly sized to gl_MaxPatchVertices. By the time we create the program resource lists, we've sized the arrays to their actual size. (We try to create TCS output arrays to match the output patch size right away, and at this point, we should have shrunk TES input arrays.) One option would be to keep them sized to gl_MaxPatchVertices, and defer shrinking them. But that's a big change, and I don't think it's a good idea. Instead, this patch introduces a new ir_variable flag which indicates the variable is implicitly to gl_MaxPatchVertices. Then, the linker munges the types when creating the resource list, ignoring the size in the IR's types. Basically, lie about it for resource queries. It's ugly, but I think it ought to work. We probably could use var->data.implicit_sized_array for this, but I opted for a separate bit to try and avoid convoluting the existing SSBO handling. They're similar in concept, but share none of the same code... Fixes: ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage and the ES32-CTS and ESEXT-CTS variants. v2: Add a comment (requested by Timothy, written by me). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-10-27 00:56:51 -07:00
Kenneth Graunke	34fd2ffed8	glsl: Pass ctx to program interface query helper functions. The next commit will use this in add_shader_variable - this just separates out some of the mechanical changes for easier review. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-10-27 00:56:34 -07:00
Tapani Pälli	2035930966	egl: set preserved behavior for surface only if config supports it Otherwise we can end up with mismatching behavior between config and surface when client queries surface attributes. As example, configs for DRI3 do not support preserved behavior but here we were setting preserved behavior for pixmap and pbuffer. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98326 Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-10-27 07:12:51 +03:00
Tapani Pälli	671da8d8ba	mesa: expose GL_EXT_robustness Fixes 8 failing dEQP tests: dEQP-EGL.functional.create_context_ext.robust_gles* (now 42 tests pass in dEQP-EGLrobust, 0 fail and rest are skipped) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98343 Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-27 07:06:41 +03:00
Tapani Pälli	44482d5a3e	st/mesa: set RobustAccess true when is supported Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-27 07:06:41 +03:00
Tapani Pälli	fe764477b0	i956: set RobustAccess true when is supported Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-27 07:06:41 +03:00
Tapani Pälli	ef16003320	mesa: add missing CONTEXT_ROBUST_ACCESS enum commit `85008db1d5` missed this enum for GL_KHR_robustness implementation Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-27 07:06:41 +03:00
Tapani Pälli	6bf6fcfcd9	egl: fix error handling in _eglCreateSync EGL specification requires context to be current only when sync type matches EGL_SYNC_FENCE_KHR. Fixes 25 failing dEQP tests: dEQP-EGL.functional.reusable_sync.* Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98339 Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-27 07:06:41 +03:00
Dave Airlie	ca035006c8	vulkan/wsi/x11: add support for IMMEDIATE present mode We shouldn't be using ASYNC here, that would be used for immediate mode, so let's implement that. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-27 11:43:15 +10:00
Dave Airlie	1cdca1eb16	vulkan/wsi: store present mode in swapchain base class This just moves this up a level as x11 will need it to implement things properly. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-27 11:43:00 +10:00
Dave Airlie	787c172aed	vulkan/wsi/x11: handle timeouts properly in next image acquire (v1.1) For 0 timeout, just poll for an event, and if none, return For UINT64_MAX timeout, just wait for special event blocked For other timeouts get the xcb fd and block on it, decreasing the timeout if we get woken up for non-special events. v1.1: return VK_TIMEOUT for poll timeouts. handle timeout going negative. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-27 11:42:26 +10:00
Dave Airlie	d548fa882b	radv/ac/llvm: trim texture return values The intrinsic engine asserts in llvm due to this, as we put a vec4 into a vec1, and the next instruction isn't expecting it. So trim the vector at the end before inserting it. Reported-by: Christoph Haag <haagch+mesadev@frickel.club> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-27 11:42:03 +10:00
Rhys Kidd	5c73ecaac4	glsl: Add pthread libs to cache_test Fixes the following compile error, present when the SHA1 library is libgcrypt: CCLD glsl/tests/cache-test glsl/.libs/libglsl.a(libmesautil_la-mesa-sha1.o): In function `call_once': /mesa/src/util/../../include/c11/threads_posix.h:96: undefined reference to `pthread_once' Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-27 09:49:35 +11:00
Matt Turner	8e5aed5b56	genxml: Handle failure of Python codegen scripts.	2016-10-26 14:06:45 -07:00
Samuel Pitoiset	1ec7227d44	nvc0/ir: fix emission of SHLADD with NEG modifiers This affects GF100:GK110 chipsets, but not GM107+ where the logic is a bit different. The emitters tried to emit sub instead of subr when src0 has a NEG modifier. This fixes the following piglit tests glsl-fs-loop-nested and glsl-vs-loop-nested. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-10-26 22:18:04 +02:00
Erik Faye-Lund	aca491341b	compiler: avoid warning about redefinition of PYTHON_GEN PYTHON_GEN is defined to the exact same thing in both Makefile.glsl.am and Makefile.nir.am. This makes automake complain, so let's lift the definition up to Makefile.am, the same way as MKDIR_GEN. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-26 14:54:26 +01:00
Eric Engestrom	4fa799ae04	egl/dri2: swap_buffers_with_damage falls back to swap_buffers Since commit `0a606a400f` ("egl: add eglSwapBuffersWithDamageKHR"), Android has been broken because the function eglSwapBuffersWithDamageKHR is provided regardless of the extension being present. Also, the Android meta-EGL always advertises the extension regardless of the underlying EGL implementation. As there doesn't seem to be a simple way conditionally make the EGL function ptr NULL, just implement a brain dead version of eglSwapBuffersWithDamage{KHR,EXT}. Cc: 13.0 <mesa-stable@lists.freedesktop.org> CC: Rob Clark <robdclark@gmail.com> Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Rob Herring <robh@kernel.org> [Emil Velikov: copy the original commit message from Rob's patch] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-26 12:04:21 +01:00
Emil Velikov	294b5f5f71	compiler: automake: add shader_info.h to the sources list Otherwise it'll be missing from the tarball. Fixes: `094fe3a959` ("nir: move nir_shader_info to a common compiler header") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-26 12:04:02 +01:00
Marek Olšák	1ac40173c2	configure.ac: simplify EGL requirements for drivers dependent on EGL Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	b687f766fd	st/mesa: allow multiple concurrent waiters in ClientWaitSync so->fence can be unreferenced by one thread while another thread is somewhere in ClientWaitSync and expecting so->fence to be non-NULL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98172 Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	f240ad98bc	st/mesa: unduplicate st_check_sync code It's the same as st_client_wait_sync. Discovered by Michel. This is needed to make the following fix simpler. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	ad45dce4a2	radeonsi: remove si_resource_create_custom Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	29144d0f34	gallium/radeon: stop using PIPE_BIND_CUSTOM it has no effect whatsoever Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	e3c3a7fada	r600g: remove a redundant buffer_create helper Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	3dc78c33a9	gallium/radeon: remove unused r600_cmask_info members Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	d18bf0b944	gallium/radeon: don't force the same tiling parameters for FMASK GCN can use a completely different tile mode for FMASK. FMASK allocation now skips one unrelated amdgpu_surface_init codepath as hinted by the assertion. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	ecf045b4f7	winsys/amdgpu: allocate FMASK properly I expect no change in behavior, because r600_texture.c forces the same tile mode as the base texture has. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	24faeb94be	gallium/radeon: print tiling index when printing texture info Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	37659071b8	gallium/radeon: don't do (fmask.size && cmask.size) fmask implies that cmask is present too. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	2664351dfe	gallium/radeon: re-order radeon_surf::dcc and htile members Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	2a2e537577	gallium/radeon: rename bo_size -> surf_size, bo_alignment -> surf_alignment these names were misleading. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	67a44c97af	gallium/radeon: remove flags specific to libdrm_radeon from winsys interface These just say whether libdrm can assume that the latest radeon_surface definition is used by Mesa. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	7a706ad25c	gallium/radeon: remove r600_htile_info Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	7e73ff87c0	gallium/radeon: remove unnecessary fields from radeon_surf_level Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	d5c7ea3b83	gallium/radeon: decrease the size of radeon_surf Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	e9590d9092	gallium/radeon: pass pipe_resource and other params to surface_init directly This removes input-only parameters from the radeon_surf structure. Some of the translation logic from pipe_resource to radeon_surf is moved to winsys/radeon. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	8b94976df9	radeon/vce: use nblk_y instead of npix_y npix_y will be removed. level[0].npix_y will be removed too. nblk_y should be the same as npix_y if the block height == 1. However, nblk_y is aligned to the tile size, so it can be greater than npix_y. If that's a problem, we'll have to save the input height of surface_init and use that. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	ba174b8dff	gallium/radeon: define RADEON_SURF_MODE_* as enums Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	b5118fe054	gallium/radeon: stop using some input fields from radeon_surface Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	28d237d63d	gallium/radeon: fold r600_setup_surface into r600_init_surface Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	b0d8a717a7	winsys/amdgpu: remove unused definitions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	81a95946da	gallium/radeon: fold radeon_winsys::surface_best into radeon/winsys Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	dc6bbe2dd0	gallium/radeon: use r600_gfx_write_event_eop everywhere Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	462e3cdf3b	gallium/radeon: make r600_gfx_write_fence more generic Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	edf56fb428	gallium/radeon: fix a ZPASS comment, EVENT_WRITE_EOP fixups Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	d883c83ba9	radeonsi: enable SDMA on Carrizo and all CIK chips again SDMA might be fixed by: "winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	6ec3b2a4b1	winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures Maybe this is why SDMA has been broken for many amdgpu users? SDMA is the only block which is used with imported textures and relies on this variable. DB also uses it, but it doesn't get imported textures, so it's unaffected. I do get SDMA failures on Tonga before this patch if R600_DEBUG=testdma is changed to use imported textures. Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	dce05b3423	gallium/radeon: make sure the address of separate CMASK is aligned properly This should fix random GPU hangs on Hawaii and Fiji. Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	8a21f52d73	gallium/radeon: fix incorrect bpe use in si_set_optimal_micro_tile_mode Oh my god, I wonder what catastrophic issues this was causing on SI. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Samuel Iglesias Gonsálvez	0e742926c6	glsl: update default precision qualifier when it is set in the shader Default precision qualifier for a data type could be set several times inside a shader. This patch allows to update the default precision qualifier for the given type that is saved in the symbol table. If it is not in the symbol table, just add it. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97804 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-26 11:57:07 +02:00
Samuel Iglesias Gonsálvez	dfbdb2c0b3	mesa/program: Add _mesa_symbol_table_replace_symbol() This function allows to modify an existing symbol. v2: - Remove namespace usage now that it was deleted. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-26 11:57:02 +02:00
Timothy Arceri	2e423ca147	nir: stop adjusting driver location for varying packing As of `59864e8e02` we just use the location assigned by the front-end and no longer need this for i965. Since there were some issues in the logic with assigning arrays the same driver location if they didn't start at the same location just remove it and let other drivers implement a solution if needed when they add ARB_enhanced_layouts support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-26 14:29:36 +11:00
Timothy Arceri	4ac6686165	compiler: remove copy_shader_info() This temporary helper is no longer needed now that we have finished refactoring common shader metadata. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	9972c591e7	glsl: set uses texture gather directly in shader_info Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	4016f08854	glsl/st/mesa: use common system values read field And set system values read directly in shader_info. st/mesa changes where: Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	2f59f3eee5	glsl: set patch outputs written directly in shader_info Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	f79d37f1ec	st/mesa: use common patch outputs written field Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-26 14:29:36 +11:00
Timothy Arceri	419de307dc	glsl: set patch inputs read directly in shader_info Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	3d2a503998	st/mesa: use common patch inputs read field Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-26 14:29:36 +11:00
Timothy Arceri	fdf42d3abc	glsl: set outputs read directly in shader_info Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	5346630593	r200/glsl/st/mesa: use common outputs written field And set outputs written directly in shader_info. st/mesa changes where: Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	b4b450a5cb	mesa/glsl: set double inputs read directly in shader_info Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	24093975e8	st/mesa: use common double inputs read field Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	e81aaeba37	r200/i915/st/mesa/compiler: use common inputs read field And set set inputs_read directly in shader_info. To avoid regressions between changes this change is a squashed version of the following patches. st/mesa changes where: Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	dfcbdba471	mesa/compiler: copy early fragment tests to shader_info in _mesa_copy_linked_program_data() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	6c2fcf6a8a	meta: remove remaining tabs in meta.c Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	3d8947824f	i965: replace brw_compute_program with brw_program Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	13d0cf57bf	i965: replace brw_fragment_program with brw_program Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	46a4e4257e	i965: replace brw_tess_{eval,ctrl}_program with brw_program Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	649bdb1f03	i965: replace brw_geomerty_program with brw_program Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	cb2d181944	i965: replace brw_vertex_program with new generic brw_program Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	3423488d55	st/mesa/r200/i915/i965: eliminate gl_fragment_program Here we move OriginUpperLeft and PixelCenterInteger into gl_program all other fields have been replace by shader_info. V2: Don't use anonymous union/structs to hold vertex/fragment fields suggested by Ian. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	17e28a1571	i965/mesa/st/swrast: set fs shader_info directly and switch to using it Note we access shader_info from the program struct rather than the nir_shader pointer because shader cache won't create a nir_shader. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	91d5b0eda9	mesa: remove now unused IsCentroid from gl_fragment_program Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	d9d04373c1	st/mesa: get interpolation location at translation time Rather then messing around creating bitfields and arrays to store the interpolation location just translate it on the fly. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	a317b40d1d	i965: remove unused debug param This was accidently disabled in `832bcc3613` not long after it was added. Since it's only for gen5 and lower we might as well just remove it rather than fixing it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	89512987c6	compiler: update the comment for enum glsl_interp_mode We no longer store the interp mode with the program metadata. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	aa881e4dc0	glsl: remove now unused InterpQualifier Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	c596f47b80	i965: remove unused BRW_STATE_INTERPOLATION_MAP flag Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	91d61fbf7c	i965: rewrite brw_setup_vue_interpolation() Here brw_setup_vue_interpolation() is rewritten not to use the InterpQualifier array in gl_fragment_program which will allow us to remove it. This change also makes the code which is only used by gen4/5 more self contained as it now has its own gen5_fragment_program struct rather than storing the map in brw_context. This means the interpolation map will only get processed once and will get stored in the in memory cache rather than being processed everytime the fs changes. Also by calling this from the fs compile code rather than from the upload code and using the interpolation assigned there we can get rid of the BRW_NEW_INTERPOLATION_MAP flag. It might not seem ideal to add a gen5_fragment_program struct however by the end of this series we will have gotten rid of all the brw_{shader_stage}_program structs and replaced them with a generic brw_program struct so there will only be two program structs which is better than what we have now. V2: Don't remove BRW_NEW_INTERPOLATION_MAP from dirty_bit_map until the following patch to fix build error. V3 - Suggestions by Jason: - name struct gen4_fragment_program rather than gen5_fragment_program - don't use enum with memset() - create interp mode set helper and simplify logic to call it - add assert when calling function to show prog will never be NULL for gen4/5 i.e. no Vulkan Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-26 14:29:36 +11:00
Timothy Arceri	20c0e67501	st/mesa: stop making use of InterpQualifier array A following patch is going to merge the gl_fragment_program struct into a common gl_program and we want to avoid all stages having this array. V2: use TGSI_INTERPOLATE_COUNT as the temporary placeholder. Suggested by Marek. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	cdafd499cc	mesa: remove unrequired code InterpQualifier is never set for ARB programs so this will do nothing. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	9605b98a07	i965/mesa/st: eliminate gl_compute_program Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	5a228c0aae	mesa: set cs shader_info metadata directly Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	4ca71a1175	st/mesa: switch cs over to shared shader_info Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	54095ed8b9	compiler: add additional cs metadata fields to shader info And copy values from GLSL. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	81faead818	mesa/i965/i915/r200: eliminate gl_vertex_program Here we move the only field in gl_vertex_program to the ARB program fields in gl_program. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	0ab51f8e16	i965: switch vs over to shared shader_info Note we access shader_info from the program struct rather than the nir_shader pointer because shader cache won't create a nir_shader. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	92f77e9c01	i965/mesa/st: eliminate gl_geometry_program We now get all the gs metadata from shader_info. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	9045ddcfe4	mesa: set gs shader_info metadata directly Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	8d7b25ee58	st/mesa: switch gs over to shared shader_info Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	288c96afce	i965: switch gs over to shared shader_info Note we access shader_info from the program struct rather than the nir_shader pointer because shader cache won't create a nir_shader. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	b99ecaf872	compiler: add input primative field for gs in shader info And copy the value from GLSL. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	67c2d80a83	i965/mesa/st: eliminate gl_tess_eval_program We now get all the tes metadata from shader_info. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	65225c20c6	mesa: copy tes metadata directly to shared shader info Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	2be3dbd90b	st/mesa: switch tes over to shared shader_info Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	4f1c415cc4	i965: switch tes over to shared shader_info Note we access shader_info from the program struct rather than the nir_shader pointer because shader cache won't create a nir_shader. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	088c25bfb7	compiler: add fields for tes metadata to shader info And copy the values from gl_tess_eval_program struct. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	64d9773cfe	i965/mesa/st: eliminate gl_tess_ctrl_program We now get all the tcs metadata from shader_info. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	750b14ed8e	mesa: set tcs shader_info metadata directly Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	7a4bbfa90d	st/mesa: switch tcs over to shared shader_info Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	e6cecb837d	i965: switch tcs over to shared shader_info Note we access shader_info from the program struct rather than the nir_shader pointer because shader cache won't create a nir_shader. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	9d2b391165	glsl: add temporary copy_shader_info() function This function is added here to ease refactoring towards using the new shared shader_info. Once refactoring is complete and values are set directly it will be removed. We call it from _mesa_copy_linked_program_data() rather than glsl_to_nir() so that the values will be set for all drivers. In order to do this some calls need to be moved around so that we make sure to call do_set_program_inouts() before _mesa_copy_linked_program_data() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	debed12fdd	glsl: add a shader info field to the gl_program type And use this field as the source for shader info in the nir_shader this will allow us to set some of these fields from GLSL directly. It will also simplify restoring from shader cache and allow the removal of duplicate fields from GLSL. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	e1af20f18a	nir/i965/anv/radv/gallium: make shader info a pointer When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	094fe3a959	nir: move nir_shader_info to a common compiler header This will allow use to stop copying values between structs and will also simplify handling handling these values in the shader cache. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Timothy Arceri	e40d32b3ec	mesa: modify _mesa_copy_linked_program_data() to take gl_linked_shader This allows us to do some small tidy ups, but will also allow us to call a new function that copies values to a shared shader info from here. In order to make this change this function now requires _mesa_reference_program() to have previously been called. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-26 14:29:36 +11:00
Fredrik Höglund	68db0fe034	vulkan/wsi/wayland: fix ARGB window support Use an ARGB format for the DRM buffer when the compositeAlpha field in VkSwapchainCreateInfoKHR is set to VK_COMPOSITE_ALPHA_PRE_MULTIPLIED_BIT_KHR. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-26 12:40:39 +10:00
Fredrik Höglund	972670c200	vulkan/wsi/x11: fix ARGB window support Pass the correct depth to xcb_dri3_pixmap_from_buffer_checked(). Otherwise xcb_present_pixmap() fails with a BadMatch error. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-26 12:40:39 +10:00
Fredrik Höglund	0a153f4ee4	radv: mark the fence as submitted and signalled in vkAcquireNextImageKHR This stops the debug layers from complaining when fences are used to throttle image acquisition. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-26 12:25:35 +10:00
Vinson Lee	f2770fb3d5	scons: Require libdrm >= 2.4.66 for DRM. configure.ac already requires 2.4.66. Fix SCons build. drmDevicePtr is not available until libdrm 2.4.65. Compiling src/loader/loader.c ... src/loader/loader.c:111:40: error: unknown type name ‘drmDevicePtr’ static char *drm_construct_id_path_tag(drmDevicePtr device) ^ Fixes: `4a183f4d06` ("scons: loader: use libdrm when available") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98421 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-10-25 15:48:12 -07:00
Matt Turner	14aac061e9	radv: Replace "abi_versions" with correct "api_version". git history shows "abi_versions" was used from the outset. Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98415 Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-25 12:55:39 -07:00
Matt Turner	07755237d3	anv: Replace "abi_versions" with correct "api_version". git history shows "abi_versions" was used from the outset. Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98415 Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-25 12:55:39 -07:00
Karol Herbst	0404678c5f	nv50/ir: start LocalCSE with getFirst to merge PHI instructions total instructions in shared programs : 3499888 -> 3499445 (-0.01%) total gprs used in shared programs : 453866 -> 453803 (-0.01%) total local used in shared programs : 21621 -> 21621 (0.00%) total bytes used in shared programs : 32078952 -> 32074936 (-0.01%) local gpr inst bytes helped 0 39 119 119 hurt 0 0 0 0 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-25 20:55:07 +02:00
Samuel Pitoiset	7b2712c367	nvc0: use correct bufctx when invalidating CP textures Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>	2016-10-25 20:22:05 +02:00
Eduardo Lima Mitev	750d8cad72	vulkan/wsi/x11: Fix behavior of vkGetPhysicalDeviceSurfaceFormatsKHR x11_surface_get_formats() is currently asserting that the number of elements in pSurfaceFormats must be greater than or equal to the number of formats available. This is buggy because pSurfaceFormatsCount elements are later copied from the internal formats' array, so if pSurfaceFormatCount is greater, it will overflow it. On top of that, this assertion violates the spec. From the Vulkan 1.0 (revision 32, with KHR extensions), page 579 of the PDF: "If pSurfaceFormats is NULL, then the number of format pairs supported for the given surface is returned in pSurfaceFormatCount. Otherwise, pSurfaceFormatCount must point to a variable set by the user to the number of elements in the pSurfaceFormats array, and on return the variable is overwritten with the number of structures actually written to pSurfaceFormats. If the value of pSurfaceFormatCount is less than the number of format pairs supported, at most pSurfaceFormatCount structures will be written. If pSurfaceFormatCount is smaller than the number of format pairs supported for the given surface, VK_INCOMPLETE will be returned instead of VK_SUCCESS to indicate that not all the available values were returned." So, the correct behavior is: if pSurfaceFormatCount is greater than the internal number of formats, it is clamped to that many formats. But if it is lesser than that, then pSurfaceFormatCount elements are copied, and the call returns VK_INCOMPLETE. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-25 13:22:38 +02:00
Tapani Pälli	a1652a059e	mesa: fix error handling in DrawBuffers Patch rearranges error checking so that enum checking provided via destmask happens before other checks. It needs to be done in this order because other error checks do not work properly if there were invalid enums passed. Patch also refines one existing check and it's documentation to match GLES 3.0 spec (also in later specs). This was somewhat mysteriously referring to desktop GL but had a check for gles3. Fixes following dEQP tests: dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers no CI regressions observed. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98134 Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-25 08:04:11 +03:00
Tapani Pälli	5876f3c85a	egl: add check that eglCreateContext gets a valid config Fixes following dEQP test: dEQP-EGL.functional.negative_api.create_context v2: don't break EGL_KHR_no_config_context (Eric Engestrom) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>	2016-10-25 07:24:11 +03:00
Tapani Pälli	58b4fef8bb	mesa: add missing formats to driGLFormatToImageFormat Fixes following dEQP tests: dEQP-EGL.functional.image.api.create_image_gles2_tex2d_luminance dEQP-EGL.functional.image.api.create_image_gles2_tex2d_luminance_alpha Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98328	2016-10-25 07:24:11 +03:00
Tapani Pälli	282b87dd03	egl: fix type mismatch error type in _eglInitSurface EGL spec defines EGL_BAD_MATCH for windows, pixmaps and pbuffers in case where user creates a surface but config does not support rendering to such surface type. Following quotes are from EGL 1.5 spec 3.5 "Rendering Surfaces" : for eglCreatePlatformWindowSurface, eglCreateWindowSurface: "If config does not support rendering to windows (the EGL_SURFACE_TYPE attribute does not contain EGL_WINDOW_BIT ), an EGL_BAD_MATCH error is generated." for eglCreatePbufferSurface: "If config does not support pbuffers, an EGL_BAD_MATCH error is generated." for eglCreatePlatformPixmapSurface, eglCreatePixmapSurface: "If config does not support rendering to pixmaps (the EGL_SURFACE_TYPE attribute does not contain EGL_PIXMAP_BIT ), an EGL_BAD_MATCH error is generated." Fixes following dEQP test: dEQP-EGL.functional.negative_api.create_pbuffer_surface Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-25 07:24:11 +03:00
Tapani Pälli	1ef7873397	Revert "egl/android: Set EGL_MAX_PBUFFER_WIDTH and EGL_MAX_PBUFFER_HEIGHT" This reverts commit `b1d636aa00`, previous commit sets these values for all egl configs. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Suggested-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-25 07:24:11 +03:00
Tapani Pälli	b91e1e38e8	egl/dri2: set max values for pbuffer width and height While these max values were previously fixed for pbuffer creation, this change makes also eglGetConfigAttrib() return correct values. Fixes following dEQP tests: dEQP-EGL.functional.create_surface.pbuffer.rgb888_no_depth_no_stencil dEQP-EGL.functional.create_surface.pbuffer.rgb888_depth_stencil dEQP-EGL.functional.create_surface.pbuffer.rgba8888_no_depth_no_stencil dEQP-EGL.functional.create_surface.pbuffer.rgba8888_depth_stencil Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98326 Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>	2016-10-25 07:24:11 +03:00
Brian Paul	76c3f1bbbe	gallium/stapi: fix comment for st_visual::buffer_mask Trivial.	2016-10-24 17:22:00 -07:00
Nanley Chery	59385da39d	isl/format: Correct ASTC entries of format info table With the isl_format_supports* helpers, we can now conveniently report support for this format on Cherry View. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92925 Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-24 15:51:57 -07:00
Kenneth Graunke	41034abfe6	i965: Drop nir_inputs from fs_visitor. It's unused. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-24 14:33:38 -07:00
Kenneth Graunke	59864e8e02	i965: Don't use nir_assign_var_locations for VS/TES/GS outputs. Fixes spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-dvec3. v2: Remove nir_outputs field from fs_visitor (caught by Tim and Iago). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-24 14:33:38 -07:00
Kenneth Graunke	27715c73ff	i965: Make split_virtual_grfs() call compact_virtual_grfs(). Post-splitting, VGRFs have a maximum size (MAX_VGRF_SIZE). This is required by the register allocator, as we have to create classes for each size of VGRF. We can (and do) allocate virtual registers larger than MAX_VGRF_SIZE, but we must ensure that they are splittable. split_virtual_grfs() asserts that the post-splitting register size is in range. Unfortunately, these trip for completely dead registers which are too large - we only set split points for live registers. So dead ones are never split, and if they happened to be too large, they'd trip asserts. To fix this, call compact_virtual_grfs() to eliminate dead registers before splitting. v2: Add a comment written by Iago. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-24 14:33:38 -07:00
Kenneth Graunke	3728ee000a	i965: Drop unnecessary switch statement in nir_setup_outputs() TCS and FS are skipped above. CS has no output variables. All remaining cases take the same path. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-24 14:33:38 -07:00
Brian Paul	88a618ce86	tgsi: trivial build fix for MSVC Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-24 14:16:07 -07:00
Samuel Pitoiset	6dbb8d12a8	nv50/ir: do not perform global membar for shared memory Shared memory is local to CTA, thus we should only wait for prior memory writes which are visible to other threads in the same CTA, and not at global level. This should speedup compute shaders which use shared memory. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-24 22:51:54 +02:00
Axel Davy	eed605a473	st/nine: Fix locking CubeTexture surfaces. Only one face of Cubetextures was locked when in DEFAULT Pool. Fixes: https://github.com/iXit/Mesa-3D/issues/129 CC: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-24 21:56:44 +02:00
Axel Davy	fe7bb46134	st/nine: Fix mistake in Volume9 UnlockBox In the format fallback path, the height was used instead of the depth. CC: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-24 21:56:44 +02:00
Axel Davy	942778099e	st/nine: Use align_calloc instead of align_malloc We are not sure exactly what needs to be 0 initialized, but we are missing some cases. 0 initialize all our current aligned allocation. Fixes Tree of Savior visual issues. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-24 21:56:44 +02:00
Axel Davy	54010cf8b6	gallium/util: Add align_calloc Add implementation for align_calloc, which is align_malloc + memset. v2: add if (ptr) before memset. Fix indentation. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:56:44 +02:00
Axel Davy	25beccb379	st/nine: Fix leak with integer and boolean constants Leak introduced by: `a83dce0128` The patch also moves the part to release changed.vs_const_i and changed.vs_const_b before the if (!cb.buffer_size) check, to avoid reuploading every draw call if integer or boolean constants are dirty, but the shaders use no constants. Signed-off-by: Axel Davy <axel.davy@ens.fr> CC: "13.0" <mesa-stable@lists.freedesktop.org>	2016-10-24 21:56:44 +02:00
Marek Olšák	f35b1d156b	tgsi/scan: scan texture offset operands This seems important considering how much we depend on some of the flags. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:38 +02:00
Marek Olšák	a2f98dff14	tgsi/scan: move src operand processing into a separate function the next commit will need this Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:36 +02:00
Marek Olšák	72267a25db	tgsi/scan: get information about shader buffer usage Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:35 +02:00
Marek Olšák	d89890d000	tgsi/scan: handle indirect image indexing correctly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:33 +02:00
Marek Olšák	ac37720f51	tgsi/scan: don't treat RESQ etc. as memory instructions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:30 +02:00
Marek Olšák	f095a4eb17	tgsi/scan: get information about indirect 2D file access Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:28 +02:00
Marek Olšák	965a5f1810	tgsi/scan: get information about indirect CONST access Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-24 21:41:26 +02:00
Anuj Phogat	35010718bc	i965/gen8: Don't enable alpha test and alpha to coverage if draw bufer zero is integer type We follow this rule at multiple places in i965 driver. This patch doesn't fix any testcase. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-24 11:07:39 -07:00
Anuj Phogat	93b84cae54	i965/gen8: Use DrawBuffer->_IntegerBuffers in gen8_upload_ps_blend() No functional changes in this patch. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-24 11:07:39 -07:00
Anuj Phogat	e2dd582de8	i965/gen8: Use DrawBuffer->_IntegerBuffers in gen8_upload_blend_state() No functional changes in this patch. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-24 11:07:39 -07:00
Samuel Pitoiset	d588e4f192	nv50/ir: display OP_BAR subops in debug mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-24 18:53:45 +02:00
Iago Toral Quiroga	537dce06ec	glsl: add matrix layout information to interface block types So far we have been checking that interface block definitions had matching matrix layouts by comparing the definitions of their fields, however, this does not cover the case where the interface blocks are defined with mismatching matrix layouts but don't define any field with a matrix type. In this case Mesa will not fail to link because none of the fields will inherit the mismatching layout qualifier. This patch fixes the problem in the same way we fixed it for packing layout information: we add the the layout information to the interface type and then we check it matches during the uniform block linking process. v2: Fix unit tests so they pass the new parameter to glsl_type::get_interface_instance() Fixes: dEQP-GLES31.functional.shaders.linkage.uniform.block.layout_qualifier_mismatch_3 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98245 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-10-24 15:49:53 +02:00
Nicolai Hähnle	3d6b5dee3a	st/mesa: cleanup and fix primitive restart for indirect draws There are three intended functional changes here: 1. OpenGL 4.5 clarifies that primitive restart should only apply with index buffers, so make that change explicit in the indirect draw path. 2. Make PrimitiveRestartFixedIndex work with indirect draws. 3. The change where primitive_restart is only set when the restart index can actually have an effect (based on the size of indices) is also applied for indirect draws. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-24 15:24:23 +02:00
Timothy Arceri	6dbe8a1b9f	glsl/mesa: remove unused namespace support from the symbol table Namespace support seems to have been unused for a very long time. Previously the hash table entry was never removed and the symbol name wasn't freed until the symbol table was destroyed. In theory this could reduced the number of times we need to copy a string as duplicate names are reused. However in practice there is likely only a limited number of symbols that are the same and this is likely to cause other less than optimal behaviour such as the hash_table continuously growing. Along with dropping namespace support this change removes entries from the hash table as they become unused. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-10-24 21:40:39 +11:00
Jonathan Gray	907ace5798	mapi: automake: set VISIBILITY_CFLAGS for shared glapi shared glapi was previously built without setting CFLAGS for AM_CFLAGS and VISIBILITY_CFLAGS. This resulted in symbols being exported that shouldn't be. The x86 and sparc assembly versions of the dispatch table partially mitigated this by using .hidden. Otherwise shared_dispatch_stub_* were being exported. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "11.2 12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-24 11:29:23 +01:00
Emil Velikov	8df581520a	anv: automake: cleanup the generated json file during make clean Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-24 11:29:12 +01:00
Stencel, Joanna	2e0ab61e29	egl/wayland: add missing destroy_window callback The original patch by Joanna added the function pointer and callback yet things got only partially applied - the infra was added, but the implementation was missing. Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Fixes: `690ead4a13` ("egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-10-24 09:50:53 +01:00
Emil Velikov	3511a86111	automake: don't forget to pick wglext.h in the tarball Earlier commit reworked the header install rules, to ensure that the correct ones are installed only as needed. By doing so it dropped a wildcard which was effectively including the wglext.h header in the tarball. Add the header to the top-level noinst_HEADERS, since the it is not meant to be installed (autoconf is not used on Windows plaforms). Fixes: `a89faa2022` ("autoconf: Make header install distinct for various APIs (v2)") Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Cc: Chuck Atkins <chuck.atkins@kitware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-10-24 09:44:26 +01:00
Samuel Iglesias Gonsálvez	b50b82b8a5	glsl/es31: precision qualifier doesn't need to match in shader interface block members It is specific only to GLSL ES 3.1. From the spec, section 4.3.9 "Interface Blocks": "Matched block names within a shader interface (as defined above) must match in terms of having the same number of declarations with the same sequence of types and the same sequence of member names, as well as having the same qualification as specified in section 9.2 (“Matching of Qualifiers“)." But in GLSL ES 3.0 and 3.2, it is the opposite: "Matched block names within a shader interface (as defined above) must match in terms of having the same number of declarations with the same sequence of types, precisions and the same sequence of member names, as well as having the matching member-wise layout qualification as defined in section 9.2 (“Matching of Qualifiers”)." Fixes: dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98243 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-10-24 07:04:38 +02:00
Samuel Iglesias Gonsálvez	849390a61a	glsl: move intrastage_match() after interstage_member_mismatch() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-10-24 07:04:32 +02:00
Dave Airlie	a969548f59	radv: allow cmask transitions without fast clear This fixes dEQP-VK.pipeline.multisample.sampled_image* These all render to multisampled image, and then sample from it, so we must transition it correctly, since we have a cmask and fmask this will cause the correct transition. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-24 11:03:09 +10:00
Ilia Mirkin	7b7eb7170d	nv50/ir: it appears that OP_DISCARD can't take a join modifier nvdisasm does not print a .S even though the bit is set. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-22 12:02:35 -04:00
Ilia Mirkin	adad576bfc	nv50/ir: use levelZero for non-frag tex/txp ops radeonsi also does the same thing. I suspect that this is likely to be a no-op in reality, but it brings nouveau code closer to what the blob produces. Plus it makes sense to not try to do auto-derivatives on this. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-22 12:02:35 -04:00
Ilia Mirkin	3fdeb7c983	gallium: add PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERS This allows the driver to signal that it can't handle random interleaving of attributes across buffers. This is required for ARB_transform_feedback3, and it's initialized to whatever the previous value of PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME was except for nv50 where it is disabled. Note that the proprietary drivers never expose ARB_transform_feedback3 on any GT21x's (where nouveau previously did), and after some effort I was unable to get it to work. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-22 12:02:35 -04:00
Samuel Pitoiset	6e08f3e96c	nvc0/ir: remove outdated comment about SHLADD Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-22 14:50:17 +02:00
Eric Anholt	8ff4182876	vc4: Avoid making temporaries for assignments to NIR registers. Getting stores to NIR regs to not generate new MOVs is tricky, since the result we're trying to store into the NIR reg may have been from a conditional update of a temp, or a series of packed writes. The easiest solution seems to be to require that nir_store_dest()'s arg comes from an SSA temp. This causes us to put in a few more temporary MOVs in the NIR SSA dest case, but copy propagation successfully cleans those up. The shader-db change is modest: total instructions in shared programs: 93774 -> 93598 (-0.19%) instructions in affected programs: 14760 -> 14584 (-1.19%) total estimated cycles in shared programs: 212135 -> 211946 (-0.09%) estimated cycles in affected programs: 27005 -> 26816 (-0.70%) but I was seeing patterns in some register-allocation failures in DEQP tests that looked like the extra MOVs would increase maximum register pressure in loops. Some debug code indicates that that's not the case, though I'm still a bit confused by that result.	2016-10-21 14:12:22 -07:00
Eric Anholt	a689b8b9df	vc4: Add a comment with discussion of how simulation works.	2016-10-21 14:12:22 -07:00
Eric Anholt	83ffb607b7	vc4: Move simulator winsys mapping and tracking to the simulator. One tiny hack is left in vc4_bufmgr.c for what kind of mapping we got so that we can free it.	2016-10-21 14:12:22 -07:00
Eric Anholt	1c38ee380d	vc4: Move simulator memory management to a u_mm.h heap. Now we aren't limited to 256MB total allocated across a driver instance, just 256MB at one time. We're still copying in and out, which should get fixed.	2016-10-21 14:12:22 -07:00
Eric Anholt	9f75522382	vc4: Move simulator globals into a struct. I would like to put a couple more things in here, so it's time to package it up.	2016-10-21 14:12:22 -07:00
Eric Anholt	78087676c9	vc4: Restructure the simulator mode. Rather than having simulator mode changes scattered around vc4_bufmgr.c and vc4_screen.c, make vc4_bufmgr.c just call a vc4_simulator_ioctl, which then dispatches to a corresponding implementation. This will give the simulator support a centralized place to do tricks like storing most BOs directly in simulator memory rather than copying in and out. This leaves special casing of mmaping BOs and execution, because of the winsys mapping.	2016-10-21 14:12:22 -07:00
Eric Anholt	1d7874fa7b	vc4: Fix termination of the initial scan for branch targets. The loop is scanning until the original max_ip (size of the BO), but we want to not examine any code after the PROG_END's delay slots. There was a block trying to do that, except that we had some early continue statements if the signal wasn't a PROG_END or a BRANCH. The failure mode would be that a valid shader is rejected because some undefined memory after the PROG_END slots is parsed as a branch and the rest of its setup is illegal. I haven't seen this in the wild, but valgrind was complaining and the new userland simulator code started triggering it.	2016-10-21 14:12:06 -07:00
Jason Ekstrand	3f05fc62f9	configure: Get rid of the --disable-vulkan-icd-full-driver-path flag Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-10-21 09:30:24 -07:00
Jason Ekstrand	7ea4ef8849	anv: Always use the full driver path in the intel_icd.*.json Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-10-21 09:30:23 -07:00
Jason Ekstrand	d96345de98	anv: Suffix the intel_icd file with the host CPU Vulkan has a multi-arch problem... The idea behind the Vulkan loader is that you have a little json file on your disk that tells the loader where to find drivers. The loader looks for these json files in standard locations, and then goes and loads the my_driver.so's that they specify. This allows you as a driver implementer to put their driver wherever on the disk they want so long as the ICD points in the right place. For a multi-arch system, however, you may have multiple libvulkan_intel.so files installed that the loader needs to pick depending on architecture. Since the ICD file format does not specify any architecture information, you can't tell the loader where to find the 32-bit version vs. the 64-bit version. The way that packagers have been dealing with this is to place libvulkan_intel.so in the top level lib directory and provide just a name (and no path) to the loader. It will then use the regular system search paths and find the correct driver. While this solution works fine for distro-installed Vulkan drivers, it doesn't work so well for user-installed drivers because they may put it in /opt or $HOME/.local or some other more exotic location. In this case, you can't use an ICD json file with just a library name because it doesn't know where to find it; you also have to add that to your library lookup path via LD_LIBRARY_PATH or similar. This patch handles both use-cases by taking advantage of the fact that the loader dlopen()s each of the drivers and, if one dlopen() calls fails, it silently continues on to open other drivers. By suffixing the icd file, we can provide two different json files: intel_icd.x86_64.json and intel_icd.i686.json with different paths. Since dlopen() will only succeed on the libvulkan_intel.so of the right arch, the loader will happily ignore the others and load that one. This allows us to properly handle multi-arch while still providing a full path so user installs will work fine. I tested this on my Fedora 25 machine with 32 and 64-bit builds of our Vulkan driver installed and 32 and 64-bit builds of crucible. It seems to work just fine. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-10-21 09:30:20 -07:00
Nicolai Hähnle	17353ef043	radeonsi: fix a regression in si_eliminate_const_output A constant value of float type is not necessarily a ConstantFP: it could also be a constant expression that for some reason hasn't been folded. This fixes a regression in GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 that was introduced by commit `3ec9975555`. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-21 09:59:26 +02:00
Ilia Mirkin	8cf0f05713	nv50,nvc0: don't keep track of whether fb rt0 is integer-only This reverts commits `1af0641db3` and `a6ad49cbbd`. st/mesa adjusts the rasterizer state for us now. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-21 02:28:26 -04:00
Francisco Jerez	811eb7f178	Revert "Revert "mapi: export all GLES 3.2 functions in libGLESv2.so"" This reverts commit `85e9bbc14d`. The previous commit should help with the scons build failure caused by the original commit. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2016-10-20 15:57:18 -07:00
Francisco Jerez	15a084a039	glapi: Move PrimitiveBoundingBox and BlendBarrier definitions into ES3.2 category. These two GLES 3.2 entry points were being defined in the category of the ARB_ES3_2_compatibility and KHR_blend_equation_advanced extensions respectively instead of in the ES3.2 category. Defining them in the ES3.2 category makes sure that the gl_procs.py generator emits declarations in the glprocs.h header file for the unsuffixed GLES-only entry points that PrimitiveBoundingBoxARB and BlendBarrierKHR respectively alias. This should avoid a compilation failure during scons builds in combination with "mapi: export all GLES 3.2 functions in libGLESv2.so". Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2016-10-20 15:55:21 -07:00
Vinson Lee	889ee4da05	util: Include string.h in bitscan.h. Fix build error with clang. Compiling src/compiler/glsl/link_varyings.cpp ... In file included from src/compiler/glsl/link_varyings.cpp:33: In file included from src/compiler/glsl/glsl_symbol_table.h:34: In file included from src/compiler/glsl/ir.h:33: In file included from src/compiler/glsl_types.h:29: /usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration extern int ffs (int __i) __THROW __attribute__ ((__const__)); ^ src/util/bitscan.h:51:13: note: expanded from macro 'ffs' ^ src/util/bitscan.h:96:18: note: previous declaration is here const int i = ffs(*mask) - 1; ^ src/util/bitscan.h:51:13: note: expanded from macro 'ffs' ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97952 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-20 14:54:28 -07:00
Samuel Pitoiset	42273edf79	nvc0: do not break 3D state by pushing MS coordinates on Fermi Long story short, 3D and CP are aliased on Fermi and initializing compute after pushing the MS sample coordinate offsets seems to corrupt 3D state for weird reasons. I still don't have the faintest clue what is going on, but this seems to only affect Fermi generation. A possible fix could be to use two different channels, one for 3D and one for CP. This fixes a bunch of regressions pinpointed by piglit. Fixes: "nvc0: fix up image support for allowing multiple samples" Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-20 19:59:08 +02:00
Samuel Pitoiset	24e15aa198	nvc0: translate compute shaders at program creation This makes shader-db reports results for compute shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-20 19:46:18 +02:00
Ben Widawsky	ffd9060b23	i965: Reorder PCI ID list to match release order I have some OCD... Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2016-10-20 08:54:03 -07:00
Ben Widawsky	b8509c8936	i965: Add some APL and KBL SKU strings We got a couple for products that exist on ark.intel.com, so let's just put them in now. Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2016-10-20 08:54:03 -07:00
Brian Paul	bd60fb49ba	vbo: clean up with 'indent', whitespace fixes, etc in vbo_exec_array.c Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-10-20 09:47:21 -06:00
Brian Paul	8b9965442a	vbo: whitespace fixes and reformatting in vbo_exec_api.c Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-10-20 09:47:21 -06:00
Brian Paul	8320bf1a7e	vbo: minor clean-up in vbo_exec_api.c Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-10-20 09:47:21 -06:00
Brian Paul	1098e6957c	vbo: move attribute type assignment If the attribute type is changing, we would have found that earlier in the ATTR_UNION() macro and would have called vbo_exec_fixup_vertex(). So move the assignment into that function so we don't do it every time. No Piglit regressions. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-10-20 09:47:21 -06:00
Brian Paul	4c3c9f1441	vbo: rename reset_attrfv() to vbo_reset_all_attr() Use a better name. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-10-20 09:47:21 -06:00
Brian Paul	7693bcde28	vbo: make vbo_reset_attr() static Not called from any other file. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-10-20 09:47:21 -06:00
Brian Paul	9d6d9b28f7	vbo: trivial indentation fix in vbo_exec_api.c	2016-10-20 09:47:21 -06:00
Marek Olšák	c2a602d21a	gallivm: try to fix build with LLVM <= 3.4 due to missing CallSite.h Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2016-10-20 17:45:23 +02:00
Marek Olšák	f19f71830b	radeonsi: fix build of si_eliminate_const_vs_outputs on LLVM <= 3.8 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-20 11:07:50 +02:00
Marek Olšák	2db56434d4	gallivm: add wrappers for missing functions in LLVM <= 3.8 radeonsi needs these. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-20 11:07:50 +02:00
Nicolai Hähnle	4a2dbfff05	radeonsi: fix 64-bit loads from LDS Fixes spec/arb_tessellation_shader/execution/dvec[23]-vs-tcs-tes, among others. Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-20 10:37:07 +02:00
Nicolai Hähnle	bfa50f88ce	st/mesa: only set primitive_restart when the restart index is in range Even when enabled, primitive restart has no effect when the restart index is larger than the representable values in the index buffer. Fixes GL45-CTS.gtf31.GL3Tests.primitive_restart.primitive_restart_upconvert for radeonsi VI. v2: add an explanatory comment Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-10-20 10:37:06 +02:00
Nicolai Hähnle	3d9b57e493	st/glsl_to_tgsi: sort input and output decls by TGSI index Fixes a regression introduced by commit `777dcf81b`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98307 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org>	2016-10-20 10:37:06 +02:00
Nicolai Hähnle	a1895685f8	st/glsl_to_tgsi: fix block copies of arrays of structs Use a full writemask in this case. This is relevant e.g. when a function has an inout argument which is an array of structs. v2: use C-style comment (Timothy Arceri) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Cc: 13.0 <mesa-stable@lists.freedesktop.org>	2016-10-20 10:37:01 +02:00
Nicolai Hähnle	ca592af880	st/glsl_to_tgsi: fix block copies of arrays of doubles Set the type of the left-hand side to the same as the right-hand side, so that when the base type is double, the writemask of the MOV instruction is properly fixed up. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org>	2016-10-20 10:30:00 +02:00
Iago Toral Quiroga	3da08e1664	glsl: Indirect array indexing on non-last SSBO member must fail compilation After the changes in comit `5b2675093e`, we moved this check to the linker, but the spec expects this to be checked at compile-time. There are dEQP tests that expect an error at compile time and the spec seems to confirm that expectation: "Except for the last declared member of a shader storage block (section 4.3.9 “Interface Blocks”), the size of an array must be declared (explicitly sized) before it is indexed with anything other than an integral constant expression. The size of any array must be declared before passing it as an argument to a function. Violation of any of these rules result in compile-time errors. It is legal to declare an array without a size (unsized) and then later redeclare the same name as an array of the same type and specify a size, or index it only with integral constant expressions (implicitly sized)." Commit `5b2675093e` tries to take care of the case where we have implicitly sized arrays in SSBOs and it does so by checking the max_array_access field in ir_variable during linking. In this patch we change the approach: we look for indirect access on SSBO arrays, and when we find one, we emit a compile-time error if the accessed member is not the last in the SSBO definition. There is a corner case that the specs do not address directly though and that dEQP checks for: the case of an unsized array in an SSBO definition that is not defined last but is never used in the shader code either. The following dEQP tests expect a compile-time error in this scenario: dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader However, since the unsized array is never used it is never indexed with a non-constant expression, so by the spec quotation above, it should be valid and the tests are probably incorrect. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-20 08:26:51 +02:00
Ilia Mirkin	cd45d758ff	nv50/ir: process texture offset sources as regular sources With ARB_gpu_shader5, texture offsets can be any source, including TEMPs and IN's. Make sure to process them as regular sources so that we pick up masks, etc. This should fix some CTS tests that feed offsets directly to textureGatherOffset, and we were not picking up the input use, thus not advertising it in the shader header. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dave Airlie <airlied@redhat.com> Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>	2016-10-19 21:02:01 -04:00
Ilia Mirkin	313fba5ee1	nv50,nvc0: avoid reading out of bounds when getting bogus so info The state tracker tries to attach the info to the wrong shader. This is easy enough to protect against. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>	2016-10-19 21:02:01 -04:00
Eric Engestrom	8bf7717e1f	wsi/wayland: fix error path Fixes: `1720bbd353` ("anv/wsi: split image alloc/free out to separate fns.") Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-20 10:53:59 +10:00
Dave Airlie	b0f131b0bf	anv: drop unused zero macro. I can't see this being used anywhere. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-20 10:53:37 +10:00
Dave Airlie	d842546ad1	radv: use emit_icmp for samples_identical On a debug llvm build we'd assert on the next compare when the return from samples_identical was i1 instead of i32. Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-20 01:43:55 +01:00
Jordan Justen	64c3d73535	i965/cs: Don't use a thread channel ID for small local sizes When the local group size is 8 or less, we will execute the program at most 1 time. Therefore, the local channel ID will always be 0. By using a constant 0 in this case we can prevent using push constant data. This is not expected to be common a occurance in real applications, but it has been seen in tests. We could extend this optimization to 16 and 32 for SIMD16 and SIMD32, but it gets a bit more complicated, because this optimization is currently being done early on, before we have decided the SIMD size. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-19 16:51:45 -07:00
Jordan Justen	1fa000a33b	i965/cs: Use udiv/umod for local IDs This allows for more optimizations relating to power-of-two divisions. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-19 16:51:45 -07:00
Timothy Arceri	740a8fa1e2	mesa: remove unused LocalSizeVariable Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-20 10:30:32 +11:00
Samuel Pitoiset	2b6e04e91f	nvc0/ir: simplify predicate logic for GK104 atomic operations The predicate is always CC_NOT_P as defined in processSurfaceCoordsNVE4(), so we only want to emit OR. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-19 23:53:57 +02:00
Samuel Pitoiset	974ab614d3	nvc0/ir: remove useless NVC0LoweringPass::gMemBase Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-19 23:53:48 +02:00
Samuel Pitoiset	03dc87caab	nv50/ir: print CCTL subops in debug mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-19 23:53:39 +02:00
Ian Romanick	4d35683d91	nir: Optimize integer division and modulus with 1 The previous power-of-two rules didn't catch idiv (because i965 doesn't set lower_idiv) and imod cases. The udiv and umod cases should have been caught, but I included them for orthogonality. This fixes silly code observed from compute shaders with local_size_[xy] = 1. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98299 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 14:25:10 -07:00
Marek Olšák	baed5eab82	configure.ac: enable EGL platform DRM if GBM is enabled since GBM is enabled by default, this is also enabled by default the whitespace changes remove tabs Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-19 23:19:16 +02:00
Marek Olšák	4650a27ba1	configure.ac: enable GBM by default Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-19 23:19:16 +02:00
Marek Olšák	0e075700fa	configure.ac: print whether GBM is enabled Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-19 23:19:16 +02:00
Marek Olšák	3ec9975555	radeonsi: eliminate trivial constant VS outputs These constant value VS PARAM exports: - 0,0,0,0 - 0,0,0,1 - 1,1,1,0 - 1,1,1,1 can be loaded into PS inputs using the DEFAULT_VAL field, and the VS exports can be removed from the IR to save export & parameter memory. After LLVM optimizations, analyze the IR to see which exports are equal to the ones listed above (or undef) and remove them if they are. Targeted use cases: - All DX9 eON ports always clear 10 VS outputs to 0.0 even if most of them are unused by PS (such as Witcher 2 below). - VS output arrays with unused elements that the GLSL compiler can't eliminate (such as Batman below). The shader-db deltas are quite interesting: (not from upstream si-report.py, it won't be upstreamed) PERCENTAGE DELTAS Shaders PARAM exports (affected only) batman_arkham_origins 589 -67.17 % bioshock-infinite 1769 -0.47 % dirt-showdown 548 -2.68 % dota2 1747 -3.36 % f1-2015 776 -4.94 % left_4_dead_2 1762 -0.07 % metro_2033_redux 2670 -0.43 % portal 474 -0.22 % talos_principle 324 -3.63 % warsow 176 -2.20 % witcher2 1040 -73.78 % ---------------------------------------- All affected 991 -65.37 % ... 9681 -> 3353 ---------------------------------------- Total 26725 -10.82 % ... 58490 -> 52162 v2: treat Undef as both 0 and 1 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1)	2016-10-19 22:21:46 +02:00
Samuel Pitoiset	041da0ae81	nv50/ir: silent TGSI_PROPERTY_FS_DEPTH_LAYOUT Found that information message while replaying a trace from Metro 2033 Redux. Mark that property as useless for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-19 21:02:50 +02:00
Emil Velikov	1a9b0221bc	docs: add 13.1.0-devel release notes template, bump version Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-19 19:10:16 +01:00
Emil Velikov	3ef8d4288a	docs: rename release notes to 13.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-19 19:10:16 +01:00
Marek Olšák	a2ea653a49	radeonsi: remove cb0_is_integer handling st/mesa does this for us. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	54f8efeb02	st/mesa: disable alpha-test, alpha-to-coverage, alpha-to-one for integer FBs v2: rebased Reviewed-by: Brian Paul <brianp@vmware.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	c64da9d499	mesa: remove gl_shader_compiler_options::EmitNoNoise it's always true Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	2897cb3dba	glsl_to_tgsi: remove code for fixing up TGSI labels I don't know what this was supposed to do, but all TGSI labels were always 0. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	ec35ff4e2b	glsl_to_tgsi: remove subroutine support Never used. The GLSL compiler doesn't even look at EmitNoFunctions. v2: add back "return" support in "main" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	eacda2c080	mesa_to_tgsi: remove remnants of flow control and subroutine support Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	82f4c0126d	mesa_to_tgsi: drop support for instructions that can't occur here Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	4e42898d9d	glsl_to_tgsi: allocate glsl_to_tgsi_instruction::tex_offsets on demand sizeof(glsl_to_tgsi_instruction): 384 -> 264 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	4d3d620f26	glsl_to_tgsi: merge buffer and sampler fields in glsl_to_tgsi_instruction sizeof(glsl_to_tgsi_instruction): 416 -> 384 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	dbf64ea28b	glsl_to_tgsi: reduce the size of glsl_to_tgsi_instruction using bitfields sizeof(glsl_to_tgsi_instruction): 464 -> 416 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	9015cbb3a3	glsl_to_tgsi: reduce the size of st_dst_reg and st_src_reg I noticed that glsl_to_tgsi_instruction is too huge. sizeof(glsl_to_tgsi_instruction): 752 -> 464 (-38%) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	222c599b61	glsl_to_tgsi: remove unused st_translate::tex_offsets Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	0d95eeb79c	glsl_to_tgsi: remove unused parameters from calc_deref_offsets Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	6980480052	glsl_to_tgsi: use array_id for temp arrays instead of hacking high bits Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Adam Jackson	4276b5c16a	reviewers: Throw myself on the GLX grenade Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-19 12:37:22 -04:00
Eric Engestrom	8acb79dfac	egl: bring back the default glapi.so name Earlier commit replaced the default platform specific libglapi.so name with an #error. This may have been overzealous since the name is the correct for the BSD platforms, at least. Reinstate the hunk - bringing back OpenBSD, et al. to a successful build state. Fixes: `7a9c92d071` ("egl/dri2: non-shared glapi cleanups") [Emil Velikov: format the patch from Eric, add commit message and tag.] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-10-19 15:09:26 +01:00
Iago Toral Quiroga	66d8bd3b7e	i965: fix subnr overflow in suboffset() Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-19 11:48:21 +02:00
Dave Airlie	86c4575a81	radv: decompress fmask before reading using texture unit Before we can read the fmask using the compute shader, we need to decompress the fmask in place. This fixes a bunch of remaining failure and hopefully multisampling in Talos.	2016-10-19 17:39:47 +10:00
Dave Airlie	67c91ef2a2	radv: fix samples_identical return value. This was returning an inversion, so not doing as it should have. We need to compare the fmask value with 0, and return the result from that.	2016-10-19 17:39:01 +10:00
Dave Airlie	93ba86c307	radv: fix wsi porting regression in swapchain destroy. The code in anv is right, there's a pending patch to fix this up different, but I'll sync the code for now.	2016-10-19 13:54:49 +10:00
Dave Airlie	63406b669e	radv: fix fmask ptr issue We were using the wrong descriptor in the fmask picking code.	2016-10-19 13:16:25 +10:00
Dave Airlie	db7ae14b60	radv: simplify fast clear shaders There is no need for anything but a noop shader here.	2016-10-19 13:16:14 +10:00
Dave Airlie	1ec5e6e702	vulkan/wsi: fix out of tree build.	2016-10-19 10:54:42 +10:00
Dave Airlie	b0e11a153c	radv: start using defines for the user sgpr offsets This adds some comments and adds defines for the user sgprs, so that we can move them around easier later and not have to change/revalidate every one of these. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 10:17:48 +10:00
Dave Airlie	6c3bd1cdb3	radv: port to common wsi codebase This drops all the radv WSI code in favour of using the new shared code that was ported from anv This regresses Talos for now, Jason has pointed out the bug is in Talos and we should wait for them to fix it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	3f7ef24889	anv: move to using shared wsi code This moves the shared code to a common subdirectory and makes anv linked to that code instead of the copy it was using. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	ec0bc14a70	anv/wsi: remove all anv references from WSI common code the WSI code should be now be clean for sharing. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	971523410f	anv: move common wsi code to x11/wayland common files. Next task is to rename all the anv_ out of this, and move to a common location Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	e0d15fbe1d	anv/wsi/wayland: add callback to get device format properties. This avoids having to know the toplevel API name. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	4392de6771	anv/wsi/wl: stop using device in more places Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	507722b882	anv/wsi: split out surface creation to avoid instance API Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	954cd09e66	anv/wsi: move further away from passing anv displays around Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	1720bbd353	anv/wsi: split image alloc/free out to separate fns. This moves these outside the wsi platform code, so we can reuse that code Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	828b8dbce4	anv/wsi: switch to using VkDevice in swapchain Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	6542001345	anv/wsi/x11: more refactoring to use generic handles Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	340e72f056	anv/wsi/x11: start refactoring out the image allocation/free functionality Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	c264c272a5	anv/wsi: drop device from get format Just use the wsi_device instead. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	467d161e6a	anv/wsi: remove device from get_support interface replace with wsi_device and allocator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	b8e7460563	anv/wsi/x11: abstract WSI interface from internals. This allows the API and the internals to be split, and the internals shared. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	36e6be2e0d	anv/wsi/x11: push anv_device out of the init/finish routines Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	7c10258567	anv/wsi: abstract wsi interfaces away from device a bit more. This is a step towards separating out the wsi code for sharing Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	be61fff6da	anv/wsi/x11: push device out of x11 connection fns. just pass the allocator/wsi_interface instead. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	e9cf7c4460	anv/wsi: drop device from get caps Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	0e4abc3e10	anv/wsi: drop get present modes device arg Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	32d70c0d66	radv/anv/wsi: drop unneeded parameter Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Roland Scheidegger	aeceec54a8	draw: improve vertex fetch (v2) The per-element fetch has quite some calculations which are constant, these can be moved outside both the per-element as well as the main shader loop (llvm can figure out it's constant mostly on its own, however this can have a significant compile time cost). Similarly, it looks easier swapping the fetch loops (outer loop per attrib, inner loop filling up the per vertex elements - this way the aos->soa conversion also can be done per attrib and not just at the end though again this doesn't really make much of a difference in the generated code). (This would also make it possible to vectorize the calculations leading to the fetches.) There's also some minimal change simplifying the overflow math slightly. All in all, the generated code seems to look slightly simpler (depending on the actual vs), but more importantly I've seen a significant reduction in compile times for some vs (albeit with old (3.3) llvm version, and the time reduction is only really for the optimizations run on the IR). v2: adapt to other draw change. No changes with piglit. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-19 01:44:59 +02:00
Roland Scheidegger	0942fe548e	draw: improved handling of undefined inputs Previous attempts to zero initialize all inputs were not really optimal (though no performance impact was measurable). In fact this is not really necessary, since we know the max number of inputs used. Instead, just generate fetch for up to max inputs used by the shader, directly replacing inputs for which there was no vertex element by zero. This also cleans up key generation, which previously would have stored some garbage for these elements. And also drop the assertion which indicates such bogus usage by a debug_printf (the whole point of initializing the undefined inputs was to make this case safe to handle). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-19 01:44:59 +02:00
Roland Scheidegger	d1b4a3451e	gallivm: print out time for jitting functions with GALLIVM_DEBUG=perf Compilation to actual machine code can easily take as much time as the optimization passes on the IR if not more, so print this out too. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-19 01:44:59 +02:00
Roland Scheidegger	6f2f0daeb4	gallivm: Use native packs and unpacks for the lerps For the texturing packs, things looked pretty terrible. For every lerp, we were repacking the values, and while those look sort of cheap with 128bit, with 256bit we end up with 2 of them instead of just 1 but worse, plus 2 extracts too (the unpack, however, works fine with a single instruction, albeit only with llvm 3.8 - the vpmovzxbw). Ideally we'd use more clever pack for llvmpipe backend conversion too since we actually use the "wrong" shuffle (which is more work) when doing the fs twiddle just so we end up with the wrong order for being able to do native pack when converting from 2x8f -> 1x16b. But this requires some refactoring, since the untwiddle is separate from conversion. This is only used for avx2 256bit pack/unpack for now. Improves openarena scores by 8% or so, though overall it's still pretty disappointing how much faster 256bit vectors are even with avx2 (or rather, aren't...). And, of course, eliminating the needless packs/unpacks in the first place would eliminate most of that advantage (not quite all) from this patch. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-19 01:44:59 +02:00
Dave Airlie	7e1e06bc75	anv: drop pointless struct decl. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:26 +10:00
Dave Airlie	e4df1830e4	radv: drop pointless struct decl. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:26 +10:00
Dave Airlie	4450f40519	radv: move to using shared vk_alloc inlines. This moves to the shared vk_alloc inlines for vulkan memory allocations. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:26 +10:00
Dave Airlie	1ae6ece980	anv: move to using vk_alloc helpers. This moves all the alloc/free in anv to the generic helpers. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:26 +10:00
Dave Airlie	0cfd428aef	vulkan: add vk_alloc.h shared allocation inlines. vulkan allocation allows for overriding the allocator used, add some macros for anv/radv to share for this. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:26 +10:00
Dave Airlie	2c6d8bff03	anv: drop local MIN/MAX macros. Use the ones from mesa, most places already did. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:26 +10:00
Dave Airlie	c6f1077e0d	radv: drop local MIN/MAX macros. Use the ones in macros.h instead. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:25 +10:00
Dave Airlie	78bce52f9a	util: move min/max/clamp macros to util macros.h Although the vulkan drivers include mesa macros.h, for radv I'd like to move away from that. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:25 +10:00
Dave Airlie	f5daaba0fd	radv: make use of shared vector helper. This removes the vector code from radv in favour of sharing code with anv. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:25 +10:00
Dave Airlie	8df014c01a	anv: port to using new u_vector shared helper. This just removes the anv vector code and uses the new helper. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:25 +10:00
Dave Airlie	008f54f63a	util: add vector util code. This is ported from anv, both anv and radv can share this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:25 +10:00
Brian Paul	8b731b8b03	svga: minor code improvements in svga_validate_pipe_sampler_view() Use the 'texture' local var in more places. Rename 'pFormat' to 'viewFormat'. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-10-18 16:16:26 -06:00
Lionel Landwerlin	0ca134aa9f	intel: genxml: add SAMPLER_BORDER_COLOR_STATE structures Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-18 22:43:41 +01:00
Boyuan Zhang	5567145d59	st/va: force to flush the last p frame in idr period During dual instance encoding submission, if the second encode task and first encode task have no reference dependency, e.g. p following with idr-frame, there is a chance the second task will use for its reconstructed picture buffer the same buffer used by first task for its reference/reconstructed picture. In this case, buffer corruption may occur depending on encoding speed. Fix is to force flush these two tasks separately to avoid race condition Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005 Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-18 15:16:34 -04:00
Chad Versace	52a6483e8a	egl/surfaceless: Fix segfault in eglSwapBuffers Since commit `63c5d5c6c4`, the surfaceless platform has allowed creation of pbuffer surfaces. But the vtable entry for eglSwapBuffers has remained NULL. Discovered by running a little pbuffer test. Cc: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-18 11:12:22 -07:00
Marek Olšák	21af69e753	radeonsi: rename prefixes from radeon to si Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-18 18:41:08 +02:00
Marek Olšák	6e475fefa1	radeonsi: merge radeon_llvm_context and si_shader_context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-18 18:41:06 +02:00
Marek Olšák	5ab25bb4ba	radeonsi: import all TGSI->LLVM code from gallium/radeon Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-18 18:41:04 +02:00
Marek Olšák	4967cacdfa	gallium/radeon: simplify initialization of 64-bit gallivm builders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-18 18:41:03 +02:00
Marek Olšák	502dad4dca	gallium/radeon: remove unused radeon_llvm_reg_index_soa Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-18 18:41:01 +02:00
Marek Olšák	4e5d076fcf	radeonsi: move LLVM ALU codegen into radeonsi Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-18 18:40:59 +02:00
Jonathan Gray	41754f743f	genxml: add generated headers to EXTRA_DIST Building the Mesa 12.0.3 distfile failed on a system without python as generated files were not included in the distfile. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-18 17:06:42 +01:00
Jonathan Gray	23392abf50	mesa: automake: include mesa_glinterop.h in distfile Add mesa_glinterop.h to the list of headers that will get included in the distfile as it is required to build Mesa itself. Corrects a regression introduced in `a89faa2022`. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-18 17:06:42 +01:00
Jonathan Gray	2fc1374be6	egl: remove docs directory from EXTRA_DIST The egl docs directory no longer exists as of `88b5c36fe1`. Remove it from EXTRA_DIST to unbreak 'make dist' Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-18 17:06:42 +01:00
Jonathan Gray	27572db46d	genxml: avoid using a GNU make pattern rule % pattern rules are a GNU extension. Convert the use of one to a inference rule to allow this to build on OpenBSD. This is a related change to the one made in `e3d43dc5ea` Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-18 17:06:42 +01:00
Emil Velikov	9898c60745	configure.ac: use a single require_libdrm helper Rather than having 4-5 places which do the explicit check/message just polish the gallium helper and use it everywhere. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:38 +01:00
Emil Velikov	3e079c3f86	configure.ac: remove no longer needed *_pci_id logic Previously it was used to differentiate between the different codepaths in the loader. Although strictly speaking the (core) of the loader is only used when a hardware device is available. The latter of which in itself requires libdrm (one of the codepaths available). That said, all the configure toggles which relate to enabling/using hw device should attribute and require libdrm, so there's no need to keep this code around. With this gallium_require_drm_loader becomes an empty stub, so nuke that one as well. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:35 +01:00
Emil Velikov	47b5925d9b	loader: cleanup copyright section With previous patches nearly all the original code (as seen in the various loaders) is gone. Update the copyright/license section to reflect that. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:32 +01:00
Emil Velikov	af7abc512c	loader: remove loader_get_driver_for_fd() driver_type Reminiscent from the pre-loader days, were we had multiple instances of the loader logic in separate places and one could build a "GALLIUM_ONLY" version. Since that is no longer the case and the loaders (glx/egl/gbm) do not (and should not) require to know any classic/gallium specific we can drop the argument and the related code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:29 +01:00
Emil Velikov	f9f7e44c94	loader: remove final sysfs codepath in loader_get_device_name_for_fd() Effectively everyone with actual hardware and/or requesting the "device_name" requires a working libdrm. Thus they could/should already be using the (now only) codepath. Apart from the code simplification, we can slim down our configure.ac even further. But that will be done in separate patch(es). Cc: Gary Wong <gtw@gnu.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:26 +01:00
Emil Velikov	4f1c33fd9d	travis: remove no longer needed libudev-dev dependency Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:24 +01:00
Emil Velikov	cb23fba3f3	scons: remove all libudev references Analogous to previous automake/autoconf commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:21 +01:00
Emil Velikov	4a183f4d06	scons: loader: use libdrm when available Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:18 +01:00
Emil Velikov	0607c5b1b0	gbm: remove superfluous/incorrect udev comment The gbm_device_get_backend_name() provides an (somewhat) internal name of the implementation/backend used. Is has nothing to do with the udev, one cannot and should not attempt to derive the name from it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:15 +01:00
Emil Velikov	6b21fdaa8f	automake: remove all the libudev references As of last commit nothing in mesa depends on libudev. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:12 +01:00
Emil Velikov	1e2e625e30	loader: remove libudev_get_device_name_for_fd and related code With this all the libudev related code is now gone. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:10 +01:00
Emil Velikov	fcdc02f512	loader: reimplement loader_get_user_preferred_fd via libdrm Currently not everyone has libudev and with follow-up patches we'll completely remove the divergent codepaths. Use the libdrm drm device API to construct the required ID_PATH_TAG-like string, to preserve the current functionality for libudev users and allow others to benefit from it as well. v2: Drop ranty comments, pick the correct device v3: \n -> \0 in PCI_ID_PATH_TAG_LENGTH comment (Axel). v4: Use snprintf (Nicolai) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-18 17:06:10 +01:00
Emil Velikov	8222100631	loader: annotate __driConfigOptionsLoader as static Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:07 +01:00
Emil Velikov	d561e064a8	loader: separate USE_DRICONF code into separate function Improves readability and allows us to do further cleanups a lot easier. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:04 +01:00
Emil Velikov	be239326aa	loader: slim down loader_get_pci_id_for_fd implementation(s) Currently mesa has three code paths in the loader - libudev, manual sysfs and drm ioctl one. Considering the issues we had with libudev - strip those down in favour of the libdrm drm device API. The latter can be implemented in any way depending on the platform and can be reused by others. v2: Use correct message on drmGetDevice failure. (Nicolai) Cc: Jonathan Gray <jsg@jsg.id.au> Cc: Jean-Sébastien Pédron <dumbbell@FreeBSD.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-18 17:06:04 +01:00
Emil Velikov	fd00aba5f4	configure.ac: mark libdrm as have_pci_id provider With follow on work, we'll untangle and simplify all the different codepaths in loader. Then again, we forget to set have_pci_id when libdrm is present (one of the codepaths available). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:01 +01:00
Ilia Mirkin	8c78fdb328	gm107/ir: fix bit offset of tex lod setting for indirect texturing Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-10-18 09:56:14 -04:00
Ilia Mirkin	ecea2f69ef	gm107/ir: fix texturing with indirect samplers The indirect handle has to come right after the coordinates, so if there was a sample/bias/depth compare/offset, everything would end up being shifted by one argument position. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-10-18 09:56:14 -04:00
Marek Olšák	34099894c3	gallium/tgsi: add missing #include Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 11:20:57 +02:00
Julien Isorce	dbc8e18116	st/va: set default rt formats when calling vaCreateConfig As specified in va.h, default value should be set on attributes not present in the input list. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Mark Thompson <sw@jkqxz.net>	2016-10-18 08:44:14 +01:00
Kenneth Graunke	9f677d6541	i965: Fix gl_InvocationID in dual object GS where invocations == 1. dEQP-GLES31.functional.geometry_shading.instanced.geometry_1_invocations draws using a geometry shader that specifies layout(points, invocations = 1) in; and then uses gl_InvocationID. According to the Haswell PRM, the "GS Instance ID 0" (and 1) thread payload fields are undefined in dual object mode: "If 'dispatch mode' is DUAL_OBJECT this field is not valid." But there's no point in using them - if there's only one invocation, the ID will be 0. So just load a constant. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 20:22:02 -07:00
Jason Ekstrand	52904ba85c	anv: Get rid of anv_cmd_buffer_emit_state_base_address All code that would have once called this can now call the gen-specific version. The switching version is no longer needed. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Jason Ekstrand	7998e37774	anv/cmd_buffer: Move descriptor flushing into genX_cmd_buffer.c It really should have gone here all along. We were trying a bit too hard to make it gen-agnostic just because it didn't have any #if's. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Jason Ekstrand	eddaa237c0	anv/cmd_buffer: Expose ensure_push_constant_* Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Jason Ekstrand	1f3e6468d2	anv/cmd_buffer: Unify flush_compute_state across gens With one small genxml change, the two versions were basically identical. The only differences were one #define for HSW+ and a field that is missing on Haswell but exists everywhere else. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Jason Ekstrand	2314c9ed2e	anv/cmd_buffer: Move Begin/End/Execute to genX_cmd_buffer.c vkBeginCommandBuffer and vkCmdExecuteCommands both call into the gen-specific emit_state_base_address function and vkEndCommandBuffer belongs with begin. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Jason Ekstrand	ac0ca066de	anv/cmd_buffer: Move state base address re-emit into ExecuteCommands This has two primary advantages. First, it means that the batch_chain code knows less about the actual command buffer contents which is good because improves separation. Second, it means that it only gets re-emitted once after all of the secondaries instead of once after each secondary which is just wasteful. It also has the advantage of cleaning the code up a bit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Edward O'Callaghan	1c05f92590	doc/features.txt: factor out radeonsi as GL45 complete V2. add i965/hsw+ to list V3. rebased on master. V4. 'DONE' -> 'DONE ()'. V5. remove i965/hsw+ from list :/ Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-18 09:55:45 +11:00
Ian Romanick	89e1436e2d	i965: Silence unused parameter warnings brw_link.cpp:76:44: warning: unused parameter ‘shader_type’ [-Wunused-parameter] gl_shader_stage shader_type, ^ brw_nir.c: In function ‘brw_nir_lower_vs_inputs’: brw_nir.c:194:55: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo, ^ brw_vec4_visitor.cpp:914:37: warning: unused parameter ‘sampler’ [-Wunused-parameter] uint32_t sampler, ^ brw_vec4_visitor.cpp:1146:34: warning: unused parameter ‘stream_id’ [-Wunused-parameter] vec4_visitor::gs_emit_vertex(int stream_id) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-17 11:32:03 -07:00
Ian Romanick	7c0c3740f0	glsl: Remove unused function import_prototypes Once upon a time, this was used to extract prototypes from the shader containing GLSL built-in functions. This was removed by `f5692f45` in November 2010 for Mesa 7.10. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-17 11:32:03 -07:00
Ian Romanick	5c025ea6fc	glsl: Remove prototypes for nonexistent functions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-17 11:32:03 -07:00
Ian Romanick	fde48c1262	glsl: Replace assert with unreachable Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-17 11:32:03 -07:00
Lionel Landwerlin	696f5c1853	anv: replace , with ; in anv_batch_emit() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-17 18:16:38 +01:00
Lionel Landwerlin	6b17e3a6da	intel: aubinator: use different colors to signal batch start/end This makes the stream of commands a bit easier to read. v2 (Ken): Use bold text on green headers for easier readability; swap the green and blue headers so the majority stay blue. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-10-17 18:16:38 +01:00
Nicolai Hähnle	c3ce0d22b4	st/glsl_to_tgsi: fix [ui]vec[34] conversion to double The corresponding opcodes for integers need to be treated the same as F2D. Fixes GL45-CTS.gpu_shader_fp64.conversions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:09:45 +02:00
Nicolai Hähnle	1dd99a15a4	st/glsl_to_tgsi: fix atomic counter addressing When more than one atomic counter buffer is in use, UniformStorage[n].opaque is set up to contain indices that are contiguous across all used buffers. This appears to be used by i965 via NIR, but for TGSI we do not treat atomic counter buffers as opaque, so using the data in the opaque array is incorrect. Fixes GL45-CTS.compute_shader.resource-atomic-counter. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:09:42 +02:00
Nicolai Hähnle	9d6f82320c	st/glsl_to_tgsi: fix a corner case of std140 layout in uniform buffers See the comment in the code for an explanation. This fixes GL45-CTS.buffer_storage.map_persistent_draw. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:09:39 +02:00
Nicolai Hähnle	57a1514203	st/mesa: fix fragment shader output mapping Properly handle the case where there is a gap in the assigned output locations, e.g. a fragment shader writes to color buffer 2 but not to color buffers 0 & 1. Fixes GL45-CTS.gtf33.GL3Tests.explicit_attrib_location.explicit_attrib_location_pipeline. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:09:37 +02:00
Nicolai Hähnle	e0213f36bb	glsl: print non-zero bindings of variables Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:09:33 +02:00
Nicolai Hähnle	9160b4d981	radeonsi: unify the constant load paths Remove the split between direct and indirect. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:08:45 +02:00
Nicolai Hähnle	51f9b38ce8	radeonsi: fix indirect loads of 64 bit constants This fixes GL45-CTS.compute_shader.fp64-case3. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:08:36 +02:00
Eric Engestrom	e9864f93c6	gbm: add a couple missing includes Needed for memset() and drmIoctl(). Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-17 08:47:38 -07:00
Iago Toral Quiroga	8785a8ff89	glsl: fail compilation of compute shaders when unsupported Generally, we only check for the presence of compute shaders during parsing when we find any language (like layout qualifiers) that are specific to compute shaders, however, it is possible to define an empty compute shader does not use any language specific to compute shaders at all and we should fail the compilation anyway. dEQP checks this. This patch adds a check for compute shader availability after we have parsed the source code. At this point we know the effective GLSL version and also extensions enabled in the shader. Fixes a subcase of the following dEQP tests: dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader The tests still fail because there is one more subcase that fails that needs another fix. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-17 15:14:12 +02:00
Tapani Pälli	3d48353e29	egl/android: fix error in droid_add_configs_for_visuals() This was some kind of leftover in commit `acd35c8` and format_count array variable (declared in outer scope) should be used instead. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Fixes: `acd35c8758` ("egl/android: tweak droid_add_configs_for_visuals()") Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-17 11:51:15 +01:00
Marek Olšák	74d145f4a8	radeonsi: shorten "shader->selector" to "sel" in si_shader_create Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-17 12:13:00 +02:00
Marek Olšák	2e74e8ead9	radeonsi: clear DB_RENDER_OVERRIDE Vulkan doesn't set these fields even though it doesn't use HiS. HiS is disabled by programming DB_SRESULTS_COMPARE_STATEn to 0. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-17 12:13:00 +02:00
Kenneth Graunke	f30f48476f	glsl: Disable textureOffset(sampler2DArrayShadow, ...) in GLSL ES. This has apparently never existed in GLSL ES. Fixes dEQP-GLES3.functional.shaders.texture_functions.invalid .textureoffset_sampler2darrayshadow_vec4_ivec2_vertex and .textureoffset_sampler2darrayshadow_vec4_ivec2_fragment Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98244 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-16 15:05:00 -07:00
Axel Davy	9baf4505fb	st/nine: Fix multisample limit check Fixes regression introduced by `b560305687` The regression prevents some apps to start. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-17 00:02:52 +02:00
Eric Anholt	c61eb3c91c	vc4: Fix fast clear color packing for 565. Piglit didn't manage to cover this because fbo-clear-formats uses scissors, so we don't get fast clearing.	2016-10-16 11:22:50 -07:00
Eric Anholt	46cd3bab93	state_tracker: Fix check for scissor enabled when < 0. DEQP's clear tests like to give us x + w < 0 or y + h < 0. Since we were comparing to an unsigned, it would get promoted to unsigned and come out as bignum >= width or height and we would clear the whole fb instead of none of the fb. Fixes 10 tests under deqp-gles2/functional/color_clear. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-16 11:22:50 -07:00
Chad Versace	07422bf32b	egl/surfaceless: Fix comparison between pointer and integer Fixes GCC warning: drivers/dri2/platform_surfaceless.c:196:18: warning: comparison between pointer and integer Fixes: `4b8a55809e` ("egl/surfaceless: tweak surfaceless_add_configs_for_visuals()") Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-10-16 09:03:31 -07:00
Emil Velikov	d19b014b77	egl/surfaceless: use correct index when accesing the visual i is used for the driver_configs, while j is for the visuals. Fixes: `4b8a55809e` ("egl/surfaceless: tweak surfaceless_add_configs_for_visuals()") Reported-by: Chad Versace <chadversary@chromium.org> Tested-by: Chad Versace <chadversary@chromium.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-16 09:03:27 -07:00
Gustaw Smolarczyk	36cb5508e8	radv/winsys: Fail early on overgrown cs. When !use_ib_bos, we can't easily chain ibs one to another. If the required cs size grows over 1Mi - 8 dwords just fail the cs so that we won't assert-fail in radv_amdgpu_winsys_cs_submit later on. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-10-16 12:38:53 +02:00
Kenneth Graunke	493237d4ee	glsl: Drop the ES requirement that VS outputs must be flat qualified. Several conformance tests violate this requirement: ES31-CTS.core.tessellation_shader.max_patch_vertices ES31-CTS.core.tessellation_shader.tessellation_control_to_tessellation_evaluation.data_pass_through I submitted a merge request to fix the conformance tests, but Khronos opted to drop this GLSL ES specific requirement in favor of making flat qualification of VS outputs optional, matching modern desktop GL. Note that there were 7 Piglit tests which enforce this rule: tests/spec/glsl-es-3.00/compiler/interpolation/qualifiers/nonflat but these were deleted in Piglit commit acc0a2fabbd714bc704c16f1675e7c0. Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=15465#c7 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-15 13:47:47 -07:00
Jason Ekstrand	6ef5a44a43	intel/genxml: Make some PIPE_CONTROL fields booleans Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:50 -07:00
Jason Ekstrand	f34de3e8b0	intel/genxml: Make "Predication enable" a boolean Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:46 -07:00
Jason Ekstrand	468e1042cb	intel/genxml; Make "Use Global GTT a boolean We also remove the redundant zero defaults since everything without an explicit default gets zeroed automatically. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:43 -07:00
Jason Ekstrand	ce86227175	intel/genxml; Make "Tiled Surface" a boolean Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:39 -07:00
Jason Ekstrand	e6f9637d8a	intel/genxml: Make "SO Buffer Enable" fields boolean Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:36 -07:00
Jason Ekstrand	fa0285eaac	intel/genxml: Make "Stencil Buffer Enable" a boolean Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:30 -07:00
Jason Ekstrand	34826078f6	intel/genxml: Make a couple of STREAMOUT fields booleans Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:26 -07:00
Jason Ekstrand	6a064ad01d	intel/genxml: Make "Include Vertex Handles" and "Include Primitive ID" booleans Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:23 -07:00
Jason Ekstrand	f21d3b4d01	intel/genxml: Make "Vector Mask Enable" a boolean We also get rid of the "(VME)" a few places Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:19 -07:00
Jason Ekstrand	aee501c87e	intel/genxml: Make "Single Program Flow" a boolean We also get rid of the "(SPF)" a few places. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:14 -07:00
Tobias Klausmann	b7d9677de8	nv50/ir: constant fold OP_SPLIT Split the source immediate value into new values and move them into the original defs set by the split. Since we can only have up to 64-bit immediates, this is largely beneficial for F64 (and, in the future, U64) operations. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> [imirkin: always use U32, set newi for foldCount tracking] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-14 23:23:57 -04:00
Kenneth Graunke	75128d6ffd	i965: Enable OpenGL 4.5. Everything is in place. There are still conformance issues to sort out, but we may as well turn it on in master. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 17:35:13 -07:00
Jason Ekstrand	9d65595c06	anv/pipeline: Remove a meta hack from emit_ds_state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	69b2e931d4	anv/image: Create views directly in VkCreate*View Without meta, we no longer need the _init helpers and the ability to back an image view with surface states allocated out of the command buffer. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	0a2c375af9	anv/image: Get rid of the usage hacks for meta Now that meta is gone and we're using blorp, we don't need all of the usage hacks. Instead, the usage provided by the app is exactly the usage that we want because the app is the only thing creating image views. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	8e1a8dd47e	anv: Move CreatePipelines into genX_cmd_buffer.c Now that we don't have meta, we have no need for a gen-agnostic pipeline create path. We can, instead, just generate one CreatePipelines function per gen and be done with it. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	7df46b7533	anv/pipeline: Remove support for direct-from-nir shaders Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	6d557ae403	anv: Make entrypoint resolution take a gen_device_info In order for things such as the ANV_CALL and the ifuncs to work, we used to have a singleton gen_device_info structure that got assigned the first time you create a device. Given that the driver will never be used simultaneously on two different generations of hardware, this was fairly safe to do. However, it has caused a few hickups and isn't, in general, a good plan. Now that the two primary reasons for this singleton are gone, we can get rid of it and make things quite a bit safer. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	4c9dec80ed	anv: Get rid of the ANV_CALL macro This macro was needed by meta in order to make gen-specific calls from gen-agnostic code. Now that we don't have meta, the remaining two uses are fairly trivial to get rid of. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	ac77528f7d	anv: Get rid of graphics_pipeline_create_info_extra Now that we no longer have meta, all pipelines get created via the normal Vulkan pipeline creation mechanics. There is no more need for this bit of extra magic data that we've been passing around. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	dedc406ec8	anv: Get rid of meta Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	d823f92970	anv: Use blorp for subpass clears Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	51faab487f	anv: Use blorp for ClearAttachments Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	c9eaf12de2	anv/hiz: Perform HiZ resolves for all partial renders If we don't, we can end up with corruption in the portion of the depth buffer that lies outside the render area when we do a HiZ resolve at the end. The only reason we weren't seeing this before was that all of the meta-based clears such as VkCmdClearDepthStencilImage were internally using HiZ so the HiZ buffer never truly got out-of-sync. If the CTS ever tested a depth upload (which doesn't care about HiZ) and then a partial render we would have seen problems. Soon, we will be using blorp to do depth clears and it won't bother with HiZ so we would get CTS regressions without this. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	58f2315c38	anv: Use blorp for ClearDepthStencilImage Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	29e289fa65	anv/image: Add an isl_view to anv_image_view Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	0340548c8e	anv/image: Rework our handling of 3-D image array ranges Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	146ee31159	anv/blorp: Don't hand-roll flush_pipeline_select_3d When I initially brought up Vulkan blorp, I completely missed that this was already factored out. There's no good reason for us to hand-roll it. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	d80c0307ea	intel/blorp: Add a flag to make blorp not re-emit dept/stencil buffers In Vulkan, we want to be able to use blorp to perform clears inside of a render pass. If blorp stomps the depth/stencil buffers packets then we'll have to re-emit them. This gets tricky when secondary command buffers get involved. Instead, we'll simply guarantee that the depth and stencil buffers we pass to blorp (if any) match those already set in the hardware. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	0cabf93b80	intel/blorp: Add an entrypoint for clearing depth and stencil Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	82a2c49c5f	intel/blorp: Emit a NULL render target for depth/stencil-only operations This never mattered before because the only time we used blorp depth/stencil only was to do HiZ operations on gen6-7. It may have worked in that case (and maybe it didn't) but slow depth clears actually do depth rendering so they need a valid render target. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	b324c38ae3	intel/blorp: Allow for running without a PS on gen8+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	81be7be119	intel/blorp: Add an "enabled" bit to surface_info This gives a slightly smarter way to check whether or not a particular surface exists than looking at the address. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	bc4bb5a7e3	intel/blorp: Emit more complete DEPTH_STENCIL state This should now set the pipeline up properly for doing depth and/or stencil clears by plumbing through depth/stencil test values. We are now also emitting color calculator state for blorp operations without an actual shader because that is where the stencil reference value goes pre-SKL. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	7017742ad7	intel/blorp: Unify the DEPTH_STENCIL emit code across gens Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	cf2e3c3163	intel/blorp: Simplify depth/stencil config The newly reworked depth/stencil config code can properly handle having depth, stencil, both, or neither. We no longer need to predicate it on having depth or stencil. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	0414aaa133	intel/blorp: Set QPitch for depth and HiZ on gen8+	2016-10-14 15:39:41 -07:00
Jason Ekstrand	563fa63bf2	intel/blorp: Add support for binding an actual stencil buffer While we're here, we also make depth without HiZ work. v2: - Use the correct surface type for 1-D on SKL+ - Set QPitch on BDW+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	f180faab79	intel/blorp: Move CLEAR_PARAMS setup into emit_depth_stencil_config Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	c1fcf1a957	intel/genxml: Add a uint MOCS field to 3DSTATE_STENCIL_BUFFER Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	5dacd3caee	intel/blorp: Make the Z component of the primitive adjustable We want to be able to start doing slow depth clears with blorp. This allows us to adjust the depth we're clearing to. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Emil Velikov	7cb197c3a8	i915: workaround multiple intelFenceExtension definitions Due to conflicting symbol names (between i915 and i965) in the megadriver, we use a set of defines in i915/intel_screen.h. With a recent commit we've introduced a symbol intelFenceExtension which has different implementation for each driver, yet we forgot to add the define. Fixes: `d11515ff1b` ("i915/sync: Implement DRI2_Fence extension") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98264 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 19:22:16 +01:00
Chad Versace	cb836b673c	docs/specs: Update allocated EGL enum values Document the EGL enum ranges for Mesa and those values allocated by the following extensions: EGL_MESA_drm_image EGL_MESA_platform_gbm EGL_MESA_platform_surfaceless EGL_WL_bind_wayland_display Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 11:19:41 -07:00
Chad Versace	0cfa34c102	doc/specs: Reference the Khronos registry XML Years ago Khronos replaced the registry's spec files with newfangled XML files. Update the reference in doc/specs/enum.txt accordingly. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 11:19:40 -07:00
Chad Versace	88b5c36fe1	egl: Move old EGL_MESA_screen_surface spec It was the lone file in src/egl/docs. Move it to where the other specs live, in $MESA_TOP/docs/specs. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 11:19:40 -07:00
Chad Versace	a597c8ad5b	egl: Implement EGL_MESA_platform_surfaceless Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 11:19:40 -07:00
Chad Versace	c177ef9d47	egl: Don't advertise unsupported platform extensions Mesa's set of supported platform extensions depends on the autoconf option --with-egl-platforms=foo,bar,baz. If --with-egl-platforms lacks foo, then eglGetPlatformDisplay(EGL_PLATFORM_FOO, ...) unconditonally fails. So, if --with-egl-platforms lacks foo, then remove EGL_VENDOR_platform_foo from the EGL client extension string. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 11:19:27 -07:00
Chad Versace	27f4e38173	docs: Add EGL_MESA_platform_surfaceless.txt (v2) v2: - Assign enum values. - Define interactions with EGL_EXT_platform_base and EGL 1.4. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 11:19:13 -07:00
Ian Romanick	4246986dec	i965: Sort some extension names Trivial. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-10-14 11:16:59 -07:00
Jose Fonseca	b12606b693	scons: Fix the Python dependency scanner. modulefinder wasn't searching for dependencies in the script dir. It's not capable of detecting the sys.path manipulations scripts do internally neither. This change fixes the first issue, and hacks around the second. Honestly, I've come to the conclusion that automatic Python dependency it will always be too brittle. I think we should start manually typing the dependencies like we do in automake. At very least it will enable any person to eyeball and spot/fix missing dependencies, without dig into SCons internals.	2016-10-14 16:52:13 +01:00
Jose Fonseca	c6d17701c8	pipe_loader_sw: Don't invoke Unix close() on Windows. Trivial.	2016-10-14 16:29:04 +01:00
Emil Velikov	ebffa7b6af	Revert "egl/dri2: rework dri2_make_current code flow" This reverts commit `675719817e`.	2016-10-14 16:07:33 +01:00
Mauro Rossi	6eacd69b6f	i915: store reference to the context within struct intel_fence (v2) Porting of the corresponding patch for i965. Here follows the original commit message by Tomasz Figa: "As the spec allows for {server,client}_wait_sync to be called without currently bound context, while our implementation requires context pointer. v2: Add a mutex and acquire it for the duration of brw_fence_client_wait() and brw_fence_is_completed() as suggested by Chad." NOTE: in i915 all references to 'brw' are replaced by 'intel' Marshmallow-x86 boots ok with the following results of Android CTS. Android CTS 6.0_r7 build:2906653 Session Pass Fail Not Executed 0(EGL) 1410 24 0 1(GLES2) 13832 82 0 I get the same results as per i965GM. [Emil Velikov: Include Mauro's test results] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 15:43:57 +01:00
Mauro Rossi	d11515ff1b	i915/sync: Implement DRI2_Fence extension Here is the porting of corresponding patch for i965, i.e. commit `c636284` i965/sync: Implement DRI2_Fence extension Here follows part of original commit message by Chad Versace: "This enables EGL_KHR_fence_sync and EGL_KHR_wait_sync." Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 15:43:53 +01:00
Mauro Rossi	19fa29a592	i915/sync: Replace prefix 'intel_sync' -> 'intel_gl_sync' This is the porting of corresponding patch for i965, i.e. commit `2516d83` i965/sync: Replace prefix 'intel_sync' -> 'intel_gl_sync' The only difference compared to i965 one is that intel_check_sync() was renamed to intel_gl_check_sync() here, as it is more appropriate. Here follows original commit message by Chad Versace: "I'm about to implement DRI2_Fenc in intel_syncobj.c. To prevent madness, we need to prefix functions for GL_ARB_sync with 'gl' and functions for DRI2_Fence with 'dri'. Otherwise, the file will become a jumble of similiarly named functions. For example: old-name: intel_client_wait_sync() new-name: intel_gl_client_wait_sync() soon-to-come: intel_dri_client_wait_sync() I wrote this renaming commit separately from the commit that implements DRI2_Fence because I wanted the latter diff to be reviewable." [Emil Velikov: rename the outstanding intel_sync instances] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 15:43:22 +01:00
Emil Velikov	284795616a	egl/drm: set eglError and provide an error message on failure v2: Remove gratuitous newline/semicolon (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	d81ba763e3	egl/x11: attribute for dri2_add_config failure ... in dri2_x11_add_configs_for_visuals(). Currently the latter does not consider that, thus in such cases it adds "empty" configs in the list. Properly account for things and as we do that we can reuse count, instead of calling _eglGetArraySize to determine if we've added any configs. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	0b2b719121	egl/wayland: introduce dri2_wl_add_configs_for_visuals() helper Analogous to previous commits - with an extra bonus. Current code, apart from not attributing the lack of 'per visual' and overall configs also overwrites the newly added config. Namely if the dpy supports two or more of the supported formats (XRGB8888, ARGB8888 and RGB565) earlier configs will be overwritten and the the final one will be stored, since the we use the same index for all three in our dri2_add_config call. v2: Use correct comparison in loop conditional (Eric) Use valid C initializer (Gurchetan) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	4b8a55809e	egl/surfaceless: tweak surfaceless_add_configs_for_visuals() Analogous to previous commit. v2: Use correct comparison in loop conditional (Eric) Use valid C initializer (Gurchetan) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	acd35c8758	egl/android: tweak droid_add_configs_for_visuals() Iterate over the driver_configs first in order to cut down the number of getConfigAttrib() calls by a factor of 5. While we're here, also drop the sentinel of the visuals array. We already know its size so we can use that and save a few bytes. v2: Use correct comparison in loop conditional (Eric) Use valid C initializer (Gurchetan) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	36fe5900a4	egl/drm: introduce drm_add_configs_for_visuals() helper Factor out and rework the existing code so that it prints a debug message if we have zero configs for any visual. As a nice side effect we now provide a correct (sequential ID) when creating a config (via dri2_add_config). v2: Use correct comparison in loop conditional (Eric) Use valid C initializer (Gurchetan) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	23ed073aa4	egl/surfaceless: print out a message on zero configs for given format Currently we print a debug message if the total configs is non-zero only to do the same (at an error level) as we return from the function. Rework the message to print if we're missing a config for the given format. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	98f5d0106a	egl/dri2: set WL_bind_wayland_display in a consistent way Introduce a helper and use it throughout the platform code. This allows us to reduce the amount of ifdef(s) and (potentially) use kms_swrast_dri.so for !drm platforms (namely wayland and x11). Note: in the future as other platforms (android, surfaceless) support the extension they can reuse the helper. v2: Rebase, check for device_name. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	637d001a97	egl/android: remove duplicate KHR_image_base set The core egl/dri2 already sets the extension bit _only_ when possible - which in Android's case is always. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 12:53:38 +01:00
Emil Velikov	9caacb39b9	loader/dri3: constify the loader_dri3_vtable Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:35 +01:00
Emil Velikov	fdd373acca	egl/dri2: micro optimise dri2_bind_extensions() Do not loop over all matches if we've already found one. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:46:09 +01:00
Emil Velikov	665cad1658	egl/dri2: annotate dri2_extension_match instances as const data v2: Rebase. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:46:05 +01:00
Emil Velikov	3948ad82ce	egl/dri2: use dri2_bind_extensions to manage the optional extensions v2: dri2_bind_extensions() now takes optional as an argument. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:46:03 +01:00
Emil Velikov	d5342c6ff2	gbm: rename gbm_dri_device::{,loader_}extensions To align with the name used in the EGL and GLX loaders. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:45:54 +01:00
Emil Velikov	38526bd468	egl/dri2: add support for optional extensions in dri2_bind_extensions() Will allow us to reuse the function for optional extensions and fold a bit of code. v2: Make dri2_bind_extensions::optional flag an argument to dri2_bind_extensions (Kristian). Cc: Rob Clark <robdclark@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 12:45:24 +01:00
Emil Velikov	ebc68e3849	egl/dri2: coding style cleanup Consistently indent with space rather than a mix of tab and spaces. v2: Keep the structs properly aligned (Eric). Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 12:43:57 +01:00
Emil Velikov	b10c05d4ff	egl/x11: don't crash if dri2_dpy->conn is NULL The dri3 version of commits `60e9c35b3a` and `6de9a03bed`. While using xcb_connect() guarantees that we always get a non NULL return value, XGetXCBConnection() does/can not. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:42:37 +01:00
Emil Velikov	f871946594	egl/dri2: rework dri2_egl_display::extensions storage Remove the error prone fixed size array. While we're here also rename to loader_extensions like in the GLX code. v2: Rebase. Keep image_loader_extension within the wayland_drm dri2_loader_extensions list. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:42:22 +01:00
Emil Velikov	f7b8108289	egl/dri2: remove unused dri2_egl_display::{dri2,swrast}_loader_extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:42:18 +01:00
Emil Velikov	e7fcf1b09b	egl/x11: don't populate dri2_dpy->swrast_loader_extension Analogous to earlier commits. Note: the actual version of the extension is 1, since it does not implement .putImage2. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:42:02 +01:00
Emil Velikov	2dbe14af1e	egl/wayland: don't populate dri2_dpy->swrast_loader_extension Similar to the dri2 one - the extension stored in struct dri2_egl_display is unused. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:42:00 +01:00
Emil Velikov	3963a5fc94	egl/x11: don't populate dri2_dpy->dri2_loader_extension Analogous to the earlier android and wayland patches. As we're here we can drop exposing the old version of the extension. Any dri loader/driver interface use lower bound checking thus exposing dri2 loader v3 to a v2 capable driver is perfectly normal. v2: Preserve compat with dri2_minor < 1. The driver does not know if there is a protocol to manage getBuffersWithFormat(). It's up-to the loader to expose the vfunc if there is one. (Kristian) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:41:56 +01:00
Emil Velikov	d2d579da7e	egl/wayland: don't populate dri2_dpy->dri2_loader_extension Analogous to the earlier android patch. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:41:51 +01:00
Emil Velikov	31ef5d4452	egl/surfaceless: trivial coding style fixes Remove a few gratious blank lines and use the correct level of indentation. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:41:48 +01:00
Emil Velikov	d0155bcbe8	egl/surfaceless: don't check the mask(s) prior to calling dri2_add_config The latter already does it for us. As we're here annotate the masks as const and use unsigned for the index(es). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:41:43 +01:00
Emil Velikov	ff700f8c22	egl/surfaceless: remove unused dri2_loader_extension implementation Earlier commit introduced support for image_loader and left the dri2_loader code dangling/unused. Let's remove it. Fixes: `63c5d5c6c4` ("Added pbuffer hooks for surfaceless platform") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2016-10-14 12:17:18 +01:00
Emil Velikov	6a8fe32430	egl/android: don't populate dri2_dpy->dri2_loader_extension The extension stored in struct dri2_egl_display isn't used, thus we can create a static const instance of the extension and point extensions[] to it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 12:17:18 +01:00
Emil Velikov	675719817e	egl/dri2: rework dri2_make_current code flow Fold duplicate conditional blocks and add a few extra comments ;-) v2: Bring back the explicit "unbind" logic (Eric), remove NULL derefs. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 12:17:18 +01:00
Emil Velikov	07690a289a	egl/dri2: drop NULL checks prior to dri2_destroy_surface The function already have the respective check within. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 12:17:18 +01:00
Emil Velikov	8cf83f9c08	egl/dri2: call static functions directly, not via _EGLDriver::API The indirection is meant to be used by the core EGL implementation in main. Not in the drivers themselves. Move the dri2_destroy_surface definition to avoid forward declaration of the static function. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:08 +01:00
Emil Velikov	532ec2edd8	egl/dri2: use dri2_egl_display inline wrapper where possible This way the only places that reference DriverData are the ones that manipulate it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:07 +01:00
Emil Velikov	d6dcf3b4ca	egl/dri2: bail out on NULL dpy in dri2_display_release() Currently all callers are careful enough not to do that, yet that will not be the case in the future. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:06 +01:00
Emil Velikov	8fb9ea413d	egl/dri2: move surface refcounting out of the platform code All the platforms are duplicating what should be a driver/dri2 thing - refcounting. Just fold it accordingly. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:05 +01:00
Emil Velikov	02f1158746	egl/dri2: coding style fix Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:04 +01:00
Emil Velikov	7a9c92d071	egl/dri2: non-shared glapi cleanups For a while now we require shared glapi for EGL, thus we can drop a few bits from the olden days. Namely - dlopen(NULL...) is not possible, error out at build stage if so and drop the guard around dlclose(). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:03 +01:00
Emil Velikov	b349c11098	egl/dri2: glFlush is not optional, treat it as such The documentation is clear - one must glFlush the old context on eglMakeCurrent. Thus keeping it optional is not something we should be doing. Furthermore if we cannot get the entry point we're likely having a broken setup/stack. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:00 +01:00
Emil Velikov	13bf390657	aubinator: replace pragma once with ifndef guard Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Sirisha Gandikota<sirisha.gandikota@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:45 +01:00
Emil Velikov	ae6fb9c922	anv: error out if anv_genX.h is included by !anv_private.h Update the comment to reflect the correct filename and add a guard to catch incorrect inclusion of the header. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:43 +01:00
Emil Velikov	08efa6a19f	anv: use correct header guards Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:41 +01:00
Emil Velikov	76ae842366	intel/genxml: use correct header guards Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:39 +01:00
Emil Velikov	72e70c00f3	intel/common: use correct header guards Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:37 +01:00
Emil Velikov	0d86c92dcb	intel/blorp: use correct header guards Avoid the discouraged use of pragma once and a missing guard for blorp_genX_exec.h. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:34 +01:00
Emil Velikov	3a98bffa59	isl: use ifndef header guards Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:32 +01:00
Emil Velikov	4c1c9d62a9	isl: make locally used functions static Signed-off-by: Emil Velikov <emil.velikov@collabra.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:30 +01:00
Emil Velikov	4fe6e7f2bd	isl: trivial include-what-you-want cleanups Noticed while skimming through the files. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:28 +01:00
Emil Velikov	eac752e54b	isl/gen7: remove unneeded ISL_DEV_GEN check The function gen7_format_needs_valign2 has two callers - the gen7 only gen7_choose_valign_el() and isl_gen6_filter_tiling(). The latter of which already guarding the invocation appropriately. To be extra cautious add a couple of asserts alongside the removal of the runtime check. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:25 +01:00
Emil Velikov	5b1efb65ce	isl: prefix non-static API with isl_ The rest of ISL already follows this approach. Be consistent and resolve the final references. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:22 +01:00
Emil Velikov	84f9ef1de4	isl/gen6: correctly check msaa layout samples count Samples == 1 is a valid value, so returning false is plain wrong. Seeming copy/paste typo introduced since day 1. Fixes: `afdadec77f` ("isl: Implement isl_surf_init() for gen4-gen9") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:15 +01:00
Emil Velikov	c572360c30	automake: add radv to the `make distcheck' hooks Will allow us to catch issues (as fixed with previous patches) rather than release a broken tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-10-14 11:09:00 +01:00
Emil Velikov	3fd0cafc1c	radv: move AMDGPU_LIBS later in the link chain At the moment (albeit unlikely) one could get link-time issues, since libdrm_amdgpu.so is before it's users in the link chain. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-10-14 11:09:00 +01:00
Emil Velikov	a8a5f0a025	radv: correct variable name VISIBILITY_{, C}FLAGS The letter C was missing, thus in turn all the internal symbols were exported. As a result we hide ~150 symbols and cut ~36K from libvulkan_radeon.so. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-10-14 11:09:00 +01:00
Emil Velikov	753a9c989f	amd/addrlib: hide private symbols via VISIBILITY_CXXFLAGS Private/internal symbols should not be exported. Using the CXXFLAGS cuts ~300 exported symbols and ~23K from libvulkan_radeon.so. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	72fa5ca06d	intel: automake: replace direct basename $@ invokation with $(@F) Use the shorthand make variable(s) as elsewhere in the build. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	48267b730c	gallium: annotate sw_driver_descriptor instance as const data Already treated and handled as such. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	792148f16a	gallium: annotate drm_driver_descriptor instance as const data Already treated and handled as such. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	c079a206ad	gallium: rename drm_driver_descriptor::{, driver_}name Historically we use "device name" for the name of the kernel module and "driver name" for the dri/other driver. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	9837cf13b1	gallium: remove unused drm_driver_descriptor::driver_name Likely unused since day 1, although I've only checked back until the st/dri unification with commit `29ca7d2c94` ("st/dri: merge dri/drm and dri/sw backends") Based on the comment, referencing drmOpenByName it's not something we want to bring back. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	0f031dcf11	gallium: fix drm_driver_descriptor::name comment Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	c85b34ffd0	mesa_glinterop: allow building without X and related headers This commit effectively reverts `c10dcb2ce8` and fixes the typedef redefinition which inspired it. In order to prevent requiring X packages at build time earlier commit forward declared the required X/GLX typedefs. Since that approach introduced typedef redefinition (a C11 feature) it was reverted. To avoid the redefinition while _not_ mandating X and related headers forward declare the structs and use those through the header. As anyone uses the mesa interop header they ensure that the X (or others in terms of EGL) headers are included, which ensures that everything is resolved within the compilation unit. Cc: Vinson Lee <vlee@freedesktop.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Chih-Wei Huang <cwhuang@android-x86.org> Fixes: `c10dcb2ce8` ("Revert "mesa_glinterop: remove inclusion of GLX header"") Fixes: `8472045b16` ("mesa_glinterop: remove inclusion of GLX header") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96770 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-10-14 11:08:59 +01:00
Mark Thompson	0b241b7717	st/va: Fix H.264 PicOrderCnt value TopFieldPicOrderCnt is exactly the PicOrderCnt value for a frame - see H.264 section 8.2.1. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-14 11:57:52 +02:00
Mark Thompson	1edaa33135	st/va: Baseline profile is not supported Constrained baseline profile is supported, so use that instead. This matches what the encoder already does (constraint_set1_flag is always set in the output bitstream). Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-14 11:57:48 +02:00
Mark Thompson	e0604eed9f	st/va: Return surface formats depending on config chroma format This makes the supported format actually match the configuration, and allows the user to observe that NV12 is supported for video processing where previously they couldn't (though it did always work if they blindly tried to use it anyway). Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-14 11:57:44 +02:00
Mark Thompson	e7c7ef3625	st/va: Save surface chroma format in config Both YUV420 and RGB32 configurations are supported, so we need to be able to distinguish which is being used. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-14 11:57:40 +02:00
Mark Thompson	8a931c83ba	st/va: Return more useful config attributes The encoder attributes are needed for a user of the encoder to be able to configure it sensibly without internal knowledge. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-14 11:57:25 +02:00
Mario Kleiner	0c94ed0987	glx: Perform check for valid fbconfig against proper X-Screen. Commit `cf804b4455` ('glx: fix crash with bad fbconfig') introduced a check in glXCreateNewContext() if the given config is a valid fbconfig. Unfortunately the check always checks the given config against the fbconfigs of the DefaultScreen(dpy), instead of the actual X-Screen specified in the config config->screen. This leads to failure whenever a GL context is created on a non-DefaultScreen(dpy), e.g., on X-Screen 1 of a multi-x-screen setup, where the default screen is typically 0. Fix this by using config->screen instead of DefaultScreen(dpy). Tested to fix context creation failure on a dual-x-screen setup. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 10:11:25 +01:00
Tim Rowley	a42c22fdbf	swr: [rasterizer core] don't construct pArContext on non-ar builds Stops debug directory being created on non-ar builds. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:14 -05:00
Tim Rowley	29d07480b8	swr: [rasterizer core] remove WorkerWaitForThreadEvent bucket Cause of bucket stop capture hang, as threads get stuck in level 1. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:14 -05:00
Tim Rowley	ada27b503e	swr: [rasterizer core] move binner functionality to separate file Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:14 -05:00
Tim Rowley	f0a66c1da2	swr: [rasterizer scripts] add DEBUG_OUTPUT_DIR knob Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:14 -05:00
Tim Rowley	ffd0224303	swr: [rasterizer core] fix comment typo Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:14 -05:00
Tim Rowley	4889922210	swr: [rasterizer core/sim] 8x2 backend + 16-wide tile clear/load/store Work in progress (disabled). USE_8x2_TILE_BACKEND define in knobs.h enables AVX512 code paths (emulated on non-AVX512 HW). Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:14 -05:00
Tim Rowley	bf1f46216c	swr: [rasterizer archrast] fix event file issue with saving data Also, tagging stats with draw id to correlate these events with draw/dispatch events. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:13 -05:00
Eric Engestrom	827e038062	swr: [rasterizer common] fix assert index Fixes: `b3bd8bb611` ("swr: [rasterizer core] add support for "RAW" surface format") CovID: 1373647 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 21:37:20 -05:00
Ilia Mirkin	5f885225cf	docs: mark GL 4.4/4.5 extension groups as DONE for nvc0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-13 21:45:21 -04:00
Ilia Mirkin	afb6dc53bf	nv50: enable ARB_enhanced_layouts Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-13 21:45:21 -04:00
Ilia Mirkin	a6d6eff2e6	nvc0/ir: be more careful about preserving modifiers in SHLADD creation src2 was being given the wrong modifier, and we were not properly managing the modifier on the SHL source either. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-13 21:44:03 -04:00
Brian Paul	3a2869aaca	mesa: fix indentation in vertex_attrib_binding() Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	743a526372	mesa: add sanity check assertion in update_array_format At most, one of the normalized, integer, doubles bools can be true. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	d6b0002195	mesa: remove needless cast in update_array() Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	74745dcfa4	mesa: simplify update_array() with a vao local var Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	0de9265b1f	vbo: simplify some code in check_draw_elements_data() Use the 'vao' local var in more places. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	15fb88e912	mesa: rename gl_vertex_attrib_array gl_array_attributes The structure contains the attributes of a vertex array. The old name was kind of confusing. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	c89802aeea	mesa: rename gl_vertex_attrib_array::VertexBinding Rename to gl_vertex_attrib_array::BufferBindingIndex because this field is an index into the array of buffer binding points. This makes some code a little easier to follow since there's also a "VertexBinding" field in gl_vertex_array_object. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	c328268b92	mesa: rename some vars in arrayobj.c Use 'vao' instead of 'obj' to be consistent with other code. Plus, add a comment. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	b81546d43c	tgsi: fix comment typo in tgsi_ureg.c Trivial.	2016-10-13 17:38:49 -06:00
Brian Paul	ff00ab745c	mesa: replace gl_framebuffer::_IntegerColor wih _IntegerBuffers Use a bitmask to indicate which color buffers are integer-valued, rather than a bool. Also, the old field was mis-computed. If an integer buffer was followed by a non-integer buffer, the _IntegerColor field was wrongly set to false. This fixes the new piglit gl-3.1-mixed-int-float-fbo test. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-13 17:38:49 -06:00
Brian Paul	a710c21ac2	mesa: remove 'params' parameter from ctx->Driver.TexParameter() None of the drivers which implement this hook do anything with the texture parameter value. Drivers just look at the pname and set a dirty flag if needed. We were doing some ugly casting and type conversion to setup the argument so that all goes away. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-13 17:38:49 -06:00
Eric Anholt	99d790538d	vc4: Avoid loading from the texture during non-utile-aligned glTexImage(). Previously, the plan was "if the width/height we have to load/store isn't the size the user is planning on writing, then we need to load the old contents out beforehand to prevent writing back undefined". However, when we're doing glTexImage() we often end up aligning the width/height into the padding of the texture, and we don't actually need to read out that padding. Improves x11perf -aatrapezoid100 performance from ~460/sec to ~700/sec.	2016-10-13 14:27:30 -07:00
Axel Davy	0717cd975d	st/nine: Fix possible segfault in surface ctor Regression introduced by `ba0274c7d6` Check the resource exists before assigning it a flag (and use This->base.resource instead of pResource, since the former may have a newly allocate resource, while the latter would be NULL). This should reintroduce the behaviour of previous code. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-13 21:16:35 +02:00
Axel Davy	98b8ad61c6	st/nine: Remove useless code in nine_shader Since `1604efa6fd`, lconsti and lconstb don't need to be initialized. Remove some leftovers from the previous code (which has now invalid use of ARRAY_SIZE on a pointer instead of an array). Reported by Coverity. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-13 21:16:35 +02:00
Axel Davy	197cdd1bbd	gallium/os: Use unsigned integers for size computation Use uint64_t instead of int64_t in the calculation, as the result is uint64_t. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 21:16:35 +02:00
Samuel Pitoiset	4527222169	nvc0: enable ARB_enhanced_layouts All ARB_enhanced_layouts piglit tests pass without any changes in our compiler. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-13 21:13:34 +02:00
Dave Airlie	47a7d86fe9	radv: fix the wayland wsi busy bit Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 05:10:02 +10:00
Dave Airlie	a3834ebaf9	anv: fix the wayland wsi busy flag setting Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 05:10:02 +10:00
Tom Stellard	5c66d46d6a	radv: Use new image load/store intrinsic signatures v2 These were changed in LLVM r284024. v2: - Only use float types for vdata of llvm.amdgcn.image.store. LLVM doesn't support integer types for this intrinsic. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:48:11 +10:00
Tom Stellard	30e63fb0e4	radv: Fix incorrect comment Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:48:11 +10:00
Dave Airlie	060e6f468a	radv: fix identity swizzle handling The identity swizzle should operate exactly like an .r = R, .g = G, .b = B, .a = A swizzle. This fixes a bunch of the 16-bit BGRA blit tests dEQP-VK.api.copy_and_blit.blit_image.all_formats.b4g4r4a4* Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:45:57 +10:00
Dave Airlie	8980ac0411	anv/wsi: fix apps that acquire multiple images up front This fix was found in the radv codebase when running dota2, no idea if anyone has reported it on anv, but the same problem occurs. Once an image is acquired we need to mark it busy. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:45:11 +10:00
Dave Airlie	8bdac874e6	radv/wsi: fix app that acquire multiple images up front dota2 does multiple acquires followed by multiple queues, this bug manifested itself as a hang in the xshmfence code randomly when dota2 was doing it's menus. It also occured when running dota2 under phoronix-test-suite. The fix is once the image is acquired to mark it busy then so nobody else can acquire. We have to trust vulkan apps that they will eventually submit it. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:45:11 +10:00
Dave Airlie	dfe74fd1a9	anv: initialise and increment send_sbc At least set this to not be uninitialised memory. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:45:00 +10:00
Marek Olšák	7dddf0b7ab	radeonsi: adjust and clean up Z_ORDER and EXEC_ON_x settings The table was copied from the Vulkan driver. The comment lines are as long as the table for cosmetic reasons. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 19:00:51 +02:00
Marek Olšák	e12c1cab5d	radeonsi: disable ReZ This is a serious performance fix. Discovered by luck. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94354 Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 19:00:51 +02:00
Marek Olšák	d4d9ec55c5	radeonsi: implement TC-compatible HTILE so that decompress blits aren't needed and depth texturing needs less memory bandwidth. Z16 and Z24 are promoted to Z32_FLOAT by the driver, because TC-compatible HTILE only supports Z32_FLOAT. This doubles memory footprint for Z16. The format promotion is not visible to state trackers. This is part of TC-compatible renderbuffer compression, which has 3 parts: DCC, HTILE, FMASK. Only TC-compatible FMASK compression is missing now. I don't see a measurable increase in performance though. (I tested Talos Principle and DiRT: Showdown, the latter is improved by 0.5%, which is almost noise, and it originally used layered Z16, so at least we know that Z16 promoted to Z32F isn't slower now) Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 19:00:51 +02:00
Marek Olšák	a077185ea9	gallium: add PIPE_RESOURCE_FLAG_TEXTURING_MORE_LIKELY For performance tuning in drivers. It filters out window system framebuffers and OpenGL renderbuffers. radeonsi will use this to guess whether a depth buffer will be read by a shader. There is no guarantee about what will actually happen. This is a departure from PIPE_BIND flags which are defined to be strict but they are useless in practice. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 19:00:51 +02:00
Nicolai Hähnle	761388a0eb	radeonsi: fix regression in image atomics Caused by a bad rebase when pushing commit `76a940893`.	2016-10-13 16:04:16 +02:00
Nicolai Hähnle	d413fbb159	st/mesa: fix vertex elements setup for doubles Whether one or two slots are taken up by one API array depends on the vertex shader, not on how the array is configured. When an array is set up with fewer components than the shader expects, the high components are undefined. Fixes GL45-CTS.vertex_attrib_binding.basic-inputL-case1. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-13 15:41:36 +02:00
Nicolai Hähnle	15fc74905b	st/glsl_to_tgsi: remove unnecessary ir_instruction argument from get_opcode Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-13 15:41:33 +02:00
Nicolai Hähnle	1d7685e52c	st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-13 15:41:29 +02:00
Nicolai Hähnle	b234e37765	st/glsl_to_tgsi: simplify translate_tex_offset This fixes a bug with offsets from uniforms which seems to have only been noticed as a crash in piglit's arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag on radeonsi. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-13 15:41:11 +02:00
Nicolai Hähnle	76a940893d	radeonsi: fix the coordinate overloading of llvm.amdgcn.image.atomic.cmpswap.* Fixes GL45-CTS.shader_image_load_store.basic-allTargets-atomic* Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-13 10:17:42 +02:00
Nicolas Koch	35e2bfa6d9	radv: Return correct result in EnumeratePhysicalDevices If pPhysicalDevices is too small for all physical devices, the driver must return VK_INCOMPLETE. Since only a single physical device is supported, this is only the case when pPhysicalDeviceCount == 0 && pPhysicalDevices != NULL. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-13 09:11:13 +10:00
Ilia Mirkin	e6a693c447	st/mesa: only flip stipple pattern for winsys fbo's Gallium is completely oblivious to whether the fbo is flipped or not. Only flip the stipple pattern when the fbo is flipped as well. Otherwise the driver has no idea when to unflip the pattern. Fixes bin/gl-2.1-polygon-stipple-fs -fbo Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-12 17:04:16 -04:00
Emil Velikov	a4622305e6	swr: automake: add ar_eventhandlerfile_h.template to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-12 18:55:22 +01:00
Emil Velikov	3c419a941a	radv: add all headers to the sources list Otherwise they'll be missing from the tarball and the build will fail. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-12 18:55:20 +01:00
Ilia Mirkin	a48a343c29	nvc0/ir: fix textureGather with a single offset Recent fix for non-const offsets broke the case of a single offset (vs 4 offsets). The later code relies on the offs array to contain null values to tell whether they should be added onto the srcs list. Fixes: `5239bd592` ("nvc0/ir: fix overwriting of value backing non-constant gather offset") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-10-12 13:18:14 -04:00
Ilia Mirkin	300b5ad023	nv50/ir: copy over value's register id when resolving merge of a phi The offset needs to be properly copied over to the phi value, otherwise it will get assigned to the base of the merge instead of the proper location. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-10-12 13:18:14 -04:00
Nicolai Hähnle	789119d212	st/mesa: enable ARB_enhanced_layouts and turn the cap on v2: mark llvmpipe & softpipe properly as well (Jason Wood) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	b5b4aa42ba	st/glsl_to_tgsi: adjust swizzles and writemasks for explicit components Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	777dcf81b9	st/glsl_to_tgsi: explicitly track all input and output declaration In order to be able to emit overlapping input and output array declarations, we flip the logic of emitting those declarations on its head: rather than iterating over slots and emitting the corresponding declarations, we iterate over the declarations from GLSL and emit those. v2: fix some regressions related to structs v3: fix a regression in geometry and tessellation shader array handling Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v2) Reviewed-by: Dave Airlie <airlied@redhat.com> (v2)	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	2299a9940c	st/glsl_to_tgsi: mark "gaps" in input/output arrays as used In some cases, a shader may have an input/output array but not use some entries in the middle. This happens with eON games, for example. We emit declarations that cover the entire array range even if there are some unused gaps. This patch now reflects that in the InputsRead etc. fields to ensure the various input/outputMapping arrays are actually correct, which will be important when we re-jiggle the way declarations are emitted. v2: fix a typo (Edward O'Callaghan) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	63193b9cde	st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations This optimization is incorrect with 64-bit operations, because the channel-splitting logic in emit_asm ends up being applied twice to the source operands. A lucky coincidence of how the writemask test works resulted in this optimization basically never being applied anyway. As far as I can tell, the only case where it would (incorrectly) have been applied is something like dvec2 d; float x = (float)d.y; which nobody seems to have ever done. But the moral equivalent does occur in one of the component layout piglit test. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	f5f3cadca3	st/glsl_to_tgsi: simpler fixup of empty writemasks Empty writemasks mean "copy everything", so we can always just use the number of vector elements (which uses the GLSL meaning here, i.e. each double is a single element/writemask bit). Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	957d541089	st/glsl_to_tgsi: explicit handling of writemask for depth/stencil export Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	14aaaa1b4b	glsl: dump explicit location when printing IR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	2b460c750a	tgsi/ureg: add ureg_DECL_output_layout For specifying an exact location/component. v2: change the order of parameters (Dave) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	047a7c7a0b	tgsi/ureg: add layout/component input declarations v2: change the order of parameters (Dave) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	f9a01f3872	tgsi/scan: fix num_inputs/num_outputs for shaders with overlapping arrays v2: remove a tautological left-over assert (Marek) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	700a571f89	gallium: add PIPE_CAP_TGSI_ARRAY_COMPONENTS This is a screen cap because drivers are expected to support it either for all shader types or for none of them. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Tom Stellard	b33cb709fd	radeonsi: Use the new image load/store intrinsic signatures This patch requires LLVM r284024 or newer. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 16:42:43 +00:00
Tom Stellard	ff0df66e10	radeonsi: Add function for converting LLVM type to intrinsic string The existing function only worked for integer types. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 16:42:07 +00:00
Tom Stellard	a96a7eae04	radeonsi: Refactor image store/load intrinsic name creation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 16:42:07 +00:00
Marek Olšák	d7e74b52bb	winsys/amdgpu: fix infinite loop w/ RADEON_NOOP=1 caused by unsubmitted fences Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00
Marek Olšák	e4bbab9022	radeonsi: fix R600_DEBUG=precompile for shader-db radeonsi no longer supports pixel shaders without interpolation optimizations, which led to assertion failures in si_shader_ps when running shader-db. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00
Marek Olšák	40e1f7e09b	radeonsi: use TC write-back instead of full cache invalidation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00
Marek Olšák	8cdce30cc2	radeonsi: implement TC L2 write-back (flush) without cache invalidation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00
Marek Olšák	65a4d55a9f	radeonsi: don't invalidate VMEM L1 for memory barriers for index buffers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00
Samuel Pitoiset	87b06cab14	nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c) total instructions in shared programs :2286901 -> 2284473 (-0.11%) total gprs used in shared programs :335256 -> 335273 (0.01%) total local used in shared programs :31968 -> 31968 (0.00%) local gpr inst bytes helped 0 41 852 852 hurt 0 44 23 23 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-12 17:46:03 +02:00
Nicolai Hähnle	85ba409967	mapi: fix out-of-tree build dependencies We shouldn't be using wildcard here in the first place, but changing that is some effort. As it stands, make -p confirms that glapi_gen_mapi_deps only contains mapi_abi.py when building outside the Mesa tree. As a result, only some of the tables were updated when XML files change, but not the tables for shared glapi. This change ensures that we pick up the XML files and scripts from the source tree as dependencies also for shared glapi. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-12 17:36:35 +02:00
Roland Scheidegger	7e86b2ddae	draw: initialize shader inputs This should make the code more robust if a shader tries to use inputs which aren't defined by the vertex element layout (which usually shouldn't happen). No piglit change. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-10-12 15:05:44 +02:00
Edward O'Callaghan	cfbf956dfd	radv: trivial case stmt style fixups Relocate a 'default:' to the end of a case stmt and fix an indent issue. Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2016-10-12 20:12:43 +11:00
Nicolas Koch	fd27d5fd92	anv: Return correct result in EnumeratePhysicalDevices If pPhysicalDevices is too small for all physical devices, the driver must return VK_INCOMPLETE. Since only a single physical device is supported, this is only the case when pPhysicalDeviceCount == 0 && pPhysicalDevices != NULL. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-11 22:58:27 -07:00
Kenneth Graunke	2871d4d687	anv: Allow vp_info to be NULL in 3DSTATE_CLIP code. pViewportState may be NULL if rasterization is disabled. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-11 22:50:19 -07:00
Kenneth Graunke	ba38a9d380	anv: Fix anv_pipeline_validate_create_info assertions. Many of these can be "NULL if the pipeline has rasterization disabled." Also, we should assert that pMultisampleState exists. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-11 22:50:09 -07:00
Ilia Mirkin	389d6dedbe	trace: add invalidate_resource callback Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-11 20:47:54 -04:00
Gustaw Smolarczyk	c3f3c6b0e8	radv/winsys: Fix radv_amdgpu_cs_grow min_size argument. (v2) It's supposed to be how much at least we want to grow the cs, not the minimum size of the cs after growth. v2: Unbreak use_ib_bos. Don't mask the ib_size when !use_ib_bos, since it's not needed. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 09:06:30 +10:00
Grigori Goronzy	a22b5f28fb	radv: fix strict aliasing violation Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 09:00:22 +10:00
Grigori Goronzy	0b539abcf4	radv: fix uninitialized variables This gets rid of "may be used uninitialized" compiler warnings. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 09:00:22 +10:00
Grigori Goronzy	7ca44f8a33	radv: add missing unreachable Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 09:00:22 +10:00
Dave Airlie	8cc9f89d26	radv: remove the validation layer and some related bits. As pointed out by Emil this isn't used in anv anymore, and it was totally unused in radv anyways. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:57:09 +10:00
Dave Airlie	014ec78fb2	radv: drop entrypoint split out. radv really doesn't need different dispatch per gen yet, there really isn't that many differences yet. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:56:41 +10:00
Dave Airlie	12301c5418	radv: drop the RADV_CALL macro. This is leftover from anv, and we really never needed it. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:56:41 +10:00
Dave Airlie	fc28f89157	radv: check driver name before calling amdgpu. This checks the kernel driver name is amdgpu before calling libdrm_amdgpu. This avoids the following error: amdgpu_device_initialize: DRM version is 1.6.0 but this driver is only compatible with 3.x.x when run on a machine with i915 graphics as well as amdgpu. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:56:41 +10:00
Dave Airlie	6215b47648	radv: fix memory leak from physical device if wsi fails Inspired by patch from Edward O'Callaghan <funfunctor@folklore1984.net> which didn't do it right. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:53:44 +10:00
Edward O'Callaghan	e0641c61ca	radv/winsys: Fix mem leak at failed do_winsys_init() call site Probably unlikely however ensure we don't leak a heap allocation on the fail path. V.2: also fix missing 'amdgpu_device_deinitialize()' calls (Emil Velikov). Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:46:10 +10:00
Edward O'Callaghan	4a0db58f14	radv/winsys: Trivial style and readability fixups Drop/add a few newlines where appropriate and drop a couple of unnessary braces. Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:24:50 +10:00
Marek Olšák	b425b57d1e	radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-11 20:04:57 +02:00
Tim Rowley	9db9c61d26	swr: [rasterizer archrast] update proto file Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:48:23 -05:00
Tim Rowley	3805e40f32	swr: [rasterizer archrast] add support for stats files Only stat and counter events are saved to the event files. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:48:23 -05:00
Tim Rowley	f4684cdb5f	swr: [rasterizer jitter] remove architecture override Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:48:23 -05:00
Tim Rowley	185a531206	swr: [rasterizer jitter] adjust jitmanager assert Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:48:17 -05:00
Tim Rowley	eaec263427	swr: [rasterizer] eliminate unused label warnings on gcc Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	12e6f4c879	swr: [rasterizer core] implement depth bounds test Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	1b86c050ad	swr: [rasterizer core] update/add formats Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	a907b7a5f7	swr: [rasterizer core] SwrStoreTiles api change SwrStoreTiles now takes a mask of surfaces to store. Reduces overhead when storing multiple render targets. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	5d5179a6c2	swr: [rasterizer scripts] add ENABLE_ASSERT_DIALOGS knob for windows Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	07326d4006	swr: [rasterizer archrast] add mako template Add template for generating code to save events to a file. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	e845eeb0be	swr: [rasterizer core] disable cull for rect_list Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	b3bd8bb611	swr: [rasterizer core] add support for "RAW" surface format Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	2966d9c691	swr: [rasterizer core] align Macrotile FIFO memory to SIMD size Align and use streaming store instructions for BE fifo queues. Provides slightly faster enqueue and doesn't pollute the caches. Add appropriate memory fences to ensure streaming writes are globally visible. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	6b3691c876	swr: [rasterizer common] remove threadviz code Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	2550b04179	swr: [rasterizer memory] split load/store for compile speed Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Eric Engestrom	0a606a400f	egl: add eglSwapBuffersWithDamageKHR EGL_KHR_swap_buffers_with_damage is actually already supported, as it is technically nothing but a rename of EGL_EXT_swap_buffers_with_damage. To that effect, both extension are advertised depending on the same condition, and the new entrypoint simply redirects to the previous one. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-11 14:04:26 +01:00
Mauro Rossi	b9e639589d	intel/genxml: fix building rules for aubinator required headers New generated headers were introduced by commit `63a366a` "intel: aubinator: generate a standalone binary" Android does not need aubinator yet, so in order to avoid building error, aubinator required new genxml headers are defined in a separate list. If required, building rules for Android will be added later. [Emil Velikov: don't use a _HEADERS variable name (causes warnings)] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-11 13:53:19 +01:00
Emil Velikov	0b54c022a8	radv: automake: move libamdgpu_addrlib.la to VULKAN_LIB_DEPS The static library is analogous to the intel ISL, which is required for both hardware and (to be added) testing library. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-11 13:51:09 +01:00
Emil Velikov	4882476eca	radv: automake: remove unused variables Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-11 13:51:08 +01:00
Emil Velikov	e2cb253346	radv: automake: include the python scripts/formats table in the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-11 13:51:06 +01:00
Tapani Pälli	fc8b358bd6	mesa: fix error handling in _mesa_TransformFeedbackVaryings Patch changes function to use _mesa_lookup_shader_program_err both in TransformFeedbackVaryings and GetTransformFeedbackVarying that handles errors correctly for invalid values of shader program. Fixes following dEQP test: dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.transform_feedback_varyings Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98135 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-11 07:44:33 +03:00
Xu,Randy	d11a63d6e6	i965: solve cubemap negative x/y/z faces buffer offset issue in dEQP. Add the miptree level/slice x/y_offset when count the surface offset in brw_emit_surface_state. The surface offset has two parts, one is from mt->offset, which should be 32 aligned in width/height for tiled buffer; another is from mt->level[current_level].slice[current_slice]. x/y_offset. This fix will solve 12 deqp failure dEQP-EGL.functional.image.create.gles2_cubemap_negative_*_texture Signed-off-by: Xu,Randy <randy.xu@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-11 07:44:18 +03:00
Nicholas Bishop	64435fd888	i915g: fix incorrect gl_FragCoord value On Intel Pineview M hardware, the i915 gallium driver doesn't output the correct gl_FragCoord. It seems to always have an X coord of 0.0 and a Y coord of the window's height in pixels, e.g. 600.0f or such. I believe this is a regression caused in part by this commit: `afa035031f` The old behavior used the output at index zero, while the new behavior uses actual zeroes. In the case of gl_FragCoord the output at index zero happened to be the correct one, so the behavior appeared correct although the code already had a bug. Fixed by checking for I915_SEMANTIC_POS when setting up texCoords. If the generic_mapping is I915_SEMANTIC_POS, look for the TGSI_SEMANTIC_POSITION instead of a TGSI_SEMANTIC_GENERIC output. https://bugs.freedesktop.org/show_bug.cgi?id=97477 Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Tested-by: Stéphane Marchesin <marcheu@chromium.org>	2016-10-10 18:32:36 -07:00
Vinson Lee	c10dcb2ce8	Revert "mesa_glinterop: remove inclusion of GLX header" This reverts commit `8472045b16`. Conflicts: include/GL/mesa_glinterop.h This patch fixes this build error with GCC 4.4. Compiling src/glx/dri_common_interop.c ... In file included from src/glx/dri_common_interop.c:33: include/GL/mesa_glinterop.h:62: error: redefinition of typedef ‘GLXContext’ include/GL/glx.h:165: note: previous declaration of ‘GLXContext’ was here Fixes: `8472045b16` ("mesa_glinterop: remove inclusion of GLX header") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96770 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2016-10-10 15:09:44 -07:00
Axel Davy	eef0744d43	st/nine: More checks for GetRenderTargetData Fixes a wine test crash Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	a52e700169	st/nine: Add debug output for lost devices Add debug output to ease debugging. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	5d85253dc3	st/nine: Prevent crash in GetRenderTargetData Return error instead of crashing on source surfaces with format D3DFMT_NULL. Fix for issue #236. Tested on Windows 7. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	09edc0555f	st/nine: Set CLAMP_TO_EDGE on cubetextures Wine tests show that cubetextures always use PIPE_TEX_WRAP_CLAMP_TO_EDGE regardless of set sampler states. Fixes failing d3d9 wine test test_cube_wrap. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	fa2574497b	st/nine: handle possible failure of D3DWindowBuffer_create Check for errors and pass them to the callers. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	f04fa0a62c	st/nine: Assert on buffer creation failure Add an assert to make sure buffer creation doesn't fail. Add error handling in calling functions. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	f8c01e7a96	st/nine: Use NineDevice9_CreateDepthStencilSurface in swapchain9 Replace custom code with NineDevice9_CreateDepthStencilSurface. All functionality is given now.	2016-10-10 23:43:51 +02:00
Axel Davy	63367e6c95	st/nine: Fix check and remove useless code in swapchain9 The removed code was there for two reasons: 1) Allow DF16, DF24, INTZ to be used as depth buffer for swapchain, if the driver doesn't support PIPE_BIND_SAMPLER_VIEW for the underlying format 2) Set PIPE_BIND_SAMPLER_VIEW if possible, such that if StretchRect is called on the depth texture, it is happy. 1) The reason these formats needed a workaround is because the check flags for them in CheckDeviceFormat were incorrect, which led applications to think the formats were valid for swapchains, even if they weren't supported. 2) StretchRect limitations for depth buffers force the resource_copy_region path, which should be fine without PIPE_BIND_SAMPLER_VIEW. Thus fix the check for 1), and remove the code. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	60624be203	st/nine: Implement MSAA quality levels Advertise quality levels: Each supported multisample count matches to one quality level. The application doesn't know how much samples each quality level has. For that reason it's not possible to set the multisample mask. Return errors on quality level missmatch. Fixes several old games not having multisample support until now. Fix for issue #73. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	8a50b1244f	st/nine: Prepare update_framebuffer for MS quality levels Compare resource's nr_samples instead of D3D multisample level. Required for multisample quality levels to work correct. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	b560305687	st/nine: Add additional error handling in CheckDeviceMultiSampleType Return one supported quality level in error cases. Return error on invalid multisample count. Fixes failing wine tests. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	7afab4ad39	st/nine: Fix compiler warning Use strict aliasing in SetPrivateData and struct pheader. Casting char[1] to IUnknown** isn't allowed in strict aliasing. Compute pointer to body by adding size of header to header pointer. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	b9f31111ac	st/nine: Remove resource9 {Set/Get/Free}PrivateData functions Remove {Set/Get/Free}PrivateData in resource9. Functionality has been implement in IUnknown interface. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	03888e8a46	st/nine: Remove volume9 {Set/Get/Free}PrivateData functions Remove {Set/Get/Free}PrivateData in volume9. Functionality has been implement in IUnknown interface. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	485cba7eb4	st/nine: Switch {Set/Get/Free}PrivateData functions Switch {Set/Get/Free}PrivateData function to introduced IUnknown functions. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	4117f5e1ab	st/nine: Implement {Set/Get/Free}PrivateData in iunknown Implement {Set/Get/Free}PrivateData in iunknown to get rid of duplicated code in resource9 and volume9. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	c1c8e852c1	st/nine: Return device in NineSurface9_GetContainer According to MSDN the device is returned for surfaces that do not have a regular container. Such surfaces are: OffscreenPlainSurface, DepthStencilSurface and RenderTarget Tested and verified on Windows. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	ba0274c7d6	st/nine: Allocate surface resources in surface ctor Allocate resources in surface ctor. Allows to use statetracker internal memory accounting. Fix for issue #231. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Axel Davy	1f65f67b21	st/nine: Fix D3DFMT_NULL size D3DFMT_NULL is mapped to PIPE_FORMAT_NONE. Instead of relying on PIPE_FORMAT_NONE to return a size, pick one. The one picked is the same than Wine. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	9dc792b95b	st/nine: Add debugging output Add DBG calls to NineTexture9_GetLevelDesc and NineTexture9_GetSurfaceLevel to ease debugging. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	8ceb2264c5	st/nine: Fix assert in NineUnknown_QueryInterface Tests showed that is allowed to call this method on object that have a zero refcount. Required for issue #230. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	f2eacef33d	st/nine: Print interface id in NineVolume9_GetContainer To ease debugging print interface id. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Patrick Rudolph	489dbc51ae	st/nine: Print interface id in NineSurface9_GetContainer To ease debugging print interface id. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Patrick Rudolph	e63a38832b	st/nine: Print interface id in NineUnknown_QueryInterface To ease debugging print interface id. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Patrick Rudolph	6a1cce20b6	st/nine: Move assert in NineSurface9_ctor Move assert to function entry. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	851e4b8d8a	st/nine: Properly declare sampler states for ff Fixes a softpipe assertion failure with wine tests Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	5ce23c1689	st/nine: Handle user clipping planes properly for ff Found reading msdn and checking Wine. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	d2fd296648	st/nine: Fix the calculation of the number of vs inputs Fixes hangs on radeonsi, and assert on llvmpipe. Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-10-10 23:43:50 +02:00
Axel Davy	71e7292a85	st/nine: Fix specular w coordinate Found looking at Wine formulas. Fixes a few visual issues. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	732cea09cd	st/nine: Disable parts of lighting calculation if no normal provided Behaviour found in Wine sources, and checked with some test apps. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	fc9bb19dce	st/nine: Fix condition for specular lightning Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	c56c7c1fc8	st/nine: Do always accumulate diffuse According to spec. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	c5bce80f50	st/nine: Initialize ps ff registers Found with wine tests for the rTmp register. Not sure for the other ones. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	4ed3d5ee57	st/nine: Do not pollute rTmp in ff ps Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	d9b8b3196e	st/nine: Allocate temporaries on demand for ps ff Same change than for vs ff. This makes it easier to not introduce mistakes reusing temporaries whose result shouldn't be erased. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	f7dd27aed3	st/nine: Fix texbem Error found with wine tests. nine_shader was expecting another order than the one device9 was using. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	7afcbb49ba	st/nine: Fix ff computation for inverse Thanks to wine tests. Apparently 4x4 inverse is to be used, and if the inverse can't be calculated, the input matrix is to be used. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	36399f9a7f	st/nine: Used normed Vtx for reflectionvector Fix deduced from the spec. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	eda1e6ece7	st/nine: Implement SPHEREMAP Behaviour checked with a test app. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	a3ddc80ec8	st/nine: Enable passthrough only if positiont is used Wine tests for the passthrough feature are for positiont. Nothing seems to indicate passthrough happens when positiont it not used. However having passthrough with positiont makes sense (to be used with ProcessVertices outputs). Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	0b5bed774b	st/nine: Fix wrong mask in ff vs Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	028dab95f6	st/nine: Fix tweening factor computation The computation was reversed. Deduced by tests on windows. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	1fe055338d	st/nine: Disable ff vertex blending if required inputs are missing This behaviour has been partially tested on windows. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	aa69bb6848	st/nine: Use materials if source is not given. Deduced by test on windows. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	ab068a78d3	st/nine: Fix ff SPECULARENABLE We were (wrongly) adding specular to diffuse in vertex shaders when SPECULARENABLE was set. However the spec says specular has to be added after texture processing (which is in ps). Besides SPECULARENABLE is flagged as a pixel state. There was unused support for SPECULARENABLE in the ps ff code. Remove the vs code, and use the ps code. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	1d7890a441	st/nine: Undefined specular should be full of zeros Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	d9330f9348	st/nine: Implement normal transformation with vertex blending The formula is different from the one of the spec, but otherwise nothing particular. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	305e8106ab	st/nine: Increase MaxVertexBlendMatrixIndex Modern cards do advertise 8. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	567be40de9	st/nine: Compact ff vs constants a bit There are several holes. This patch reduces the holes a bit, which reduces the size of the constant buffer uploaded. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	07d1f32e0f	st/nine: Fix vertex blending aVtx computation There was an multiplication by the world matrix 0 which had nothing to do there. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	d9d8cb9f19	st/nine: Reorganize ff vtx processing The new order simplified the code a bit for next patches. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	cde74cba71	st/nine: Small simplification for position_t and fog position_t disables fog computation. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	5d2a8e8a36	st/nine: Cleaning code for vs temporaries This has been a real mess up to now: the temporaries were allocated once, and shared after that between the different parts of the code. To help maintaining the code, the temporaries are now allocated and released on need. As surprising as it could be, this patch, which was supposed to introduce no behaviour change, actually solved a visual bug observed on a sample program. This was due to ureg_normalize3 polluting a temporary variable. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	1f18b6f351	st/nine: No need for the local flag for temporaries in ff Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	eb9ad8f969	st/nine: Handle D3DRS_NORMALIZENORMALS When this state is set, the normals computed in the vs ff shader should be normalized. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	b9639c661f	st/nine: Initial ProcessVertices support For now only VS 3 support is implemented. This enables The Sims 2 to work. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	3bf02d383f	st/nine: Partial software vertex processing support Software Vertex Processing allows: . Less limitations for shaders (more loops, etc) . Less limitations for ff (more enabled lights, 255 matrices for VertexBlend) In particular shaders can get more constants. This patch implements support for this (not using software rendering, but hardware rendering, as llvmpipe and dx10+ hw have the same limits...) This is considered a second class path. Even apps asking for "Mixed Vertex processing" (ie the ability to switch to swvp on demand) do not use the feature much. Some just initialize more constants than the normal limit at the start of the application, but never use more than the normal limit. When the apps do not need the software vertex processing features, they do not seem to turn it on. This means it is ok if that path is slow. Thus no care has been made to make the path optimized. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	f8c8f44244	st/nine: Rework vs int and bool constants buffer This will help to support swvp constants. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	a83dce0128	st/nine: Change dirty tracking for vs int and bool constants This change makes easier to introduce tracking for swvp constants. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	f78089b962	st/nine: Drop unused constant upload path This path has been disabled for some time because of some bugs with it. It hasn't been updated to the new features, and is not faster. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:49 +02:00
Axel Davy	1604efa6fd	st/nine: Add support for swvp constants in shaders swvp has relaxed limits (more nested loops, etc). In particular it enables more constants. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	56ea3df7d4	st/nine: Initial mixed vertex processing support In mixed vertex processing, the user can enable or disable software vertex processing. It is on hardware by default. This feature is not a state, and thus the setting doesn't need to be recorded by stateblocks. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	747f1ef8b6	st/nine: Implement SetNPatchMode Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	ded7a73eb3	st/nine: Implement D3DUSAGE_SOFTWAREPROCESSING Buffers with this flag must be usable with both software and hardware vertex processing. Use Staging for fast cpu access. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:49 +02:00
Patrick Rudolph	19703f2a36	st/nine: Allocate more space for ATI1 ATIx are "unknown" formats that do not follow block format conventions. Tests showed that pitch*height bytes are allocated. apitrace used to depend on this behaviour. It used to copy more bytes than it has to for the ATI1 block format, but it didn't crash on Windows. Increase buffersize for ATI1 to fix this crash. The same issue was present in WINE but a patch has been sent by me. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Patrick Rudolph	ec6c636722	st/nine: Add missing break Add missing break instruction. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	03f60a3357	st/nine: Implement relative addressing for ps inputs To implement the feature we copy the ps inputs to a temp array. This is not optimal for performance, but it is the simplest solution. This is a feature that is very very rarely used. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	a5d308e51a	st/nine: Wait for pending tasks to execute in swapchain Fixes crash after Reset() when using thread_submit=true Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	f090705075	st/nine: Use fixed size arrays for swapchain buffers Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Patrick Rudolph	a719800cb8	st/nine: Fix buffer count check for Ex devices Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	9ff0dc3129	st/nine: Disable seamless cubemap for d3d d3d9 doesn't have seamless cubemap. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	f0ec54ee32	st/nine: Fix some check flags Uses the new defines introduced in previous commit. See comment in the commit for more explanation. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	39e98d351f	st/nine: Unify some check flags The new defines will be reused in a later patch. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:48 +02:00
Axel Davy	2290eac84e	gallium/util: Really allow aliasing of dst for u_box_union_* Gallium nine relies on aliasing to work with this function. Without this patch, dirty region tracking was incorrect, which could lead to incorrect textures or vertex buffers. Fixes several game bugs with nine. Fixes https://github.com/iXit/Mesa-3D/issues/234 Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-10-10 23:43:48 +02:00
Axel Davy	5e7f0ebe29	softpipe: Cap to 2 GB on 32 bits On 32 bits system, application memory is quite limited. softpipe uses application memory. To help prevent memory exhaustion, limit reported memory availability to 2GB. Some gallium nine apps do check reported memory by allocating resources until memory is full. Gallium nine refuses allocations when 80% of the reported memory limit is used. This change helps some apps to start. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-10-10 23:43:48 +02:00
Axel Davy	814ca96d0d	llvmpipe: Cap to 2 GB on 32 bits On 32 bits system, application memory is quite limited. llvmpipe uses application memory. To help prevent memory exhaustion, limit reported memory availability to 2GB. Some gallium nine apps do check reported memory by allocating resources until memory is full. Gallium nine refuses allocations when 80% of the reported memory limit is used. This change helps some apps to start. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-10-10 23:43:48 +02:00
Axel Davy	218459771a	gallium/os: Fix overflow on 32 bits On systems with more than 4GB of ram, os_get_total_physical_memory was triggering an integer overflow for the linux and haiku path, when on 32 bits. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94561 Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-10 23:43:48 +02:00
Axel Davy	9904581dc6	st/nine: Memset pipe_resource templates Fixes regression introduced by `ecd6fce261` and is more future proof than just clearing the next field. Other nine usages did already zero out the templates. Signed-off-by: Axel Davy <axel.davy@ens.fr> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-10 23:43:48 +02:00
Samuel Pitoiset	d43151318a	nvc0: fix valid range for shader buffers When offset != 0, the valid range was wrong because the second argument of util_range_add() is end, not size. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-10 21:32:16 +02:00
Ilia Mirkin	5239bd5920	nvc0/ir: fix overwriting of value backing non-constant gather offset Normally the value is an immediate, which is moved to some temporary, so there's no problem. In the case of a non-constant offset (as allowed by ARB_gpu_shader5), we have to take care to copy it first before using it to build up the bits. This fixes a compilation error observed in F1 2015. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-10-10 14:28:32 -04:00
Vinson Lee	0a898ec28b	glsl: Add missing cache_destroy stub function. CC glsl/tests/cache_test.o glsl/tests/cache_test.c: In function ‘test_cache_create’: glsl/tests/cache_test.c:160:4: error: implicit declaration of function ‘cache_destroy’ [-Werror=implicit-function-declaration] cache_destroy(cache); ^ Fixes: `87ab26b2ab` ("glsl: Add initial functions to implement an on-disk cache") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-10 11:17:31 -07:00
Anuj Phogat	f8f6f60a36	docs: Mark GL_OES_viewport_array done on i965 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2016-10-10 10:48:38 -07:00
Chad Versace	8044885182	egl: Unify the EGLint/EGLAttrib paths in eglCreateSync* (v3) Pre-patch, there were two code paths for parsing EGLSync attribute lists: one path for old-style EGLint lists, used by eglCreateSyncKHR, and another for new-style EGLAttrib lists, used by eglCreateSync (1.5) and eglCreateSync64 (EGL_KHR_cl_event2). There were two attrib_list parsing functions, _eglParseSyncAttribList(_EGLSync sync, const EGLint attrib_list) _eglParseSyncAttribList64(_EGLSync sync, const EGLattrib attrib_list) This patch unifies the two attrib_list parsing functions into one, _eglParseSyncAttribList(_EGLSync sync, const EGLattrib attrib_list) Many internal EGLSync function signatures had two attrib_list parameters to accomodate both code paths: one parameter was an EGLint list and other an EGLAttrib list. At most one of the parameters was allowed to be non-null. This patch removes the `EGLint attrib_list` parameter, leaving only the `EGLAttrib attrib_list` parameter, for all internal EGLSync functions. v2: - Consistently use condition (sizeof(int_list[0]) == sizeof(attrib_list[0])). [for emil] v3: - Don't double-unlock the display in eglCreateSyncKHR. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v2)	2016-10-10 09:54:11 -07:00
Eric Anholt	0f99c0686e	intel: Fix bash-specific redirection. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-10-10 09:50:05 -07:00
Eric Anholt	ec9ed1c4d8	gallium: Fix install-gallium-links.mk on non-bash /bin/sh Debian uses dash by default, which doesn't do '+='. Fixes servo's osmesa-based headless testing system, which was looking for libOSMesa in the lib/ directory. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-10-10 08:56:12 -07:00
Ilia Mirkin	ec05331a7b	nv50/ir: only stick one preret per function A function with multiple returns would have had multiple preret settings at the top of the function. While this is unlikely to have caused issues since we don't use functions in earnest, it could have in some cases overflowed the call stack, in case a function had a lot of early returns. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-10 10:45:06 -04:00
Nicolai Hähnle	1f95121626	radeonsi: make more use of si_have_tgsi_compute Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-10 10:38:33 +02:00
Nicolai Hähnle	38cfd5160a	gallium/radeon: assign a name to LLVM output variables in debug builds This can be helpful with R600_DEBUG=preoptir. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-10 10:38:30 +02:00
Nicolai Hähnle	39a29c2431	gallium/radeon: avoid redundant work with overlapping in/out arrays Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-10 10:37:50 +02:00
Nicolai Hähnle	77c81164bc	radeonsi: support ARB_compute_variable_group_size Not sure if it's possible to avoid programming the block size twice (once for the userdata and once for the dispatch). Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-10 10:36:42 +02:00
Lionel Landwerlin	014bd4acb8	anv: turn on samplerAnisotropy in VkPhysicalDeviceFeatures According to the Vulkan spec 5.63.4 : samplerAnisotropy indicates whether anisotropic filtering is supported. If this feature is not enabled, the maxAnisotropy member of the VkSamplerCreateInfo structure must be 1.0. Since we already set maxAnisotropy to 16 and program the hardware according to the VkSamplerCreateInfo.maxAnisotropy, it seems we can turn this on. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-10 09:25:38 +01:00
Edward O'Callaghan	ba43768a1e	radv: Use proper header guards over 'pragma once' directives Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-10 16:10:56 +11:00
Tapani Pälli	2d7e0f35c5	mesa: throw error if bufSize negative in GetSynciv on OpenGL ES Fixes following dEQP tests: dEQP-GLES31.functional.debug.negative_coverage.callbacks.state.get_synciv dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_synciv dEQP-GLES31.functional.debug.negative_coverage.log.state.get_synciv v2: drop _mesa_is_gles check (Kenneth) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98133 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-10 07:29:31 +03:00
Tapani Pälli	d997d5c0c9	glsl: prohibit lowp, mediump precision on atomic_uint Fixes following dEQP tests: dEQP-GLES31.functional.debug.negative_coverage.callbacks.atomic_counter.atomic_precision dEQP-GLES31.functional.debug.negative_coverage.get_error.atomic_counter.atomic_precision dEQP-GLES31.functional.debug.negative_coverage.log.atomic_counter.atomic_precision Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98131 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-10 07:29:31 +03:00
Tapani Pälli	c64093e7d5	glsl: optimize copy_propagation_elements pass Changes make copy_propagation_elements pass faster, reducing link time spent in test case of bug 94477. Does not fix the actual issue but brings down the total time. No regressions seen in CI. v2 (idr): Formatting / whitespace fixes. Embed the acp_ref in the acp_entry. v3 (idr): Delete unused copy constructor. Use while(pop_head) instead of foreach() { remove }. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-10 07:29:31 +03:00
Dave Airlie	db5d278541	radv: don't build without SHA1. Just copy the section from anv above this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98167 Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-10 10:08:47 +10:00
Edward O'Callaghan	185be15d9d	docs/features.txt: Add GL_KHR_robustness supported on ES 3.2 Both radeonsi and nvc0 should also support ES so fixup doc. Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-09 01:06:38 +11:00
Lionel Landwerlin	4682abdaa8	intel: aubinator: enable loading dumps from standard input In conjuction with an intel_aubdump change, you can now look at your application's output like this : $ intel_aubdump -c '/path/to/aubinator --gen=hsw' my_gl_app v2: Add print_help() comment about standard input handling (Eero) Remove shrinked gtt space debug workaround (Eero) v3: Use realloc rather than memcpy/free (Ben) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>	2016-10-08 02:18:47 +01:00
Lionel Landwerlin	619c8de522	intel: aubinator: enable loading xml files from a given directory This might be useful for people who debug with out of tree descriptions. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>	2016-10-08 02:17:35 +01:00
Lionel Landwerlin	63a366a881	intel: aubinator: generate a standalone binary Embed the xml files into the binary, so aubinator can be used from any location. v2: Split generation packing into another patch (Jason) Check for xxd (Jason) v3: Fix out of tree builds (Jason) Generate custom variable name rather than names generated by xxd (Lionel) v4: Move generated _xml.h files to genxml/ (Sirisha) v5: Remove newline from makefile (Jason) v6: Add comment on gen*_xml.h creation (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-08 02:17:03 +01:00
Nanley Chery	4d7d9825f3	anv/TODO: Update the HiZ task Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-07 12:54:18 -07:00
Nanley Chery	d8aacc24cc	anv: Enable fast depth clears Provides an FPS increase of ~30% on the Sascha triangle and multisampling demos. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-07 12:54:18 -07:00
Chad Versace	78d074b87a	anv/cmd_buffer: Enable rendering to HiZ Nanley Chery: (rebase) - Resolve conflicts with new anv_batch_emit macro (amend) - Handle a QPitch TODO - Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems - Only use HiZ for single-subpass renderpasses - Emit the HiZ instruction before the stencil instruction to follow the optimized clear sequence specified in the PRMs - Don't modify clear params - Enable resolves when a HiZ buffer is used to ensure depth buffer validity Provides an FPS increase of ~15% on the Sascha triangle and multisampling demos. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:18 -07:00
Nanley Chery	134d181be1	anv/cmd_buffer: Add code for performing HZ operations Create a function that performs one of three HiZ operations - depth/stencil clears, HiZ resolve, and depth resolves. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-07 12:54:18 -07:00
Jason Ekstrand	9919a2d34d	anv/image: Memset hiz surfaces to 0 when binding memory Nanley Chery (amend): - Change memset value from 0xff to 0 (a defined value for HiZ). Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:18 -07:00
Jason Ekstrand	b4bbabf21b	anv: Move BindImageMemory to anv_image.c Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:18 -07:00
Chad Versace	917814dccd	anv: Allocate hiz surface Nanley Chery: (rebase) - Use isl_surf_get_hiz_surf() (amend) - Only add a HiZ surface onto a depth/stencil attachment - Add comment above HiZ surface addition - Hide HiZ behind INTEL_VK_HIZ prior to BDW - Disable HiZ for untested cases - Remove DISABLE_AUX_BIT instead of preventing it from being added Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-07 12:54:18 -07:00
Chad Versace	3aec432ed3	anv: Add func anv_image_has_hiz() Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:17 -07:00
Chad Versace	fe40d026a1	anv: Add anv_image::hiz_surface Unused. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:17 -07:00
Nanley Chery	814fa12379	isl: Correct a comment in the isl_format enum HiZ is not a color surface, but an auxiliary depth surface. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:17 -07:00
Rob Clark	495ba8884a	gallium: add missing zero-init for resource templates Mostly test code, plus one spot I noticed in r600. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 15:50:46 -04:00
Rob Clark	3ebfc44b42	freedreno: don't try to shadow layered textures We will only hit this with multi-planar YUV external images, so we would probably never hit this code path in the first place. But if we did, it wouldn't do the right thing so just bail. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-10-07 15:50:46 -04:00
Rob Clark	f88f025e8c	freedreno/a3xx+a4xx: fix clip-plane lowering state If enabled clip-planes have changed, we need to mark program state dirty. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-10-07 15:50:46 -04:00
Ian Romanick	f546b41f6a	glsl: Let cache_test build when the shader cache is not enabled Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Tested-by: Aaron Watry <awatry@gmail.com>	2016-10-07 11:19:37 -07:00
Lionel Landwerlin	eb23de6116	anv: pipeline cache: fix return value of vkGetPipelineCacheData According to the spec - 9.6. Pipeline Cache : If pDataSize is less than the maximum size that can be retrieved by the pipeline cache, at most pDataSize bytes will be written to pData, and vkGetPipelineCacheData will return VK_INCOMPLETE. Fixes the following test from Vulkan CTS : dEQP-VK.pipeline.cache.pipeline_from_incomplete_get_data.vertex_stage_fragment_stage dEQP-VK.pipeline.cache.pipeline_from_incomplete_get_data.vertex_stage_geometry_stage_fragment_stage dEQP-VK.pipeline.cache.misc_tests.invalid_size_test Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 18:46:12 +01:00
Timothy Arceri	965ebc8b28	util: remove unused variable Also initialise page at declaration. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 21:24:50 +11:00
Martin Peres	a599b1c203	loader/dri3: import prime buffers in the currently-bound screen This tries to mirrors the codepath taken by DRI2 in IntelSetTexBuffer2() and fixes many applications when using DRI3: - Totem with libva on hw-accelerated decoding - obs-studio, using Window Capture (Xcomposite) as a Source - gstreamer with VAAPI v2: - introduce get_dri_screen() in the dri3 loader's vtable (krh) Tested-by: Timo Aaltonen <tjaalton@ubuntu.com> Tested-by: Ionut Biru <biru.ionut@gmail.com> Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71759 Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2016-10-07 11:11:55 +03:00
Martin Peres	0247e5ee3e	loader/dri3: add get_dri_screen() to the vtable This allows querying the current active screen from the loader's common code. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2016-10-07 11:11:44 +03:00
Jason Ekstrand	82b4f1c47b	anv/entrypoints: Save off the entire devinfo rather than a pointer Since the gen_device_info structs are no longer just constant memory, a pointer to one is not a pointer to something in the .data section so we shouldn't be storing it in a static variable. Instead, we should just store the entire device_info structure. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-06 21:13:52 -07:00
Dave Airlie	85a47f647e	radv: drop all uint for unsigned. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-07 12:09:13 +10:00
Eric Anholt	20d91e5ce9	vc4: Don't worry about partial Z/S clear if the other is already cleared. We have to be careful to not smash the value they're clearing to, but other than that we're fine. Avoids quad clears in Processing, which likes to do glClear(Z\|S); glClear(Z). Improves performance of Processing's QuadRendering demo at 5000 quads by 5.46507% +/- 1.35576% (n=15 before, 32 after)	2016-10-06 18:29:16 -07:00
Eric Anholt	cb328123fe	vc4: Try to fix the HW-2116 workaround. We were incrementing the count at the end of vc4_start_draw(), except that that function returns immediately if we've already started drawing on this batch. It also failed to count the statechanges from the GFXH-515 workaround. This incidentally allows repeated glClear() to be coalesced, because the fast clears aren't counted in draw_calls_queued any more. Fixes most of the extra flushes in Processing, which emits glClear(Z\|S); glClear(Z); glClear(C) during its frame setup. Improves performance of Processing's QuadRendering demo at 5000 quads by 3.33538% +/- 2.05846% (n=21 before, 15 after)	2016-10-06 18:29:12 -07:00
Eric Anholt	bca9a58d04	vc4: Drop dead argument from vc4_start_draw().	2016-10-06 18:09:24 -07:00
Eric Anholt	9421a6065c	vc4: Fix fallback to quad clears of depth in GLX. The fix in the vc4-jobs series ended up triggering the fallback path on GLX apps that use depth but not stencil.	2016-10-06 18:09:24 -07:00
Eric Anholt	8810270d06	vc4: Add the format name in miptree_debug. I was curious if my Z/S buffer was actually ZS or ZX, and the vc4 format of "0" didn't tell me much.	2016-10-06 18:09:24 -07:00
Eric Anholt	ee577e7fa7	vc4: Fix perf debug formatting on partial Z/S clear.	2016-10-06 18:09:24 -07:00
Eric Anholt	7c7bcbbc7d	vc4: Drop destination register when it's unused. This slightly reduces instructions on shader-db, but I think it's just perturbing register allocation -- the allocator should have always trivially colored these nodes, before. This commit is just to make QIR code failing more intelligible when register allocation fails.	2016-10-06 18:09:24 -07:00
Eric Anholt	d4ae5ca823	vc4: Fix live intervals analysis for screening defs in if statements. If a conditional assignment is only conditioned on the exec mask, that's still screening off the value in the executed channels (and, since we're not storing to the unexcuted channels, we don't care what's in there). Fixes a bunch of extra register pressure on Processing's Ribbons demo, which is failing to allocate.	2016-10-06 18:09:24 -07:00
Eric Anholt	06cc3dfda4	vc4: Fix simulator when more than one vc4_screen is opened. We would assertion fail in setting up the simulator the second time around. This at least postpones the assertion failure until we've closed all of the first set of screens and started opening a new set.	2016-10-06 18:09:24 -07:00
Eric Anholt	b30205b112	vc4: Fix assertion fails from trying to cast non-ALU instrs to ALU. Fixes 100 piglit tests since the assertions were added to nir.h. What's amazing is that these tests used to pass, even when casting garbage.	2016-10-06 18:09:24 -07:00
Jason Ekstrand	c81ec84c1e	anv/cmd_buffer: Move the clear_subpasses calls to set_subpass Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-06 16:52:31 -07:00
Jason Ekstrand	b548fdbed5	anv/cmd_buffer: Don't call set_subpass in a secondary Initially, we had intended set_subpass to be an interesting function that did whatever (presumably a lot) setup we needed for a subpass. In reality, it just sets a pointer and a dirty bit and then emits depth and stencil state. When we call BeginCommandBuffer on a secondary, there's no point in setting depth and stencil state since it will already be set by the primary. Instead, the only thing we need to do at the start of a secondary is set the subpass pointer and the dirty bit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-06 16:52:31 -07:00
Jason Ekstrand	fe4e276b02	anv/cmd_buffer: Rework descriptor dirtying in set_subpass We have a DIRTY_RENDER_TARGETS flag and that makes a lot more sense than just dirtying fragment descriptors. We're checking for it in some of the gen7 code but unfortunately, nothing was setting it and it didn't do what it was supposed to do in cmd_buffer_flush_state. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-06 16:52:31 -07:00
Jason Ekstrand	a1db0e87ff	anv/wsi: Advertise UNORM formats as well as sRGB Because WSI images are created with VkImageCreateInfo::flags explicitly set to 0, they don't ever have the VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT set. This means that you can't create an image view of it with a different format so applications can't render directly in sRGB (without automatic encoding) unless we actually advertise UNORM formats. There are a lot of applications that want to do their own sRGB conversion, so we should allow for that. We do, however, make UNORM come after sRGB in the list so that the default for dumb apps that just grab the first thing is to render in linear and let the sRGB conversion happen automatically. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-06 16:52:31 -07:00
Dave Airlie	5267124648	radv: fix configure.ac check This should be positive test. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-07 09:28:03 +10:00
Gustaw Smolarczyk	24815bd7b3	radv: Skip already signalled fences. If the user created a fence with VK_FENCE_CREATE_SIGNALED_BIT set, we shouldn't fail to wait for a fence if it was not submitted since that is not necessary. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-07 09:24:09 +10:00
Dave Airlie	f4e499ec79	radv: add initial non-conformant radv vulkan driver This squashes all the radv development up until now into one for merging. History can be found: https://github.com/airlied/mesa/tree/semi-interesting This requires llvm 3.9 and is in no way considered a conformant vulkan implementation. It can run a number of vulkan applications, and supports all GPUs using the amdgpu kernel driver. Thanks to Intel for providing anv and spirv->nir, and Emil Velikov for reviewing build integration. Parts of this are: Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Authors: Bas Nieuwenhuizen and Dave Airlie Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-07 09:16:09 +10:00
Samuel Pitoiset	28ecd3eac2	nv50/ir: fix wrong check when optimizing MAD to SHLADD Checking if MAD is supported is definitely wrong, and it's more likely a typo I introduced few days ago which breaks NV50 because SHLADD is not supported there. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-07 01:13:06 +02:00
Lionel Landwerlin	0b10152b80	intel: aubinator: use getopt to parse arguments Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sirisha Gandikota <sirisha.gandikota@intel.com>	2016-10-07 00:05:56 +01:00
Samuel Pitoiset	a198883bf7	nvc0: dump program binary only when NV50_PROG_DEBUG is set When the chipset is forced with NV50_PROG_CHIPSET, we actually only want to output the binary if NV50_PROG_DEBUG is also enabled. Otherwise, this pollutes the shader-db output. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-07 01:01:17 +02:00
Jason Ekstrand	325b3fd668	nir: Fix the control flow tests for nir_loop_first_block changes Commit `2ed17d46de` changed nir_loop_first_cf_node and friends to return a nir_block instead of a nir_cf_node. This broke one of the NIR control flow tests. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98128	2016-10-06 15:48:30 -07:00
Samuel Pitoiset	e3f586c98d	docs: mark ARB_compute_variable_group_size as done for nvc0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	56a0bed2c1	nvc0: expose ARB_compute_variable_group_size Only expose 512 threads/block on Fermi to not be limited by 32 GPRs/thread. v4: - use 512 threads on Fermi, 1024 on Kepler+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	11e75fffeb	nv50/ir: set number of threads/block for variable local size When a variable local size is defined as specified by ARB_compute_variable_group_size, the fixed local size is set to 0 and a SIGFPE occurs when we compute the maximum number of regs. This allows to use 64 GPRs/thread. v4: - use 512 threads on Fermi, 1024 on Kepler+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	590734fa0d	st/mesa: expose ARB_compute_variable_group_size This extension is only exposed if the underlying driver supports ARB_compute_shader and if PIPE_COMPUTE_MAX_VARIABLE_THREADS_PER_BLOCK is set. v3: - initialize max_variable_threads_per_block to 0 v2: - expose the ext based on that new cap Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	dfd7734cb7	st/mesa: add support for dispatching a variable local size Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	e78bd48b9c	st/mesa: add mapping for SYSTEM_VALUE_LOCAL_GROUP_SIZE gl_LocalGroupSizeARB can be translated into TGSI_SEMANTIC_BLOCK_SIZE which represents the block size in threads. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	07bb4513c6	gallium: add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCK v3: - use a new case statement in r600_pipe_common.c - fix compilation of softpipe... Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	48de9aaa72	glsl: add gl_LocalGroupSizeARB as a system value v2: - only add it if the ext is enabled (Ilia) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	dee627a16e	glsl/linker: handle errors when a variable local size is used Compute shaders can now include a fixed local size as defined by ARB_compute_shader or a variable size as defined by ARB_compute_variable_group_size. v2: - update formatting spec quotations (Ian) - various cosmetic changes (Ian) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	008e785f74	glsl: reject compute shaders with fixed and variable local size The ARB_compute_variable_group_size specification explains that when a compute shader includes both a fixed and a variable local size, a compile-time error occurs. v2: - update formatting spec quotations (Ian) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	dd2bda7002	glsl: process local_size_variable input qualifier This is the new layout qualifier introduced by ARB_compute_variable_group_size which allows to use a variable work group size. v4: - add missing '%s' in the monster format string Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	d5c8481d57	glsl: add enable flags for ARB_compute_variable_group_size This also initializes the default values for the standalone compiler. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	45ab63c0cb	mesa/main: add support for ARB_compute_variable_groups_size v5: - replace fixed_local_size by !LocalSizeVariable (Nicolai) v4: - slightly indent spec quotes (Nicolai) - drop useless _mesa_has_compute_shaders() check (Nicolai) - move the fixed local size outside of the loop (Nicolai) - add missing check for invalid use of work group count v2: - update formatting spec quotations (Ian) - move the total_invocations check outside of the loop (Ian) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	a063f3084a	glapi: add entry points for GL_ARB_compute_variable_group_size v2: - correctly sort that new extension (Ian) - fix up the comment (Ian) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Karol Herbst	f96945c5b5	nv50/ir: optimize sub(a, 0) to a helped some ue4 demos and divinity OS shaders total instructions in shared programs : 2818674 -> 2818606 (-0.00%) total gprs used in shared programs : 379273 -> 379273 (0.00%) total local used in shared programs : 9505 -> 9505 (0.00%) total bytes used in shared programs : 25837792 -> 25837192 (-0.00%) local gpr inst bytes helped 0 0 33 33 hurt 0 0 0 0 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2016-10-06 19:39:51 +02:00
Brian Paul	6963f94e98	st/mesa: move all sampler view code into new st_sampler_view.[ch] files Previously, the sampler view code was scattered across several different files. Note, the previous REALLOC(), FREE() for st_texture_object::sampler_views are replaced by realloc(), free() to avoid conflicting macros in Mesa vs. Gallium. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-06 11:29:32 -06:00
Brian Paul	e5cc84dd43	st/mesa: optimize pipe_sampler_view validation Before, st_get_texture_sampler_view_from_stobj() did a lot of work to check if the texture parameters matched the sampler view (format, swizzle, min/max lod, first/last layer, etc). We did this every time we validated the texture state. Now, we use a ctx->Driver.TexParameter() callback and a couple other checks to proactively release texture views when we know that view-related parameters have changed. Then, the validation step is simplified: - Search the texture's list of sampler views (just match the context). - If found, we're done. - Else, create a new sampler view. There will never be old, out-of-date sampler views attached to texture objects that we have to test. Most apps create textures and set the texture parameters once. This make sampler view validation much cheaper for that case. Note that the old texture/sampler comparison code has been converted into a set of assertions to verify that the sampler view is in fact consistent with the texture parameters. This should help to spot any potential regressions. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-06 11:29:32 -06:00
Brian Paul	0f3aee888e	mesa: call ctx->Driver.TexParameter() in texture_buffer_range() To inform drivers of texture buffer offset/size changes, as we do for other texture object parameters. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-06 11:29:32 -06:00
Brian Paul	b3127a96a9	st/mesa: consolidate view format setup code Before, we had code to compute the sampler view's format spread across two different functions: in update_single_texture() and st_get_texture_sampler_view_from_stobj(). Now it's all in one new function. Also, use _mesa_texture_base_format() to simplify the code. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-06 11:29:32 -06:00
Brian Paul	628e651f64	st/mesa: add some const qualifiers in st_atom_texture.c And minor code reformatting. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-06 11:29:32 -06:00
Brian Paul	b3c8935165	st/mesa: simplify some code in get_texture_format_swizzle() There's no need to cast to st_texture_image. Just use gl_texture_image. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-06 11:29:31 -06:00
Brian Paul	9add37b100	mesa: make _mesa_texture_buffer_range() static Not called from any other file. Also, add a comment. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-06 11:29:31 -06:00
Brian Paul	92188c207e	mesa: add const qualifier, comment on can_avoid_reallocation() Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-06 11:29:31 -06:00
Brian Paul	57279c5454	mesa: add comment/assertion on get_tex_level_parameter_buffer() Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-06 11:29:31 -06:00
Jason Ekstrand	ae032e5ea6	nir: Remove some no longer needed asserts Now that the NIR casting functions have type assertions, we have a bunch of assertions that aren't needed anymore. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-10-06 09:16:39 -07:00
Jason Ekstrand	2ed17d46de	nir: Make nir_foo_first/last_cf_node return a block instead One of NIR's invariants is that control flow lists always start and end with blocks. There's no good reason why we should return a cf_node from these functions since we know that it's always a block. Making it a block lets us remove a bunch of code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-10-06 09:16:37 -07:00
Jason Ekstrand	7a3bcadf4e	nir: Add asserts to the casting functions This makes calling nir_foo_as_bar a bit safer because we're no longer 100% trusting in the caller to ensure that it's safe. The caller still needs to do the right thing but this ensures that we catch invalid casts with an assert rather than by reading garbage data. The one downside is that we do use the casts a bit in nir_validate and it's not a validate_assert. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-10-06 09:16:24 -07:00
Steven Toth	e00fdd643b	gallium/hud: Remove superfluous debug No longer required. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 16:37:06 +01:00
Emil Velikov	03350c9708	amd: add amd_kernel_code_t.h to the sources list Otherwise it won't be picked in the tarball and the build will fail. Fixes: `91ec6e5664` ("radeonsi/compute: Use the HSA abi for non-TGSI compute shaders v3") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 16:17:51 +01:00
Emil Velikov	b634be0e69	svga: add svga_mksstats.h to the sources list Otherwise it won't be picked in the tarball and the build will fail. Fixes: `0035f7f136` ("svga: add guest statistic gathering interface") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 16:17:09 +01:00
Emil Velikov	78a7415f0b	glx: rename choose_visual(), drop const argument The function deals with fb (style) configs, thus using "visual" in the name is misleading. Which in itself had led to the use of fbconfig_style_tags argument. Rename the function to reflect what it does and drop the unneeded argument. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-06 15:03:47 +01:00
Emil Velikov	2e9e05dfca	glx: return GL_FALSE from glx_screen_init where applicable. Return GL_FALSE if we fail to find any fb/visual configs, otherwise we end up with all sorts of chaos further down the GLX stack. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-06 15:03:47 +01:00
Emil Velikov	e542ed463d	glx: correctly mask the drawableType for GLX_ARB_fbconfig_float The comment/spec says - only for pbuffer drawables, while the code clears the window/pixmap bit. Practise what you preach and apply the trivial tweak. In practise this should not cause functional change. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-06 15:03:46 +01:00
Chuck Atkins	a89faa2022	autoconf: Make header install distinct for various APIs (v2) This fixes a problem where GL headers would only get installed if glx was enabled. So if osmesa was enabled but not glx, then the GL headers required by osmesa would be missing from the install. v2: Dropped unneeded mesa_glinterop.h redundant osmesa.h install Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	0216a16819	mesa: annotate AttribFuncsARB[] as const It's read-only data, so annotate it accordingly. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	0728e2bb17	mapi/glapi: remove unused _glapi_check_table() Similar to earlier commit - symbol was never part of the public API so we're safe to remove it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	96b9ec1ea3	glapi/hgl: remove the final user of _glapi_check_table() The symbol is a no-op since, the EXTRA_DEBUG macro is not set in the build. Unused by !Haiku people/platforms since 2010 (commit `a73c6540d9`) while the Haiku C++ wrapper has no obvious users. Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	79835565c3	mapi/glapi: remove unused _glapi_check_table_not_null Function was never part of the API/ABI and the final user was removed with commit `a73c6540d9`, back in 2010. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	9b7fd4080a	st/xvmc/tests: force enable assertions Similar to the other 'tests', enable assertions in xvmc_bench. This silences the GCC warnings about unused-variable(s), makes the program actually useful, as the XvMC API called. Atm the function calls are omitted, since they're called within the assert. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	0b6837a643	anv: automake: ship intel_icd.json.in in the tarball Otherwise we'll fail to (re)generate intel_icd.json. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	a42115d6e2	intel: automake: reference the correct header The header was renamed with earlier commit, so update the Makefile.sources respectively. {vulkan/genX_multisample.h => common/gen_sample_positions.h} Fixes: c779ad3e661("intel: Move Vulkan sample positions to common code") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 15:03:46 +01:00
Lionel Landwerlin	b84234fd28	intel: aubinator: add missing return characters Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-06 10:39:53 +01:00
Kenneth Graunke	f7659e02c3	nir: Delete open coded type printing. glsl_print_type() prints arrays of arrays incorrectly. For example, a type with name float[3][7] would be printed as float[7][3]. (This is an array of length 3 containing arrays of 7 floats.) cdecl says that the type name is correct. glsl_print_type() doesn't really do anything above and beyond printing type->name, and glsl_print_struct() wasn't used at all. So, drop them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-06 02:13:36 -07:00
Philipp Zabel	0408d50f43	anv: fix GetPhysicalDeviceProperties to return timestampPeriod in ns According to chapters 16.5. (Timestamp Queries) and 30.2 (Limits) of the Vulkan Specification 1.0.29, the .limits.timestampPeriod field returned by vkGetPhysicalDeviceProperties is measured in nanoseconds, not in seconds. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 02:02:35 -07:00
Timothy Arceri	88428fbe41	i965: remove remaining tabs in brw_draw.c Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:04:16 +11:00
Timothy Arceri	7627fbd9b0	i965: get inputs read from nir info This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:04:09 +11:00
Timothy Arceri	7ef8286487	i965: get outputs written from nir info This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:04:03 +11:00
Timothy Arceri	b526a9b708	i965: get outputs read from nir info This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:03:57 +11:00
Timothy Arceri	a38c809f6e	i965: remove remaining tabs in brw_wm.c Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:03:52 +11:00
Timothy Arceri	201f940d2e	mesa: remove the UsesDFdy flag Seems the last user of this was removed in `08bc74e69`. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:03:46 +11:00
Timothy Arceri	556335eb99	i965: get uses discard from nir info This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:03:40 +11:00
Timothy Arceri	ee829cba8e	i965: get uses texture gather from nir info This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:03:00 +11:00
Kenneth Graunke	a85a8ecd32	i965: Eliminate brw->cs.prog_data pointer. Just say no to: - brw->cs.base.prog_data = &brw->cs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_cs_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:21:35 -07:00
Kenneth Graunke	16d5536e55	i965: Eliminate brw->wm.prog_data pointer. Just say no to: - brw->wm.base.prog_data = &brw->wm.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_wm_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:21:35 -07:00
Kenneth Graunke	ff366f3db4	i965: Eliminate brw->gs.prog_data pointer. Just say no to: - brw->gs.base.prog_data = &brw->gs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_gs_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:21:33 -07:00
Kenneth Graunke	e512941537	i965: Eliminate brw->tes.prog_data pointer. Just say no to: - brw->tes.base.prog_data = &brw->tes.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_tes_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:21:09 -07:00
Kenneth Graunke	82c97ac710	i965: Eliminate brw->tcs.prog_data pointer. Just say no to: - brw->tcs.base.prog_data = &brw->tcs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_tcs_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:21:09 -07:00
Kenneth Graunke	40258a13d5	i965: Eliminate brw->vs.prog_data pointer. Just say no to: - brw->vs.base.prog_data = &brw->vs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_vs_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:21:06 -07:00
Kenneth Graunke	e51e055fcd	i965: Introduce downcast helpers for prog_data structures. Similar to brw_context(...), intel_texture_object(...), and so on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:20:42 -07:00
Chad Versace	74b02a7449	i965/sync: Rename awkward variable What is the difference between a 'driver_fence' and a 'fence'? Do the characters 'driver_' add anything helpful? Nope. They do, though, add an extra 7 chars and pull your eyeballs away to ask "huh? what's that?" one microsecond too many. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-05 17:09:25 -07:00
Chad Versace	a99ff82714	i965/sync: Rename intel_syncobj.c -> brw_sync.c Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-05 17:09:25 -07:00
Chad Versace	9ea48fc877	i965/sync: Replace 'intel' prefix with 'brw' This is yet another patch for the great renaming begun long ago. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-05 17:09:24 -07:00
Chad Versace	ce1d67c2e5	i965/sync: Fix uninitalized usage and leak of mutex We locked an unitialized mutex in the callstack glClientWaitSync intel_gl_client_wait_sync brw_fence_client_wait_sync because we forgot to initialize it in intel_gl_fence_sync. (The EGLSync codepath didn't have this bug. It initialized the mutex in intel_dri_create_sync). We also forgot to tear down (mtx_destroy) the mutex when destroying the sync object. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-05 17:09:24 -07:00
Jason Ekstrand	28ab2570c8	nir: Use the correct infos structure for copying atomic sources Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Cc: "12.0" <mesa-dev@lists.freedestkop.org>	2016-10-05 13:04:54 -07:00
Samuel Pitoiset	a41cfbbf2b	nvc0: dump program binary when chipset has been forced Currently, program binaries are only dumped at upload time, but when the chipset has been forced via NV50_PROG_CHIPSET we might want to show the generated code, especially with shaderdb. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-05 21:15:44 +02:00
Marek Olšák	cc4a19c4ad	radeonsi: fix texture border colors for compute shaders There are VM faults without this. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:54 +02:00
Marek Olšák	844f8268e1	gallium/radeon/winsyses: set reasonable max_alloc_size which is returned for GL_MAX_TEXTURE_BUFFER_SIZE. It doesn't have any other use at the moment. Bigger allocations are not rejected. This fixes GL45-CTS.texture_buffer.texture_buffer_max_size on Bonaire. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:54 +02:00
Marek Olšák	1b37e5541c	radeonsi: fix interpolateAt opcodes for .zw components Not returning garbage in .zw seems pretty important. This fixes: GL45-CTS.shader_multisample_interpolation.render.interpolate_at__check. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:23 +02:00
Marek Olšák	300a8221e9	radeonsi: add assertions to validate interpolation flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:23 +02:00
Marek Olšák	d4a8bf89ce	radeonsi: interpolate colors after interpolation weight shuffling Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:23 +02:00
Marek Olšák	faee2d6dda	tgsi/scan: don't set interp flags for inputs only used by INTERP (v2) (v1 pushed, then reverted) This fixes 9 randomly failing tests on radeonsi: GL45-CTS.shader_multisample_interpolation.render.interpolate_at_centroid.* v2: use input_interpolate[input] (correct) instead of input_interpolate[index] (incorrect) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:23 +02:00
Marek Olšák	10e5f126dd	ddebug: dump most driver information with GALLIUM_DDEBUG=always Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:23 +02:00
Karol Herbst	d8bcd3ef37	nv50/ra: let simplify return an error and handle that fixes a crash in the case simplify reports an error Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-05 19:11:42 +02:00
Nanley Chery	f315c4f189	intel/blorp: Use documented RECTLIST vertex positions Use the vertex positions described in the PRMs. This has no effect on rendering but quiets the simulator warnings seen when the vertices appear out of order. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-10-05 09:41:21 -07:00
Jason Ekstrand	e3a1d33077	anv/meta: Roll clear_image into CmdClearDepthStencilImage It is now the only caller so there's no sense in keeping things split out. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-05 09:33:44 -07:00
Jason Ekstrand	f027609a64	anv: Use blorp for VkCmdFillBuffer Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-05 09:33:44 -07:00
Kyle Brenneman	ca9f26ac6f	egl: Implement EGL_KHR_debug (v2) Wire up the debug entrypoints to EGL dispatch, and add the extension string to the client extension list. v2: - Lots of style fixes - Fix missing EGLAPIENTRYs - Factor out valid attribute check - Lock display in eglLabelObjectKHR as needed, and use RETURN_EGL_* - Move "EGL_KHR_debug" into asciibetical order in client extension string Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.veliko@collabora.com>	2016-10-05 11:41:26 -04:00
Kyle Brenneman	6a5545d3ba	egl: Track EGL_KHR_debug state when going through EGL API calls (v3) This decorates every EGL entrypoint with _EGL_FUNC_START, which records the function name and primary dispatch object label in the current thread state. It also adds debug report functions and calls them when appropriate. This would be useful enough for debugging on its own, if the user set a breakpoint when the report function was called. We will also need this state tracked in order to expose EGL_KHR_debug. v2: - Clear the object label in more cases in _eglSetFuncName - Pass draw surface (if any) to _EGL_FUNC_START in eglSwapInterval v3: - Set dummy thread's CurrentAPI to EGL_OPENGL_ES_API not zero - Less ?: in _eglSetFuncName Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.veliko@collabora.com>	2016-10-05 11:40:51 -04:00
Lionel Landwerlin	f8b861a867	intel: aubinator: pack supported generations into an array Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-05 16:23:28 +01:00
Ben Widawsky	2dc06e2324	i965/l3: Add explicit way size calculation for bxt There should be no functional change here because Broxton and CHV are both gt1. Without this code however, it might seem like broxton support is missing. While here, put the gt1 check in front to hopefully short-circuit the condition for the mobile cases. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-10-05 07:57:58 -07:00
Nicolai Hähnle	11cc59afca	virgl: Fix build regression of commit `8a943564`	2016-10-05 16:27:29 +02:00
Nicolai Hähnle	0cba7b771a	st/mesa: enable GL_KHR_robustness The difference to the virtually identical ARB_robustness (which is already enabled unconditionally) is miniscule and handled elsewhere, but this cap seems like the right thing to require for this extension. v2: drop the device reset cap requirement (Ilia) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-05 15:51:59 +02:00
Nicolai Hähnle	b5cd7dfe3e	gallium/radeon: implement set_device_reset_callback Check for device reset on flush. It would be nicer if the kernel just reported this as an error on the submit ioctl (and similarly for fences), but this will do for now. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:51:56 +02:00
Nicolai Hähnle	a1fa8b731f	st/mesa: set a device reset callback when available Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:51:53 +02:00
Nicolai Hähnle	d856130025	st/mesa: extract conversion from pipe_reset_status to GLenum Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:51:49 +02:00
Nicolai Hähnle	07bea09c64	ddebug: add pass-through of set_device_reset_callback Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:51:47 +02:00
Nicolai Hähnle	1a3c75e30e	gallium: add pipe_context::set_device_reset_callback Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:51:34 +02:00
Nicolai Hähnle	8a943564fd	virgl: use the new parent/child pools for transfers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:42:22 +02:00
Nicolai Hähnle	2a83036fe2	vc4: use the new parent/child pools for transfers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:42:20 +02:00
Nicolai Hähnle	0334ba150f	freedreno: use the new parent/child pools for transfers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:42:17 +02:00
Nicolai Hähnle	616e36674a	r300: use the new parent/child pools for transfers (v2) v2: slab_alloc_st -> slab_alloc Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:42:13 +02:00
Nicolai Hähnle	e56e1f8119	gallium/radeon: use the new parent/child pools for transfers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97894 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:42:07 +02:00
Nicolai Hähnle	d8cff811df	util/slab: re-design to allow migration between pools (v3) This is basically a re-write of the slab allocator into a design where multiple child pools are linked to a parent pool. The intention is that every (GL, pipe) context has its own child pool, while the corresponding parent pool is held by the winsys or screen, or possibly the GL share group. The fast path is still used when objects are freed by the same child pool that allocated them. However, it is now also possible to free an object in a different pool, as long as they belong to the same parent. Objects also survive the destruction of the (child) pool from which they were allocated. The slow path will return freed objects to the child pool from which they were originally allocated. If that child pool was destroyed, the corresponding page is considered an orphan and will be freed once all objects in it have been freed. This allocation pattern is required for pipe_transfers that correspond to (GL) buffer object mappings when the mapping is created in one context which is later destroyed while other contexts of the same share group live on -- see the bug report referenced below. Note that individual drivers do need to migrate to the new interface in order to benefit and fix the bug. v2: use singly-linked lists everywhere v3: use p_atomic_set for page->u.num_remaining Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97894	2016-10-05 15:40:40 +02:00
Nicolai Hähnle	8915f0c0de	util: use GCC atomic intrinsics with explicit memory model This is motivated by the fact that p_atomic_read and p_atomic_set may somewhat surprisingly not do the right thing in the old version: while stores and loads are de facto atomic at least on x86, the compiler may apply re-ordering and speculation quite liberally. Basically, the old version uses the "relaxed" memory ordering. The new ordering always uses acquire/release ordering. This is the strongest possible memory ordering that doesn't require additional fence instructions on x86. (And the only stronger ordering is "sequentially consistent", which is usually more than you need anyway.) I would feel more comfortable if p_atomic_set/read in the old implementation were at least using volatile loads and stores, but I don't see a way to get there without typeof (which we cannot use here since the code is compiled with -std=c99). Eventually, we should really just move to something that is based on the atomics in C11 / C++11. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-05 15:39:39 +02:00
Lionel Landwerlin	d51c1f9d51	i965: use L3 data cache for SSBOs Anv programs the hardware to use L3 data cache if we use either SSBOs or images in the shaders, we can program i965 the same way. gl_shader_program has a bit of a confusing named field with 'NumAtomicBuffers'. It doesn't tell how many buffers are accessed by the shader in an atomic way but instead the number of atomic counters manipulated by the shader. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-10-05 12:24:04 +01:00
Kenneth Graunke	a40640f530	mesa: Raise INVALID_ENUM in FramebufferTextureD for unknown textargets. ES3-CTS.functional.negative_api.buffer.framebuffer_texture2d expects glFramebufferTexture[123]D to raise GL_INVALID_ENUM when supplied a completely bogus textarget parameter (i.e. 0xffffffff). This is at odds with the spec. GLES 3.1 says: "An INVALID_OPERATION error is generated if texture is not zero and textarget is not one of TEXTURE_2D, TEXTURE_2D_MULTISAMPLE, or one of the cube map face targets from table 8.21." (and GLES 3.0 and GL 4.5 both have similar text). However, GL has a general guideline that says: "If a command that requires an enumerated value is passed a symbolic constant that is not one of those specified as allowable for that command, an INVALID_ENUM error is generated." Apparently other vendors reconcile these two rules as follows: GL should raise INVALID_OPERATION for actual texture target enumeration values which are not allowed for this particular glFramebufferTextureD call. Any value that is not a texture target should result in GL_INVALID_ENUM. For example, glFramebufferTexture2D with GL_TEXTURE_1D would result in INVALID_OPERATION because it is a real texture target, but not allowed for the 2D version of the function. But calling it with GL_FRONT would result in INVALID_ENUM, as that isn't even a texture target. Fixes: - {ES3-CTS,dEQP-GLES3}.functional.negative_api.buffer.framebuffer_texture2d - {ES31-CTS,ES32-CTS,dEQP-GLES31}.functional.debug.negative_coverage.get_error.buffer.framebuffer_texture2d References: https://gitlab.khronos.org/opengl/cts/merge_requests/387 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-04 21:10:24 -07:00
Kenneth Graunke	aecdb21be8	mesa: Reorganize check_textarget(). Having one top-level switch statement covering all known texture targets will make the next change easier to implement. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-04 21:10:05 -07:00
Kenneth Graunke	53b8f6374f	aubinator: use the correct format specifier for printing ptrdiff_t. Fixes more warnings in 32-bit builds. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-04 17:28:01 -07:00
Kenneth Graunke	af41e1a500	aubinator: Use less -RS instead of -r for the implicit pager. From the less man page: "Warning: when the -r option is used, less cannot keep track of the actual appearance of the screen (since this depends on how the screen responds to each type of control character). Thus, various display problems may result, such as long lines being split in the wrong place." Lines which are too long to fit in the terminal would be word wrapped, but unfortunately less would get confused about which line it was on, and text would be drawn on top of other text. The most noticable case was shader assembly, which is frequently too wide for an 80 character terminal, and thus would be drawn on top of the following state packets, making them completely unreadable. Using -R instead of -r fixes this problem by only allowing color escape sequences. (Notably, Git's implicit pager invocation uses -R.) Unfortunately, it means our "clear to the end of the line" hack for extending the blue bar headers won't work anymore. Word wrapping usually isn't terribly readable, anyway, so we also add the -S option (chop long lines) to restrict it to the terminal width. (You can hit the left and right arrow keys to scroll sideways.) Then, for a new blue bar hack, we can use a printf specifier to pad the command packet names to be 80 characters long (arbitrarily), which extends them "far enough" to look good, and doesn't require us to use ioctls to determine the terminal width. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Sirisha Gandikota <sirisha.gandikota@intel.com>	2016-10-04 17:25:46 -07:00
Kenneth Graunke	8a484a63f8	i965: Drop _NEW_TRANSFORM from 3DSTATE_VS atom on Gen7. The atom that uploads push constants listens to _NEW_TRANSFORM for legacy clip plane handling. On Sandybridge, the gen6_vs_state atom emits 3DSTATE_CONSTANT_VS as well as 3DSTATE_VS, so it needs to listen to the same set of conditions. However, it looks like Gen7 doesn't need this. The push constant atom emits 3DSTATE_CONSTANT_VS directly, and the gen7_vs_state atom that emits 3DSTATE_VS doesn't have a dependency on ctx->Transform. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:21:40 -07:00
Kenneth Graunke	d3cc3d28bd	i965: Fix brw_clear_cache to clean up TCS/TES shaders. We need to free prog_data for TCS/TES too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-04 17:09:08 -07:00
Kenneth Graunke	bab1c05634	i965: Add missing BRW_CS_PROG_DATA to CS work group surface atom. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:09:07 -07:00
Kenneth Graunke	ce6c80ebbb	i965: Add missing BRW_NEW_CS_PROG_DATA to compute constant atom. CACHE_NEW_CS_PROG hasn't existed in quite a long time...the old comment was there, but not the actual bit. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:09:07 -07:00
Kenneth Graunke	f2b9b0c730	i965: Add missing BRW_NEW_FS_PROG_DATA to render target reads. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:09:07 -07:00
Kenneth Graunke	0047d600af	i965: Move BRW_NEW_FRAGMENT_PROGRAM from 3DSTATE_PS to PS_EXTRA. 3DSTATE_PS doesn't need this. 3DSTATE_PS_EXTRA however does, for brw_color_buffer_write_enabled(). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:09:07 -07:00
Kenneth Graunke	28e1538be7	i965: Add missing BRW_NEW_VS_PROG_DATA to 3DSTATE_CLIP. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:09:07 -07:00
Kenneth Graunke	78df96256b	i965: Fix missing _NEW_TRANSFORM in Gen8+ 3DSTATE_DS atom. Needed for user clip plane enables. Broken since this code was introduced. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:09:07 -07:00
Ian Romanick	40dd45d0c6	i965: Enable ARB_shader_atomic_counter_ops Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-04 16:53:32 -07:00
Ian Romanick	3d2011cb33	i965: Refactor emission of atomic counter operations This will make it easier to add more operations. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-04 16:53:32 -07:00
Ian Romanick	7cd0b3084c	nir/intrinsics: Add more atomic_counter ops v2: Delete some stray debug code notice by Iago. v3: Massive rebase on new ir_function_signature::intrinsic_id mechanism. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1] Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:32 -07:00
Ian Romanick	2c9a17ac79	nir/intrinsics: Include atomic_counter_ in the names used in macro invocations Otherwise grepping for where atomic_counter_inc and friends are defined is a very frustrating experience. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:32 -07:00
Ian Romanick	c42fe30c86	glsl: Kill __intrinsic_atomic_sub Just generate an __intrinsic_atomic_add with a negated parameter. Some background on the non-obvious reasons for the the big change to builtin_builder::call()... this is cribbed from some discussion with Ilia on mesa-dev. Why change builtin_builder::call() to allow taking dereferences and create them here rather than just feeding in the ir_variables directly? The problem is the neg_data ir_variable node would have to be in two lists at the same time: the instruction stream and parameters. The ir_variable node is automatically added to the instruction stream by the call to make_temp. Restructuring the code so that the ir_variables could be in parameters then move them to the instruction stream would have been pretty terrible. ir_call in the instruction stream has an exec_list that contains ir_dereference_variable nodes. The builtin_builder::call method previously took an exec_list of ir_variables and created a list of ir_dereference_variable. All of the original users of that method wanted to make a function call using exactly the set of parameters passed to the built-in function (i.e., call __intrinsic_atomic_add using the parameters to atomicAdd). For these users, the list of ir_variables already existed: the list of parameters in the built-in function signature. This new caller doesn't do that. It wants to call a function with a parameter from the function and a value calculated in the function. So, I changed builtin_builder::call to take a list that could either be a list of ir_variable or a list of ir_dereference_variable. In the former case it behaves just as it previously did. In the latter case, it uses (and removes from the input list) the ir_dereference_variable nodes instead of creating new ones. text data bss dec hex filename 6036395 283160 28608 6348163 60dd83 lib64/i965_dri.so before 6036923 283160 28608 6348691 60df93 lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:32 -07:00
Ian Romanick	bb290b5679	glsl: Remove ir_function_signature::_is_intrinsic field text data bss dec hex filename 6036491 283160 28608 `6348259` 60dde3 lib64/i965_dri.so before 6036395 283160 28608 6348163 60dd83 lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:31 -07:00
Ian Romanick	acfcc7bbfa	glsl: Add ir_function_signature::is_intrinsic() method This necessetated renaming the is_intrinsic field to _is_intrinsic. The next commit will remove the field. text data bss dec hex filename 6036507 283160 28608 6348275 60ddf3 lib64/i965_dri.so before 6036491 283160 28608 `6348259` 60dde3 lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:31 -07:00
Ian Romanick	b7df52b106	glsl: Use the ir_intrinsic_* enums instead of the __intrinsic_* name strings text data bss dec hex filename 6038043 283160 28608 6349811 60e3f3 lib64/i965_dri.so before 6036507 283160 28608 6348275 60ddf3 lib64/i965_dri.so after v2: s/ir_intrinsic_atomic_sub/ir_intrinsic_atomic_counter_sub/. Noticed by Ilia. v3: Silence unhandled enum in switch warnings in st_glsl_to_tgsi. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:31 -07:00
Ian Romanick	5854de99b2	glsl: Track a unique intrinsic ID with each intrinsic function text data bss dec hex filename 6037483 283160 28608 6349251 60e1c3 lib64/i965_dri.so before 6038043 283160 28608 6349811 60e3f3 lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:31 -07:00
Ian Romanick	c01f2bfc6c	glsl: Don't emit ir_binop_carry during ir_binop_imul_high lowering st_glsl_to_tgsi only calls lower_instructions once (instead of in a loop), so the ir_binop_carry generated would not get lowered. Fixes assertion failure state_tracker/st_glsl_to_tgsi.cpp:2265: void glsl_to_tgsi_visitor::visit_expression(ir_expression, st_src_reg): Assertion `!"Invalid ir opcode in glsl_to_tgsi_visitor::visit()"' failed. on softpipe in 16 piglit tests: mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-only-msb-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-only-msb.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-only-msb-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-only-msb.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-only-msb-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-only-msb.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-only-msb-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-only-msb.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended.shader_test Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:31 -07:00
Timothy Arceri	0e8f1eaf41	i965: fix unused variable warning in brw_emit_gpgpu_walker() Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-05 10:14:05 +11:00
Timothy Arceri	6fdfcd4d1c	i965: add MAYBE_UNUSED to assert param Fixes unused variable warning in release build. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-05 10:13:58 +11:00
Timothy Arceri	4340294af8	i965: wrap unused function in #ifndef NDEBUG This function is only ever used by an assert() this fixes an unused function warning in release builds. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-05 10:13:58 +11:00
Timothy Arceri	c9f1767903	i965: fix unused variable warning in gen7_block_read_scratch() Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-05 10:13:58 +11:00
Timothy Arceri	df4ff31d3c	i965: add MAYBE_UNUSED to assert param This fixes an unused variable warning on release builds. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-05 10:13:52 +11:00
Jose Fonseca	437d7e1baf	gallivm: Use AVX2 gather instrinsics. v2: Use AVX2 gather for non aligned loads too. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-10-04 23:36:20 +01:00
Roland Scheidegger	bc80741d7a	gallivm: Use 8 wide AoS sampling on AVX2. v2: Make sure that with num_lods > 1 and min_filter != mag_filter we still enter the splitting path. So this case would still use 4-wide aos path (as a side note, the 4-wide aos sampling path could actually be improved quite a bit if we have avx2, by just doing the filtering with 256bit vectors). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-04 23:36:20 +01:00
José Fonseca	e088390c7d	gallivm: Basic AVX2 support. v2: pblendb -> pblendvb Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-10-04 23:36:20 +01:00
Chad Versace	add01add1b	egl: Drop duplicate check on EGLSync type _eglInitSync checked that the display supported the sync type (such as EGL_SYNC_FENCE), and did it wrong. When the check failed it emitted EGL_BAD_ATTRIBUTE, but sometimes EGL_BAD_PARAMETER is needed. _eglCreateSync already does the error checking, and it does it right. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 14:11:29 -07:00
Chad Versace	02e4f1cb43	egl: Cleanup control flow in _eglParseSyncAttribList When the function encountered an error, it effectively returned immediately. However, it did so indirectly by breaking out of a loop. Replace the loop breakout with a explicit 'return'. Do the same for _eglParseSyncAttribList64 too. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 14:11:29 -07:00
Chad Versace	3e0d575a6d	egl: Add _eglConvertIntsToAttribs() This function converts an attribute list from EGLint[] to EGLAttrib[]. Will be used in following patches to cleanup EGLSync attribute parsing. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 14:11:29 -07:00
Chad Versace	f2c2f43d4e	egl: Fix an error path in eglCreateSync* When the user called eglCreateSync64KHR on a display without EGL_KHR_cl_event2 (the only extension that exposes it), we returned EGL_NO_SYNC but did not update the error code. We also did the same for eglCreateSync on a display without EGL 1.5. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 14:11:28 -07:00
Chad Versace	69adb9a778	egl: Fix truncation error in _eglParseSyncAttribList64 The function stores EGLAttrib values in EGLint variables. On 64-bit systems, this truncated the values. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 14:11:28 -07:00
Chad Versace	17084b6f93	egl: Fix missing unlock in eglGetSyncAttribKHR On the error path, eglGetSyncAttribKHR neglected to unlock the EGLDisplay before returning. Fixes deadlock in dEQP-EGL.functional.fence_sync.invalid.get_invalid_value. Cc: mesa-stable@lists.freedesktop.org Cc: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 14:11:22 -07:00
Anuj Phogat	d2112fc8d9	anv/gen7_pipeline: Fix typo in semicolon Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:35 -07:00
Anuj Phogat	1ffcf95fc4	anv/gen7_pipeline: Set sample mask field in 3DSTATE_PS Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:35 -07:00
Anuj Phogat	deeb1e95d0	anv/gen7_pipeline: Move ksp{1,2} state setting next to ksp0 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:35 -07:00
Anuj Phogat	517b1bf499	anv/gen7: Make use of local variable prog_data Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:34 -07:00
Anuj Phogat	2abb7486f5	anv/gen8_pipeline: Add an assert to ensure use_alt_mode is not set in prog_data Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-04 13:20:34 -07:00
Anuj Phogat	fa04b57c15	anv/gen8_pipeline: Fix typo in semicolon Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:34 -07:00
Anuj Phogat	7daafad9ac	intel/genxml: Keep the value name 'Alternate' uniform across gen75.xml Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:34 -07:00
Anuj Phogat	c0f02bbc57	intel/genxml: Fix typo in gen75.xml Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:34 -07:00
Anuj Phogat	cd69d3f929	i965/gen8+: Enable GL_OES_viewport_array This patch causes 2 regressions in khronos' gles cts tests on various intel platforms. Failing tests: ES3-CTS.functional.state_query.integers.viewport_getinteger ES3-CTS.functional.state_query.integers.viewport_getfloat Here is an explanation of what's causing the failures: CTS tests are not clamping the x, y location of the viewport's bottom-left corner as recommended by ARB_viewport_array and OES_viewport_array: "The location of the viewport's bottom-left corner, given by (x,y), are clamped to be within the implementation-dependent viewport bounds range. The viewport bounds range [min, max] tuple may be determined by calling GetFloatv with the symbolic constant VIEWPORT_BOUNDS_RANGE_OES" Khronos CTS merge request to fix the test case: https://gitlab.khronos.org/opengl/cts/merge_requests/399 V2: Initialize the relevant variables for GL_OES_viewport_array on gen8+ Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 13:20:34 -07:00
Anuj Phogat	239ff64173	mesa: Add a check for OES_viewport_array Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 13:20:34 -07:00
Anuj Phogat	0a7691ee62	mesa: Enable enums for OES_viewport_array Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 13:20:34 -07:00
Anuj Phogat	2c7e1165fa	anv/gen7_pipeline: Use MSDISPMODE_PERSAMPLE for non-multisampled fbo Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:34 -07:00
Anuj Phogat	f75a93f610	anv/blorp: Handle zero width/height blits in blorp_copy() V2: Move the check from copy_buffer_to_image() to blorp_copy(). (Nanley) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-04 13:20:34 -07:00
Anuj Phogat	2c78b2ec90	intel/isl: Add an assert to check zero width/height surface Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:34 -07:00
Leo Liu	0e85ff3355	st/omx/dec/h265: add scaling list data Specified by subclause 7.3.4 v2: get the loop optimized Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-04 11:09:59 -04:00
Leo Liu	ffb863fd2c	st/omx/dec/h265: fix the skip for before and after list For reference picture sets, there are cases that rps will not always be used. Once detect the unused flag from encoded bitstream, we should not add this rps to any list, otherwise pass the incorrect reference and skip the correct rps. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 11:09:59 -04:00
Leo Liu	c50b68e6a8	st/omx/dec/h265: set the default reference picture set for reference It will fix the corruption for frame, that only has one stort term ref picture set, we set NULL rps for this case previously, causing taking incorrect reference. Instead we should take that only short term set as reference Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 11:09:59 -04:00
Leo Liu	091aae0265	st/omx/dec/h265: decoder size should follow from sps The video size from format container is not always compatible with the size from codec bitstream, the HW decoder should take the size information from bitstream, otherwise the corruption appears with clip that has different size info between bitstream and format container So we are passing width(height)_in_samples from sequence parameter set to video decoder. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 11:09:59 -04:00
Leo Liu	2371119db9	st/omx/dec/h265: increase dpb max size to 32 For clip with frame delta poc over 16 Signed-off-by: Leo Liu <leo.liu@amd.com>	2016-10-04 11:09:59 -04:00
Eric Engestrom	66f85c3824	nir/spirv: Remove a duplicate spirv2nir from .gitignore This reverts commit `fc03ecfeaf`. Chad had already pushed the same change between me posting the patch and Jason pushing it: `44bcf1ffcc` (".gitignore: Ignore src/compiler/spirv2nir") Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 07:43:15 -07:00
Nicolai Hähnle	8b1f9fd3b3	radeonsi: optionally run the LLVM IR verifier pass This is enabled automatically if shader printing is enabled, or separately by R600_DEBUG=checkir. Catch mal-formed IR before it crashes in a later pass. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:33 +02:00
Nicolai Hähnle	1e9476e8c5	gallium/radeon: fix argument type of llvm.{cttz,ctlz}.i32 intrinsics Caught by R600_DEBUG=checkir (next commit). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:28 +02:00
Nicolai Hähnle	1b6fb88ab2	gallium/radeon: unify the creation of basic blocks This changes the order of basic blocks to be equal to the order of code in the original TGSI, which is nice for making sense of shader dumps. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:25 +02:00
Nicolai Hähnle	d377f4c1ca	gallium/radeon: merge branch and loop flow control stacks Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:21 +02:00
Nicolai Hähnle	b0d50e157d	gallium/radeon: simplify if/else/endif blocks In particular, we no longer emit an else block when there is no ELSE instruction. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:18 +02:00
Nicolai Hähnle	89e9de2ea6	gallium/radeon: label basic blocks by the corresponding TGSI pc Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:15 +02:00
Nicolai Hähnle	6f87d7a146	gallium/radeon: cleanup and fix branch emits Some of the existing code is needlessly complicated. The basic principle should be: control-flow opcodes emit branches to properly terminate the current block, _unless_ the current block already has a terminator (which happens if and only if there was a BRK or CONT). This also fixes a bug where multiple terminators were created in a block. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97887 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:10 +02:00
Nicolai Hähnle	dfc1afda83	winsys/radeon: add buffer_get_reloc_offset Really fix the bug that was supposed to be fixed by commits `3e7cced4b` and `a48bf02d`: even when virtual addresses are used, the legacy relocation-based method with offsets relative to the kernel's buffer object are used for video submissions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:37:44 +02:00
Marek Olšák	71a5cf6f3b	radeonsi: don't declare LDS in PS when ds_bpermute is used I guess this is not needed because dead code elimination removes the declaration. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:16 +02:00
Marek Olšák	b2a694f079	radeonsi: use DDX/DDY directly in si_llvm_emit_ddxy_interp We can finally do this, because the opcodes are scalar now. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:14 +02:00
Marek Olšák	b57aef8033	radeonsi: simplify si_llvm_emit_ddxy si_llvm_emit_ddxy is called once per element, so we don't have to generate code for 4 elements at once. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:12 +02:00
Marek Olšák	046c199c3a	radeonsi: don't call build_gep0 in si_llvm_emit_ddxy on VI Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:11 +02:00
Marek Olšák	bcc55e1f32	radeonsi: use a helper function for BuildGEP(0, x) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:10 +02:00
Marek Olšák	e20f7142a3	radeonsi: remove obsolete shader definitions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:09 +02:00
Marek Olšák	8c6ea5a6ff	radeonsi: remove unnecessary #includes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:07 +02:00
Marek Olšák	3388f27d84	radeonsi: clean up lucky #include dependencies Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:06 +02:00
Marek Olšák	53d2c8f00f	radeonsi: don't re-create shader PM4 states after scratch buffer update Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:05 +02:00
Marek Olšák	6c01684393	gallium/radeon: move r600_common_context::texture_buffers to r600g Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:03 +02:00
Marek Olšák	7ce19d9014	radeonsi: don't set sampler buffer offsets in create_sampler_view do it at bind time, so that pipe_sampler_view is immutable with regard to buffer reallocations and we don't have to remember all existing buffer views. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:01 +02:00
Marek Olšák	7e6428e0a8	radeonsi: optimize si_invalidate_buffer based on bind_history Just enclose each section with: if (rbuffer->bind_history & PIPE_BIND_...) Bioshock Infinite: +1% performance Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:00 +02:00
Marek Olšák	e43bd861e8	radeonsi: track buffer bind history similar to gl_buffer_object::UsageHistory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:58 +02:00
Marek Olšák	b523a9ddc5	radeonsi: drop support for NULL sampler views not used anymore. It was used when the polygon stipple texture was constant. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:57 +02:00
Marek Olšák	82e51e8188	radeonsi: separate IA_MULTI_VGT_PARAM and VGT_PRIMITIVE_TYPE emission We want to emit IA_MULTI_VGT_PARAM less often because it's a context reg. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:56 +02:00
Marek Olšák	3ee9be42ac	radeonsi: move VGT_LS_HS_CONFIG to derived tess_state Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:53 +02:00
Marek Olšák	f92113c5a1	radeonsi: don't check PIPE_BARRIER_MAPPED_BUFFER Caches are always flushed at IB boundary. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:51 +02:00
Marek Olšák	ca1d1e0e19	radeonsi: parse SURFACE_SYNC correctly on CIK-VI Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:49 +02:00
Marek Olšák	37065b0583	gallium/radeon: inline r600_context_add_resource_size Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:47 +02:00
James Legg	e33f31d61f	radeonsi: Fix primitive restart when index changes If primitive restart is enabled for two consecutive draws which use different primitive restart indices, then the first draw's primitive restart index was incorrectly used for the second draw. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98025 Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 15:57:37 +02:00
Timothy Arceri	338d3c0b0f	spirv: replace assert() with unreachable() This fixes an uninitialized warning for is_vertex_input. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 22:33:51 +11:00
Timothy Arceri	298c2e03d7	intel: use the correct format specifier for printing uint64_t Fixes a bunch of warnings in 32-bit builds. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-04 22:32:57 +11:00
Matt Whitlock	42ed8a6c9c	gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-04 11:09:03 +02:00
Matt Whitlock	ac6064f918	st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-04 11:09:01 +02:00
Matt Whitlock	0c060f691c	st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-04 11:08:58 +02:00
Matt Whitlock	5d0069eca2	gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-04 11:08:55 +02:00
Matt Whitlock	c8fd7d060d	egl/android: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-04 11:08:50 +02:00
Tapani Pälli	387e0af0b4	intel: fix compilation warning on gen_get_device_info (warning: 'const' type qualifier on return type has no effect) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-10-04 07:38:45 +03:00
Kenneth Graunke	9d6ca7c3d0	i965: Only emit 1 viewport when possible. In core profile, we support up to 16 viewports. However, in the majority of cases, only 1 of them is actually used - we only need the others if the last shader stage prior to the rasterizer writes gl_ViewportIndex. Processing all 16 viewports adds additional CPU overhead, which hurts CPU-intensive workloads such as Glamor. This meant that switching to core profile actually penalized Glamor to an extent, which is unfortunate. This patch tracks the number of relevant viewports, switching between 1 and ctx->Const.MaxViewports if gl_ViewportIndex is written. A new BRW_NEW_VIEWPORT_COUNT flag tracks this. This could mean re-emitting viewport state when switching, but hopefully this is offset by doing 1/16th of the work in the common case. The new flag is also lighter weight than BRW_NEW_VUE_MAP_GEOM_OUT, which we were using in one case. According to Eric Anholt, x11perf -copypixwin10 performance improves by 11.5094% +/- 3.10841% (n=10) on his Skylake. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-03 18:41:10 -07:00
Dave Airlie	7eb7684818	spirv: translate cull distance semantic. This just translates to the correct cull distance slot. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-04 10:16:23 +10:00
Dave Airlie	bd0157d542	compiler: add printable values for cull distance varyings. We need these for spir-v/nir shaders. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-04 10:15:23 +10:00
Jason Ekstrand	6ffbfc760d	nir/spirv/cfg: Use a nop intrinsic for tagging the ends of blocks Previously, we were saving off the last nir_block in a vtn_block before moving on so that we could find the nir_block again when it came time to handle phi sources. Unfortunately, NIR's control flow modification code is inconsistent when it comes to how it splits blocks so the block pointer we saved off may point to a block somewhere else in the shader by the time we get around to handling phi sources. In order to get around this, we insert a nop instruction and use that as the logical end of our block. Since the control flow manipulation code respects instructions, the nop will keeps its place like any other instruction and we can easily find the end of our block when we need it. This fixes a bug triggered by a couple of vkQuake shaders. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97233 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Tested-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-03 16:17:12 -07:00
Jason Ekstrand	7697b4b98b	nir: Add a nop intrinsic This intrinsic has no destination, no sources, no variables, and can be eliminated. In other words, it does nothing and will always get deleted by dead code elimination. However, it does provide a quick-and-easy way to temporarily tag a particular location in a NIR shader. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-10-03 16:17:12 -07:00
Jason Ekstrand	0176c6a692	intel/isl: Allow non-2D HiZ surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	4e397c6c75	intel/isl: Add a detailed comment about multisampling with HiZ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	c3bd711411	intel/isl: Remove tiling checks from choose_msaa_layout We already do those checks in filter_tiling. There's no good reason to repeat them in choose_msaa_layout. If anything they should have been asserts and not "return false" checks. Also, this check was causing us to outright reject multisampled HiZ surfaces which wasn't intended. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	69d3bb9915	intel/isl: Handle HiZ and CCS tiling more directly The HiZ and CCS tiling formats are always used for HiZ and CCS surfaces respectively. There's no reason why we should go through filter_tiling and it's much easier to always get HiZ and CCS right if we just handle them directly. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	b1311a48e0	intel/isl: Allow multisampling with ISL_FORMAT_HiZ HiZ buffers can be multisampled and, on Broadwell and earlier, simply using interleaved multisampling with a compression block size of 8x4 samples yields the correct HiZ surface size calculations. Unfortunately, choose_msaa_layout was rejecting multisampled HiZ buffers because of format checks. Now that we have a simple helper for determining if a format supports multisampling, that's an easy enough issue to fix. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	baade41a5c	intel/isl: Allow creation of 1-D compressed textures Compressed 1-D textures are not well-defined thing in either GL or Vulkan. However, auxiliary surfaces are treated as compressed textures in ISL and we can do HiZ and CCS with 1-D so we need to be able to create them. In order to prevent actually using them (the docs say no), we assert in the state setup code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	f82166578f	intel/isl: Fix up asserts in calc_phys_level0_extent_sa The assertion that a format is uncompressed in the multisample layouts isn't quite right. What we really want to assert is that the format supports multisampling which is a bit more complicated query. We also want to assert that it has a block size of 1x1 since we do nothing with the block size in the phys_level0_sa assignment. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	5637f3f120	intel/isl: Add a format_supports_multisampling helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Nayan Deshmukh	b7a0f2e1f7	vl/dri3: fix warning about incompatible pointer type Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-10-03 12:51:30 -04:00
Bruce Cherniak	903d00cd32	swr: Removed stalling SwrWaitForIdle from queries. Previous fundamental change in stats gathering added a temporary SwrWaitForIdle to begin_query and end_query. Code has been reworked to remove stall. Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-10-03 09:57:45 -05:00
Tim Rowley	cdac042733	swr: [rasterizer core] refactor thread creation Create worker pool now computes number of worker threads based on things like topologies, etc. and creates the pool but doesn't actually launch the threads. Instead there is a separate start thread pool function. This allows thread resources to be constructed first before threads start. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-03 09:57:38 -05:00
Tim Rowley	114f7a92c6	swr: [rasterizer jitter] canonicalize blend compile state Canonicalize to prevent unnecessary JIT compiles. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-03 09:57:31 -05:00
Tim Rowley	4198520a82	swr: [rasterizer core] archrast fixes - Immediately sleep threads until thread data is initialized - Fix some compile bugs with AR enabled Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-03 09:57:25 -05:00
Tim Rowley	aaeb07989e	swr: [rasterizer jitter] fixes for icc in vs2015 compat mode - Move most jitter functionality into SwrJit namespace - Avoid global "using namespace llvm" in headers Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-03 09:57:19 -05:00
Tim Rowley	b8a6f06c85	swr: [rasterizer core] generalize compute dispatch mechanism Generalize compute dispatch mechanism to support other types of dispatches. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-03 09:57:13 -05:00
Tim Rowley	33a1a09eb0	swr: [rasterizer common] os.h portability header changes - Fix conflict between windows MemoryFence and llvm::sys::MemoryFence - Declare gettid() Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-03 09:56:47 -05:00
Ville Syrjälä	2fef0d108a	anv/formats: Fix build on gcc-4 and earlier gcc-4 and earlier don't allow compound literals where a constant is required in -std=c99/gnu99 mode, so we can't use ISL_SWIZZLE() when populating the anv_formats[] array. There are a few ways around it: First one would be -std=c89/gnu89, but the rest of the code depends on c99 so it's not really an option. The second option would be to upgrade to gcc-5+ where the compiler behaviour was relaxed a bit [1]. And the third option is just to avoid using compound literals. I chose the last option since it keeps gcc-4 and earlier working. [1] https://gcc.gnu.org/gcc-5/porting_to.html Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Topi Pohjolainen <topi.pohjolainen@intel.com> Fixes: `7ddb21708c` ("intel/isl: Add an isl_swizzle structure and use it for isl_view swizzles") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-03 15:45:28 +03:00
Tapani Pälli	4d6d55deef	egl: stop claiming support for pbuffer + msaa This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test and same crash in many dEQP EGL tests. I also found that some Qt example did a workaround because of this crash: https://bugreports.qt.io/browse/QTBUG-47509 v2: Ian pointed out that v1 removed support for all multisample configs, including window ones. This one removes pbuffer bit when adding configs, now only pbuffer+msaa gets rejected and window+msaa continues to work. Fixed also comment (Emil) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-03 07:56:44 +03:00
Timothy Arceri	eaf147cb46	i965: rename max_ds_* variable to max_tes_* Using consistent naming allows us to create macros more easily. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-03 15:29:58 +11:00
Timothy Arceri	b67633ce5e	i965: rename max_hs_* variables to max_tcs_* Using consistent naming allows us to create macros more easily. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-03 15:29:51 +11:00
Kenneth Graunke	da274ba5f8	i965: Drop pointless stage == MESA_SHADER_FRAGMENT checks. There's an assert right above this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-02 14:49:20 -07:00
Timothy Arceri	024c207319	glsl: add missing headers to blob.h Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-10-02 13:48:06 +11:00
Jason Ekstrand	ef3c5ac7fb	nir/spirv/cfg: Detect switch_break after loop_break/continue While the current CFG code is valid in the case where a switch break also happens to be a loop continue, it's a bit suboptimal. Since hardware is capable of handling the continue as a direct jump, it's better to use a continue instruction when we can than to bother with all of the nasty switch break lowering. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-10-01 15:40:34 -07:00
Jason Ekstrand	4d02faede5	nir/spirv/cfg: Handle switches whose break block is a loop continue It is possible that the break block of a switch is actually the continue of the loop containing the switch. In this case, we need to identify the break block as a continue and break out of current level of CFG handling. If we don't, the continue portion of the loop will get handled twice, once by following after the break and a second time by the loop handling code handling it explicitly. This fixes 6 of the new Vulkan CTS tests: - dEQP-VK.spirv_assembly.instruction.graphics.opphi.out_of_order* - dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order* Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-10-01 15:40:14 -07:00
Eric Engestrom	fc03ecfeaf	nir/spirv: add spirv2nir binary to .gitignore Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-01 15:27:48 -07:00
Eric Engestrom	c867938044	nir/spirv: improve mmap() error handling Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-01 15:27:46 -07:00
Eric Engestrom	65c8cbe89d	nir/spirv: improve lseek() error handling Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-01 15:27:44 -07:00
Eric Engestrom	23519a9de2	nir/spirv: add some error checking to open() CovID: 1373369 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-01 15:27:31 -07:00
Timothy Arceri	913e0296f2	mesa: use uint32_t rather than unsigned for xfb struct members These structs will be written to disk as part of the shader cache so use uint32_t just to be safe. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-10-01 11:26:25 +10:00
Timothy Arceri	7064f8674a	i915/i965: remove commented out warning The warning was also the wrong location, it should have been in the else. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-01 09:24:33 +10:00
Brian Paul	951bf44a56	mesa: move _mesa_valid_to_render() to api_validate.c Almost all of the other drawing validation code is in api_validate.c so put this function there as well. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-30 16:28:00 -06:00
Steven Toth	e99b9395be	gallium/hud: Add support for CPU frequency monitoring Detect all of the CPUs in the system. Expose metrics for min, max and current frequency in Hz. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-30 15:18:46 -06:00
Marek Olšák	7b87190d2b	Revert "gallium/hud: automatically print % if max_value == 100" This reverts commit `dbfeb0ec12`. With max_value being rounded to 100, it's often wrong. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-30 22:07:12 +02:00
Brian Paul	1d07552ba5	docs: update the list of Mesa major versions and API support Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-30 09:17:33 -06:00
Nicolai Hähnle	7bac5bf032	gallium/radeon: fix crash/regression in performance counters Regression introduced by "gallium/radeon: zero all query buffers". Cc: Michel Dänzer <michel@daenzer.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-30 12:41:45 +02:00
Nicolai Hähnle	cfd870de70	gallium/radeon: update documentation of buffer_get_virtual_address Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-30 12:41:41 +02:00
Nicolai Hähnle	fd9f54223d	gallium/radeon: emit relocations for query fences This is only needed for r600 which doesn't have ARB_query_buffer_object and therefore wouldn't really need the fences, but let's be optimistic about filling in this feature gap eventually. Cc: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-30 12:38:57 +02:00
Nicolai Hähnle	3e7cced4b9	radeon/uvd: adjust the buffer offset when relocation is used We don't plan to use sub-allocated buffers with UVD, but just in case one slips through, this increases the chances of things working out anyway. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-30 12:38:52 +02:00
Nicolai Hähnle	a48bf02d05	radeon/vce: adjust the buffer offset when relocation is used We don't plan to use sub-allocated buffers with VCE, but just in case one slips through, this increases the chances of things working out anyway. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-30 12:38:48 +02:00
Nicolai Hähnle	13cb41f666	radeon/video: don't use sub-allocated buffers Cc: Christian König <christian.koenig@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97976 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969 Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-30 12:38:29 +02:00
Steven Toth	1d466b9b04	gallium/hud: Add power sensor support Implement support for power based sensors, reporting units in milli-watts and watts. Also, minor cleanup - change the related if block to a switch. Tested with two different power sensors, including the nouveau 'power1' sensors on a GTX950 card. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-29 17:51:15 -06:00
Samuel Pitoiset	3abe68b828	nv50/ir: teach insnCanLoad() about SHLADD Commutativity is not allowed with SHLADD, but src2 can accept loads. To allow the load propagation pass to do its job, add a special case like for SUCLAMP because src1 is always an immediate. This IMAD to SHLADD optimization helps a bunch of shaders from Tomb Raider, Victor Vran, UE4 demos (+15% perf with Elemental) and Shadow Warrior. GF100/GK104: total instructions in shared programs :2838045 -> 2834712 (-0.12%) total gprs used in shared programs :396684 -> 396386 (-0.08%) total local used in shared programs :34416 -> 34416 (0.00%) local gpr inst bytes helped 0 326 1105 1105 hurt 0 55 3 3 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 21:20:50 +02:00
Samuel Pitoiset	115c79be10	nv50/ir: optimize SHLADD(a, b, c) to MOV((a << b) + c) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 21:20:47 +02:00
Samuel Pitoiset	2e008be9a9	nv50/ir: optimize SHLADD(a, b, 0x0) to SHL(a, b) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 21:20:44 +02:00
Samuel Pitoiset	e4eb0fca02	nv50/ir: optimize IMAD to SHLADD in presence of power of 2 Only and only if src1 is a power of 2 we can replace IMAD by SHLADD. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 21:20:41 +02:00
Samuel Pitoiset	31545b64b8	nvc0/ir: add emission for SHLADD Unfortunately, we can't use the emit helpers for GF100/GK110 because src1 and src2 are swapped. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 21:20:36 +02:00
Samuel Pitoiset	85132c7453	nv50/ir: add preliminary support for SHLADD This instruction is available since SM20 (Fermi) and allow to do (a << b) + c in one shot. In some situations, IMAD should be replaced by SHLADD when b is a power of 2, and ADD+SHL should be replaced by SHLADD as well. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 21:20:30 +02:00
Samuel Pitoiset	652874754a	nvc0: update GM107 sched control codes format envyas now uses a much better representation for those control codes and it displays the different flags instead of an unreadable hex number. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 20:13:05 +02:00
Nicolai Hähnle	e4b585f009	gallium/radeon: use smaller buffers for query results Most of the time, even the 512 bytes that we now get is more than sufficient (pipeline stats queries are the largest at 184 bytes per shot). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:24:56 +02:00
Nicolai Hähnle	de84e99e45	gallium/radeon/winsyses: add radeon_winsys::min_alloc_size Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:24:52 +02:00
Nicolai Hähnle	7a0e543836	radeonsi: enable ARB_query_buffer_object (v2) v2: enable only when compute is available Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:15:00 +02:00
Nicolai Hähnle	15e2661137	gallium/radeon: implement get_query_result_resource (v2) v2: fix a comment (Gustaw Smolarczyk) Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:54 +02:00
Nicolai Hähnle	2c9d546402	gallium/radeon: zero all query buffers To ensure that fences are properly initialized. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:51 +02:00
Nicolai Hähnle	daeab0171d	gallium/radeon: cleanup getting PIPE_QUERY_TIMESTAMP result Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:45 +02:00
Nicolai Hähnle	631c47384c	gallium/radeon: add query fences and r600_get_hw_query_params We will support the waiting option in ARB_query_buffer_object using WAIT_REG_MEM on an appropriate fence-like dword. Some queries conveniently write their results with the highest bit set, and we can just use that; for others, we have to write a fence explicitly. ZPASS_DONE for occlusion queries writes its results with the high bit set, but it writes up to 8 pairs of results (one for each DB). We have to wait for all of these results, so let's just add an explicit fence. The new function provides summary information to be used by subsequent patches. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:41 +02:00
Nicolai Hähnle	51b57a9b5a	radeonsi: add save_qbo_state Save compute shader state that will be used for the ARB_query_buffer_object implementation. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:37 +02:00
Nicolai Hähnle	70f9ca2468	radeonsi: add si_get_shader_buffers/get_pipe_constant_buffers (v2) These functions extract the pipe state structure from the current descriptors, for state saving. v2: correctly dereference *buf (Bas) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:33 +02:00
Nicolai Hähnle	8d45243e40	gallium/radeon: add r600_gfx_{write,wait}_fence For bottom-of-pipe fences inside the gfx command stream. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:29 +02:00
Nicolai Hähnle	8e4de00930	gallium/radeon: add barrier_flags to r600_common_screen There are driver-specific context flags for barriers that are not covered by the Gallium barrier interfaces. The R600 settings of these flags may not be optimal, but we're not going to use them yet anyway. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:11 +02:00
Timothy Arceri	577e06095b	glsl: remove remaining tabs from ast_type.cpp Acked-by: Dave Airlie <airlied@redhat.com>	2016-09-29 11:06:12 +10:00
Timothy Arceri	222f66a812	glsl: remove remaining tabs from ast_to_hir.cpp Acked-by: Dave Airlie <airlied@redhat.com>	2016-09-29 11:06:12 +10:00
Timothy Arceri	fc1d200bc7	glsl: remove remaining tabs from ast_array_index.cpp Acked-by: Dave Airlie <airlied@redhat.com>	2016-09-29 11:06:12 +10:00
Timothy Arceri	b193c4d75b	glsl: remove tabs from ast_expr.cpp Acked-by: Dave Airlie <airlied@redhat.com>	2016-09-29 11:06:12 +10:00
Timothy Arceri	386045a3df	glsl: remove tabs from linker.{cpp,h} Acked-by: Dave Airlie <airlied@redhat.com>	2016-09-29 11:06:12 +10:00
Steven Toth	8c60bcb4c3	gallium/hud: Add support for block I/O, network I/O and lmsensor stats V8: Feedback based on peer review convert if block into a switch Constify some func args V7: Increase precision when measuring lmsensors volts Flatten patch series. V6: Feedback based on peer review Simplify sensor initialization (arg passing). Constify some func args V5: Feedback based on peer review Convert sprintf to snprintf Convert char * to const char * int arg converted to bool Func changes to take a filename vs a larger struct. Omit the space between '*' and the param name. V4: Merged with master as of 2016/9/27 6pm V3: Flatten the entire patchset ready for the ML V2: Additional seperate patches based on feedback a) configure.ac: Add a comment related to libsensors b) HUD: Disable Block/NIC I/O stats by default. Implement configuration option --enable-gallium-extra-hud=yes and enable both statistics when this option is enabled. c) Configure.ac: Minor cleanup to user visible configuration settings d) Configure.ac: HUD stats - build system improvements Move the -lsensors out of a deeper Makefile, bring it into the configure.ac. Also, rename a compiler directive to more closely follow the standard. V1: Initial release to the ML Three new features: 1. Disk/block I/O device read/write stats MB/ps. 2. Network Interface RX/TX transfer statistics as a percentage of the overall NIC speed. 3. lmsensor power, voltage and temperature sensors. The lmsensor changes makes a dependency on libsensors so support for the change is opt out by default. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-28 16:18:05 -06:00
Ben Widawsky	29783c0887	i965: Remove useless (harmful) assertion The code already skips doing the depth stall on gen >= 8, and as we enable new platforms this assertion will fail needlessly. Instead of changing the caller, make this simple change. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-28 09:42:53 -07:00
Eric Anholt	2a721b1b79	vc4: Emit perf debug when we fall back to quad clears.	2016-09-28 08:31:14 -07:00
Eric Anholt	1aa8a0392f	nir: Optimize out discard_ifs with a constant 0 argument. I found this in a shader that was doing an alpha test when alpha is fixed at 1.0. v2: Rebase on master (now the const value is "u32" not "u"). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2016-09-28 08:31:14 -07:00
Michel Dänzer	8d8c440ebf	gallium/radeon: Initialize pipe_resource::next to NULL Fixes lots of piglit tests crashing due to using uninitialized memory. Fixes: `ecd6fce261` ("mesa/st: support lowering multi-planar YUV") Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-28 10:39:22 +09:00
Timothy Arceri	3eb0baeecf	glsl: don't crash when dumping shaders if some come from cache Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-28 10:43:15 +10:00
Timothy Arceri	87ab26b2ab	glsl: Add initial functions to implement an on-disk cache This code provides for an on-disk cache of objects. Objects are stored and retrieved via names that are arbitrary 20-byte sequences, (intended to be SHA-1 hashes of something identifying for the content). The directory used for the cache can be specified by means of environment variables in the following priority order: $MESA_GLSL_CACHE_DIR $XDG_CACHE_HOME/mesa <user-home-directory>/.cache/mesa By default the cache will be limited to a maximum size of 1GB. The environment variable: $MESA_GLSL_CACHE_MAX_SIZE can be set (at the time of GL context creation) to choose some other size. This variable is a number that can optionally be followed by 'K', 'M', or 'G' to select a size in kilobytes, megabytes, or gigabytes. By default, an unadorned value will be interpreted as gigabytes. The cache will be entirely disabled at runtime if the variable MESA_GLSL_CACHE_DISABLE is set at the time of GL context creation. Many thanks to Kristian Høgsberg <krh@bitplanet.net> for the initial implementation of code that led to this patch. In particular, the idea of using an mmapped file, (indexed by a portion of the SHA-1), for the efficent implementation of cache_has_key was entirely his idea. Kristian also provided some very helpful advice in discussions regarding various race conditions to be avoided in this code. Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-28 09:16:31 +10:00
Chad Versace	44bcf1ffcc	.gitignore: Ignore src/compiler/spirv2nir	2016-09-27 13:22:44 -07:00
Ian Romanick	ea6ed2379d	glsl: Fix cut-and-paste bug in hierarchical visitor ir_expression::accept At this point in the code, s must be visit_continue. If the child returned visit_stop, visit_stop is the only correct thing to return. Found by inspection. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 12:06:46 -07:00
Ian Romanick	7f64041cee	glsl: Add bit_xor builder Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 12:06:46 -07:00
Ian Romanick	5f7f7d582b	glsl/standalone: Enable GLSL 4.00 through 4.50 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 12:06:46 -07:00
Ian Romanick	798d1b8816	glsl/standalone: Use API_OPENGL_CORE if the GLSL version is >= 1.40 Otherwise extensions to 1.40 that are only for core profile won't work. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 12:06:46 -07:00
Ian Romanick	afd99734db	glsl: Update function parameter documentation for do_common_optimization max_unroll_iterations was moved into options a long, long time ago. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 12:06:46 -07:00
Tim Rowley	bacdd9ef4c	configure.ac: add llvm inteljitevents component if enabled Needed to successfully link llvmpipe or swr when using shared llvm libs built with inteljitevents enabled. v2: Make adding inteljitevents component global rather than just llvmpipe/swr, since libgallium will have a symbol dependency. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-27 12:56:47 -05:00
Tim Rowley	50842e8a93	swr: replace gallium->swr format enum conversion Replace old string comparison with a mapping table. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-09-27 12:55:26 -05:00
Nicolai Hähnle	4421c0fb0d	gallium/radeon/winsyses: reduce the number of pb_cache buckets Small buffers are now handled via the slabs code, so separate buckets in pb_cache have become redundant. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:41 +02:00
Nicolai Hähnle	fb827c055c	winsys/radeon: enable buffer allocation from slabs Only enable for chips with GPUVM, because older driver paths do not take the required offset into account. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:37 +02:00
Nicolai Hähnle	a1e391e39d	winsys/radeon: add fine-grained fences for slab buffers Note the logic for adding fences is somewhat different than for amdgpu, because radeon has no scheduler and we therefore have no guarantee about the order in which submissions from multiple threads are processed. (Ironically, this is only an issue when "multi-threaded submission" is disabled, because "multi-threaded submission" actually means that all submissions happen from a single thread that happens to be separate from the application's threads. If we only supported "multi-threaded submission", the fence handling could be simplified by adding the fences in that thread where everything is serialized.) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:34 +02:00
Nicolai Hähnle	0edebde9a4	winsys/radeon: add slab buffer list Introducing radeon_bo::hash will reduce collisions between "real" buffers and buffers from slabs. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:32 +02:00
Nicolai Hähnle	cbb9c2f170	winsys/radeon: separate adding a buffer from updating its reloc data Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:29 +02:00
Nicolai Hähnle	a9e8672585	winsys/radeon: add slab entry structures to radeon_bo Already adjust the map/unmap logic accordingly. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:25 +02:00
Nicolai Hähnle	ffa1c669dd	winsys/amdgpu: enable buffer allocation from slabs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:23 +02:00
Nicolai Hähnle	a3832590c6	winsys/amdgpu: add fence and buffer list logic for slab allocated buffers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:20 +02:00
Nicolai Hähnle	a987e4377a	winsys/amdgpu: add slab entry structures to amdgpu_winsys_bo Already adjust amdgpu_bo_map/unmap accordingly. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:15 +02:00
Nicolai Hähnle	5af9eef719	winsys/amdgpu: do not synchronize unsynchronized buffers When a buffer is added to a CS without the SYNCHRONIZED usage flag, we now no longer add a dependency on the buffer's fence(s). However, we still need to add a fence to the buffer during flush, so that cache reclaim works correctly (and in the hypothetical case that the buffer is later added to a CS _with_ the SYNCHRONIZED flag). It is now possible that the submissions refererring to a buffer are no longer linearly ordered, and so we may have to keep multiple fences around. We keep the fences in a FIFO. It should usually stay quite short (# of contexts * 2, for gfx + dma rings). While we're at it, extract amdgpu_add_fence_dependency for a single buffer, which will make adding the distinction between real buffer and slab cases easier. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:11 +02:00
Nicolai Hähnle	6d89a40676	gallium/radeon: add RADEON_FLAG_HANDLE When passed to winsys->buffer_create, this flag will indicate that we require a buffer that maps 1:1 with a kernel buffer handle. This is currently set for all textures, since textures can potentially be exported to other processes. This is not a huge loss, since the main purpose of this patch series is to deal with applications that allocate many small buffers. A hypothetical application with tons of tiny textures might still benefit from not setting this flag, but that's not a use case I'm worried about just now. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:05 +02:00
Nicolai Hähnle	e703f71ebd	gallium/radeon: add RADEON_USAGE_SYNCHRONIZED This is really the behavior we want most of the time, but having a SYNCHRONIZED flag instead of an UNSYNCHRONIZED one has the advantage that OR'ing different flags together always results in stronger guarantees. The parent BOs of sub-allocated buffers will be added unsynchronized. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:02 +02:00
Nicolai Hähnle	84f156c0cb	gallium/pipebuffer: add pb_slab utility This is a simple framework for slab allocation from buffers that fits into the buffer management scheme of the radeon and amdgpu winsyses where bufmgrs aren't used. The utility knows about different sized allocations and explicitly manages reclaim of allocations that have pending fences. It manages all the free lists but does not actually touch buffer objects directly, relying on callbacks for that. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:44:42 +02:00
Nicolai Hähnle	b3ebc229dc	gallium/u_math: add util_logbase2_ceil For finding the exponent of the next power of two. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:44:38 +02:00
Nicholas Bishop	c060f291c2	i915g: add dma-buf support to i915_drm_buffer_get_handle The implementation of i915_drm_buffer_get_handle now handles DRM_API_HANDLE_TYPE_FD in the same way that intel_winsys_import_handle does, by calling drm_intel_bo_gem_create_from_prime. Tested by successfully running Chrome's ozone_demo [1] with the ozone-gbm backend on an Intel Pineview M machine. Without this change it fails while trying to create a DMA-BUF. [1] https://chromium.googlesource.com/chromium/src.git/+/master/ui/ozone/demo/ozone_demo.cc Signed-off-by: Nicholas Bishop <nbishop@neverware.com> [Emil Velikov: Fix coding style] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-27 13:37:21 +01:00
Nicholas Bishop	aa560e8e63	st/dri: check pipe_screen->resource_get_handle() return value Change dri2_query_image to check the return value of resource_get_handle and return GL_FALSE if an error occurs. For reference this is an example callstack that should propagate the error back to the user: i915_drm_buffer_get_handle i915_texture_get_handle u_resource_get_handle_vtbl dri2_query_image gbm_dri_bo_get_fd gbm_bo_get_fd Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Nicholas Bishop <nbishop@neverware.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) [Emil Velikov: Split from larger patch, polish coding style, cc stable] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-27 13:37:21 +01:00
Nicholas Bishop	2d05ba2ca0	gbm: return appropriate error when queryImage() fails Change gbm_dri_bo_get_fd to check the return value of queryImage and return -1 (an invalid file descriptor) if an error occurs. Update the comment for gbm_bo_get_fd to return -1, since (apart from the above) we've already return -1 on error. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Nicholas Bishop <nbishop@neverware.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) [Emil Velikov: Split from larger patch, polish coding style, cc stable] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-27 13:37:21 +01:00
Andy Furniss	a599302227	st/va Avoid VBR bitrate calculation overflow v2 VBR bitrate calc needs 64 bits at high rates. v2: use float. Signed-off-by: Andy Furniss <adf.lists@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: mesa-stable@lists.freedesktop.org	2016-09-27 14:21:45 +02:00
Mark Thompson	a543f231d7	st/va: Fix vaSyncSurface with no outstanding operation Fixes crash if the application doesn't do what the state tracker expects. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-27 14:21:44 +02:00
Timothy Arceri	df920367bf	glsl: remove remaining tabs in glsl_parser_extras.h Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-09-27 20:32:47 +10:00
Ilia Mirkin	477cc0e085	st/mesa: enable ARB_ES3_2_compatibility when enough available Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 00:20:44 -04:00
Ilia Mirkin	67fbaa5873	st/mesa: enable GL_ANDROID_extension_pack_es31a when available For now that's never since advanced blend hasn't been piped through. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 00:20:41 -04:00
Timothy Arceri	63e8221574	glsl: move some uniform linking code to new link_assign_uniform_storage() This makes link_assign_uniform_locations() easier to follow. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 11:29:05 +10:00
Timothy Arceri	ab67b6afdf	glsl: move some uniform linking code to new link_setup_uniform_remap_tables() This makes link_assign_uniform_locations() easier to follow. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 11:29:05 +10:00
Timothy Arceri	856e0bd707	i965: create populate key functions for tcs and tes These will be used by the on disk shader cache. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 11:11:15 +10:00
Timothy Arceri	ec75570415	i965: make gs key generation helper available to shader cache Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 11:11:15 +10:00
Timothy Arceri	481d8ec291	glsl: use reproducible name for lowered const arrays Otherwise we can end up with mismatching names between the cached binary and the cached metadata. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 11:11:15 +10:00
Carl Worth	017081a3e5	i965: make vs and fs key generation helpers available to shader cache Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>	2016-09-27 11:11:15 +10:00
Carl Worth	f61669f997	glsl: Prepare standalone compiler to be able to use parameter lists As part of the shader-cache work an upcoming change will add new references to _mesa_add_parameter and _mesa_new_parameter_list from the glsl code. To prepare for that, and to allow the standalone glsl_compiler to still link, here we add mesa/program/prog_parameter.c to the libglsl_util sources. Then, in order to get that to work, we also add to stubs to standalone_scaffolding: _mesa_program_state_flags _mesa_program_state_string These functions aren't actually used by the two functions in prog_parameter.c that we are actually calling. They are used in other functions in the same file. So we don't care what the implementation of these stubs is, (they won't be called by glsl_compiler). We just need the stubs present so that it can link. Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-27 11:11:15 +10:00
Samuel Pitoiset	f24b517858	nv50/ir: fix comments about instructions info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-26 21:59:37 +02:00
Rob Clark	ecd6fce261	mesa/st: support lowering multi-planar YUV Support multi-planar YUV for external EGLImage's (currently just in the dma-buf import path) by lowering to multiple texture fetch's for each plane and CSC in shader. There was some discussion of alternative approaches for tracking the additional UV or U/V planes: https://lists.freedesktop.org/archives/mesa-dev/2016-September/127832.html They all seemed worse than pipe_resource::next Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-26 15:29:17 -04:00
Rob Clark	e0ec1c3134	mesa/st: add nir pass to lower tex_src_plane Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-26 15:29:17 -04:00
Rob Clark	c2a60cacd4	mesa/st: add lowering pass for YUV samplers Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-26 15:29:17 -04:00
Sirisha Gandikota	8e3e9d74b5	aubinator: Fix the decoding of values that span two Dwords Fixed the way the values that span two Dwords are decoded. Based on the start and end indices of the field, the Dwords are fetched and decoded accordingly. v2: rename dw to qw in gen_field_iterator_next and remove extra white space (Anuj) v3: change all instances of dw to qw (Anuj) Earlier, 64-bit fields (such as most pointers on Gen8+) weren't decoded correctly. gen_field_iterator_next seemed to walk one DWord at a time, sets v.dw, and then passes it to field(). So, even though field() takes a uint64_t, we're passing it a uint32_t (which gets promoted, so the top 32 bits will always be zero). This seems pretty bogus... (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-26 11:18:52 -07:00
Samuel Pitoiset	ac859d68f4	nvc0: allow to force compiling programs in debug build This adds a new envvar called NV50_PROG_CHIPSET which allows to compile shaders with a different target, especially useful for shader-db. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-26 19:39:04 +02:00
Samuel Pitoiset	e05042b367	nv50/ir: drop unused NVISA_XXX_CHIPSET constants Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-26 19:39:04 +02:00
Samuel Pitoiset	be0535b8c7	gallium/util: make use of strtol() in debug_get_num_option() This allows to use hexadecimal numbers which are automatically detected by strtol() when the base is 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Brian Paul <brianp@vmware.com>	2016-09-26 19:39:04 +02:00
Glenn Kennard	5da24242b3	r600g: Add support for PK2H/UP2H Based off of Ilia's original patch, but with output values replicated so that it matches the TGSI semantics. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-09-26 17:08:49 +02:00
Timothy Arceri	eb2dc04127	i965: stop passing stage as a function parameter We already pass the shader so we can just get the stage from this. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-09-26 09:59:24 +10:00
Nayan Deshmukh	b3827819aa	aubinator: fix resource leak CovID: 1373370 Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-25 12:32:48 -07:00
Emilio Cobos Álvarez	cb7c2c9d65	osmesa: Unbind the current context when given a null context and buffer. This is needed to be consistent with other drivers. Signed-off-by: Emilio Cobos Álvarez <me@emiliocobos.me> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-23 19:55:50 -06:00
Brian Paul	07d1f8faf9	st/mesa: small optimization in swizzle_swizzle() Usually, there's no user-specified texture swizzle so we can optimize the swizzle_swizzle() function and skip the loop/switch. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	1cdc232e1a	st/mesa: fix swizzle issue in st_create_sampler_view_from_stobj() Some demos, like Heaven, were creating and destroying a large number of sampler views because of a swizzle issue. Basically, we compute the sampler view's swizzle by examining the texture format, user swizzle, depth mode, etc. Later, during validation we recompute that swizzle (in case something like depth mode changes) and see if it matches the view's swizzle. In the case of PIPE_FORMAT_RGTC2_UNORM, get_texture_format_swizzle returned SWIZZLE_XYZW but the u_sampler_view_default_template() function was setting the sampler view's swizzle to SWIZZLE_XY01. This mismatch caused the validation step to always "fail" so we'd destroy the old sampler view and create a new one. By removing the conditional, the sampler view's swizzle and the computed texture swizzle match and validation "passes". When creating a new sampler view, we always want to use the texture swizzle which we just computed. Fixes VMware issue 1733389. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	c0d7b6073d	svga: set PIPE_BIND_DEPTH_STENCIL flag for new resources when possible When we create a depth/stencil texture, also check if we can render to it and set the PIPE_BIND_DEPTH_STENCIL flag. We were previously doing this for color textures (PIPE_BIND_RENDER_TARGET). Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	f942a70340	svga: don't special case caps for SVGA3D_R32_FLOAT This may have been needed years ago during development, but not now. Prevents some regressions after introducing the next patch. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	14639cdf8f	svga: use new adjust_z_layer() helper in svga_pipe_blit.c To handle z/layer fix-ups for blitting and copying. Note that we weren't doing this properly in svga_blit() before. Also, remove redundant stex, dtex assignments. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	c42000545d	svga: simplify/improve the format compatibility check for region copies The util_is_format_compatible() function didn't quite do what we wanted for vgpu10. This check is more flexible and allows copies between formats such as R32G32B32A32_FLOAT and R32G32B32A32_INT. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	2ad4ba0727	svga: add const qualifier on svga_translate_format() Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	4d04696524	svga: eliminate unneeded gotos in svga_validate_surface_view() Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Neha Bhende	47f16f5e7f	svga: disable srgb format related code from svga_blit() With latest mesa and latest piglit tests srgb<->linear conversion is not required as per GL4.4 rules See commit `b662c70aea`. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:53:51 -06:00
Timothy Arceri	29c174a3e5	Revert "glsl: move xfb BufferStride into gl_transform_feedback_info" This reverts commit `f5a6aab403`. This broke some tests. It seems gl_transform_feedback_info gets memset to 0 so we were losing the values in BufferStride before we used them.	2016-09-24 10:17:26 +10:00
Kenneth Graunke	943b69cddd	glsl: Delete linker stuff relating to built-in functions. Now that we generate built-in functions inline, there's no need to link against the built-in shader, and no built-in prototypes to consider. This lets us delete a bunch of code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by; Ian Romanick <ian.d.romanick@intel.com>	2016-09-23 16:40:40 -07:00
Kenneth Graunke	f7a5c714b3	glsl: Delete ftransform support from builtin_functions.cpp. This is now handled directly by ast_function.cpp. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by; Ian Romanick <ian.d.romanick@intel.com>	2016-09-23 16:40:40 -07:00
Kenneth Graunke	b04ef3c08a	glsl: Immediately inline built-ins rather than generating calls. In the past, we imported the prototypes of built-in functions, generated calls to those, and waited until link time to resolve the calls and import the actual code for the built-in functions. This severely limited our compile-time optimization opportunities: even trivial functions like dot() were represented as function calls. We also had no way of reasoning about those calls; they could have been 1,000 line functions with side-effects for all we knew. Practically all built-in functions are trivial translations to ir_expression opcodes, so it makes sense to just generate those inline. Since we eventually inline all functions anyway, we may as well just do it for all built-in functions. There's only one snag: built-in functions that refer to built-in global variables need those remapped to the variables in the shader being compiled, rather than the ones in the built-in shader. Currently, ftransform() is the only function matching those criteria, so it seemed easier to just make it a special case. On Skylake: total instructions in shared programs: 12023491 -> 12024010 (0.00%) instructions in affected programs: 77595 -> 78114 (0.67%) helped: 97 HURT: 309 total cycles in shared programs: 137239044 -> 137295498 (0.04%) cycles in affected programs: 16714026 -> 16770480 (0.34%) helped: 4663 HURT: 4923 while these statistics are in the wrong direction, the number of hurt programs is small (309 / 41282 = 0.75%), and I don't think anything can be done about it. A change like this significantly alters the order in which optimizations are performed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by; Ian Romanick <ian.d.romanick@intel.com>	2016-09-23 16:40:40 -07:00
Kenneth Graunke	1617f59bc6	glsl: Check TCS barrier restrictions at ast_to_hir time, not link time. We want to check prior to optimization - otherwise we might fail to detect cases where barrier() is in control flow which is always taken (and therefore gets optimized away). We don't currently loop unroll if there are function calls inside; otherwise we might have a problem detecting barrier() in loops that get unrolled as well. Tapani's switch handling code adds a loop around switch statements, so even with the mess of if ladders, we'll properly reject it. Enforcing these rules at compile time makes more sense more sense than link time. Doing it at ast-to-hir time (rather than as an IR pass) allows us to emit an error message with proper line numbers. (Otherwise, I would have preferred the IR pass...) Fixes spec/arb_tessellation_shader/compiler/barrier-switch-always.tesc. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by; Ian Romanick <ian.d.romanick@intel.com>	2016-09-23 16:40:40 -07:00
Timothy Arceri	f5a6aab403	glsl: move xfb BufferStride into gl_transform_feedback_info It makes more sense to have this here where we store the other values from xfb qualifiers. The struct it was previously part of is now only used to store values that come from the api. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-09-24 09:18:29 +10:00
Dylan Baker	85e9bbc14d	Revert "mapi: export all GLES 3.2 functions in libGLESv2.so" This reverts commit `e66a2b879b`. Which breaks the scons build in an interesting way, particularly when BlendBarrier and PrimitiveBoundingBox are added to static_data.py's functions list. This seems to be related to the fact that the unsuffixed names are only in GLES3.2, but Desktop GL only has suffixed versions. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2016-09-23 12:13:13 -07:00
Adam Jackson	8ce2afe776	i965: Enable EGL_KHR_gl_texture_3D_image Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-23 06:53:21 -04:00
Adam Jackson	5981366b9f	i915: Enable EGL_KHR_gl_texture_3D_image Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-23 06:53:17 -04:00
Nicolas Koch	f17948a30a	anv: Check for VK_WHOLE_SIZE in anv_CmdFillBuffer From the Vulkan spec: Size is the number of bytes to fill, and must be either a multiple of 4, or VK_WHOLE_SIZE to fill the range from offset to the end of the buffer. If VK_WHOLE_SIZE is used and the remaining size of the buffer is not a multiple of 4, then the nearest smaller multiple is used. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-23 00:20:16 -07:00
Lionel Landwerlin	6b21728c4a	anv: get rid of duplicated values from gen_device_info Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-23 10:12:06 +03:00
Lionel Landwerlin	94d0e7dc08	i965: get rid of duplicated values from gen_device_info Now that we have gen_device_info mutable, we can update its values and drop all copies we had in brw_context. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-23 10:12:06 +03:00
Lionel Landwerlin	bc24590f0c	intel/i965: make gen_device_info mutable Make gen_device_info a mutable structure so we can update the fields that can be refined by querying the kernel (like subslices and EU numbers). This patch does not make any functional change, it just makes gen_get_device_info() fill a structure rather than returning a const pointer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-23 10:11:59 +03:00
Timothy Arceri	e60928f4c4	gallium: remove unused PIPE_CC_GCC_VERSION Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-23 16:18:21 +10:00
Timothy Arceri	4eb0e90c6b	util: remove Sun C Compiler support Support for this compiler was dropped in `51564f04b7` Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-23 16:17:16 +10:00
Ilia Mirkin	c0a7e931e3	st/mesa: turn on OES_viewport_array when dependencies are met Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-22 20:42:30 -04:00
Ilia Mirkin	0f01aa8033	mesa: add implementations for new float depth functions This just up-converts them to doubles. Not great, but this is what all the other variants also do. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-22 20:42:30 -04:00
Ilia Mirkin	381b15dc20	mesa: move ARB_viewport_array params to a GLES 3.1-accessible section This is needed for GL_OES_viewport_array. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-22 20:42:30 -04:00
Ilia Mirkin	5644a90801	mesa: add GL_OES_viewport_array to the extension string The expectation is that drivers will set this based on OES_geometry_shader and ARB_viewport_array support. This is a separate enable on the same reasoning as for OES_texture_cube_map_array. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-22 20:42:30 -04:00
Ilia Mirkin	70aef97f9e	glsl: add OES_viewport_array enables and use them to expose gl_ViewportIndex Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-22 20:42:30 -04:00
Ilia Mirkin	411a72d4a2	mesa: add new entrypoints for GL_OES_viewport_array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-22 20:42:30 -04:00
Dylan Baker	e66a2b879b	mapi: export all GLES 3.2 functions in libGLESv2.so See commit `5921f372c8` for the rational of this commit. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-22 16:01:40 -07:00
Dylan Baker	ce83e36ec0	mapi: sort static_data.py functions Sorted by vim's builtin "sort i" (keeping the sorting case insensitive) v2: - uses case insensitive sorting (Ken) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-22 15:29:27 -07:00
Dylan Baker	2fd51cf8ca	mapi: retab static_data.py to be consistent This file currently uses a mixture of 3 and 4 space indent. I have changed it all to 4 space indent, matching the settings in $ROOT/.editorconfig. This was generated with sed: sed -i -e 's@^ "@ "@g' Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-22 15:28:44 -07:00
Lionel Landwerlin	9adfa695ac	spirv: fix AtomicLoad/Store on images OpAtomicLoad/Store should have pointer to images just like the rest of the atomic operators. These couple of lines were poorly copied from the ssbo/shared_vars cases (the only ones currently tests by the CTS). Fixes `2afb950161` ("spirv/nir: Add support for OpAtomicLoad/Store") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-22 14:08:21 +03:00
Eric Anholt	36f0f03182	nir: Allow opt_peephole_sel to be more aggressive in flattening IFs. VC4 was running into a major performance regression from enabling control flow in the glmark2 conditionals test, because of short if statements containing an ffract. This pass seems like it was was trying to ensure that we only flattened IFs that should be entirely a win by guaranteeing that there would be fewer bcsels than there were MOVs otherwise. However, if the number of ALU ops is small, we can avoid the overhead of branching (which itself costs cycles) and still get a win, even if it means moving real instructions out of the THEN/ELSE blocks. For now, just turn on aggressive flattening on vc4. i965 will need some tuning to avoid regressions. It does looks like this may be useful to replace freedreno code. Improves glmark2 -b conditionals:fragment-steps=5:vertex-steps=0 from 47 fps to 95 fps on vc4. vc4 shader-db: total instructions in shared programs: 101282 -> 99543 (-1.72%) instructions in affected programs: 17365 -> 15626 (-10.01%) total uniforms in shared programs: 31295 -> 31172 (-0.39%) uniforms in affected programs: 3580 -> 3457 (-3.44%) total estimated cycles in shared programs: 225182 -> 223746 (-0.64%) estimated cycles in affected programs: 26085 -> 24649 (-5.51%) v2: Update shader-db output. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)	2016-09-22 11:10:21 +03:00
Kenneth Graunke	6c648cdac8	docs: Mark ES 3.2 "all done" for i965/gen9+.	2016-09-21 11:52:59 -07:00
Kenneth Graunke	a4fbc73ee8	docs: Add ES 3.2 to release notes.	2016-09-21 11:52:59 -07:00
Brian Paul	b35684543e	gallium/util: add comment on util_is_format_compatible() From reading the code, it's not obvious what is src/dest compatible. The list of a->b copy-compatible formats comes from Jose's original check-in message, with some format name updates. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-09-21 12:26:17 -06:00
Brian Paul	99d9f764b2	svga: minor simplification in svga_validate_surface_view() Get rid of unneeded local var. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-21 12:23:45 -06:00
Brian Paul	1cc7a76d73	svga: remove disable_shader debug variable Never used, AFAIK. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-21 12:23:45 -06:00
Kenneth Graunke	a53da57d5a	i965: Enable ES 3.2 on Skylake. It's already advertised because the version.c extension checks are fulfilled, but we didn't actually claim support, so trying to create a ES 3.2 context would fail. It's all done, and the CTS results look good, so let's turn it on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-09-21 10:51:58 -07:00
Jason Ekstrand	d2f42a945e	nir/spirv/glsl450: Add support for the InterpolateAt opcodes Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-09-21 05:39:06 -07:00
Jason Ekstrand	a529644889	nir/spirv: Claim support for SampleRateShading We already support all of the decorations that require this capability. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-09-21 05:39:06 -07:00
Jason Ekstrand	7c48622581	nir/spirv: Bring back the spirv2nir helper binary This was something that I wrote in the early days of the spirv_to_nir code but deleted once we had a real driver. However, in the absence of a shader_runner equivalent, it's extremely useful for debugging the spirv_to_nir code so let's bring it back. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-21 05:38:26 -07:00
Chuanbo Weng	e4648ba8dd	i965: implement querying __DRI_IMAGE_ATTRIB_OFFSET. Implement querying this attribute in intelImageExtension and bump version of intelImageExtension. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-21 12:19:19 +01:00
Chuanbo Weng	9e8de866f7	egl: return corresponding offset of EGLImage instead of 0. The offset should not always be 0. For example, if EGLImage is created from a 2D texture with EGL_GL_TEXTURE_LEVEL=1, then the offset should be the actual start of miplevel 1 in bo. v2: Add version check of __DRIimageExtension implementation (Suggested by Axel Davy). v3: Don't add version check of __DRIimageExtension implementation. Set the offset only when queryImage() succeeds. (Suggested by Emil Velikov) Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [Emil Velikov: coding style fixes] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-21 12:19:19 +01:00
Chuanbo Weng	1ceb775d57	dri: add offset attribute and bump version of EGLImage extensions. Offset is useful for buffer sharing with other components, so add it to queryImage attributes. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-21 12:19:19 +01:00
Francisco Jerez	e5311ba1ac	i965/ir: Test thread dispatch packing assumptions. Not [originally] intended for upstream. Should cause a GPU hang if some thread is executed with a non-contiguous dispatch mask breaking assumptions of brw_stage_has_packed_dispatch(). Doesn't cause any CTS, DEQP or Piglit regressions, while replacing brw_stage_has_packed_dispatch() with a dummy implementation that unconditionally returns true on top of this patch causes multiple GPU hangs. v2: Refactor into a separate function instead of emitting the test code directly from emit_nir_code(), drop VEC4 test and clean up slightly for upstream. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-21 13:45:46 +03:00
Francisco Jerez	c05a4f11a0	i965/ir: Pass identity mask to brw_find_live_channel() in the packed dispatch case. This avoids emitting a few extra instructions required to take the dispatch mask into account when it's known to be tightly packed. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-21 13:45:46 +03:00
Francisco Jerez	f57f526fc5	i965/ir: Skip eliminate_find_live_channel() for stages with sparse thread dispatch. The eliminate_find_live_channel optimization eliminates FIND_LIVE_CHANNEL instructions in cases where control flow is known to be uniform, and replaces them with 'MOV 0', which in turn unblocks subsequent elimination of the BROADCAST instruction frequently used on the result of FIND_LIVE_CHANNEL. This is however not correct in per-sample fragment shader dispatch because the PSD can dispatch a fully unlit sample under certain conditions. Disable the optimization in that case. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> v2: Add devinfo argument to brw_stage_has_packed_dispatch() to implement hardware generation check.	2016-09-21 13:45:46 +03:00
Jason Ekstrand	8a468d186e	i965/fs: Take Dispatch/Vector mask into account in FIND_LIVE_CHANNEL On at least Sky Lake, ce0 does not contain the full story as far as enabled channels goes. It is possible to have completely disabled channels where the corresponding bits in ce0 are 1. In order to get the correct execution mask, you have to mask off those channels which were disabled from the beginning by taking the AND of ce0 with either sr0.2 or sr0.3 depending on the shader stage. Failure to do so can result in FIND_LIVE_CHANNEL returning a completely dead channel. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Francisco Jerez <currojerez@riseup.net> [ Francisco Jerez: Fix a couple of typos, add mask register type assertion, clarify reason why ce0 can have bits set for disabled channels, clarify that this may only be a problem when thread dispatch doesn't pack channels tightly in the SIMD thread. Apply same treatment to Align16 path. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-09-21 13:45:45 +03:00
Jason Ekstrand	a2392cee48	i965/reg: Make brw_sr0_reg take a subnr and return a vec1 reg The state register sr0 is really a collection of dwords not a SIMD8 anything. It's much more convenient for brw_sr0_reg to return the particular dword you're looking for rather than a giant blob you have to massage into what you want. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> [ Francisco Jerez: Trivial simplification of brw_ud1_reg(). ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-09-21 13:45:45 +03:00
Lionel Landwerlin	b8162d6b6e	anv: pipeline: use correct number of thread for compute Reproduces this commit : commit `0fb85ac08d` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Mon Jun 6 21:37:34 2016 -0700 i965: Use the correct number of threads for compute shaders. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-21 12:01:06 +03:00
Lionel Landwerlin	f2d43b44d7	anv: allocator: correct scratch space for haswell This reproduces this commit : commit `2213ffdb4b` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Mon Jun 6 21:37:34 2016 -0700 i965: Allocate scratch space for the maximum number of compute threads. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-21 12:01:06 +03:00
Lionel Landwerlin	09394ee6cf	anv: device: calculate compute thread numbers using subslices numbers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-21 12:01:06 +03:00
Nicolai Hähnle	1f291369e4	gallivm: support negation on 64-bit integers This should be analogous to 32-bit integers. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:24:50 +02:00
Dave Airlie	4207612f9c	radeonsi: prepare 64-bit integer support. (v2) v2: - no PIPE_CAP_INT64 yet - emit DIV/MOD without the divide-by-zero workaround Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:24:38 +02:00
Dave Airlie	5561a37710	gallivm/llvmpipe: prepare support for ARB_gpu_shader_int64. This enables 64-bit integer support in gallivm and llvmpipe. v2: add conversion opcodes. v3: - PIPE_CAP_INT64 is not there yet - restrict DIV/MOD defaults to the CPU, as for 32 bits - TGSI_OPCODE_I2U64 becomes TGSI_OPCODE_U2I64 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:24:30 +02:00
Dave Airlie	6b26039da3	tgsi/softpipe: prepare ARB_gpu_shader_int64 support. (v3) This adds all the opcodes to tgsi_exec for softpipe to use. v2: add conversion opcodes. v3: - no PIPE_CAP_INT64 yet - change TGSI_OPCODE_I2U64 to TGSI_OPCODE_U2I64 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:24:11 +02:00
Dave Airlie	3985e6c044	gallium/tgsi: add support for 64-bit integer immediates. This adds support to TGSI for 64-bit integer immediates. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-09-21 10:23:55 +02:00
Dave Airlie	6e1a34d545	gallium: add opcode and types for 64-bit integers. (v3) This just adds the basic support for 64-bit opcodes, and the new types. v2: add conversion opcodes. add documentation. v3: - make docs more consistent - change TGSI_OPCODE_I2U64 to TGSI_OPCODE_U2I64 Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2) Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:23:05 +02:00
Kenneth Graunke	9694b23f66	i965: Rename intelScreen to screen. "intelScreen" is wordy and also doesn't fit our style guidelines. "screen" is shorter, which is nice, because we use it fairly often. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-20 20:08:20 -07:00
Kenneth Graunke	8fec9fbb9f	i965: Rename __DRIScreen pointers to "dri_screen". I want to use "screen" as the variable name for a struct intel_screen pointer. This means that we can't use it for __DRIscreen pointers. Sometimes we called it "screen", sometimes "sPriv", sometimes "driScrnPriv", and sometimes "psp" (Pointer to Screen Private?). The last one is particularly confusing because we use "psp" to refer to the Gen4 PIPELINED_STATE_POINTERS packet as well. Let's be consistent. "dri_screen" is clear, and it's not used often enough that I'm worried about the verbosity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-20 20:08:12 -07:00
Dylan Baker	d4bf9baa43	mesa: Implement ARB_shader_viewport_layer_array for i965 This extension is a combination of AMD_vertex_shader_viewport_index and AMD_vertex_shader_layer, making it rather trivial to implement. For gallium I think this needs a new cap because of the addition of support in tessellation evaluation shaders, and since I don't have any hardware to test it on, I've left that for someone else to wire up. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-20 16:23:04 -07:00
Leo Liu	956f3e3bcd	radeon/vce: add firmware support for version 52.8.3 Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-20 15:58:56 -04:00
Indrajit Das	f9311265bf	st/omx/dec/h265: Correct the timestamping (derived from commit `3b6bda665a`) v2: fix the tabs(Leo) Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nishanth Peethambaran <nishanth.peethambaran@amd.com> Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2016-09-20 15:58:56 -04:00
Lionel Landwerlin	792d77165b	aubinator: add a custom handler for immediate register load Transforming this : 0x00c77084: 0x11000001: MI_LOAD_REGISTER_IMM 0x00c77088: 0x0000b020 : Dword 1 Register Offset: 0x0000b020 0x00c7708c: 0x00880038 : Dword 2 Data DWord: 8912952 Into this: 0x007880f0: 0x11000001: MI_LOAD_REGISTER_IMM 0x007880f4: 0x0000b020 : Dword 1 Register Offset: 0x0000b020 0x007880f8: 0x00080040 : Dword 2 Data DWord: 524352 register L3CNTLREG2 (0xb020) : 0x80040 SLM Enable: 0 URB Allocation: 32 URB Low Bandwidth: 0 RO Allocation: 32 RO Low Bandwidth: 0 DC Allocation: 0 DC Low Bandwidth: 0 v2: Drop unused arguments (Sirisha) Print out register name Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-09-20 10:47:21 +01:00
Nayan Deshmukh	0301858a31	st/va: flush the context before calling flush_frontbuffer(v2) so that the texture is rendered to back buffer before calling flush_frontbuffer and can be copied to a different buffer in the function v2: change comment style Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-20 11:18:29 +02:00
Nayan Deshmukh	e4cc2276c1	st/vdpau: flush the context before calling flush_frontbuffer so that the texture is rendered to back buffer before calling flush_frontbuffer and can be copied to a different buffer in the function Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-20 11:18:07 +02:00
Nayan Deshmukh	853e80f5a0	vl/dri3: handle the case of different GPU(v4.2) In case of prime when rendering is done on GPU other then the server GPU, use a seprate linear buffer for each back buffer which will be displayed using present extension. v2: Use a seprate linear buffer for each back buffer (Michel) v3: Change variable names and fix coding style (Leo and Emil) v4: Use PIPE_BIND_SAMPLER_VIEW for back buffer in case when a seprate linear buffer is used (Michel) v4.1: remove empty line v4.2: destroy the context and handle the case when create_context fails (Emil) Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-20 11:17:02 +02:00
Ilia Mirkin	40d787ab05	st/vdpau: fix argument type to vlVdpOutputSurfaceDMABuf Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-20 11:13:05 +02:00
Tim Rowley	92ec820244	swr: [rasterizer core] Better thread destruction Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-09-19 20:10:19 -05:00
Tim Rowley	fdf2890423	swr: [rasterizer jitter] Fix missing end-of-file newline Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-09-19 20:10:19 -05:00
Tim Rowley	2f86a9577a	swr: [rasterizer core] Add macros for mapping ArchRast to buckets Switch all RDTSC_START/STOP macros to use AR_BEGIN/END macros. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-09-19 20:10:19 -05:00
Kenneth Graunke	04026b43c8	glsl: Skip "unsized arrays aren't allowed" check for TCS/TES/GS vars. Fixes ESEXT-CTS.draw_elements_base_vertex_tests.AEP_shader_stages and ESEXT-CTS.texture_cube_map_array.texture_size_tesselation_con_sh. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-19 12:01:11 -07:00
Samuel Pitoiset	6ed05fa4cb	nvc0: get rid of nvc0_stage_sampler_states_bind_range() Same thing as nvc0_stage_set_sampler_views_range(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-19 20:03:24 +02:00
Samuel Pitoiset	407948df1b	nvc0: get rid of nvc0_stage_set_sampler_views_range() This function was quite similar to nvc0_stage_set_sampler_views() and I don't see any reasons to not remove it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-19 20:03:20 +02:00
Samuel Pitoiset	557a29b51f	nv50/ir: optimize SUB(a, b) to MOV(a - b) This helps shaders in UE4 demos, especially with Elemental (+1% perf). This optimization reduces spilling usage in one shader which explains the little gain. GF100/GK104: total instructions in shared programs :2838551 -> 2838045 (-0.02%) total gprs used in shared programs :396706 -> 396684 (-0.01%) total local used in shared programs :34432 -> 34416 (-0.05%) local gpr inst bytes helped 1 19 112 112 hurt 0 0 0 0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-18 16:42:39 +02:00
Samuel Pitoiset	d8b4f5fcca	gk110/ir: fix wrong emission of OP_NOT This should emit src0 instead of src1. Found by inspection. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-09-18 16:42:33 +02:00
Martina Kollarova	15804c4b90	r600g/sb: fix struct/class declaration conflicts A couple of forward-declarations were causing warnings in clang: 'value' defined as a class here but previously declared as a struct [-Wmismatched-tags] Signed-off-by: Martina Kollarova <martina.kollarova@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-09-18 09:23:42 +02:00
Eric Anholt	073129c7af	i965: Drop assertion about buffer offset at draw time. Given robust access, we should just be returning zeroes if the user gives us a base pointer that's too big, which is what was happens on a release build. This was caught by a webgl conformance test for out-of-bounds draws on servo. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-17 17:48:16 +01:00
Lars Hamre	ddd6116e32	tgsi: Enable returns from within loops Fixes the following piglit test (for softpipe): /spec/glsl-1.10/execution/fs-loop-return Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:13 -06:00
Charmaine Lee	8a6391477e	svga: relax restriction of compressed formats for texture upload This patch relaxes the restriction of compressed formats for texture upload buffer. For now, 3D texture with compressed format is still not supported in the texture upload buffer path. As Brian noted, ETQW does many texture updates with glCompressedTexSubImage. This patch greatly improves the performance of the ETQW trace. Tested with ETQW, MTT piglit, glretrace, conform, viewperf v2: Per Brian's suggestion, removed the subregion boundary check. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:13 -06:00
Brian Paul	15dee0fc1d	svga: skip query flush if we already have the query result This reduces the number of times we flush in some situations (the arbocclude demo is one trivial example). Tested with Piglit, ETQW, Sauerbraten, arbocclude. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-17 10:24:13 -06:00
Brian Paul	c71e82b8e9	svga: remove unneeded svga_context_flush() in svga_end_query() Since commit `99d8fe20ab` we don't have to flush the command buffer when we end a query. Tested with Piglit, Sauerbraten, arbocclude, ETQW (noticably faster now). Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-09-17 10:24:13 -06:00
Charmaine Lee	f1b3374d28	svga: use upload buffer for upload texture. With this patch, when running with vgpu10, instead of mapping directly to the guest backed memory for texture update, we'll use the texture upload buffer and use the transfer from buffer command to update the host side texture memory. This optimization yields about 20% performance improvement with Lightsmark2008 and about 40% with Tropics. Tested with Lightsmark2008, Tropics, Heaven, MTT piglit, glretrace, conform. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:13 -06:00
Charmaine Lee	a9c4a861d5	svga: refactor svga_texture_transfer_map/unmap functions Split the functions into separate functions for dma and direct map to make the code more readable. Tested with MTT piglit, glretrace, viewperf, conform, various OpenGL apps Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:12 -06:00
Charmaine Lee	c8ef82d65a	svga: add SVGA3d_vgpu10_TransferFromBuffer() Also add the corresponding dump function to dump the TransferFromBuffer command. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:12 -06:00
Charmaine Lee	2a4b019239	svga: single sample surface can be created as non-multisamples surface With this patch, single sample surface will be created as non-multisamples surface. Tested with piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:12 -06:00
Charmaine Lee	5947d90830	svga: fix memory leak with sampler state This patch fixes a memory leak with sampler state when piglit is run with HW version 11. Sampler state clean up was incorrectly skipped in svga_cleanup_sampler_state() for vgpu9. Tested with piglit. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:12 -06:00
Brian Paul	12689efbbe	svga: fix prim type check/assignment in translate_indices() Left over test code spotted by Sinclair. Tested with piglit, Google Earth, Lightsmark, Heaven4, glretraces, etc. Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2016-09-17 10:09:00 -06:00
Charmaine Lee	50359ddb5d	svga: use SVGA3D_QUERYTYPE_MAX for svga query type check Use SVGA3D_QUERYTYPE_MAX instead of SVGA_QUERY_MAX for svga query type check. Tested with various OpenGL apps with GALLIUM_HUD set. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:09:00 -06:00
Charmaine Lee	ee39814d90	svga: split the num-resources-mapped hud to textures & buffers Replace the num-resources-mapped hud with num-textures-mapped and num-buffers-mapped, so we can differentiate the map counts for these two different resources. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:09:00 -06:00
Charmaine Lee	f168c886c9	svga: change svga hud defines to enums This will make it easier to add new hud types. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:09:00 -06:00
Brian Paul	4f74b379aa	svga: implement an index buffer translation cache Some OpenGL apps, like Cinebench R15, have many glDrawElements(GL_QUADS) calls. Since we don't directly support quads we have to convert these calls into GL_TRIANGLES which involves generating a new index buffer. This patch saves the new/translated index buffer in the hope that it can be reused for a later draw call. Cinebench R15 increases by about 20% with this change. The NobelClinician Viewer app also hits this code. Tested with full piglit run. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-17 10:09:00 -06:00
Brian Paul	581292a78c	svga: try to emit fewer buffer rebind commands If a consecutive sequence of drawing commands references the same vertex/index buffers, there should be no need to rebind the surfaces for the second and subsequent drawing commands. Apps that use multiple display lists benefit from this since the vertex data for several display lists is often stored in one buffer. In the case of the legacy E&S Glaze demo, this reduces the size of our command buffers from 91KB to 44KB. One WSI Fusion trace shows a 33% reduction in command buffer sizes. Tested with full piglit run. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-17 10:09:00 -06:00
Brian Paul	ee5f5e2269	svga: reduce unmapping/remapping of the default constant buffer Previously, every time we put shader constants into the default constant buffer we called u_upload_alloc(), which mapped the buffer, and u_upload_unmap(). We had to unmap the buffer before calling svga_buffer_handle() to get the winsys handle for the buffer. But we really only need to do that the first time we reference the const buffer. Now we try to keep the upload manager's buffer mapped until we fill it or flush the command buffer. v2: add additional comment on the buffer unmapping code in svga_context_flush(), per Charmaine. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-17 10:09:00 -06:00
Brian Paul	ce3b34b727	svga: optimize memcpy() in svga_buffer_update_hw() When we migrate a buffer from sw/malloc storage to a hardware buffer, don't memcpy the whole buffer, just copy the part we've written to. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-17 10:08:59 -06:00
Neha Bhende	b7bee25052	svga: Use comparison between svga texture types to use PredCopyRegion command PredCopyRegion support copy between same type of textures. Instead of comparing src and dst pipe texture type, compare svga texture type which can avoid some software fallback. for example, it avoids a software blit with the Redway3D Aston demo. Tested piglit tests on VGPU9 and VGPU10 on GL/DX11Renderer, Redway3D Aston demo v2: some nit pick suggested by Charmaine. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:08:59 -06:00
Neha Bhende	b9f333cc81	svga: Add function svga_resource_type() This function returns svga texture type for corresponding pipe texture. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:08:59 -06:00
Samuel Pitoiset	50baaf6bc6	nvc0/ir: fix subops for IMAD Offset was wrong, it's at bit 8, not 4. Also, uses subr instead of sub when src2 has neg. Similar to GK110 now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-09-17 17:42:45 +02:00
Samuel Pitoiset	9b8b69b3c4	nvc0/ir: fix comments about instructions info The comment for the commutative flags was wrong because OP_MUL is before OP_MAD. While we are at it add missing opcodes, and fix the comment about the short forms. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-17 17:42:40 +02:00
Kenneth Graunke	eaacb27812	mesa: Move buffers-unmapped earlier in check_valid_to_render(). This needs to be above the switch on API, as that can return true (valid to render) before this error check even had a chance to run. Fixes ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos, which worked before commit `72f1566f90`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-09-16 19:42:56 -07:00
Kenneth Graunke	6b0ba02cae	mesa: Expose GL_CONTEXT_FLAGS in ES 3.2. Fixes four ES32-CTS.context_flags.* tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-16 18:55:38 -07:00
Tom Stellard	91ec6e5664	radeonsi/compute: Use the HSA abi for non-TGSI compute shaders v3 This patch switches non-TGSI compute shaders over to using the HSA ABI described here: https://github.com/RadeonOpenCompute/ROCm-Docs/blob/master/AMDGPU-ABI.md The HSA ABI provides a much cleaner interface for compute shaders and allows us to share more code in the compiler with the HSA stack. The main changes in this patch are: - We now pass the scratch buffer resource into the shader via user sgprs rather than using relocations. - Grid/Block sizes are now passed to the shader via the dispatch packet rather than at the beginning of the kernel arguments. Typically for HSA, the CP firmware will create the dispatch packet and set up the user sgprs automatically. However, in Mesa we let the driver do this work. The main reason for this is that I haven't researched how to get the CP to do all these things, and I'm not sure if it is supported for all GPUs. v2: - Add comments explaining why we are setting certain bits of the scratch resource descriptor. v3: - Use amdgcn-mesa-mesa3d triple instead of amdgcn--mesa3d. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-16 23:07:10 +00:00
Tom Stellard	a2b8346fa6	radeonsi/compute: Add some more debug printfs	2016-09-16 22:51:06 +00:00
Marek Olšák	ae0a4a1299	glsl: remove interpolateAt* instructions for demoted inputs This fixes 8 fs-interpolateat* piglit crashes on radeonsi, because it can't handle non-input operands in interpolateAt*. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-16 22:35:08 +02:00
Marek Olšák	d58a3906cb	mesa: fix glGetFramebufferAttachmentParameteriv w/ on-demand FRONT_BACK alloc This fixes 66 CTS tests on st/mesa. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-16 22:35:08 +02:00
Serge Martin	1c8d4c694b	clover: fix getting scalar args api size This fix getting the size of a struct arg. vec3 types still work ok. Only buit-in args need to have power of two alignment, getTypeAllocSize reports the correct size in all cases. Acked-by: Francisco Jerez <currojerez@riseup.net>	2016-09-16 22:09:47 +02:00
Ilia Mirkin	f65187bb93	docs: add GL_ARB_gl_spirv to features list Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-16 12:04:12 -04:00
Rob Clark	ba8a50955d	ttn: fix warning after `7bf76563e` Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-16 11:55:26 -04:00
Brian Paul	702ff0b9a0	gallium/docs: document alpha_to_coverage and alpha_to_one blend state The gallium interface defines these like DX10. Note that OpenGL ignores these options if MSAA is disabled or the dest buffer doesn't support MSAA. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-09-16 08:44:26 -06:00
Brian Paul	187c278121	st/mesa: update comment in st_atom_msaa.c The old comment was a copy and paste mistake. Indent another comment. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-09-16 08:44:26 -06:00
Brian Paul	a01872f808	st/mesa: only enable MSAA coverage options when we have a MSAA buffer Regardless of whether GL_MULTISAMPLE is enabled (it's enabled by default) we should not set the alpha_to_coverage or alpha_to_one flags if the current drawing buffer does not do MSAA. This fixes the new piglit gl-1.3-alpha_to_coverage_nop test. ETQW is a game that enables GL_SAMPLE_ALPHA_TO_COVERAGE without MSAA. Shrubs along the side of roads were invisible because fragments with alpha < 0.5 were being discarded (zero coverage). v2: remove ctx->DrawBuffer != NULL check. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-16 08:44:12 -06:00
Dave Airlie	e1ea36ae71	spirv: use subpass image type (v1.1) This adds support for the input attachments subpass type to the SPIRV->NIR pass. v1.1: drop handling from vtn_handle_texture Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-09-16 15:16:31 +10:00
Dave Airlie	7bf76563e2	glsl: add subpass image type (v2) SPIR-V/Vulkan have a special image type for input attachments called the subpass type. It has different characteristics than other images types. The main one being it can only be an input image to fragment shaders and loads from it are relative to the frag coord. This adds support for it to the GLSL types. Unfortunately we've run out of space in the sampler dim in types, so we need to use another bit. v2: Fixup subpass input name (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-09-16 15:16:31 +10:00
Kenneth Graunke	081f21f29b	isl: Finish tiling filtering for Gen6. Gen6 only has one additional restriction over Gen7+, so we just add it to the existing gen7 function (which actually covers later gens too). This should stop FINISHME spew when running GL on Sandybridge. v2: Fix bytes per block vs. bits per block confusion (Jason) and rename function to gen6_filter_tiling (Jason and Chad). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-15 21:21:50 -07:00
Ilia Mirkin	9fec15a7e0	i965: enable ARB_ES3_2_compatibility on gen8+ Note that ASTC support is not actually mandated for this extension to be exposed. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-15 19:29:41 -04:00
Jason Ekstrand	111f6b250d	i965/nir: Roll set_default_interpolation into lower_fs_inputs Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-15 13:31:43 -07:00
Jason Ekstrand	246db0063e	i965/fs: Use NIR for handling forced per-sample interpolation Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-15 13:31:43 -07:00
Jason Ekstrand	ed65e6ef49	nir: Add a flag to lower_io to force "sample" interpolation Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-15 13:31:43 -07:00
Jason Ekstrand	114874b22b	i965/fs: Use sample interpolation for interpolateAtCentroid in persample mode From the ARB_gpu_shader5 spec: The built-in functions interpolateAtCentroid() and interpolateAtSample() will sample variables as though they were declared with the "centroid" or "sample" qualifiers, respectively. When running with persample dispatch forced by the API, we interpolate anything that isn't flat as if it's qualified by "sample". In order to keep interpolateAtCentroid() consistent with the "centroid" qualifier, we need to make interpolateAtCentroid() do sample interpolation instead. Nothing in the GLSL spec guarantees that the result of interpolateAtCentroid is uniform across samples in any way, so this is a perfectly fine thing to do. Fixes 8 of the new dEQP-VK.pipeline.multisample_interpolation.* Vulkan CTS tests that specifically validate consistency between the "sample" qualifier and interpolateAtSample() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-15 13:31:27 -07:00
Brian Paul	0d2eb8c14d	mesa: check for no matrix change in _mesa_LoadMatrixf() Some apps issue redundant glLoadMatrixf() calls with the same matrix. Try to avoid setting dirty state in that situation. This reduces the number of constant buffer updates by about half in ET Quake Wars. Tested with Piglit, ETQW, Sauerbraten, Google Earth, etc. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-15 12:00:12 -06:00
Jon Turney	533b3530c1	direct-to-native-GL for GLX clients on Cygwin ("Windows-DRI") Structurally, this is very similar to the existing Apple-DRI code, except I have chosen to implement this using the __GLXDRIdisplay, etc. vtables (as suggested originally in [1]), rather than a maze of ifdefs. This also means that LIBGL_ALWAYS_SOFTWARE and LIBGL_ALWAYS_INDIRECT work as expected. [1] https://lists.freedesktop.org/archives/mesa-dev/2010-May/000756.html This adds: * the Windows-DRI extension protocol headers and the windowsdriproto.pc file, for use in building the Windows-DRI extension for the X server * a Windows-DRI extension helper client library * a Windows-specific DRI implementation for GLX clients The server is queried for Windows-DRI extension support on the screen before using it (to detect the case where WGL is disabled or can't be activated). The server is queried for fbconfigID to pixelformatindex mapping, which is used to augment glx_config. The server is queried for a native handle for the drawable (which is of a different type for windows, pixmaps and pbuffers), which is used to augment __GLXDRIdrawable. Various GLX extensions are enabled depending on if the equivalent WGL extension is available.	2016-09-15 13:14:43 +01:00
Emil Velikov	2ac09ac5a5	docs: add news item and link release notes for 12.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-15 11:31:06 +01:00
Emil Velikov	219a2f5f9f	docs: add sha256 checksums for 12.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `09460b8cf7`)	2016-09-15 11:30:00 +01:00
Emil Velikov	06f83a5548	docs: add release notes for 12.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `d79b2e7bf3`)	2016-09-15 11:29:59 +01:00
Kenneth Graunke	3bcdc2e3db	mesa: Expose RESET_NOTIFICATION_STRATEGY with KHR_robustness. This is supposed to be exposed with the GL_KHR_robustness extension, which we support on ES 2.0 and later. On desktop GL, it's also exposed by GL_ARB_robustness, which is supported by all drivers ("dummy_true"). so we also allow desktop GL. Fixes: - ES32-CTS.robust.robustness.noResetNotification - ES32-CTS.robust.robustness.loseContextOnReset Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-15 00:58:47 -07:00
Jason Ekstrand	89a96c8f43	anv/cmd_buffer: Set the L3 atomic disable mask bit in CHICKEN3 on HSW Without this bit set, the value in "L3 Atomic Disable" won't get applied by the hardware so we won't properly get L3 atomic caching. Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex and 198 of the dEQP-VK.image.atomic_operations.* tests on HSW Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-09-14 17:53:16 -07:00
Jason Ekstrand	a814e18c96	intel/blorp: Stop setting 3DSTATE_DRAWING_RECTANGLE The Vulkan driver sets 3DSTATE_DRAWING_RECTANGLE once to MAX_INT x MAX_INT at the GPU initialization time and never sets it again. The GL driver sets it every time the framebuffer changes. Originally, blorp set it to the size of the drawing area but meant we had to set it back in the Vulkan driver. Instead, we can easily just do that in the GL driver's blorp_exec implementation and not set it in blorp core. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-14 17:51:16 -07:00
Jason Ekstrand	b56f509ee0	intel/blorp: Emit 3DSTATE_MULTISAMPLE directly Previously, we relied on a driver hook for 3DSTATE_MULTISAMPLE. However, now that Vulkan and GL use the same sample positions, we can set up 3DSTATE_MULTISAMPLE directly in blorp and delete the driver hook. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-14 17:51:16 -07:00
Jason Ekstrand	c779ad3e66	intel: Move Vulkan sample positions to common code Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-14 17:51:16 -07:00
Marek Olšák	f019255acf	Revert "tgsi/scan: don't set interp flags for inputs only used by INTERP instructions" This reverts commit `524fd55d2d`. Reason: https://bugs.freedesktop.org/show_bug.cgi?id=97808	2016-09-15 00:47:24 +02:00
Francisco Jerez	6d861968ca	i965/vec4: Assert that pull constant load offsets are 16B-aligned. Non-16B-aligned pull constant loads are unlikely to be particularly useful given that you can get roughly the same effect by using swizzles on the result. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:59 -07:00
Francisco Jerez	5ca35c6367	i965/vec4: Assert that ATTR regions are register-aligned. It might be useful to actually handle this once copy propagation becomes smarter about register-misaligned offsets. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:59 -07:00
Francisco Jerez	f33a8f8fcf	i965/vec4: Don't spill non-GRF-aligned register regions. A better fix would be to do something along the lines of the FS back-end spilling code and emit a scratch read before any instruction that overwrites the register to spill partially due to a non-zero sub-register offset. In the meantime mark registers used with a non-zero sub-register offset as no-spill to prevent the spilling code from miscompiling the program. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:59 -07:00
Francisco Jerez	8531f943d9	i965/vec4: Fix copy propagation for non-register-aligned regions. This prevents it from trying to propagate a copy through a register-misaligned region. MOV instructions with a misaligned destination shouldn't be treated as a direct GRF copy, because they only define the destination GRFs partially. Also fix the interference check implemented with is_channel_updated() to consider overlapping regions with different register offset to interfere, since the writemask check implemented in the function is only valid under the assumption that the source and destination regions are aligned component by component. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:59 -07:00
Francisco Jerez	0e657b7b55	i965/vec4: Compare full register offsets in cmod propagation. Cmod propagation would misoptimize the program if the destination offset of the generating instruction wasn't exactly the same as the source region offset of the copy instruction. In preparation for adding support for sub-GRF offsets to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	8bed1adfc1	i965/vec4: Assign correct destination offset to rewritten instruction in register coalesce. Because the pass already checks that the destination offset of each 'scan_inst' that needs to be rewritten matches 'inst->src[0].offset' exactly, the final offset of the rewritten instruction is just the original destination offset of the copy. This is in preparation for adding support for sub-GRF offsets to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	3a74e437fd	i965/vec4: Don't coalesce registers with overlapping writes not matching the MOV source. In preparation for adding support for sub-GRF offsets to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	1bb5074474	i965/vec4: Compare full register offsets in opt_register_coalesce nop move check. In preparation for adding support for sub-GRF offsets to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	3be0d6d040	i965/vec4: Check that the write offsets match when setting dependency controls. For simplicity just assume that two writes to the same GRF with different sub-GRF offsets will potentially interfere and break the dependency control chain. This is in preparation for adding sub-GRF offset support to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	b52fefc4d5	i965/vec4: Change opt_vector_float to keep track of the last offset seen in bytes. This simplifies things slightly and makes the pass more correct in presence of sub-GRF offsets. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	230615e228	i965/vec4: Simplify src/dst_reg to brw_reg conversion by using byte_offset(). This should also have the side effect of fixing convert_to_hw_regs() to handle sub-GRF register offsets. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	eb746a80e5	i965/ir: Update several stale comments. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	47784e2346	i965/ir: Don't print ARF subnr values twice. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	5d65d51e78	i965/vec4: Print src/dst_reg::offset field consistently for all register files. C.f. 'i965/fs: Print fs_reg::offset field consistently for all register files.'. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	ec259f5307	i965/fs: Print fs_reg::offset field consistently for all register files. The offset printing code in fs_visitor::dump_instruction() was doing things differently for sources and destinations and for each register file -- In some cases it would be added to the base register number fs_reg::nr, in other cases it would follow the base register separated with a plus sign, in other cases (uniforms) it would do both (!). The sub-register offset was also being printed or not rather inconsistently. Fix it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	950af5ed40	i965/fs: Misc simplification. Get rid of some leftover redundant arithmetic introduced during the conversion to byte offsets and sizes that can be simplified easily. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	80e1d670b4	i965/fs: Get rid of fs_inst::set_smear(). component() was generally a better alternative because of several issues set_smear() had: - It wouldn't take the original stride and offset of the register into account, which means that set_smear() on the result of e.g. another set_smear() call or an offset() call would give a bogus region as result. - It was an inherently destructive operation. See the 'nir_intrinsic_shader_clock' hunk below for how this could lead to subtle bugs in cases where set_smear() was called multiple times on the same register like 'r.set_smear(0), r.set_smear(1)' with the expectation that each call would return a separate value instead of a reference to the same subsequently mutated object. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	8e58e4412f	i965/fs: Use region_contained_in() in compute-to-mrf coalescing pass. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	f2d2156ba2	i965/fs: Move region_contained_in to the IR header and fix for non-VGRF files. Also changed the argument names since 'src' and 'dst' don't make that much sense outside of the context of copy propagation. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	645261c4b2	i965/fs: Change region_contained_in() to use byte units. This makes the function less annoying to use and more accurate -- We shouldn't propagate a copy into a register region that wasn't fully contained in the destination of the copy (IOW, a source region that wasn't fully defined by the copy) just because the number of registers written and read by each instruction happened to get rounded up to the same GRF multiple. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	1c67e27247	i965/fs: Simplify copy propagation LOAD_PAYLOAD ACP setup. By keeping track of 'offset' in byte units. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	2d7d4a7910	i965/fs: Simplify a bunch of fs_inst::size_written calculations by using component_size(). Using component_size() is easier and generally more correct because it takes into account the register type and stride for you. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	0bc46cc961	i965/fs: Simplify result_live calculation in dead_code_eliminate(). No need to unroll the first iteration of the loop manually. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	62aaef6c83	i965/fs: Simplify and fix buggy stride/offset calculations using subscript(). These were bashing the 'offset' and 'stride' values of several registers without taking the previous value into account, which probably didn't matter in practice for optimize_frontfacing_ternary() because the 'tmp' register already had a known region, but it would have given the wrong region as result in the other cases in lower_integer_multiplication(). subscript(..., i) is a more straightforward way to take the i-th field of a given type from each channel of a register which should give the right answer as result regardless of the original 'offset' and 'stride' parameters of the register region. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	3b7b908787	i965/fs: Simplify get_fpu_lowered_simd_width() by using inequalities instead of rounding. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	ee930c0435	i965/fs: Simplify byte_offset(). In the most common case this can now be implemented as a simple addition because the offset is already encoded as a single scalar value in bytes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	bae3a41171	i965/fs: Fix signedness of the return value of fs_inst::size_read(). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	a384503c15	i965/fs: Switch mask_relative_to() used in compute-to-mrf to byte units. This makes the helper function less annoying to use and somewhat more accurate. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	401fc228fd	i965/fs: Fix bogus sub-MRF offset calculation in compute-to-mrf. The 'scan_inst->dst.offset % REG_SIZE' term in the final 'scan_inst->dst.offset' calculation is obviously bogus. The offset from the start of the copy destination register 'inst->dst' where the destination of the generating instruction 'scan_inst' would be written to (before compute-to-mrf runs) is just the offset of 'scan_inst->dst' relative to the source of the copy instruction (AKA rel_offset in the code below). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	cd0134072a	i965/fs: Take into account copy register offset during compute-to-mrf. This was dropping 'inst->dst.offset' on the floor. Nothing in the code above seems to guarantee that it's zero and in that case the offset of the register being coalesced into wouldn't be taken into account while rewriting the generating instruction. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	fcd9d1badc	i965/vec4: Drop backend_reg::in_range() in favor of regions_overlap(). This makes sure that overlap checks are done correctly throughout the back-end when the '*this' register starts before the register/size pair provided as argument, and is actually less annoying to use than in_range() at this point since regions_overlap() takes its size arguments in bytes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	56bcb2230f	i965/vec4: Port regions_overlap() to the vec4 IR. This is copy-pasted almost line by line from the FS back-end. The only reason it cannot be implemented in terms of backend_reg is that the backend_reg::nr field doesn't have the same meaning for uniforms on both back-ends. It could be easily deduplicated by using a template function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	c057278c06	i965/fs: Stop using fs_reg::in_range() in favor of regions_overlap(). Its only use left in the FS back-end should be using regions_overlap() instead to avoid getting a false negative result in cases where source and destination overlap but the former starts before the latter in the VGRF file. v2: Put back lost components factor (Iago). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	b42c13a5b8	i965/fs: Drop fs_inst::overwrites_reg() in favor of regions_overlap(). fs_inst::overwrites_reg is rather easy to misuse because it cannot tell how large the register region starting at 'reg' is, so in cases where the destination region starts after 'reg' it may give a misleading result. regions_overlap() is somewhat more verbose to use but handles arbitrary overlap correctly so it should generally be used instead. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	32d67923b2	i965/fs: Fix LOAD_PAYLOAD handling in register coalesce is_nop_mov(). is_nop_mov() was broken for LOAD_PAYLOAD instructions in two ways: On the one hand the original destination register offset wasn't being taken into account which would give incorrect results if it was already non-zero, and on the other hand all source registers were being treated as if they had a size of 32B, which is almost never the case in SIMD16 programs for non-header sources. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	5cc6425d70	i965/fs: Fix can_propagate_from() source/destination overlap check. The previous overlap condition only made sure that the VGRF numbers or GRF-aligned offsets were different without taking the amount of data written and read by the instruction into consideration. Use the regions_overlap() helper instead. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	9ae77d7020	i965/fs: Compare full register offsets in cmod propagation pass. This could potentially have misoptimized a program in cases where inst->src[0] had a non-zero sub-GRF offset. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	3a4ea7cf80	i965/fs: Don't consider LOAD_PAYLOAD with stride > 1 source to behave like a raw copy. Noticed the problem by inspection while typing in the previous commit. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	1164aa1a1b	i965/fs: Don't consider LOAD_PAYLOAD with sub-GRF offset to behave like a raw copy. This was likely the original intention, and at least register coalesce relies on it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	a5bbe4c127	i965/vec4: Take into account misalignment in regs_written() and regs_read(). Unlike the FS counterpart of this commit this was likely not (yet) a bug, but let's fix it already in preparation for implementing support for sub-GRF offsets in the VEC4 back-end. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	717d8efd58	i965/fs: Take into account misalignment in regs_written() and regs_read(). There was a workaround for this in fs_inst::size_read() for the SHADER_OPCODE_MOV_INDIRECT instruction and FIXED_GRF register file only. We should take this possibility into account for the sources and destinations of all instructions on all optimization passes that need to quantize dataflow in 32B increments by adding the amount of misalignment to the size read or written from the regs_read() and regs_written() helpers respectively. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	e540045df5	i965/fs: Take into account trailing padding in regs_written() and regs_read(). This fixes regs_written() and regs_read() to return a more accurate value when the padding left between components due to a stride value greater than one causes the region bounds given by size_written or size_read to overflow into the next register. This could become a problem in optimization passes that keep track of dataflow using fixed-size arrays with register granularity, because the overflow register (not actually accessed by the region) may not have been allocated at all which could lead to undefined memory access. An alternative to this would be to subtract the trailing padding already during the calculation of fs_inst::size_read and ::size_written, but that would break code that currently assumes that ::size_read and _written are whole multiples of the component size, and would be hard to maintain looking forward because size_written is assigned from a bunch of different places. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	937373eb25	i965/fs: Handle fixed HW GRF subnr in reg_offset(). This will be useful later on when we start using reg_offset() on fixed hardware registers. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	1a4b7fdd88	i965/fs: Handle arbitrary offsets in brw_reg_from_fs_reg for MRF/VGRF registers. This restriction seemed rather artificial... Removing it actually simplifies things slightly. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	d6b60934aa	i965/fs: Return more accurate read size for LINTERP from fs_inst::size_read. The LINTERP virtual instruction only reads three scalar components from the first 16B of the second source, we can now teach size_read() about it since its return value is represented with byte granularity. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	31a40202b8	i965/fs: Return more accurate read size from fs_inst::size_read for IMM and UNIFORM files. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	728dd30c0a	i965/vec4: Replace vec4_instruction::regs_read with ::size_read using byte units. The previous regs_read value can be recovered by rewriting each reference of regs_read() like 'x = i.regs_read(j)' to 'x = DIV_ROUND_UP(i.size_read(j), reg_unit)'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	e1a918ba7b	i965/fs: Replace fs_inst::regs_read with ::size_read using byte units. The previous regs_read value can be recovered by rewriting each reference of regs_read() like 'x = i.regs_read(j)' to 'x = DIV_ROUND_UP(i.size_read(j), reg_unit)'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	27cb6b081e	i965/ir: Drop backend_instruction::regs_written field. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	69fdf13c21	i965/vec4: Replace vec4_instruction::regs_written with ::size_written field in bytes. The previous regs_written field can be recovered by rewriting each rvalue reference of regs_written like 'x = i.regs_written' to 'x = DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference like 'i.regs_written = x' to 'i.size_written = x * reg_unit'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	69570bbad8	i965/fs: Replace fs_inst::regs_written with ::size_written field in bytes. The previous regs_written field can be recovered by rewriting each rvalue reference of regs_written like 'x = i.regs_written' to 'x = DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference like 'i.regs_written = x' to 'i.size_written = x * reg_unit'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	d28cfa35fe	i965/vec4: Add wrapper functions for vec4_instruction::regs_read and ::regs_written. This is in preparation for dropping vec4_instruction::regs_read and ::regs_written in favor of more accurate alternatives expressed in byte units. The main reason these wrappers are useful is that a number of optimization passes implement dataflow analysis with register granularity, so these helpers will come in handy once we've switched register offsets and sizes to the byte representation. The wrapper functions will also make sure that GRF misalignment (currently neglected by most of the back-end) is taken into account correctly in the calculation of regs_read and regs_written. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	c458eeb946	i965/fs: Add wrapper functions for fs_inst::regs_read and ::regs_written. This is in preparation for dropping fs_inst::regs_read and ::regs_written in favor of more accurate alternatives expressed in byte units. The main reason these wrappers are useful is that a number of optimization passes implement dataflow analysis with register granularity, so these helpers will come in handy once we've switched register offsets and sizes to the byte representation. The wrapper functions will also make sure that GRF misalignment (currently neglected by most of the back-end) is taken into account correctly in the calculation of regs_read and regs_written. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	be095e11e4	i965/fs: Replace fs_reg::subreg_offset with fs_reg::offset expressed in bytes. The fs_reg::subreg_offset and ::offset fields are now redundant, the sub-GRF offset can just be added to the single ::offset field expressed in byte units. The current subreg_offset value can be recovered by applying the following rule: Replace each rvalue reference of subreg_offset like 'x = r.subreg_offset' with 'x = r.offset % reg_unit', and each lvalue reference like 'r.subreg_offset = x' with 'r.offset = ROUND_DOWN_TO(r.offset, reg_unit) + x'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	9a523dd051	i965/ir: Remove backend_reg::reg_offset. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	fba020e5af	i965/vec4: Replace dst/src_reg::reg_offset with dst/src_reg::offset expressed in bytes. The dst/src_reg::offset field in byte units introduced in the previous patch is a more straightforward alternative to an offset representation split between ::reg_offset and ::subreg_offset fields. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple FS back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. v2: Fix division by the wrong reg_unit in the UNIFORM case of convert_to_hw_regs(). (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:52 -07:00
Francisco Jerez	86944e063a	i965/fs: Replace fs_reg::reg_offset with fs_reg::offset expressed in bytes. The fs_reg::offset field in byte units introduced in this patch is a more straightforward alternative to the current register offset representation split between fs_reg::reg_offset and ::subreg_offset. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:52 -07:00
Eero Tamminen	8ad5fb3a8f	glsl: grammar fix Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-09-14 13:35:47 -07:00
Kenneth Graunke	aa70ac172e	docs: Mention AEP in release notes	2016-09-14 12:43:16 -07:00
Kenneth Graunke	8c9dddadad	i965: Enable ANDROID_extension_pack_es31a on Gen9+. AEP requires ASTC, which is currently only enabled on Skylake and later. (It may be possible to extend this to Cherryview/Braswell in the future, but earlier hardware doesn't have ASTC support.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-14 12:16:25 -07:00
Kenneth Graunke	2d8a3fa7ea	nir: Report progress from nir_lower_phis_to_scalar. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-14 12:01:51 -07:00
Kenneth Graunke	32630e211e	nir: Report progress from nir_lower_alu_to_scalar. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-14 12:01:49 -07:00
Kenneth Graunke	e6eed3533e	nir: Call nir_metadata_preserve from nir_lower_alu_to_scalar(). This is mandatory. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-14 12:01:39 -07:00
Rob Clark	bff90aedf1	nir/lower_tex: fix typo with sample_dim Numeric 2 is actually GLSL_SAMPLER_DIM_3D, which I don't think is what was intended. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-14 13:45:32 -04:00
Rob Clark	1a8424ceba	nir: move tex_instr_remove_src I want to re-use this in a different pass, so move to nir.h Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-14 13:45:32 -04:00
Rob Clark	2c3f966276	nir/lower_tex: remove tex_instr_find_src() Turns out it already exists.. so don't duplicate it. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-14 13:45:32 -04:00
Kyle Brenneman	7206b3a556	egl: Add storage for EGL_KHR_debug's state to EGL objects Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	1d535c1e83	egl: Factor out _eglGetSyncAttribCommon Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	5b0b844ac9	egl: Factor out _eglWaitSyncCommon Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	9a992038e7	egl: Lock the display in _eglCreateSync's callers Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	58338c6b65	egl: Factor out _eglCreateImageCommon (v2) v2: - Pass disp to RETURN_EGL_ERROR so we unlock the display Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	82a2e2cb50	egl: Factor out _eglWaitClientCommon Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	8cc3d9855f	egl: Use _eglCreatePixmapSurfaceCommon consistently This moves the native pixmap fixup to a helper function so we don't repeat ourselves. Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	7d7ae5e1c3	egl: Use _eglCreateWindowSurfaceCommon consistently This moves the native window fixup to a helper function so we don't repeat ourselves. Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	017946b724	egl: Factor out _eglGetPlatformDisplayCommon Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	fe6ffa79be	egl: Fix typo Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Adam Jackson	e2c067d256	egl: Tear down images and syncs at eglTerminate Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	6e50f12b04	egl: Update eglext.h (v2) Updated eglext.h to revision 33111 from the Khronos repository. v2: - Don't (re)move extension includes from eglext.h (Emil Velikov) - Bump to revision 33111 (Adam Jackson) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2016-09-14 11:45:58 -04:00
Brendan King	95f3e5861c	configure.ac: fix the name of the Wayland Scanner pc file The Wayland Scanner pkg-config file is called wayland-scanner.pc. Fixes: `153539bd9d` ("configure: rework wayland_scanner handling (fix make distcheck)") Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Brendan King <Brendan.King@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 14:38:30 +01:00
Eric Engestrom	4bb9efb592	gbm: remove left-over array `e7c8c85785` ("gbm: Removed unused function.") forgot to remove the global array used only by that function. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 14:37:34 +01:00
Martina Kollarova	2527e18eeb	gallium: fix return value check A possible error (-1) was being lost because it was first converted to an unsigned int and only then checked. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Martina Kollarova <martina.kollarova@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-09-14 14:36:43 +01:00
Marek Olšák	ab29788250	radeonsi: reload PS inputs with direct indexing at each use (v2) The LLVM compiler can CSE interp intrinsics thanks to LLVMReadNoneAttribute. 26011 shaders in 14651 tests Totals: SGPRS: 1146340 -> 1132676 (-1.19 %) VGPRS: 727371 -> 711730 (-2.15 %) Spilled SGPRs: 2218 -> 2078 (-6.31 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 35841268 -> 36009732 (0.47 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222559 -> 224779 (1.00 %) Wait states: 0 -> 0 (0.00 %) v2: don't call load_input for fragment shaders in emit_declaration Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-14 12:33:00 +02:00
Marek Olšák	007b512f9d	radeonsi: get rid of constant buffer preloading 26011 shaders in 14651 tests Totals: SGPRS: 1152636 -> 1146340 (-0.55 %) VGPRS: 728198 -> 727371 (-0.11 %) Spilled SGPRs: 3776 -> 2218 (-41.26 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 35835152 -> 35841268 (0.02 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222372 -> 222559 (0.08 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-14 12:32:59 +02:00
Marek Olšák	16be87c904	radeonsi: get rid of img/buf/sampler descriptor preloading (v2) 26011 shaders in 14651 tests Totals: SGPRS: 1251920 -> 1152636 (-7.93 %) VGPRS: 728421 -> 728198 (-0.03 %) Spilled SGPRs: 16644 -> 3776 (-77.31 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 36001064 -> 35835152 (-0.46 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222221 -> 222372 (0.07 %) Wait states: 0 -> 0 (0.00 %) v2: merge codepaths where possible Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-14 12:32:59 +02:00
Marek Olšák	22797d7d83	radeonsi: rename get_sampler_desc -> load_sampler_desc Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-14 12:32:59 +02:00
Marek Olšák	5f0a8fbcc8	radeonsi: cosmetic changes in si_shader.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-14 12:32:59 +02:00
Marek Olšák	afaf27bff3	radeonsi: load streamout buffer descriptors before use (v2) v2: inline the code and remove the conditional that's a no-op now Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-14 12:32:59 +02:00
Eric Anholt	f597ac3966	vc4: Implement job shuffling Track rendering to each FBO independently and flush rendering only when necessary. This lets us avoid the overhead of storing and loading the frame when an application momentarily switches to rendering to some other texture in order to continue rendering the main scene. Improves glmark -b desktop:effect=shadow:windows=4 by 27% Improves glmark -b desktop:blur-radius=5:effect=blur:passes=1:separable=true:windows=4 by 17% While I haven't tested other apps, this should help X rendering a lot, and I've heard GLBenchmark needed it too.	2016-09-14 06:25:41 +01:00
Eric Anholt	f473348468	vc4: Handle resolve skipping at job submit time. This is done in vc4_flush currently, but I'm going to make the job always track the surfaces it might be rendering to instead of putting in the destinations at flush time.	2016-09-14 06:08:03 +01:00
Eric Anholt	9688166bd9	vc4: Move the render job state into a separate structure. This is a preparation step for having multiple jobs being queued up at the same time.	2016-09-14 06:08:03 +01:00
Eric Anholt	c31a7f529f	vc4: Always unref the current job surfaces at job reset time. Drops some tricky logic in vc4_flush() trying to update the pointers, and fixes a broken lack of unref for MSAA surfaces at context destroy time.	2016-09-14 06:08:03 +01:00
Eric Anholt	774a556b6d	vc4: Move job-submit skip cases to vc4_job_submit(). For calling job_submit() directly, I need the skipping here.	2016-09-14 06:08:03 +01:00
Eric Anholt	0ef1b32ebb	vc4: Move bin CL trailer to job_submit() time. To implement job shuffling, I want to be able to call submit() on specific jobs, turning vc4_flush() into the context's flush-all-jobs hook.	2016-09-14 06:08:03 +01:00
Eric Anholt	a2014c2eb9	vc4: Simplify the DISCARD_RANGE handling It's really just an upgrade to attempting WHOLE_RESOURCE. Pulling the logic out caught two bugs in it: We would try to do so on cubemaps (even though we're only mapping 1 of the 6 slices), and we would break persistent coherent mappings by trying to reallocate when we shouldn't.	2016-09-14 06:08:03 +01:00
Eric Anholt	21a27ad956	vc4: Fix incorrect clearing of Z/stencil when cleared separately. The clear of Z or stencil will end up clearing the other as well, instead of masking. There's no way around this that I know of, so if we are clearing just one then we need to draw a quad. Fixes a regression in the job-shuffling code, where the clear values move to the job and don't just have the last clear's value laying around when you do glClear(DEPTH) and then glClear(STENCIL) separately (ext_framebuffer_multisample-clear 4 depth)). This causes regressions in ext_framebuffer_multisample/multisample-blit depth and ext_framebuffer_multisample/no-color depth, but these were formerly false positives due to the reference image also being black. Now the reference and test images are both being drawn, and it looks like there's an incorrect resolve of depth during blitting to an MSAA FBO.	2016-09-14 06:08:03 +01:00
Ilia Mirkin	89a49af31e	glsl: add core plumbing for GL_ANDROID_extension_pack_es31a Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 20:49:55 -04:00
Ilia Mirkin	83116d084f	mesa: introduce glPrimitiveBoundingBoxARB entrypoint This requires a bit of rejiggering, since normally ES entrypoints alias core ones, not vice-versa. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 20:49:50 -04:00
Ilia Mirkin	a69dc2c412	mesa: add a GLES3.2 enums section, and expose new MS line width params This also exposes them for ARB_ES3_2_compatibility. While both specs refer to the new MS line width parameters being separate from the existing AA line widths, reality begs to differ. It's the same on all hardware currently supported by mesa. Should hardware come along that wants these to be different, they're easy enough to separate out. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 20:49:47 -04:00
Sirisha Gandikota	aa7b410592	aubinator: Remove bogus "end" parameter in gen_disasm_disassemble() Earlier, the loop pretends to loop over instructions from "start" to "end", but the callers always pass 8192 for end, which is some huge bogus value. The real loop termination condition is send-with-EOT or 0. (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 16:32:42 -07:00
Sirisha Gandikota	1ab92d80a8	aubinator: Make gen_disasm_disassemble handle split sends Skylake adds new SENDS and SENDSC opcodes, which should be handled in the send-with-EOT check. Make an is_send() helper that checks if the opcode is SEND/SENDC/SENDS/SENDSC (Ken) v2: Make is_send() much more crispier, Mix declaration and code to make the code compact (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 16:32:39 -07:00
Sirisha Gandikota	5d2440532f	aubinator: Simplify print_dword_val() method Remove the float/dword union and use the iter->p[f->start / 32] directly as printf formatter %08x expects uint32_t (Ken) v2: Make the cleanup much more crispier (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 16:32:24 -07:00
Jason Ekstrand	1eebb60917	anv/image: Set correct base_array_layer and array_len for storage images Since Vulkan doesn't allow single-slice 3D storage images, we need to just set the base_array_layer and array_len to the full size of the 3-D LOD. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-13 14:45:49 -07:00
Jason Ekstrand	106709db7b	Revert "intel/isl: Ignore base_array_layer and array_len for 3D storage..." This reverts commit `3943888c94`. It turns out that commit was pretty-much bogus since it breaks binding a 3-D texture as a 2-D storage image. The correct fix for the Vulkan CTS tests needs to be in the Vulkan driver itself rather than ISL. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-13 14:45:15 -07:00
Jason Ekstrand	330104464f	anv: Use blorp for doing MSAA resolves Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-13 12:40:13 -07:00
Jason Ekstrand	6bcb1f753e	anv: Use blorp for ClearColorImage Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-13 12:40:13 -07:00
Jason Ekstrand	57e87862eb	anv: Delete meta_blit2d Everything that we were once using the blit2d framework for is now done with blorp. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-13 12:40:13 -07:00
Jason Ekstrand	36286ccb96	anv/blorp: Add a gcd_pow2_u64 helper and use it for buffer alignments This is a lot cleaner and easier to read than the old piles of if statements. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-09-13 12:40:13 -07:00
Jason Ekstrand	af5d30de55	anv: Use blorp for CopyBuffer and UpdateBuffer Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-09-13 12:40:13 -07:00
Jason Ekstrand	0f1ca5407a	anv: Use blorp for CopyImage Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	58593f24cb	anv: Use blorp for CopyBufferToImage Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	f07f44a5bc	anv: Use blorp for CopyImageToBuffer Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	9f44745eca	anv: Use blorp to implement VkBlitImage Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	52fa3e8347	anv: Make image_get_surface_for_aspect_mask const Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	8f780af968	anv: Add initial blorp support Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	1fe8bf82b2	intel/anv: Use #defines for all __gen_ helpers This allows us to #undef them later if we don't want them to persist Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	4a6c9e20b8	anv: Generalize emit_urb_setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	8cb144bd93	anv/pipeline: Roll compute_urb_partition into emit_urb_setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	823ab83432	intel/blorp: Use #defines for all __gen_ helpers This allows us to #undef them later if we don't want them to persist Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	c0b9776cd6	intel/isl: Divide QPitch by 2 for 3-D stencil textures on SKL+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	00e79cec99	isl/state: Don't set QPitch for GEN4_3D surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	cb780c9ccf	intel/blorp: Rework alloc_binding_table The original blorp_alloc_binding_table helper was supposed to return the binding table offset and map along with the surface state maps. This isn't quite what we want, however. What we really want is the binding table offsets, surface state offsets, and surface state maps. In the GL driver, the binding table map is an array of surface state offsets. However, in Vulkan, this isn't quite true as the entries in the binding table are surface state offsets combined with another binding table block offset. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-13 12:40:11 -07:00
Marek Olšák	524fd55d2d	tgsi/scan: don't set interp flags for inputs only used by INTERP instructions radeonsi depends on the interp flags a little bit too much. This fixes 9 randomly failing tests: GL45-CTS.shader_multisample_interpolation.render.interpolate_at_centroid.* Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	15a127bc2c	radeonsi: fix FP64 UBO loads with indirect uniform block indexing No known tests. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	35d284d08e	winsys/amdgpu: don't assume GTT if the VRAM flag isn't set Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	6df872df59	radeonsi: clean up CP DMA emit code Unify the clear and copy paths, clean up the definitions. It looks more like a rework. It's a preparation for GDS support, which might or might not come. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	84860dd0bb	radeonsi: print the IB and buffer list in VM fault reports This is a fallout from reworking the debug flags. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	fd69fa65a8	radeonsi: add sampler view BOs to the BO list last If si_sampler_view_add_buffer ends up flushing, then the code in begin_new_cs would previously have added the buffer(s) for whatever was previously bound to that slot. Now it would add only the new buffer. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	275c073c6a	radeonsi: export SampleMask from pixel shaders at full rate Heaven and Valley write gl_SampleMask and not Z. Use 16_ABGR instead of 32_ABGR if Z isn't written. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	b89854b0c7	gallium/radeon: set new r600_resource fields correctly in other places too This was missed in: commit `0d2e43fcb1` Author: Marek Olšák <marek.olsak@amd.com> Date: Thu Aug 18 16:30:00 2016 +0200 gallium/radeon: derive buffer placement and flags only at initialization Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	c723acc03d	ddebug: dump shader buffers and images this was unimplemented Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	fdd457c89f	ddebug: fix a crash in resource_get_handle broken recently Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Jan Vesely	b671909d27	radeon: Don't check DCC on pipe buffers Fixes segfaults in EG compute since: commit `21de3be8e6` radeonsi: fix texture format reinterpretation with DCC Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-13 14:23:26 -04:00
Andy Furniss	304f70536a	vl/util: Fix YV12/I420 convert to NV12 U/V reversal Fix VAAPI YV12/I420 convert to NV12 U/V reversal. Input order is YVU when this is called. Signed-off-by: Andy Furniss <adf.lists@gmail.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-09-13 13:58:40 -04:00
Jason Ekstrand	6ac469a6c3	anv/allocator: Use VG_NOACCESS_WRITE in anv_bo_pool_free Previously, we were relying on the fact that VALGRIND_MEMPOOL_FREE came later on in the function to prevent "link->bo = bo" from causing an invalid write. However, in the case where the size requested by the user is very small (less than sizeof(struct anv_bo)), this isn't sufficient. Instead, we should call VALGRIND_MEMPOOL_FREE early and then use VG_NOACCESS_WRITE. We do, however, have to call VALGRIND_MEMPOOL_FREE after reading bo_in because it may be stored in the bo itself. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-13 10:44:03 -07:00
Jason Ekstrand	3943888c94	intel/isl: Ignore base_array_layer and array_len for 3D storage surfaces The time we want to restrict the Z range of a 3-D surface is when rendering to it. For storage surfaces, we always want he full range. However, we still need to set MinimumArrayElement and RenderTargetViewExtent to sensible values so we'll just set them to the reasonable defaults we used before we started respecting the base_array_layer and array_len. This fixes a bunch of Vulkan CTS regressions caused by `48f195d7c6`. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97790 Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-13 10:43:21 -07:00
Jose Fonseca	62affedbed	appveyor: Update winflexbison download URL. This particular version got moved into a `old_versions` subdirectory.	2016-09-13 17:54:51 +01:00
Jason Ekstrand	a1e49be713	i965: Use blorp_copy for all copy_image operations on gen6+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-12 19:44:05 -07:00
Jason Ekstrand	540395bf9b	i965/blorp: Add a copy_miptrees helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-12 19:44:05 -07:00
Jason Ekstrand	d038adca0e	intel/isl: Add support for RGB formats in X and Y-tiled memory Normally, using a non-linear tiling format helps improve cache locality by ensuring that neighboring pixels are usually close-by in memory. For RGB formats, this still sort-of holds, but it can also lead to rather terrible memory access patterns where a single RGB pixel value crosses a tile boundary and gets split into two pieces in different 4K pages. It also makes for some rather awkward calculations because your tile size is no longer an even multiple of surface element size. For these reasons, we chose to simply never create tiled RGB images in the Vulkan driver. The GL driver, however, is not so kind so we need to support it somehow. I briefly toyed with a couple of different schemes but this is the best one I could come up with. The fundamental problem is that a tile no longer contains an integer number of surface elements. I briefly considered a couple other options but found them wanting: 1) Using floats for the logical tile size. This leads to potential rounding error problems. 2) When presented with a RGB format, just make the tile 3-times as wide. This isn't so nice because now our tiles are no longer power-of-two size. Also, it can force the row_pitch to be larger than needed which, while not strictly a problem for ISL, causes incompatibility problems with the way the GL driver chooses surface pitches. The chosen method requires that you pay attention and not just assume that your tile_info is in the units you think it is. However, it's nice because it provides a nice "these are the units" declaration in isl_tile_info itself. Previously, the tile_info wasn't usable as a stand-alone structure because you had to also know the format. It also forces figuring out how to deal with inconsistencies between tiling and format back to the caller which is good because the two different consumers of isl_tile_info really want to deal with it differently: Computation of the surface size wants the fewest number of horizontal tiles possible while get_intratile_offset is far more concerned with things aligning nicely. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Chad Versace <chadversary@chromium.org>	2016-09-12 19:44:05 -07:00
Jason Ekstrand	883086500b	intel/isl: Allow valign2 for texture-only Y-tiled surfaces on gen7 The restriction that Y-tiled surfaces must have valign == 4 only aplies to render targets but we were applying it universally. This causes problems if ISL_FORMAT_R32G32B32_FLOAT is used because it requires valign == 2; this should be okay because you can't render to that format. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-12 19:44:05 -07:00
Jason Ekstrand	54db5afd2c	intel/blorp: Work in terms of logical array layers When Ivy Bridge introduced array multisampling, someone made the decision to do lots of stuff throughout the driver in terms of physical array layers rather than logical array layers. In ISL, we use logical array layers most of the time and it really makes no sense to use physical array layers in the blorp API. Every time someone passes physical array layers into blorp for an array multisampled surface, they're always divisible by the number of samples and we divide right away. Eventually, I'd like to rework most of the GL driver internals to use logical array layers but that's going to be a big project and will probably happen as part of the ISL conversion. For now, we'll do the conversion in brw_blorp and let blorp just use the logical layers. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	fa4627149d	intel/blorp: Increase the presision of coordinate transform calculations The result of this calculation goes into an fma() in the shader and we would like it to be as precise as possible. The division in particular was a source of imprecision whenever dst1 - dst0 was not a power of two. This prevents regressions in some of the new Vulkan CTS tests for blitting using a filtering of NEAREST. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	c70be1ead5	intel/blorp: Add a swizzle parameter to blorp_clear While we're here, we also re-arrange the parameters to better match the parameter order of blorp_blit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	ea1399aba0	intel/blorp: Make color_write_disable const and optional Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	9286f62f11	intel/blorp: Add support for clearing R9G9B9E5 surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	ab03e59867	intel/blorp: Add support for RGB destinations in copies Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	5ae8043fed	intel/blorp: Add an entrypoint for doing bit-for-bit copies Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	941b4d063a	intel/blorp: Pull the guts of blorp_blit into a helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	4e03edf189	intel/blorp: Stop using the X/YOffset field of RENDER_SURFACE_STATE While it can be useful, the field has substantial limtations. In particular, the bittom 2 or 3 bits is missing so your offset always has to be a multiple of 4 or 8. While surface alignments usually work out to make this ok, when you start trying to fake compressed surfaces as uncompressed (which we will want to do) this falls apart. The easiest solution is to simply align all offsets to a tile boundary and munge the regions we're copying to account for the intratile offset. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	c170606fc6	intel/blorp: Use fake_interleaved_msaa in retile_w_to_y Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	a613449f71	intel/blorp: Use isl_get_interleaved_msaa_px_size_sa Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	8ac99eabb6	intel/isl: Add a helper for getting the size of an interleaved pixel Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	3cc15ba5bb	intel/blorp: Handle 3D surfaces in convert_to_single_slice Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	43d25edf78	intel/isl: Fix an assert in get_intratile_offset_sa Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	6da968b651	intel/blorp: Fix the early return condition in convert_to_single_slice The convert_to_single_slice operation is mostly idempotent. The only non-repeatable thing it does is that, when it sets the intratile offset fields, it just overwrites them instead of doing a += operation. This is supposed to be ok because we have an early return at the top that should make it bail of the surface is already a single slice. Unfortunately, the if condition has been broken ever since it was first added in `96fa98c18`. This commit fixes the condition and adds an assert to ensure we don't stomp any non-zero intratile offsets. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	ec7e0d62c5	intel/blorp: Use the surface format for computing offsets If we use the view format, it may be an uncompressed view of a compressed image which throws things off. Since we're computing offsets of images, we want the actual surface offset anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	7f2fecd114	intel/blorp: Don't assume R8_UINT in convert_to_single_slice We're going to use it for more than just stencil textures Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	2fc9c7e3d9	intel/blorp: Take a destination swizzle in blorp_blit Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	2dba5489ae	intel/blorp: Take an isl_swizzle instead of a SWIZZLE Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	7ddb21708c	intel/isl: Add an isl_swizzle structure and use it for isl_view swizzles This should be more compact than the enum isl_channel_select[4] that we were using before. It's also very convenient because we already had such a structure in the Vulkan driver we just needed to pull it over. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Kenneth Graunke	376d1dc2f1	docs: Add OES_tessellation_shader to the release notes.	2016-09-12 17:24:35 -07:00
Kenneth Graunke	049cee2c16	docs: Mark OES_tessellation_shader as done.	2016-09-12 17:23:20 -07:00
Ilia Mirkin	742832434a	st/mesa: fix is_scissor_enabled when X/Y are negative Similar to commit `49c24d8a24` ("i965: fix noop_scissor range issue on width/height") - take the X/Y into account to determine whether the scissor covers the whole area or not. Fixes the recently-added gl-1.0-scissor-depth-clear-negative-xy piglit test. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: <mesa-stable@lists.freedesktop.org>	2016-09-12 20:07:21 -04:00
Mauro Rossi	6b9d7e69ee	android: add support for libmesa_amdgpu_addrlib Android porting of the following commits: `f1f1ba3` "radeonsi: move sid.h/r600d_common.h to a common place." `69fca64` "amd/addrlib: move addrlib from amdgpu winsys to common code" This patch fixes android building errors Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-09-13 10:06:04 +10:00
Dave Airlie	0fe9152868	u_endian: add android to glibc clause Tested-by: Mauro Rossi <issor.oruam@gmail.com>	2016-09-13 10:04:13 +10:00
Jason Ekstrand	24be630660	Revert "i965: Drop the maximum 3D texture size to 512 on Sandy Bridge" This reverts commit `6ba88bce64`. The commit was erroneous because GL has a separate limit, GL_MAX_FRAMEBUFFER_LAYERS which guards the number of layers you are allowed to render into. The GL 4.5 spec says: "The framebuffer attachment point attachment is said to be framebuffer attachment complete if [...] all of the following conditions are true: [...] If image is a three-dimensional, one- or two-dimensional array, or cube map array texture and the attachment is layered, the depth or layer count of the texture is less than or equal to the value of the implementation-dependent limit MAX_FRAMEBUFFER_LAYERS." and goes on to say that "framebuffer complete" requires all attachments to be "framebuffer attachment complete". On Sandy Bridge, we set GL_MAX_FRAMEBUFFER_LAYERS to 512 so creating a 3D texture bigger than 512 is fine; you just can't render into all of the slices at once. Fixes ES3-CTS.gtf.GL3Tests.npot_textures.npot_tex_image on Sandy Bridge Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-12 16:52:10 -07:00
Jason Ekstrand	2519237c24	intel/blorp: Handle the 512 layers restriction on Sandy Bridge Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-12 16:48:56 -07:00
Jason Ekstrand	48f195d7c6	intel/isl: Treat 3-D textures as 2-D arrays for rendering In particular, this means that isl_view::base_array_layer and isl_view::array_len get applied to 3-D textures but only when rendering. We were already applying isl_view::base_array_layer for rendering into 3-D textures so this isn't a huge deviation. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-12 16:48:56 -07:00
Sirisha Gandikota	63fe9ab894	aubinator: Simplify gen_disasm_create()'s devinfo handling Copy the whole devinfo structure instead of just few fields (Ken) Earlier, copied only couple of fields which added more code. So, simplify code by copying the whole structure. Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-12 16:20:04 -07:00
Sirisha Gandikota	d2869c95fb	aubinator: Fix compiler warning Add 'const' qualifier to gen_field_iterator::p pointer (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-12 16:19:56 -07:00
Julien Isorce	bf901a2f8c	st/va: also honors interlaced preference when providing a video format This fixes a crash when using the prefered video format with vaapisink on Nvidia hardwares. Also caught by the following assert: nouveau_vp3_video.c:91: Assertion `templat->interlaced' failed. TEST= gst-launch-1.0 videotestsrc ! video/x-raw, format=NV12 ! vaapisink Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com> Tested-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-12 22:17:40 +01:00
Samuel Pitoiset	3f3640c86c	tgsi: document semantics for compute shaders Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-12 22:15:10 +02:00
Kenneth Graunke	54138af1cd	mesa: Enable OES/EXT_tessellation_shader for ES 3.1 + ARB_tess drivers. Drivers which support ARB_tessellation_shader and ES 3.1 now will expose OES_tessellation_shader and EXT_tessellation_shader as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-12 13:07:38 -07:00
Marek Olšák	546bc07349	radeonsi: don't preload constants at the beginning of shaders LLVM can CSE the loads, thus we can always re-load constants before each use. The decrease in SGPR spilling is huge. The best improvements are the dumbest ones. 26011 shaders in 14651 tests Totals: SGPRS: 1453346 -> 1251920 (-13.86 %) VGPRS: 742576 -> 728421 (-1.91 %) Spilled SGPRs: 52298 -> 16644 (-68.17 %) Spilled VGPRs: 397 -> 369 (-7.05 %) Scratch VGPRs: 1372 -> 1344 (-2.04 %) dwords per thread Code Size: 36136488 -> 36001064 (-0.37 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 219315 -> 222221 (1.33 %) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-12 21:06:57 +02:00
Jason Ekstrand	e2fb044115	intel/blorp: Add a TODO file This provides a nice little place to share notes on what still needs to be done and/or would be nice to have in BLORP. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 10:14:49 -07:00
Alejandro Piñeiro	6165603209	i965: check for GL_TEXTURE_EXTERNAL_OES at miptree_create_for_teximage Forgotten on commit "i965: Fix calculation of the image height at start level". Thanks to Ilia Mirkin for point it. Fixes the following regressions on Haswell and Broadwell: ES2-CTS.gtf.GL2ExtensionTests.egl_image_external.TestSimpleUnassociated (crash back to pass) ES2-CTS.gtf.GL2ExtensionTests.egl_image_external.TestSimple (crash back to fail) ES2-CTS.gtf.GL2ExtensionTests.egl_image_external.TestVertexShader (crash back to fail) https://bugs.freedesktop.org/show_bug.cgi?id=97761 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 18:10:50 +02:00
Chuanbo Weng	9a1eb54237	gbm: fix potential NULL deref of mapImage/unmapImage. The mapImage/unmapImage functions of DRIimage extension can be NULL, so we should add additional check for them. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-12 16:52:55 +01:00
Emil Velikov	63faf7de61	Remove GL_GLEXT_PROTOTYPES guards from non-ext headers. A earlier sync with the Khronos headers added _extension_ prototype guards to all the GLES2/3/31/32 core entry points. Effectively breaking all the applications that aim to be portable and do not set the define. The issue has been reported to Khronos (internal bugzilla #14206) and is being worked on. Until updated/fixed headers are released locally fix the issue. The following report is when building weston. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97773 Cc: Armin Krezović <krezovic.armin@gmail.com> Cc: Emmanuel Gil Peyrot <emmanuel.peyrot@collabora.com> Cc: Pekka Paalanen <ppaalanen@gmail.com> Fixes: `6a5504de2f` ("Update Khronos-supplied headers to r33100") Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-12 16:52:43 +01:00
Emil Velikov	ceaa2e1738	aubinator: rework print_help() Rather than using platform specific methods to retrieve the program name pass it explicitly. The function is called directly from main(). Similarly - basename comes in two versions POSIX (can modify string, always pass a copy) and GNU (never modifies the string). Just printout the complete program name, esp. since the program is not meant to be installed. Thus using $basename is unlikely to work, not to mention it is misleading. Reported-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jonathan Gray <jsg@jsg.id.au>	2016-09-12 16:49:59 +01:00
Adam Jackson	0cb1428fbb	docs: Note MESA_configless_context as superseded Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-12 11:29:11 -04:00
Adam Jackson	d9f5b1915b	egl: Rename MESA_configless_context bit to KHR_no_config_context Keep the old name in the extension string, but refer to the KHR extension internally. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-12 11:29:09 -04:00
Adam Jackson	cc45a5c308	egl: QueryContext on a configless context returns zero MESA_configless_context does not specify the interaction with QueryContext at all, and the code to generate an error in this case predates the Mesa extension. Since EGL_NO_CONFIG_{KHR,MESA} are numerically identical there's no way to distinguish which one the application asked for, so use the KHR behaviour. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-12 11:28:38 -04:00
Boyuan Zhang	e5009b7c26	st/va: enable vbr rate control for vaapi encode This patch enables variable bit-rate for vaapi encoding. According to va.h, target bit-rate equals to maximum bit-rate multiplies by target_percentage. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-12 10:34:53 -04:00
Leo Liu	6a7f79af9b	vl/rbsp: match initial escaped bits with valid in the buffer Otherwise the check for the three byte will not make sense. Signed-off-by: Leo Liu <leo.liu@amd.com>	2016-09-12 10:09:27 -04:00
Timothy Arceri	2da15a3b89	egl: fix gcc warning braces around scalar initializer Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-09-12 22:43:49 +10:00
Nicolai Hähnle	b8703e363c	winsys/radeon: rename nrelocs, crelocs to max_relocs, num_relocs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:55 +02:00
Nicolai Hähnle	d66bbfbede	winsys/radeon: don't pre-allocate the relocations array It's really not necessary. Switch to an exponential resizing strategy. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:53 +02:00
Nicolai Hähnle	f47da2e34f	winsys/radeon: remove unused radeon_cs_context::priority_usage Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:51 +02:00
Nicolai Hähnle	17fff0c2de	winsys/amdgpu: remove amdgpu_cs_lookup_buffer The radeonsi driver doesn't and shouldn't care about the buffer index. Only the virtual addresses matter. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:47 +02:00
Nicolai Hähnle	12657a7abf	winsys/amdgpu: remove unused field domains from amdgpu_cs_buffer Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:07 +02:00
Nicolai Hähnle	3cdeb2a177	winsys/amdgpu: remove initial buffer list allocation It's really not necessary. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:04 +02:00
Nicolai Hähnle	cc53dfda9f	winsys/amdgpu: extract adding a new buffer list entry into its own function While at it, try to be a little more robust in the face of memory allocation failure. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:01 +02:00
Nicolai Hähnle	11cbf4d7ae	winsys/amdgpu: use only one fence per BO The fence that is added to the BO during flush is guaranteed to be signaled after all the fences that were in the fences array of the BO before the flush, because those fences are added as dependencies for the submission (and all this happens atomically under the bo_fence_lock). Therefore, keeping only the last fence around is sufficient. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:54:59 +02:00
Nicolai Hähnle	480ac143df	winsys/amdgpu: add do_winsys_deinit function The idea is to have matching init/deinit functions so that deinit can be re-used for cleanup in the error path of amdgpu_winsys_create. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:54:56 +02:00
Nicolai Hähnle	9fb8d354ca	winsys/amdgpu: clean up error paths in amdgpu_winsys_create No need to call pb_cache_deinit, because the cache hasn't been initialized at that point. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:54:53 +02:00
Nicolai Hähnle	a6c38d47d4	gallium/radeon: page alignment for buffers is unnecessary In some places (e.g. shader program pointers) we require 256 bytes alignment. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:54:45 +02:00
Nicolai Hähnle	339867c077	gallium/radeon/winsyses: remove #includes of pb_bufmgr.h Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:54:36 +02:00
Topi Pohjolainen	e54b70b3d4	i965/rbc: Clarify rational given for shader image resolves Original commit added documentation explaining lossless compression case: commit `56f29911ec` Author: Topi Pohjolainen <topi.pohjolainen@intel.com> Date: Tue Feb 2 10:00:41 2016 +0200 i965: Add a flag telling color resolve pass to ignore CCS_E It, however, easily gives the impression that the sole purpose of the intel_miptree_resolve_color() is to address lossless compression. Original intention is to document the lack of INTEL_MIPTREE_IGNORE_CCS_E flag given for the resolve call. This patch fixes this along with a typo found spotted further down. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:30 +03:00
Topi Pohjolainen	1df4b666ed	i965/blorp: Use hw generetad primitive copies for layered clears Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:30 +03:00
Topi Pohjolainen	b712aa2614	i965/blorp: Sanity check all layers before actual clear Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:30 +03:00
Topi Pohjolainen	a1c7de09dc	intel/blorp: Add plumbing for setting color clear layer count Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:29 +03:00
Topi Pohjolainen	514afdce95	intel/blorp: Allow multiple layers Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:29 +03:00
Topi Pohjolainen	e597821ef2	i965/blorp: Instruct vertex fetcher to provide prim instance id This will indicate target layer (Render Target Array Index) needed for layered clears. v2: Use 3DSTATE_VF_SGVS for gen8+ Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:29 +03:00
Topi Pohjolainen	39712b2a14	i965/rbc: Allocate mcs directly such as we do for compressed msaa. In case of non-compressed simgle sampled buffers the allocation of mcs is deferred until there is actually a clear operation that needs the mcs. In case of render buffer compression the mcs buffer always needed and there is no real reason to defer the allocation. By doing it directly allows to drop quite a bit unnecessary complexity. Patch leaves brw_predraw_set_aux_buffers() a no-op. Subsequent patches will re-use it and it seemed cleaner to leave it instead of removing and re-introducing. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:29 +03:00
Topi Pohjolainen	024a39511f	isl/gen8+: Allow 1D and 3D auxiliary surfaces Otherwise once mcs buffer gets allocated without delay for lossless compression (same as we do for msaa), assert starts to fire in piglit case: tex3d. The test uses depth of one which is in fact supported even now. v2 (Jason): Allow also 1D case as there is nothing in the specs constraining it either. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:29 +03:00
Topi Pohjolainen	6939532593	i965: Add sanity check for non-compressible texture views v2: Fix missing inline declaration Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:29 +03:00
Topi Pohjolainen	1b6fcc08df	i965/rbc: Consult rb settings for texture surface setup Once mcs buffer gets allocated without delay for lossless compression (same as we do for msaa), one gets regression in: GL45-CTS.texture_barrier_ARB.same-texel-rw Setting the auxiliary surface for both sampling engine and data port seems to fix this. I haven't found any hardware documentation backing this though. v2 (Jason): Prepare also for the case where surface is sampled with non-compressible format forcing also rendering without compression. v3: Split asserts and decision making. v4: Detailed comment provided by Jason explaining the need for using auxiliary buffer for texturing when the same surface is also used as render target. Added check for existence of renderbuffer before considering if underlying miptree matches. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:46:13 +03:00
Topi Pohjolainen	22d9a4824b	i965: Track non-compressible sampling of renderbuffers v3: - Actually set the flags when needed instead of falsely overwriting them (Jason). - Use more generic name for flag (dropped RENDERBUFFER) - Consult also shader images v4: - Consult only lossless compressd shader images v5: - Check the existence of renderbuffer before considering if it matches the given miptree Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 08:58:38 +03:00
Topi Pohjolainen	1f51217d99	i965: Replace boolean rb surface state setup argument with flags And add plumbing to provide it all the way to surface state emitter. This is not used yet but will be in subsequent patches to carry additional constraints. v2 (Jason): Use uint32_t instead of int as the type Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 08:58:38 +03:00
Topi Pohjolainen	1634a4963c	i965/rbc: Allow integer formats as advertised in isl_format.c Blorp consults brw_is_color_fast_clear_compatible() to see if any restrictions apply for fast clear in addition to the capablities advertised in isl_format.c::format_info[]. On Gen8+ integer formats are backlisted for plain old fast clear but there is no reason why lossless compression shouldn't be supported. In fact, lossless compression of integer formats is already supported for normal render paths. This patch prepares for dropping the delayed allocating of the mcs buffer for lossless compression. Until now the skip of fast clear also prevented the mcs being allocated and hence the lossless compression being effectively turned off for integer formats. Once the mcs buffer is allocated beforehand, the assertion addressed here would start triggering. v2: Drop the assert instead of relaxing it (Jason) Fix typo while at it. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 08:58:38 +03:00
Alejandro Piñeiro	e77bf32475	i965: remove unused variable at intel_miptree_create_for_teximage After commit "i965: Fix calculation of the image height at start level", it is not needed. This commit removes the "warning: unused variable ‘i’" warning. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 07:21:32 +02:00
Thomas Helland	08c5b10ae9	mesa/glsl: Move string_to_uint_map into the util folder This clears the last bits of the usecases of the hash table located in mesa/program, allowing us to remove it. V2: Rebase on top of changes to Makefile.sources Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	e55eb2b7ea	glsl: Convert glcpp-parse to the util hash table And change the include in glcpp.h accordingly. V2: Whitespace fix Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	16fb318d0c	glsl: Convert loop analysis to the util hash table Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	ec453979db	mesa: Convert symbol table to the util hash table Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	f224ef4392	glsl: Convert varying test to the util hash table V2: remove now unused ht_count_callback() (Timothy Arceri) Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	9efa977be5	glsl: Convert output read lowering to the util hash table Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	6adcc8f283	glsl: Convert interface block lowering to the util hash table V2: move comment to correct location (Timothy Arceri) Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	5482d31b86	glsl: Convert if lowering to use a set Also do some minor whitespace cleanups Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	85a197c4ed	glsl: Convert linker to the util hash table We are getting the util hash table through the include in program/hash_table.h for the moment until we migrate the string_to_uint_map to a separate file. Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	f10cc9407b	glsl: Convert link_varyings to the util hash table Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	e7f91d9de1	glsl: Change link_functions to use a set The "locals" hash table is used as a set, so use a set to avoid confusion and also spare some minor memory. Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	2228548f83	glsl: Convert recursion detection to the util hash table Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	9b3c0f81a7	glsl: Convert constant_expression to the util hash table V2: Fix incorrect ordering on hash table insert V3: null check value returned by _mesa_hash_table_search() (Timothy Arceri) Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	9f188be8a6	glsl: Convert ast_to_hir to the util hash table V2: Rebase to the adaption of new hashing functions V3: move previous_label declaration to where it is used (Timothy Arceri) Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	9ac6d61751	glsl: Convert ir_clone to the util hash table V2: add braces to multiline if (Timothy Arceri) Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	5b5d4ea4a0	glsl: Convert function inlining to the util hash table Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	eef2be6822	mesa: Convert string_to_uint_map to the util hash table And remove the now unused hash_table_replace. V2: Actually do the equivalent thing, and don't leak memory V3: fix minor typo in comment (Timothy Arceri) Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	ddb8639b18	util: Move hash_table_call_foreach to util hash table It is included through the util/hash_table include in the program hash_table, so this should be safe. This will be needed when we start converting each use of the program_hash_table, as some places need this function. Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	cf4a4820ac	mesa: Remove prog_hash_table.c Here we make the prog_hash_table functionally equivalent to the one in util by wrapping the remaing functions that differ. We also move the functions to the header so we can remove the c file. This enables us to do a step-by-step replacement of the table. Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	42ba435fd1	mesa: Remove unused hash table includes This should prevent us from rebuilding the world. Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Ilia Mirkin	148fbf32a8	freedreno/a3xx: disable filtering for texture buffers and int textures Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-11 13:14:06 -04:00
Niels Ole Salscheider	cfa914a1b4	st/clover: Define __OPENCL_VERSION__ on the device side This is required by the OpenCL standard. Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-09-10 15:48:54 -07:00
Ilia Mirkin	a8c0c7301c	gm107/ir: allow indirect inputs to be loaded by frag shader Looks like the GM107 IPA op does not allow a separate offset when using an indirect register. Instead we must use AL2P like we do for indirect vertex operations on Kepler+. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-09-10 13:40:04 -04:00
Ilia Mirkin	a22aee5ad1	gm107/ir: AL2P writes to a predicate register We have to force it to write to predicate 7 (aka PT) in order for it not to mess up another predicate. Unclear what would be returned in the predicate, perhaps an error code for out-of-bounds requests. Blob doesn't seem to check it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-09-10 13:36:20 -04:00
Antia Puentes	83e8617f4b	i965: Fix calculation of the image height at start level - Fixes CTS tests: * GL44-CTS.shader_image_size.advanced-nonMS-cs-float * GL44-CTS.shader_image_size.advanced-nonMS-cs-int * GL44-CTS.shader_image_size.advanced-nonMS-cs-uint * GL44-CTS.shader_image_size.advanced-nonMS-gs-float * GL44-CTS.shader_image_size.advanced-nonMS-gs-int * GL44-CTS.shader_image_size.advanced-nonMS-gs-uint * GL44-CTS.shader_image_size.advanced-nonMS-tes-float * GL44-CTS.shader_image_size.advanced-nonMS-tes-int * GL44-CTS.shader_image_size.advanced-nonMS-tes-uint * GL44-CTS.shader_image_size.advanced-nonMS-vs-float * GL44-CTS.shader_image_size.advanced-nonMS-vs-int * GL44-CTS.shader_image_size.advanced-nonMS-vs-uint v1: (written by Dave Airlie) Always shift height images for levels. Fixed the CTS test. v2: Only shift height if the texture is not an 1D_ARRAY, it fixes assertion in GL44-CTS.texture_view.gettexparameter due to the original patch (Antia). v3: Remove the loop. Do not shift height either for 1D textures. Use an explicit switch and add an assertion (levels == 0) for multisampled textures (Jason). v4: Rectangle textures can not have levels either (Ilia Mirkin). Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-10 12:52:32 +02:00
Marek Olšák	08bcbfdc07	radeonsi: flush TC L2 before using a compute indirect buffer There is no known test for this. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 22:45:07 +02:00
Marek Olšák	a5a2cc530c	radeonsi: fix the VGT performance tweak for small instances Based on the VGT spec. The Vulkan driver doesn't do it optimally and they plan to fix it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 22:45:06 +02:00
Marek Olšák	a67d81580b	radeonsi: remove the cache_flush atom Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 22:45:06 +02:00
Marek Olšák	f9750932ea	winsys/amdgpu: replace OUT_CS with radeon_emit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 22:45:06 +02:00
Marek Olšák	81da78bfc3	winsys/radeon: replace OUT_CS with radeon_emit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 22:45:06 +02:00
Christoph Haag	55ba5fa9a6	doc: document GALLIUM_DRIVER v2: Add dot at end of sentence Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 09:24:28 +02:00
Haixia Shi	b1d636aa00	egl/android: Set EGL_MAX_PBUFFER_WIDTH and EGL_MAX_PBUFFER_HEIGHT Set config attributes EGL_MAX_PBUFFER_WIDTH and EGL_MAX_PBUFFER_HEIGHT to hard-coded non-zero values. These two attributes are required on Android. v2: use _EGL_MAX_PBUFFER_WIDTH/HEIGHT from egldefines.h (based on discussion on the first version) Signed-off-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-09 07:51:04 +03:00
Tapani Pälli	478fbc2348	android: depend on libmesa_genxml from i965 Android.gen.mk Static library dependency is required to pull the generated XML headers into the generated C file. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-09 07:51:04 +03:00
Tapani Pälli	4542c7ed5f	i965: release GLSL IR in LinkShader after it's not needed Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-09 07:51:04 +03:00
Tapani Pälli	2cd02e30d2	glsl: use hash instead of exec_list in copy propagation This change makes copy propagation pass faster. Complete link time spent in test case attached to bug 94477 goes down to ~400 secs from over 500 secs on my HSW machine. Does not fix the actual issue but brings down the total. No regressions seen in CI. v2: do not leak hash_table structure Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-09 07:50:42 +03:00
Jason Ekstrand	175ac629be	i965/fs: Fail the shader compile instead of asserting when we can't spill Blorp doesn't handle spilling so we set allow_spilling to false in that case. The blorp 16x MSAA resolve shader spills in 16-wide but not 8-wide. This commit makes it so that we fail the 16-wide compile and successfully fall back to 8-wide instead of just assert-failing when trying to compile the 16-wide shader. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-09-08 20:53:01 -07:00
Jason Ekstrand	88a2a2e053	nir/gcm: Add global value numbering support Unlike the current CSE pass, global value numbering is capable of detecting common values even if one does not dominate the other. For instance, in you have if (...) { ssa_1 = ssa_0 + 7; /* use ssa_1 / } else { ssa_2 = ssa_0 + 7; / use ssa_2 / } Global value numbering doesn't care about dominance relationships so it figures out that ssa_1 and ssa_2 are the same and converts this to if (...) { ssa_1 = ssa_0 + 7; / use ssa_1 / } else { / use ssa_1 / } Obviously, we just broke SSA form which is bad. Global code motion, however, will repair this for us by turning this into ssa_1 = ssa_0 + 7; if (...) { / use ssa_1 / } else { / use ssa_1 */ } This intended to eventually mostly replace CSE. However, conventional CSE may still be useful because it's less of a scorched-earth approach and doesn't require GCM. This makes it a bit more appropriate for use as a clean-up in a late optimization run. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-08 20:53:01 -07:00
Jason Ekstrand	99ff4b3eb2	nir/gcm: Call nir_metadata_preserve Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-08 20:53:01 -07:00
Max Staudt	02675622b0	r300g: Set R300_VAP_CNTL on RSxxx to avoid triangle flickering On the RSxxx chip series, HW TCL is missing and r300_emit_vs_state() is never called. However, if R300_VAP_CNTL is never set, the hardware (at least the RS690 I tested this on) comes up with rendering artifacts, and parts that are uploaded before this "fix" remain broken in VRAM. This causes artifacts as in fdo#69076 ("triangle flickering"). It seems like this setup needs to happen at least once after power on for 3D rendering to work properly. In the DDX with EXA, this happens in RADEON_SWITCH_TO_3D() when processing an XRENDER Composite or an Xv request. So playing back a video or starting a GTK+2 application fixes 3D rendering for the rest of the session. However, this auto-fix doesn't happen when EXA is not used, such as with GLAMOR or Wayland. This patch ensures the register is configured even in absence of the DDX's EXA module. The register setting is taken from: xf86-video-ati -- RADEONInit3DEngineInternal() mesa/src/mesa/drivers/dri/r300 -- r300EmitClearState() Tested on RS690. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Max Staudt <mstaudt@suse.de> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-09-09 13:30:47 +10:00
Marek Olšák	5981ab5445	gallium: remove PIPE_BIND_TRANSFER_READ/WRITE not used in any useful way Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-09-08 22:51:33 +02:00
Marek Olšák	0fbaf74977	radeonsi: unify si_set_optimal_micro_tile_mode call sites There is nothing special happening in those code blocks. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-08 22:51:33 +02:00
Marek Olšák	758bc52959	radeonsi: fix texture reinterpretation after DCC fast clear The problem is that TC-compatible DCC clear codes translate into different clear values when you change the format. I have a new piglit reproducing the issue. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-08 22:51:33 +02:00
Marek Olšák	46c425e7c8	radeonsi: enable DCC fast clear for 128-bit formats Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-08 22:51:33 +02:00
Marek Olšák	831c0c80f1	radeonsi: clamp integer clear color values for DCC fast clear It should be possible to get TC-compatible fast clear more often now. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-08 22:51:33 +02:00
Marek Olšák	93f3d8e10d	Revert "radeonsi: enable SDMA on CIK" This reverts commit `0241d8300f`. It doesn't work with mobile Bonaire. It looks like the programming of tiling parameters is wrong on some chips.	2016-09-08 22:51:33 +02:00
Christoph Haag	7b414bc512	doc: fix typo of GALLIUM_HUD_TOGGLE_SIGNAL In the original commit message in `56a1c10` it was wrongly used too: - env GALLIUM_HUD_SIGNAL_TOGGLE: toggle visibility via signal Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-08 20:19:35 +02:00
Jason Ekstrand	a00bd7bc27	nir/spirv: Refactor variable deocration handling Previously, we dind't apply variable decorations to the members of a split structure variable. This doesn't quite work, unfortunately, because things such as the "flat" qualifier may get applied to an entire structure instead of propagated to the members. This fixes 9 of the new CTS tests in the dEQP-VK.glsl.linkage.varying.struct.* group. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-09-08 10:45:23 -07:00
Jason Ekstrand	f5505730d3	nir/spirv: Break variable decoration handling into a helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-09-08 10:45:23 -07:00
Jonathan Gray	d50c56f868	aubinator: only use program_invocation_short_name with glibc/cygwin program_invocation_short_name is a gnu extension. Limit use of it to glibc and cygwin and otherwise use getprogname() which is available on BSD and OS X. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-08 18:37:02 +01:00
Jonathan Gray	2d3ebb474c	aubinator: include libgen.h for basename(3) Include libgen.h for basename as required by posix. The definition is not found on at least OpenBSD otherwise. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-08 18:37:02 +01:00
Jonathan Gray	0ba9e281fc	aubinator: stop using non portable error() function error() is a gnu extension and is not present on OpenBSD and likely other systems. Convert use of error to fprintf/strerror/exit. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-08 18:37:02 +01:00
Adam Jackson	dbda375d6f	egl: Fix up indentation on previous commit This was requested in review but I pushed the wrong version. Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-08 13:21:27 -04:00
Adam Jackson	a279760536	egl: Document why EGL_OPENGL{, _ES}_API are mostly identical Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-09-08 13:19:58 -04:00
Chad Versace	bad80c26e7	anv: Link to libX11-xcb only when unneeded The Makefile unconditionally linked libX11-xcb into libvulkan_intel.so. But it's needed only if HAVE_PLATFORM_X11. Fixes build of libvulkan_intel.so on Chromium OS, which has no X11 libraries. Fixes: `71258e9462` ("anv/x11: Add support for Xlib platform") Cc: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-08 09:24:30 -07:00
Tim Rowley	7514e326f8	swr: fixes for format mapping and texture sizing Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-09-08 10:43:21 -05:00
Topi Pohjolainen	b863f4a39a	intel/blorp: Allow single slice converter to suppress number of layers Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-08 08:53:45 +03:00
Lionel Landwerlin	0ad84b4366	spirv/nir: Implement OpAtomicLoad/Store for shared variables Missing bits from `2afb950161`. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-07 17:37:37 +01:00
Jason Ekstrand	37763bf446	nir/spirv: Remove an erroneous "fall through" comment	2016-09-07 09:04:34 -07:00
Kyle Brenneman	6e066f76ee	EGL: Combine the GL and GLES current contexts (v2) Only keep track of a single current context, instead of separate contexts for GL and GLES. In EGL 1.4 (and 1.5), EGL_OPENGL_API and EGL_OPENGL_ES_API are supposed to be interchangeable for all purposes except for eglCreateContext. The _EGLThreadInfo::CurrentContexts array is now a single pointer to the current context, which may be a GL or GLES context. In addition, it now keeps track of the current API as an enum instead of an index. eglMakeCurrent will now replace the current context, regardless of which client API is used for for the current and new contexts. It no longer checks for a conflicting context. In addition, calling eglMakeCurrent with EGL_NO_CONTEXT will now release the current context regardless of the current API. v2: Rebased against master (Adam Jackson) Reviewed-by: Adam Jackson <ajax@redhat.com>	2016-09-07 11:56:48 -04:00
Rob Clark	74b1969d71	gbm: wire up fence extension v2: make fence extension optional to not break non-i965 classic drivers, and move __DRI2_FENCE into core extensions, based on comments from Emil Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-07 11:54:00 -04:00
Rob Clark	32c061b110	freedreno: reject imports with bogus pitch Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-07 11:41:38 -04:00
Rob Clark	b4e88b500c	gbm: add missing R8 and GR88 formats Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-07 11:30:41 -04:00
Lionel Landwerlin	2afb950161	spirv/nir: Add support for OpAtomicLoad/Store Fixes new CTS tests : dEQP-VK.spirv_assembly.instruction.compute.opatomic.load dEQP-VK.spirv_assembly.instruction.compute.opatomic.store v2: don't handle images like ssbo/ubo (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-07 11:00:30 +01:00
Marek Olšák	fe40a65fb6	radeonsi: skip redundant INDEX_TYPE writes Ported from Vulkan. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-07 11:13:13 +02:00
Marek Olšák	bdf767dac4	radeonsi: add more unlikely() uses into si_draw_vbo Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-07 11:13:13 +02:00
Marek Olšák	a8e7ea6abc	radeonsi: skip draws with instance_count == 0 loosely ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-07 11:13:13 +02:00
Marek Olšák	53d74e055e	gallium/radeon/winsyses: fix counting mapped memory Not all buffers are unmapped explicitly. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-07 11:13:13 +02:00
Ilia Mirkin	8c8874eafb	nir: fix definition of pack_uvec2_to_uint Found by inspection. Untested beyond compilation. This also matches the logic used in nir_lower_alu_to_scalar. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org	2016-09-06 22:45:44 -04:00
Ilia Mirkin	c42acd93d4	mesa/formatquery: limit ES target support, fix core context support First off, as late as ES 3.2, GetInternalformat only supports RENDERBUFFER and 2DMS(_ARRAY) targets. Secondly, the _mesa_has_ext helpers are very accurate... a little too accurate, some might say. If we only show an extension in compat profiles because core profiles have the functionality guaranteed, they will return false. Fix these to either check for a core profile explicitly, or to a different-but-identical extension available in core profile. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matteo Bruni <matteo.mystral@gmail.com> Tested-by: Matteo Bruni <matteo.mystral@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-09-06 22:45:44 -04:00
Ilia Mirkin	f654b4983a	mapi: add gl32.h to the list of GLES3 headers for installation This was missed when I added the updated (and new) Khronos headers. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-09-06 22:45:44 -04:00
Ilia Mirkin	36347c8d6f	main: GL_RGB10_A2UI does not come with GL 3.0/EXT_texture_integer Add a separate extension check for that format. Prevents glTexImage from trying to find a matching format, which fails on drivers without support for this format. Fixes: sized-texture-format-channels (on a3xx) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: mesa-stable@lists.freedesktop.org	2016-09-06 22:41:48 -04:00
Jason Ekstrand	2b18a3f5d3	nir/spirv: Use fill_common_atomic_sources for image atomics We had two almost identical copies of this code and they were both broken but in different ways. The previous two commits fixed both of them. This one just unifies them so that it's easier to handle in the future. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-09-06 17:08:13 -07:00
Jason Ekstrand	f2a10937d8	nir/spirv: Use the correct sources for CompareExchange on images The CompareExchange operation has two "Memory Semantics" parameters instead of one so the real arguments start at w[7] instead of w[6]. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-09-06 17:08:13 -07:00
Jason Ekstrand	0ead7bef6b	nir/spirv: Swap the argument order for AtomicCompareExchange SPIR-V has the two arguments in the opposite order from GLSL. NIR uses the GLSL order so we had them backwards. Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-09-06 17:08:13 -07:00
Tim Rowley	edd688d986	vbo: increase VBO_SAVE_BUFFER_SIZE from 8k to 256k dwords Increases the performance of legacy geometry-heavy apps still using display lists. Performance increase for a targeted testcase is on the order of 8x, and applications like ParaView 4.x (5.x uses no longer used display lists) improve by about 10%-20%. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-06 15:15:11 -05:00
Vinson Lee	215075ae30	glsl: Add positional argument specifiers. Fix build with Python < 2.7. File "./glsl/ir_expression_operation.py", line 360, in get_enum_name return "ir_{}op_{}".format(("un", "bin", "tri", "quad")[self.num_operands-1], self.name) ValueError: zero length field name in format Fixes: `e31c72a331` ("glsl: Convert tuple into a class") Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2016-09-06 12:03:30 -07:00
Roland Scheidegger	31a380c8dd	util: (trivial) add <stdint.h> include to slab.c should fix "src/util/slab.c:57:13: error: ‘uint8_t’ undeclared"	2016-09-06 19:47:14 +02:00
Jason Ekstrand	92162dbe32	glsl: Add .gitignore for make check warnings test	2016-09-06 08:32:19 -07:00
Jason Ekstrand	20b2f1ecb9	anv/pipeline: Lower indirect outputs when EmitNoIndirectOutput is set Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reported-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-06 08:27:23 -07:00
Rob Herring	244f0aba16	Android: glsl: add rules to generate ir_expression.h header files Recent changes to generate ir_expression.h header files broke Android builds. This adds the generation rules. This change is complicated due to creating a circular dependency between libmesa_glsl, libmesa_nir, and libmesa_compiler. Normally, we add static libraries so that include paths are added even if there's no linking dependency. That is the case here. Instead, we explicitly add the include path using $(MESA_GEN_GLSL_H) to libmesa_compiler. This in turn requires shuffling the order of make includes. It also uncovered missing dependency tracking of glsl_parser.h. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-06 15:58:55 +01:00
Leo Liu	2593354643	st/omx/dec: enable hevc omx decode support Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	1a534d31fe	st/omx/dec/h265: get the reference list for uvd Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	7d63b80728	st/omx/dec/h265: add short term reference picture sets Specified by subclause 7.3.7 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	fa7c4f151d	st/omx/dec/h265: add slice header Specified by subclause 7.3.6.1 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	a639a2868e	st/omx/dec/h265: add picture parameter sets Specified by subclause 7.3.2.3 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	b3c1583e17	st/omx/dec/h265: add sequence parameter sets Specified by subclause 7.3.2.2 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	6d186a79f2	st/omx/dec: add initial omx hevc support Mainly based on the h264 implementation. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	0c374a7770	st/omx/dec: set dst rect to match src size When creating interlaced video buffer, hegith set to "template.height = align(tmpl->height/ array_size, VL_MACROBLOCK_HEIGHT);", and we use "template.height *= array_size;" for the buffer height, so it actually aligned with 32. With progressive video buffer it still aligned with 16, thus causing different height between interlaced buffer and progressive buffer for 4K (height=2160), and 720p (height=720). When transcode the video, this will cause the 16 lines corruption at the bottom of the encode video. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:01:24 -04:00
Marek Olšák	e7a73b75a0	gallium: switch drivers to the slab allocator in src/util	2016-09-06 14:24:04 +02:00
Marek Olšák	761ff40302	util: import the slab allocator from gallium There are also some cosmetic changes.	2016-09-06 14:24:04 +02:00
Michel Dänzer	dc3bb5db8c	loader/dri3: Always use at least two back buffers This can make a significant difference for performance with some extreme test cases such as vblank_mode=0 glxgears. Fixes: `1e3218bc5b` ("loader/dri3: Overhaul dri3_update_num_back") Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97549 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-09-06 13:04:48 +09:00
Kenneth Graunke	d0cd504046	glsl: Fix locations of variables in patch qualified interface blocks. As of commit `d82f8d9772`, we actually parse and attempt to handle the 'patch' qualifier on interface blocks. This patch fixes explicit locations for variables in such blocks. Without it, many program interface query dEQP/CTS tests hit this assertion in ir_set_program_inouts.cpp if (is_patch_generic) { assert(idx >= VARYING_SLOT_PATCH0 && idx < VARYING_SLOT_TESS_MAX); bitfield = BITFIELD64_BIT(idx - VARYING_SLOT_PATCH0); } because the location was incorrectly based on VARYING_SLOT_VAR0. Note that most of the tests affected currently fail before they hit this, due to confusion about what the program interface query name of those resources should be. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-05 17:37:55 -07:00
Kenneth Graunke	096ad19a2b	mesa: Fix types in _mesa_get_color_read_format(). This is a mesa_format, not a GLenum. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-05 17:37:55 -07:00
Dave Airlie	69fca64259	amd/addrlib: move addrlib from amdgpu winsys to common code Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-06 10:06:33 +10:00
Dave Airlie	1add3562e3	gallium/util: move endian detect into a separate file This just ports the simpler endian detection bits, addrlib sharing wants this outside gallium. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-06 10:06:24 +10:00
Dave Airlie	a86be7b6ad	radeon: move radeon_family/chip_class defintions to common This just moves these to a common header file. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-06 10:06:04 +10:00
Dave Airlie	f1f1ba3781	radeonsi: move sid.h/r600d_common.h to a common place. Step one to merging radv would be to move some files around. This only adds the include path to r600/radeonsi, because later we want to avoid having to add it to the generic target paths. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-06 10:05:13 +10:00
Marek Olšák	0d7ec8b7d0	gallium/radeon: remove VPORT_ZMIN/ZMAX from init config states It's part of the viewport state now. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	687c4be9cf	gallium/radeon: set VPORT_ZMIN/MAX registers correctly Calculate depth ranges from viewport states and pipe_rasterizer_state::clip_halfz. The evergreend.h change is required to silence a warning. This fixes this recently updated piglit: arb_depth_clamp/depth-clamp-range Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	8b0507672e	gallium/radeon: unify viewport emission code Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	6c8b76263d	radeonsi: also do VS_PARTIAL_FLUSH before updating VGT ring pointers ported from Vulkan Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	22cb5aecbe	radeonsi: fix variable naming in si_emit_cache_flush Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	911202817d	radeonsi: don't emit CS_PARTIAL_FLUSH if compute is not used for less noise in the HUD Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	addca75f4e	radeonsi: add HUD queries for counting VS/PS/CS partial flushes Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	1d0593abd7	gallium/radeon: rename the num-cs-flushes query to num-ctx-flushes num-cs-flushes will mean compute shader flushes Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	1469c70c2a	radeonsi: fix a badly implemented GS bug workaround Limit it to geometry shaders and Hawaii. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	21de3be8e6	radeonsi: fix texture format reinterpretation with DCC DCC is limited in how texture formats can be reinterpreted using texture views. If we get a view format that is incompatible with the initial texture format with respect to DCC, disable DCC. There is a new piglit which tests all format combinations. What works and what doesn't was deduced by looking at the piglit failures. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	63da0c991d	radeonsi: fix Gather4 with integer formats The closed compiler does the same thing. This fixes: GL45-CTS.texture_gather.-int- (18 tests) Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	3e756f09d4	radeonsi: fix a crash in imageSize for cubemap arrays Sometimes it was f32, other times it was i32. Now it's always i32. This fixes: GL45-CTS.texture_cube_map_array.image_texture_size.texture_size_compute_sh Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	03708deed2	radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader This fixes: GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation .gl_PatchVerticesIn Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	a4fa215058	radeonsi: fix cubemaps viewed as 2D This fixes: GL43-CTS.texture_view.view_sampling v2: fix a typo, merge both if statements Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dave Airlie <airlied@redhat.com> (v1) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	2975230fdc	radeonsi: always use the same function signature for llvm.SI.export Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	1c13c71ef8	radeonsi: return correct eviction stats for NVX_gpu_memory_info Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	f660d1cb21	gallium/radeon: also eliminate DCC fast clear in resource_get_handle just do what the comment says Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	01dd73f2f4	gallium/radeon: use the current ctx for CMASK elimination in resource_get_handle For coherency with the current context. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	d22feeaa9d	gallium/radeon: use the current ctx for DCC decompression in resource_get_handle For coherency with the current context. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	0d2e43fcb1	gallium/radeon: derive buffer placement and flags only at initialization Invalidated buffers don't have to go through it. Split r600_init_resource into r600_init_resource_fields and r600_alloc_resource. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	a14c50bceb	radeonsi: set more sampler settings Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Emil Velikov	4ea90682ab	docs: add news item and link release notes for 12.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-05 16:13:48 +01:00
Emil Velikov	2099d5df97	docs: add sha256 checksums for 12.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `614fb93a6d`)	2016-09-05 16:12:08 +01:00
Emil Velikov	f541530bbc	docs: add release notes for 12.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `2fc6a31f10`)	2016-09-05 16:12:07 +01:00
Marek Olšák	b012a13af5	noop: implement resource_get_handle X+DRI3 locks up if the returned handle is invalid.	2016-09-05 16:12:04 +02:00
Marek Olšák	1c71bccdaa	noop: set missing functions	2016-09-05 16:12:04 +02:00
Marek Olšák	ed164f0d6b	noop: simplify some functions	2016-09-05 16:12:04 +02:00
Emil Velikov	62b224d428	glx/glvnd: list the strcmp arguments in correct order Currently, due to the inverse order, strcmp will produce negative result when the needle is towards the start of the haystack. Thus on the next iteration(s) we'll end up further towards the end and eventually fail to locate the entry. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-09-05 11:59:07 +01:00
Jason Ekstrand	821e366385	nir/tests: Update the CF tests to not assume fake edges In `aad4f1550`, we removed the concept of "fake" edges from NIR. Now, if you have a block at the end of an infinite loop it really has no predecessors. This updates the unit tests to match. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97587 Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-09-04 20:44:59 -07:00
Ilia Mirkin	61e978524a	gk110/ir: fix quadop dall emission We recently starting to always emit the NDV (== dall) bit for quadops. However it was folded into the wrong code word. Fixes: `e0a067ed48` (nv50/ir: always emit the NDV bit for OP_QUADOP) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-09-04 18:28:29 -04:00
Mauro Rossi	98f734e758	android: intel: fix include paths in new "common" library Fixes building error in libmesa_intel_common static library Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-03 20:03:16 -07:00
Ilia Mirkin	ca313e00b6	a3xx: use window scissor to simulate viewport xy clip Unfortunately a3xx does not have a separate disable for depth clipping, so when depth clamp is enabled, we disable the whole 3d clipper logic. This in turn also gets rid of the xy clip that it would normally do. When we detect this would happen, instead we integrate the viewport into the window scissor. This may have slightly different behavior around wide points, but it's unlikely that anything depends on this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-09-03 19:58:42 -04:00
Ilia Mirkin	83d7230fd5	a3xx: make use of software clipping when hw can't handle it The hw clipper only handles up to 6 UCPs. If there are more than 6 UCPs, or a clip vertex, or clip distances are in use, then we must use the fallback discard-based clipping from the frag shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-09-03 19:58:42 -04:00
Ilia Mirkin	dac72234c7	a3xx: make sure to actually clamp depth as requested We were previously ... not clamping. I guess this meant that everything got clamped to 1/0, which was enough to pass the existing tests. Or perhaps the clamping would only happen to the rasterized depth value and not the frag shader's output depth value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-09-03 19:58:42 -04:00
Karol Herbst	ae7eb93e6c	nvc0/ir: allow min/max instructions to be dual-issued in pairs changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0 /benchmark_duration_ms=60000 /width=1024 /height=640: inst_executed: 1.03G inst_issued1: 614M -> 580M inst_issued2: 213M -> 230M score: 1021 -> 1030 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-03 13:53:09 -04:00
Jason Ekstrand	7e891f90c7	anv: Move cmd_buffer_config_l3 into anv_cmd_buffer.c This is the only remaining part of genX_l3.c and there's really no good reason for it to be in its own file. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:07 -07:00
Jason Ekstrand	17968e2dfd	anv/cmd_buffer: Move emit_lri and emit_lrm higher up Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:07 -07:00
Jason Ekstrand	42d03c204c	anv: Refactor pipeline l3 config setup Now that we're using gen_l3_config.c, we no longer have one set of l3 config functions per gen and we can simplify a bit. Also, we know that only compute uses SLM so we don't need to look for it in all of the stages. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:07 -07:00
Jason Ekstrand	6448c0e324	anv: Leverage the shared L3$ config code When Jordan first implement L3$ configuration for Vulkan, he copied+pasted from the GL driver because we had no good place to share it. Now that we have src/intel/common, we should be sharing these tables. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:07 -07:00
Jason Ekstrand	49981891f7	intel: Pull the guts of gen7_l3_state.c into a shared helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:07 -07:00
Jason Ekstrand	979d0aca62	intel: Rename brw_get_device_name/info to gen_get_device_name/info Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:07 -07:00
Jason Ekstrand	527f371999	intel: s/brw_device_info/gen_device_info/ Generated by: sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/*/.c sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/*/.h sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.c sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.cpp sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:06 -07:00
Jason Ekstrand	55364ab5b7	intel: Add a new "common" library for more code sharing The first thing to go in this new library is brw_device_info. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:06 -07:00
Mauro Rossi	4218c32166	intel/blorp: fix typo in android makefile Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-03 08:22:53 -07:00
Timothy Arceri	1692228a38	nir: remove unused variable This was let over from `aad4f15506` Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-09-03 20:30:19 +10:00
Connor Abbott	356d101af3	nir: remove some fields from nir_shader_compiler_options I accidentally added these with `0dc4cab`. Oops!	2016-09-03 00:49:58 -04:00
Connor Abbott	c62b58c216	nir: fix bug with moves in nir_opt_remove_phis() In `144cbf8` ("nir: Make nir_opt_remove_phis see through moves."), Ken made nir_opt_remove_phis able to coalesce phi nodes whose sources are all moves with the same swizzle. However, he didn't add the logic necessary for handling the fact that the phi may now have multiple different sources, even though the sources point to the same thing. For example, if we had something like: if (...) a1 = b.yx; else a2 = b.yx; a = phi(a1, a2) ... = a then we would rewrite it to if (...) a1 = b.yx; else a2 = b.yx; ... = a1 by picking a random phi source, which in this case is invalid because the source doesn't dominate the phi. Instead, we need to change it to: if (...) a1 = b.yx; else a2 = b.yx; a3 = b.yx; ... = a3; Fixes 12 CTS tests: ES31-CTS.functional.tessellation.invariance.outer_edge_symmetry.quads* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-03 00:37:48 -04:00
Connor Abbott	0dc4cabee2	nir: add nir_after_phis() cursor helper And re-implement nir_after_cf_node_and_phis() using it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-03 00:37:48 -04:00
Ilia Mirkin	64a69059ce	glsl: expose max atomic counter/buffer consts for tess in ES 3.2 Curiously OES/EXT_tessellation_shader leave these out, while ES 3.2 adds them in. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-09-03 00:26:36 -04:00
Ilia Mirkin	8122e30aec	mapi: don't forget to expose GetPointerv in GL ES 3.2 I left this out of my previous commit that went around enabling all of the other ES 3.2 entrypoints. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-09-03 00:26:36 -04:00
Ilia Mirkin	346de79ffd	main: add KHR_robustness to ES 3.2 extension requirements Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-09-03 00:26:36 -04:00
Ilia Mirkin	163a029eba	nv50,nvc0: respect render condition enable flag when clearing rt/zs This is a newly added flag. We always pass false into it from nv50_clear_texture, but other callers may want to respect the render condition. (And the functions were originally spec'd to respect it.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-03 00:01:07 -04:00
Karol Herbst	d0cf7a6beb	nvc0/ir: don't dual-issue ops that depend or interfere with each other Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> [imirkin: rewrite to split up the helpers and move more logic to target] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-03 00:01:06 -04:00
Jason Ekstrand	aad4f15506	nir: Remove fake edges in the CF handling code When NIR was first introduced, Connor added this fake-edge hack to work around issues related to unreachable blocks. Thanks to GLSL IR's jump lowering code, the only unreachable code you can have is a block after an infinite loop. With SPIR-V, we didn't have the jump lowering code so we could also end up with the "if (...) { break; } else { continue; }" case which generates an unreachable block after the if. Because of this, most of NIR had to be fixed up for handling unreachable blocks. The only remaining case of not handling unreachable blocks was specifically the block-after-infinite-loop case in dead_cf which was fixed by the previous commit. We can now delete the fake edge hack. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-09-02 11:24:09 -07:00
Jason Ekstrand	9a4d76e534	nir/dead_cf: Don't crash on unreachable after-loop blocks Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-09-02 11:24:09 -07:00
Samuel Pitoiset	ea7b475968	nvc0: reduce the initial code segment size to 512KB Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-01 21:25:39 +02:00
Samuel Pitoiset	6557058827	nvc0: allow to resize the code segment dynamically When an application uses a ton of shaders, we need to evict them when the code segment is full but this is not really a good solution if monster shaders are used because code eviction will happen a lot. To avoid this, it seems better to dynamically resize the code segment area after each eviction. The maximum size is arbitrary fixed to 8MB which should be enough. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-01 21:25:35 +02:00
Samuel Pitoiset	96e21ad763	nvc0: add a new bin for the code segment To avoid the bins list to grow up indefinitely when the code segment size will be bumped, we need to separate that bin from the SCREEN one because it contains other resources like the uniform bo. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-01 21:25:31 +02:00
Samuel Pitoiset	63ac80879e	nvc0: add nvc0_screen_resize_text_area() helper This function will be helpful for resizing the code segment area when we need to evict all shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-01 21:25:28 +02:00
Samuel Pitoiset	3d928d9082	nvc0: re-upload currently bound shaders after code eviction This fixes a very old issue which happens when the code segment size is full. A bunch of real applications like Tomb Raider, F1 2015, Elemental, hit that issue because they use a ton of shaders. In this case, all shaders are evicted (for freeing space) but all currently bound shaders also need to be re-uploaded and SP_START_ID have to be updated accordingly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-01 21:25:25 +02:00
Samuel Pitoiset	34883626d1	nvc0: refactor the program upload process This refactoring will help for fixing the "out of code space" eviction issue because we will need to reupload the code for all currently bound shaders but it's slightly different than uploading a new fresh code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-01 21:25:17 +02:00
Jordan Justen	49c24d8a24	i965: fix noop_scissor range issue on width/height If scissor X or Y was set to a negative value then the previous code might have indicated noop scissors when the scissor range actually was masking a portion of the framebuffer. Since fb->_Xmin, _Xmax, _Ymin and _Ymax take scissors into account, we can use these to test for a noop scissor. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-01 11:45:13 -07:00
Kenneth Graunke	9c562956f9	glsl: Only force varyings to be flat when varying packing. Varying packing would like to mark certain variables as flat. This works as long as both sides of the interfaces are changed accordingly. However, with SSO, we disable varying packing on the outermost stages. We also disable varying packing for certain tessellation stages. With SSO, we operate on the producer and consumer separately. Checks based on the consumer stage and variable are risky, and can easily lead to altering one half of the interface between stages, breaking SSO pipeline IO validation. Just stop monkeying around with interpolation modes unless required for varying packing. There's no point. This also disables it in unsafe SSO cases. Fixes CTS tests: *.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_MaxPatchVertices_Position_PointSize Also fixes Piglit's spec/oes_geometry_shader/sso_validation: - user-defined-gs-input-not-in-block.shader_test - user-defined-gs-input-in-block.shader_test Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-09-01 11:24:17 -07:00
Kenneth Graunke	72b56e8b1a	glsl: Reject TCS/TES input arrays not sized to gl_MaxPatchVertices. We handled the unsized case, implicitly sizing arrays to the value of gl_MaxPatchVertices. But if a size was present, we failed to raise a compile error if it wasn't the value of gl_MaxPatchVertices. Fixes CTS tests: .tessellation_shader.compilation_and_linking_errors. {tc,te}_invalid_array_size_used_for_input_blocks Piglit's tcs-input-read-nonconst- tests have recently been fixed. This patch will break older copies of those tests, but the latest should continue working. Update to Piglit 75819c13af2ed5. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-09-01 11:07:07 -07:00
Frank Binns	2f3154f464	wayland-drm: add missing NULL check Although malloc is unlikely to fail check its return value nevertheless. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-01 15:48:52 +01:00
Frank Binns	d5f65b8bf5	loader: fix sysfs uevent file parsing When trying to get a device name for an fd using sysfs, it would always fail as it was expecting key/value pairs to be delimited by '\0', which is not the case. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-01 15:48:34 +01:00
Frank Binns	d6f669ba83	egl: only store device name when Wayland support is built The device name is only needed for WL_bind_wayland_display so make this clear by only storing the device name when Wayland support is built. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-01 15:47:58 +01:00
Lionel Landwerlin	2dc6930a5a	isl: round format alignment to nearest power of 2 A few inline asserts in anv assume alignments are power of 2, but with formats like R8G8B8 we have odd alignments. v2: round up to power of 2 (Ilia) v3: reuse util_next_power_of_two() from gallium/aux/util/u_math.h (Ilia) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-01 11:36:09 +01:00
Thomas Hellstrom	fc6be40011	gallium/postprocess: Fix resource freeing The code was triggering asserts in DEBUG builds of the SVGA driver since the reference count of the resource was never decremented before destroy. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-09-01 07:59:49 +02:00
Ilia Mirkin	e3db415456	st/mesa: expose OES_geometry_shader and OES_texture_cube_map_array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-31 20:12:55 -04:00
Eric Engestrom	3bd885d09c	Introduce .editorconfig A few weeks ago, Jose Fonseca suggested [0] we use .editorconfig files to try and enforce the formatting of the code, to which Michel Dänzer suggested [1] we start by importing the existing .dir-locals.el settings. The first draft was discussed in the RFC [2]. These .editorconfig are a first step, one that has the advantage of requiring little to no intervention from the devs once the settings files are in place, but the settings are very limited. This does have the advantage of applying while the code is being written. This doesn't replace the need for more comprehensive formatting tools such as clang-format & clang-tidy, but those reformat the code after the fact. [0] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121545.html [1] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121639.html [2] https://lists.freedesktop.org/archives/mesa-dev/2016-July/123431.html Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-08-31 17:06:54 -07:00
Eric Anholt	509e2dbc10	vc4: Add missing break statement. This opcode isn't used yet, so it didn't affect anything. Caught by Coverity, reported to me by imirkin.	2016-08-31 17:06:54 -07:00
Brian Paul	c87e8c8515	gallium/docs: clarify render_condition_enabled parameter to clear functions If false, it means do the clear unconditionally. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-31 15:51:06 -06:00
Jason Ekstrand	b8bff0823b	mesa: Add some more .gitignore	2016-08-31 13:45:27 -07:00
Matt Turner	90eaf01616	i965: Pass start_offset to brw_set_uip_jip(). Without this, we would pass over the instructions in the SIMD8 program (which is located earlier in the buffer) when brw_set_uip_jip() is called to handle the SIMD16 program. The assertion about compacted control flow was bogus: halt, cont, break cannot be compacted because they have both JIP and UIP. Instead, we should never see a compacted instruction in this code at all. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-31 13:11:27 -07:00
Kenneth Graunke	bea048752e	i965: Merge gen7_clip_state atom into gen6_clip_state atom. The original motivation was that gen6_clip_state ignored _NEW_POLYGON as it didn't care about early culling. The only other change was that Gen6 ignored BRW_NEW_TES_PROG_DATA as it doesn't have tessellation shaders, but listening to this is harmless as it'll never be signalled. Now that we've added _NEW_POLYGON for is_drawing_lines/points, we can merge the two as the distinction is meaningless. This actually fixes a bug, though: Gen8+ was using the gen6_clip_state atom because it doesn't care about early culling, but it also needs BRW_NEW_TES_PROG_DATA, which was missing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-31 12:42:09 -07:00
Kenneth Graunke	4c116cbafb	i965: Use gs_prog_data in is_drawing_points/lines(). State upload code should use prog_data rather than poking at core Mesa shader data structures wherever possible. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-31 11:50:15 -07:00
Kenneth Graunke	cd19db4ee6	i965: Fix missing dirty bits related to is_drawing_points/lines. calculate_attr_overrides() uses is_drawing_points(), which depends on tessellation and geometry program state, as well as polygon state. v2: Add missing _NEW_POLYGON as well. Caught by Iago Toral. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-31 11:50:15 -07:00
Samuel Pitoiset	3df8615dcd	nvc0: remove an attempt at uploading all IMMD into a CB This has never been used because info->immd.bufSize is always 0 and anyways this is an experimental code which has never been completed. This gets rid of some unused code in the program validation process. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-31 19:05:16 +02:00
Samuel Pitoiset	b2f3d50ca7	nv50: remove unused nv50_program::immd_size field Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-31 19:05:13 +02:00
Ilia Mirkin	6118bcab4e	nv30: set usage to staging so that the buffer is allocated in GART The code a few lines below expects to migrate the bo in question to VRAM. Since we're filling the initial data via CPU, it's more efficient to create the temporary buffer in GART. There is no "push" method implemented, otherwise we'd use that instead. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-08-31 10:28:33 -04:00
Frank Binns	5505845945	egl/x11_dri3: provide an authentication function To support WL_bind_wayland_display an authentication function needs to be provided but this was not being done for this platform as it's not strictly necessary. However, as this isn't an optional function there's the potential for a segfault to occur if authentication is mistakenly performed. Protect against this by providing a function that prints an error. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-08-31 15:10:14 +02:00
Frank Binns	4c28c916ef	egl/x11_dri3: disable WL_bind_wayland_display for devices without render nodes Up until now, DRI3 was only used for devices that have render nodes, unless overridden via an environment variable, with it falling back to DRI2 otherwise. This limitation was there in order to support WL_bind_wayland_display as it requires client opened device node fds to be authenticated, which isn't possible when using DRI3. This is an unfortunate compromise as DRI3 provides security benefits over DRI2. Instead, allow DRI3 to be used for devices without render nodes but don't advertise WL_bind_wayland_display in this case. Applications that need this extension can still be run by disabling DRI3 support via the LIBGL_DRI3_DISABLE environment variable. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-08-31 15:09:12 +02:00
Jose Fonseca	55e417222f	scons: Fix MinGW cross compilation. The generated GLSL header files were only being built for the host platform, and not the target platform. Trivial.	2016-08-31 12:18:34 +01:00
Ilia Mirkin	8caf2cb0c0	nv30: only bail on color/depth bpp mismatch when surfaces are swizzled The actual restriction is a little weaker than I originally thought. See https://bugs.freedesktop.org/show_bug.cgi?id=92306#c17 for the suggestion. This also explain why things weren't always failing before, only sometimes. We will allocate a non-swizzled depth buffer for NPOT winsys buffer sizes, which they almost always are. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-08-31 01:17:55 -04:00
Kenneth Graunke	d82f8d9772	glsl: Handle patch qualifier on interface blocks. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 22:09:36 -07:00
Ilia Mirkin	a0b1260fe0	i965: enable OES_primitive_bounding_box with the no-op implementation Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 21:31:30 -04:00
Ilia Mirkin	bf47b2bf88	st/mesa: provide the null implementation of bounding box outputs in tcs Until hardware appears (in a gallium driver) that can make use of the TCS-outputted gl_BoundingBox, we just request that the variable gets assigned as a regular patch variable. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 20:25:15 -04:00
Ilia Mirkin	891d7e3c9e	glsl: add gl_BoundingBox and associated varying slots Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 20:25:15 -04:00
Ilia Mirkin	10663c648e	mesa: add support for GL_PRIMITIVE_BOUNDING_BOX storage and query Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 20:25:15 -04:00
Ilia Mirkin	3b81c998a2	mesa: add scaffolding for OES/EXT_primitive_bounding_box Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 20:25:15 -04:00
Ilia Mirkin	5ce0969df2	docs: add GL_OES_viewport_array to features Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 20:25:15 -04:00
Timothy Arceri	64a48efb9e	aubinator: fix if indentation and add brackets to multiline body Fixes misleading indentation warning in gcc. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-31 10:19:45 +10:00
Francisco Jerez	6df215d97e	i965/fs: Assert that the number of color targets is one when dual-source blend is enabled. Requested by Anuj during review of `4a87e4ade7`, adding as follow-up since it led to assertion failures due to various GLSL bugs that should be fixed now.	2016-08-30 16:54:19 -07:00
Francisco Jerez	fd04d048ae	glsl: Fix gl_program::OutputsWritten computation for dual-source blending. In the fragment shader OutputsWritten is a bitset of FRAG_RESULT_* enumerants, which represent the location of each color output written by the shader. The secondary and primary color outputs of a given render target using dual-source blending have the same location, so the 'idx' computation below will give the wrong bit as result if the 'var->data.index' term is non-zero -- E.g. if the shader writes the primary and secondary colors of the FRAG_RESULT_COLOR output, ir_set_program_inouts will think that the shader writes both FRAG_RESULT_COLOR and FRAG_RESULT_SAMPLE_MASK, which is just bogus. That would cause the brw_wm_prog_key::nr_color_regions computation done in the i965 driver during fragment shader precompilation to be wrong, which currently leads to unnecessary recompilation of shaders that use dual-source blending, and triggers an assertion failure in fs_visitor::emit_fb_writes() on my i965-fb-fetch branch. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 16:54:19 -07:00
Francisco Jerez	965934f38a	glsl: Fix incorrect hard-coded location of the gl_SecondaryFragColorEXT built-in. gl_SecondaryFragColorEXT should have the same location as gl_FragColor for the secondary fragment color to be replicated to all fragment outputs. The incorrect location of gl_SecondaryFragColorEXT would cause the linker to mark both FRAG_RESULT_COLOR and FRAG_RESULT_DATA0 as being written to, which isn't allowed by the spec and would ultimately lead to an assertion failure in fs_visitor::emit_fb_writes() on my i965-fb-fetch branch. This should also fix the code below for multiple dual-source-blended render targets, which no driver currently supports but we have plans to enable eventually in the i965 driver (the comment saying that no hardware will ever support it seems rather hilarious). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 16:54:19 -07:00
Francisco Jerez	342f945b13	st/glsl_to_tgsi: Use SecondaryOutputsWritten to determine dual-source fragment outputs. Currently the mesa state tracker relies on there being two bits set per dual-source output in the gl_program::OutputsWritten bitset, but that only worked due to a GLSL front-end bug that caused it to set the OutputsWritten bit for both location and location+1 even though at the GLSL level the primary and secondary color outputs used for dual-source blending have the same location. Fix it by extending outputMapping[] to 2*FRAG_RESULT_MAX elements in order to represent a mapping from a (location, index) pair to its TGSI output, which should also make it slightly easier to add support for dual-source blending in combination with multiple render targets in the long run. No Piglit regressions on llvmpipe. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 16:54:19 -07:00
Francisco Jerez	cb4b38af41	glsl: Calculate bitset of secondary outputs written in ir_set_program_inouts. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 16:54:18 -07:00
Ian Romanick	c011d7d900	glsl: Fix typo in comment Trivial. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	aee9ab7de7	glsl: Replace most assertions with unreachable() text data bss dec hex filename 7669233 277176 28624 7975033 79b079 i965_dri.so before generated code 7647081 277176 28624 7952881 7959f1 i965_dri.so before this commit 7669289 277176 28624 7975089 79b0b1 i965_dri.so with this commit Looking at the generated assembly, it appears that some of changes made in the generated code prevent some loops from being unrolled. Removing the default cases (via unreachable()) allows these loops to unroll again. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	dd574be54c	glsl: Refactor handling of horizontal operations Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	d6e73150a4	glsl: Use constant_template_horizontal instead of constant_template_horizontal_single_implementation for unops This changes the "shape" of all the pack and unpack operators, but they should function the same. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	822b5c5eb2	glsl: Eliminate constant_template2 constant_template_common can now handle the case where the result type is different from the input type by using type_signature_iter. This changes the "shape" of all the cast-style operators, but they should function the same. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	abc81f7883	glsl: Eliminate constant_template5 constant_template_common can now handle the case where the result type is different from the input type by using type_signature_iter. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	53c54a6c73	glsl: Eliminate constant_template0 This template is mostly an artefact of the development of the original patch series and to minimize the differences between the original code and the generated code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	ddb4b53de3	glsl: Eliminate one of the templates for simpler operations The difference between these two templates were mostly an artefact of the development of the original patch series and to minimize the differences between the original code and the generated code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	ee3cdac785	glsl: Use the generated constant expression code Immediately previous to this patch, diff -wud src/glsl/ir_constant_expression.cpp \ src/glsl/ir_expression_operation_constant.h should be "minimal." v3: With much help from José Fonseca, fix the SCons build. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	f3fcfe001f	glsl: Generate code for constant ir_triop_csel expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	2761190baa	glsl: Generate code for constant ir_triop_lrp expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	6e09c8715d	glsl: Generate code for constant ir_quadop_vector expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	f8e185a65f	glsl: Generate code for constant ir_quadop_bitfield_insert expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	4d8ac28b20	glsl: Generate code for constant ir_triop_vector_insert expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	9f1d7c5235	glsl: Generate code for constant ir_binop_vector_extract expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	d8dd49419a	glsl: Generate code for constant ir_binop_mul expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	8954a019f7	glsl: Generate code for constant ir_triop_fma and ir_triop_bitfield_extract expressions ir_triop_bitfield_extract is a little weird because the second and third operand and aways int, so they may differ in type from the first operand. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	da61c94db8	glsl: Generate code for constant ir_binop_dot expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	13106e1041	glsl: Generate code for constant ir_binop_lshift and ir_binop_rshift expressions The code generated is quite different from what was previously used. I believe that it is still correct by the GLSL spec, and I believe, due to C rules about shifts, the behavior will be the same. Section 5.9 (Expressions) of the GLSL 4.50 spec says: The result is undefined if the right operand is negative, or greater than or equal to the number of bits in the left expression's base type. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	90da8bf547	glsl: Generate code for constant ir_binop_ldexp expressions ldexp is weird because its two operands have different types. Add support for directly specifying the exact signatures of all the possible variations of an operation. v2: Use tuple() instead of () for clarity. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	0f87c54d1c	glsl: Generate code for constant unary expressions that don't assign the destination These are operations like the pack functions that have separate functions that assign multiple outputs from a single input. v2: Correct the source and destination types. They were previously transposed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	8cf9157786	glsl: Generate code for some constant binary expression that are horizontal Only operations where the implementation is identical code regardless of type. The only such operations are ir_binop_all_equal and ir_binop_any_nequal. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	d5bfe6b9c4	glsl: Generate code for constant unary expression that are horizontal Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	8f5357b1d6	glsl: Generate code for constant expressions that have an output type the differs from the input types v2: Remove extra int() cast in find_lsb. Suggested by Matt. 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	74e335c762	glsl: Generate code for constant binary expressions that combine vector and scalar operands v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	f81b1c7fa7	glsl: Generate code for constant binary expressions that have one operand type Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	598929aee7	glsl: Generate code for constant unary expression that have different implementations for each source type v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	aa9f4fc53e	glsl: Generate code for constant unary expression that map one type to another ir_unop_i2b is omitted because its source can either be int or uint. That makes it special. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	3fcb6b85c0	glsl: Begin generating code for the most basic constant expressions Unary operations where all of the supported types use the same C expression to evaluate them. v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	e31c72a331	glsl: Convert tuple into a class This makes things a little more clear now, and it will make future changes... possible. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	6ef27003ac	glsl: Compact a bunch of things onto one line Even though they are much too long for that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	0cef8c683e	glsl: Sort constant expression handling by IR operand enum value Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	8d54b5f756	glsl: Trivial whitespace and punctuation changes Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	fd2dabbb9f	glsl: Sort GLSL type enums in switch-statements in enum order Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	13ef8c46b8	glsl: Always use correct float types in constant expression handling Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	ea05a72258	glsl: Extract ir_quadop_bitfield_insert implementation to a separate function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	fe153309a8	glsl: Extract ir_triop_bitfield_extract implementation to a separate function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	54ec6e1b8b	glsl: Extract ir_binop_ldexp implementation to a separate function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	6d5fe1815c	glsl: Use find_msb_uint to implement ir_unop_find_lsb (X & -X) calculates a value with only the least significant bit of X set. Since there is only one bit set, the LSB is the MSB. v2: Remove extra int() cast. Suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	5c24750a49	glsl: Extract ir_unop_find_msb implementation to a separate function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	d75034b3a2	glsl: Extract ir_unop_bitfield_reverse implementation to a separate function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	4b0606e0a7	glsl: Use _mesa_bitcount to implement constant ir_unop_bit_count Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	f4af9f36e7	glsl: Delete spurious comment about mod not taking integer operands This hasn't been true since we added support for GLSL 1.30. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	d6ad3e2dd9	glsl: Delete spurious comment about updating ir_expression::get_num_operands This hasn't been necessary since `007f48815`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	dc41d998f2	glsl: Do not generate comments or extra whitespace in expression files The comments and whitespace can live in the Python code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	c6e8fd82ea	glsl: Just access the ir_expression_operation strings table directly The operator_string functions gave us some protection against a malformed table. Now that the table is generated from the same data that generates the enum, this is not a concern. Just cut out the middle man. text data bss dec hex filename 7531892 273992 28584 7834468 778b64 i965_dri-64bit-before.so 7531828 273992 28584 7834404 778b24 i965_dri-64bit-after.so Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	fb44f69779	glsl: Generate ir_expression_operation_strings.h from Python 'diff -ud' is clean. v2: Massive rebase. v3: With much help from José Fonseca, fix the SCons build. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	90781eee4d	glsl: Pull operator_strs out to its own file No change except to the copyright symbol. The next patch will generate this file with Python, and Unicode + Python = pure rage. v2: Massive rebase. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	140ec58a07	glsl: Generate the ir_last_* values This ensures that they remain correct if the list is rearranged or new opcodes are added. I checked a diff of before and after to ensure that each ir_last_ had the same value. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	7d6af9e599	glsl: Generate ir_expression_operation.h from Python There are differences in where end-of-line comments are placed, but 'diff -wud' is clean. v2: Massive rebase. v3: With much help from José Fonseca, fix SCons build. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:00 -07:00
Jason Ekstrand	10f9901bce	anv: Rework pipeline caching The original pipeline cache the Kristian wrote was based on a now-false premise that the shaders can be stored in the pipeline cache. The Vulkan 1.0 spec explicitly states that the pipeline cache object is transiant and you are allowed to delete it after using it to create a pipeline with no ill effects. As nice as Kristian's design was, it doesn't jive with the expectation provided by the Vulkan spec. The new pipeline cache uses reference-counted anv_shader_bin objects that are backed by a large state pool. The cache itself is just a hash table mapping keys hashes to anv_shader_bin objects. This has the added advantage of removing one more hand-rolled hash table from mesa. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97476 Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	6899718470	anv: Add a struct for storing a compiled shader This new anv_shader_bin struct stores the compiled kernel (as an anv_state) as well as all of the metadata that is generated at shader compile time. The struct is very similar to the old cache_entry struct except that it is reference counted and stores the actual pipeline_bind_map. Similarly to cache_entry, much of the actual data is floating-size and stored after the main struct. Unlike cache_entry, which was storred in GPU-accessable memory, the storage for anv_shader_bin kernels comes from a state pool. The struct itself is reference-counted so that it can be used by multiple pipelines at a time without fear of allocation issues. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	13c09fdd0c	anv: Add pipeline_has_stage guards a few places All of these worked before because they were depending on prog_data to be null. Soon, we won't be able to depend on a nice prog_data pointer and it's nice to be more explicit anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	b259d86ad6	anv: Remove unused fields from anv_pipeline_bind_map Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	d5945bec12	anv/pipeline: Properly handle OOM during shader compilation Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	a0f5c496e3	anv/allocator: Correctly set the number of buckets The range from ANV_MIN_STATE_SIZE_LOG2 to ANV_MAX_STATE_SIZE_LOG2 should be inclusive and we have asserts that ensure that you never try to allocate a state larger than (1 << ANV_MAX_STATE_SIZE_LOG2). However, without adding 1 to the difference, we allocate 1 too few bucckts and so, even though we have an assert, anything landing in the last bucket will fail to allocate properly.. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	4200c2266e	anv/pipeline: Fix bind maps for fragment output arrays Found by inspection. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	d316cec1c1	anv/descriptor_set: memset anv_descriptor_set_layout We hash this data structure so we can't afford to have uninitialized data even if it is just structure padding. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-30 15:08:23 -07:00
Eric Engestrom	d5899b3010	docs/helpwanted: fix GL3.txt/features.txt link Fixes: `f926cf5bd0` ("docs: Rename GL3.txt to features.txt") Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> CC: Andreas Boll <andreas.boll.dev@gmail.com>	2016-08-30 14:38:57 -07:00
Eric Engestrom	aac91fffae	anv/wayland: fix assert typo Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-30 13:47:51 -07:00
Eric Engestrom	4e68bb620f	anv/meta: fix unreachable() typo Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-30 13:47:51 -07:00
Eric Engestrom	b0acebd41f	st/nine: fix unreachable() typo Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-30 13:47:46 -07:00
Eric Engestrom	e2627e34ba	glsl: fix unreachable() typo Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-30 13:47:42 -07:00
Eric Engestrom	352f0d9180	get_reviewer.pl: fix mesa check This script was broken for the last few days and I couldn't figure out why. Turns out it was checking for the existence of a file that got renamed, so rename it in here too. Fixes: `f926cf5bd0` ("docs: Rename GL3.txt to features.txt") CC: Ian Romanick <ian.d.romanick@intel.com> CC: Rob Clark <robclark@freedesktop.org> Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-30 16:44:00 -04:00
Kenneth Graunke	6699403651	glsl: Initialize outputs[] array in lower_blend_equation_advanced. Caught by Coverity. Likely fixes real issues if an output component is not present. CID: 1372278 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-30 13:11:00 -07:00
Samuel Pitoiset	6820f75c91	nvc0: fix indentation in nvc0_screen_init() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 18:42:02 +02:00
Samuel Pitoiset	0fc3b7c88e	nvc0: check return value of nvc0_screen_resize_tls_area() While we are at it, make it static and change the return values policy to be consistent. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 18:41:59 +02:00
Samuel Pitoiset	b489ac88f6	nvc0: make use of FAIL_SCREEN_INIT in nvc0_screen_create() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 18:41:57 +02:00
Samuel Pitoiset	e0a067ed48	nv50/ir: always emit the NDV bit for OP_QUADOP This silences a divergent error found with F1 2015. Basically, the NDV bit has to be set when a FSWZ instruction is inside divergent code, but it's not needed otherwise. The correct fix should be to set it only in divergent code situations. GM107 emitter already sets that bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-08-30 18:41:46 +02:00
Jason Ekstrand	9514c5a30f	intel/blorp: Inline get_vs_entry_size into emit_urb_config Topi asked to have the prefix removed because there's nothing gen7 about it. However, now that everything is in a single file, there is no good reason to have it split out into a helper function anyway. Let's just put the contents in emit_urb_config and call it a day. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-30 09:24:50 -07:00
Tim Rowley	175052507c	swr: [rasterizer] add archrast instrumentation Statistics measurement system	2016-08-30 10:32:36 -05:00
Emil Velikov	5de640a518	i915: Check return value of screen->image.loader->getBuffers Ported from the i965 commit `e7ab358e81`. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Cc: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-08-30 14:50:47 +01:00
Emil Velikov	4f5f9575d0	egl/android: remove config post-processing No longer needed as of last commit, since we no longer add OPENGL to the ClientAPIs thus, RenderType and Conformant don't have the desktop GL bit set. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2016-08-30 14:50:28 +01:00
Emil Velikov	03eaa6c596	egl/dri2: check if the EGL API is valid before adding it to ClientAPIs In the rather unlikely case that the API is considered invalid, don't add it to the (supported) ClientAPIs bitmask. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> --- Strictly speaking we only need this in the Android case for OpenGL. Adding it everywhere doesn't hurt us since the compiler will const propagate and optimise/remove these.	2016-08-30 14:50:10 +01:00
Emil Velikov	4472b6e469	egl/android: annotate static const data as such Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2016-08-30 14:50:08 +01:00
Emil Velikov	7563c39641	egl: treat EGL_OPENGL_API as invalid on Android At the moment one can use OpenGL in eglBindAPI() only to clear the EGL_OPENGL_BIT from RenderableType and Conformant for _each_ config. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2016-08-30 14:49:24 +01:00
Ilia Mirkin	a165e5cb7c	nouveau: make color/depth bpp match for pre-nv10 chips This avoids generating fbconfigs whose winsys framebuffers will be incomplete (see nouveau_check_framebuffer_complete). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 00:21:42 -04:00
Ilia Mirkin	357d8261f1	nouveau: always enable at least one RC Experimentally, this is required for glxgears and others to display the proper colors. This is also what the code used to do before the referenced commit. Fixes: `c703658b39` (mesa: Drop _EnabledUnits.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-08-30 00:21:42 -04:00
Ilia Mirkin	91681302d0	nouveau: allow NV3x's to be used with nouveau_vieux NV34 and possibly other NV3x hardware has the capability of exposing the NV25 graph class. This allows forcing nouveau_vieux to be used instead of the gallium driver, primarily for testing purposes. (Among other things, NV2x only ever came as AGP or inside an Xbox, never PCI/PCIe). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 00:21:42 -04:00
Ilia Mirkin	ab0917311f	nvc0: undo overzealous enum usage Commit `7413625ad3` flipped a few functions too many to use pipe_shader_type. These functions actually take an integer that does not correspond 1:1 with the enum. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 00:17:54 -04:00
Brian Paul	ec16a5b091	svga: fix a texture readback bug Backing views/surfaces are used to handle the case when a resource is bound both as a render target and as a sampler source (such as when doing auto mipmap generation). This patch fixes a bug where mapping a resource (to do a glReadPixels) was reading the stale data in the original surface rather than the backing surface which was rendered to. We need to propagate the backing resource (which we rendered to) back to the original resource before we read from it. The problem was the svga_propagate_rendertargets() function was examining the wrong surface views. This fixes the "poc9" test described in VMware bug 1686661. Also tested with Piglit, Cinebench, Lightsmark, etc. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-29 17:46:50 -06:00
Brian Paul	646afc6ff7	svga: move surface propagation code into new function Put new svga_propagate_rendertargets() function where all the other surface propagation code lives. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-29 17:46:50 -06:00
Brian Paul	b9b88516f8	mesa: fix format conversion bug in get_tex_rgba_uncompressed() We need to set the need_convert flag with each loop iteration, not just when the rgba pointer is null. Bug reported by Markus Müller <mueller@imfusion.de> on mesa-users list. Fixes new piglit arb_texture_float-get-tex3d test. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-29 17:46:50 -06:00
Dave Airlie	f235dc08ac	radeonsi: add support for cull distances. (v1.1) This should be all that is required for cull distances to work on radeonsi. v1.1: whitespace cleanup, add docs fix clipdist_mask usage. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-08-30 09:35:56 +10:00
Timothy Arceri	5025e88703	spirv: replace assert with unreachable Fixes uninitialised warning for coord_components. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-08-30 09:29:26 +10:00
Jason Ekstrand	f4314d06e8	isl/state: Add some asserts about format capabilities This keeps invalid surface states from leaking through and potentially hanging the GPU. We shouldn't actually be hitting this on a regular basis, but a helpful assert is better than a hang. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	87214414fd	intel/blorp: Add a format parameter to blorp_fast_clear This allows us to use the actual render format as opposed to the texture format. I don't know that the hardware actually cares in the case of fast clears, but it certainly seems more correct. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	348509269e	i965: Move blorp into src/intel/blorp At this point, blorp is completely driver agnostic and can be safely moved into its own folder. Soon, we hope to start using it for doing blits in the Vulkan driver. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	8bd35d8bd2	i965/blorp: Remove the remaining brw prefixes from the blorp.h API Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	3e46f11409	i965/blorp: Use isl_format_get_depth_format for setting depth formats Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	555b22a446	i965: Move the type_size function declartaions to brw_nir.h Signed-of-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	007d8a6d04	i965: Move get_fast_clear_rect to blorp_clear.c This has been the only caller since we deleted the meta fast clear code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	c8ff36228d	i965: Roll brw_get_ccs_resolve_rect into blorp_ccs_resolve Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	12a2fe5389	i965/blorp: Get rid of most brw and mesa includes Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	87a1cb6979	i965: Move the hiz_op enum to blorp Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	db95a8108f	i965/blorp: Add a fast_clear_op enum Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	71dc2e0106	i965/blorp: Make blorp_addres::buffer a void* The Vulkan driver doesn't use libdrm so we don't want to bake that in. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	2191f5cb7e	i965/blorp: Get rid of brw_context This commit switches all of blorp from taking a brw_context to taking a blorp_context and, where useful, a void batch. In the GL driver, we only have one active batch at a time so the brw_context is* the batch but in Vulkan, batch will point to the anv_cmd_buffer in which we are building instructions. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	99b9e9b86e	i965/blorp: Take a blorp_context in compile_nir_shader Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	a818a32244	i965/meta_util: Take an isl_device in get_fast_clear_rect Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	bc159ff0f7	i965/blorp: Add an "exec" function pointer to blorp_context Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	cea360a708	i965/blorp: Remove some i965-isms from genX_blorp_exec.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	cf14b52478	i965/blorp: Move the guts of brw_blorp_exec into genX_blorp_exec.c Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	28ae664e3b	i965/blorp: Pull the guts of blorp_exec into a driver-agnostic header Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	9a842c61fe	i965/blorp/exec: Refactor to use a new blorp_batch struct This gets rid of brw_context throughout the core of the state setup code. Instead, it is replaced with blorp_batch which contains a pointer to the blorp_context and a void* that the driver can use for its own blorp data. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	4e7bddf8a3	i965/blorp: Add a helper for allocating binding tables and surface states Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	8a39069dfe	i965/blorp: Use BT_INDEX enums for setting up the binding table Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	1367af159e	i965/blorp: Shorten binding table index enum names Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	da2a078deb	i965/blorp/genX: Add a blorp_surface_reloc helper Previously, we passed the buffer address (as per the latest offset from the kernel) to ISL to use when it filled out the surface state. We then called drm_intel_bo_emit_reloc() to add the relocation to the list. The newly added blorp_surface_reloc helper adds the relocation to the list and then writes the buffer address directly into the surface state. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	ac08bc8ac2	i965/blorp: Use blorp_address in brw_blorp_surface instead of bo+offset Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	33cc1f6bb4	i965/blorp: Pull emit_surface_state into genX_blorp_exec.c Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	6d2f8f8f5f	i965/blorp: Add driver mocs settings to the context Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	9c380b639f	i965/blorp/genX: Move emit_urb_config into another helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	28991c9601	i965/blorp: Use gen6_upload_urb Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	7ecbb9bada	i965/gen6: Refactor gen6_upload_urb This splits it into two functions very similar to gen7_upload_urb. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	3e4b43d11d	i965/blorp/genX: Pull emit_3dstate_multisample into a helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	becd434d14	i965/blorp/genX: Add helpers for allocating various bits of state This pulls most of the brw-specific bits into helpers with generic names. Later, those will become the driver hooks for generic code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	600446ccc7	i965/blorp: Expose the shader cache through function pointers This sanitizes blorp's access to the i965 driver's shader cache by patching it through the blorp_context. When we start using blorp in Vulkan, we will simply have to implement such a caching interface in the Vulkan driver. Note: In my first attempt at this, I simplified it down to a single upload_shader entrypoint and implemented the caching inside of blorp. This doesn't work, however, because the i965 driver will, on occation, dump its entire cache and start over. When this happens, blorp needs to be able to recompile its shaders and re-upload them. It's easiest to just expose the caching interface. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	a14d1b63ce	i965/blorp: Add a blorp_context struct and init/finish funcs Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Mauro Rossi	cd18bbeef3	android: intel: Flatten the makefile structure Android porting of commit `bebc1a1` "intel: Flatten the makefile structure" Automake approach was followed, by moving makefiles a level up, naming them Android.genxml.mk and Android.isl.mk, performing the necessary adjustments to the paths, adding src/intel/Android.mk and fixing mesa top level makefile. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-29 12:17:34 -07:00
Jan Vesely	083746bc48	clover: Use device cap to query pointer size instead of hardcoded 32bits Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97513 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-29 14:40:15 -04:00
Jan Vesely	c7af84968d	gallium: add cap to export device pointer size v2: document the new cap v3: fix 80 char limit in screen.rst Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-29 14:40:15 -04:00
Brian Paul	f5602c27ec	svga: s/unsigned/enum pipe_shader_type/ Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-29 12:40:45 -06:00
Jordan Justen	5e76baa2ad	i965/hsw: Enable ARB_ES3_1_compatibility extension Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-08-29 11:23:08 -07:00
Rhys Kidd	b1b7e921f8	r600g: Clean up defined magic numbers for TGSI opcodes Small code clean up that removes magic numbers where a TGSI opcode has been defined. No functional change expected as each opcode is unsupported on the respective hardware. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: James Harvey <lothmordor@gmail.com>	2016-08-29 11:03:20 -07:00
Rhys Kidd	d4cb3ee95c	r600g: Avoid duplicated initialization of TGSI_OPCODE_DFMA As reported by Clang, TGSI_OPCODE_DFMA (defined magic number 118) is currently initialized twice for Cayman and Evergreen. When Jan Vesely added double precision FMA opcode it did make sense to locate it immediately after TGSI_OPCODE_DMAD, although this is out of order. This change cleans up the prior magic number definition and ensures any later reordering of this struct will not create problems. Prior change was: commit `015e2e0fce` Author: Jan Vesely <jan.vesely@rutgers.edu> Date: Sat Jul 2 16:14:54 2016 -0400 r600g: Add double precision FMA ops Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96782 Fixes: `54c4d525da` ("r600g: Enable FMA on chips that support it") Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Tested-by: James Harvey <lothmordor@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: James Harvey <lothmordor@gmail.com>	2016-08-29 11:03:20 -07:00
Rhys Kidd	8ba1fd339c	i915g: Fix typo in i915_translate_instruction() Noticed this error in a debug message whilst reviewing https://bugs.freedesktop.org/show_bug.cgi?id=97477 This patch doesn't go towards fixing that bug, but at least may clarify future debug output. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-29 11:03:20 -07:00
Eric Anholt	60bed14d0f	vc4: Handle discards while in control flow. I missed this while adding loop support because the discard test inside a loop was crashing before, anyway. Fixes piglit glsl-fs-discard-04.	2016-08-29 11:03:11 -07:00
Eric Anholt	b9a74fbec7	vc4: Mark when we add discards while lowering blend state.	2016-08-29 10:57:04 -07:00
Eric Anholt	a99d70d105	nir: Update shader info when adding discards vc4 is about to start using the shader info field to set up discard handling. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-29 10:56:59 -07:00
Tim Rowley	fa8f87132a	swr: [rasterier core] fix GetRasterizerFunc selection Only rasterize scissor edges if one or more scissor/viewport rects are not hottile aligned. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:42:36 -05:00
Tim Rowley	8e41a65fc5	swr: [rasterizer core] whitespace cleanup Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:42:30 -05:00
Tim Rowley	cc7f655177	swr: [rasterizer jitter] reimplement SCATTERPS Implement SCATTERPS as a dynamic loop based on mask set bits instead of a static compile time loop. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:42:23 -05:00
Tim Rowley	c7e21183a1	swr: [rasterizer core] upper left rule for scissors Fixes upper left rule for scissors and viewport/scissor macrotile alignment. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:42:15 -05:00
Tim Rowley	e54df2c7e4	swr: [rasterizer scripts] undef DEFINE_KNOB after usage Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:42:10 -05:00
Tim Rowley	a4efbd14d3	swr: [rasterizer core] minor cleanup to thread initialization Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:42:04 -05:00
Tim Rowley	7472a8ee75	swr: [rasterizer core] remove KNOB_MAX_THREADS Use dynamic memory allocation for per-thread data Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:41:58 -05:00
Tim Rowley	9e4a482d46	swr: [rasterizer core] track guardbands per viewport rect Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:41:51 -05:00
Tim Rowley	b473bec878	swr: [rasterizer core] per-primitive viewports/scissors - use per-primitive viewports throughout the pipeline. - track whether all available scissor rects are tile aligned. Causes failures, so not taken into account when choosing rasterizer yet. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:41:16 -05:00
Tom Stellard	63ed11cde9	radeonsi: Don't use global variables for tess lds We were allocating global variables for the maximum LDS size which made the compiler think we were using all of LDS, which isn't the case. Reviewed-By: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-29 16:36:46 +00:00
Roland Scheidegger	f48ccb8c07	softpipe: (trivial) honor render_condition_enabled for clear_rt/clear_ds	2016-08-29 18:15:08 +02:00
Roland Scheidegger	c5d7624e1d	llvmpipe: (trivial) honor render_condition_enabled for clear_rt/clear_ds	2016-08-29 18:14:49 +02:00
Kai Wasserbäch	4c53267b8f	gallium: Use enum pipe_shader_type in set_shader_images() Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 09:07:37 -06:00
Kai Wasserbäch	15fe288dea	gallium: Use enum pipe_shader_type in set_shader_buffers() Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 09:07:33 -06:00
Kai Wasserbäch	532db3b788	gallium: Use enum pipe_shader_type in set_sampler_views() Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 09:07:25 -06:00
Kai Wasserbäch	7413625ad3	gallium: Use enum pipe_shader_type in bind_sampler_states() (v2) v1 → v2: - Fixed indentation (noted by Brian Paul) - Removed second assert from nouveau's switch statements (suggested by Brian Paul) Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 08:45:48 -06:00
Marek Olšák	ed24d79ed7	gallium/radeon: clear dirty_level_mask when discarding CMASK This fixes: GL45-CTS.texture_barrier.* Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-08-29 14:23:58 +02:00
Marek Olšák	d301efb400	tgsi/scan: remember sampler view types Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 14:16:57 +02:00
Nayan Deshmukh	5f0ea3db16	st/vdpau: use temporary buffers while applying filters Use temporary buffers so that we don't read and write to the same surface at the same time. We don't need to use linear layout now. v2: rebase the patch against reverted change Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-08-29 11:23:56 +02:00
Christian König	77e4424106	st/vdpau: Revert "change the order in which filters are applied(v3)" This reverts commit `09dff7ae2e`. Turned out this can cause some artifacts in the output. Let's revert it for now until we have sorted out all issues. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>	2016-08-29 11:23:51 +02:00
Iago Toral Quiroga	9c9f45b824	i965/vec4: remove the generator hack for dual instanced GS This hack was introduced in commit `03ac2c7223`: i965/gs: Fix up gl_PointSize input swizzling for DUAL_INSTANCED gs Specifically to fixup the code we emitted to deal with gl_PointSize inputs in dual instance mode, where we were emitting a MOV to copy the point size from .w (where the hardware delivers it) to .x (because code will expect this to be a float). This meant that we were emitting a MOV to an ATTR destination that could have a width of 4 (in dual instanced mode) so it was necessary to fix the execution size and regioning of the instruction. Fortunately, Ken fixed this in `67c5d00273`: i965/vec4/gs: Stop munging the ATTR containing gl_PointSize. by using a WWWW swizzle instead of a MOV, and as the commit log in that patch states, we no longer emit instructions with ATTR destinations, so that makes the fixup code in the generator unnecessary. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-29 08:09:09 +02:00
Timothy Arceri	22cec6dc5e	glsl: initialise pointer to NULL Fixes uninitialised warning and covery defect. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-29 13:13:42 +10:00
Ilia Mirkin	6a5504de2f	Update Khronos-supplied headers to r33100 As retrieved from opengl.org and khronos.org. Maintained the APPLE hack in GL/glext.h manually. Added gl32.h. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Dave Airlie <airlied@redhat.com>	2016-08-28 21:41:47 -04:00
Ilia Mirkin	d49a231c33	mesa: add EXT_texture_cube_map_array support This is identical to OES_texture_cube_map_array support. dEQP has tests which use this extension. Also it is part of AEP. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-28 21:38:55 -04:00
Ilia Mirkin	4ec1c2bb7f	mesa: remove OES_shader_io_blocks enable This extension should just be available whenever ES 3.1 is available. With the new extension verification infrastructure, it will only be enable-able on a #version 310 es shader, rendering the original reason for having a separate enable moot. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-28 21:38:55 -04:00
Ilia Mirkin	89e95d15f9	main: use KHR_blend_equation_advanced enable for ES 3.2 availability Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-08-28 21:38:55 -04:00
Ilia Mirkin	05b37e20de	main: add missing EXTRA_END in OES_sample_variables get check Fixes: `3002296cb6` (mesa: add GL_OES_shader_multisample_interpolation support) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: <mesa-stable@lists.freedesktop.org>	2016-08-28 21:38:55 -04:00
Jose Fonseca	09dafb9630	scons: Take indirect gl_and_es_API.xml dependencies in consideration. Same as `26a8f76ba1`. Trivial.	2016-08-27 22:59:06 +01:00
Ilia Mirkin	5b18e5fd7b	docs: sort extensions in relnotes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-27 17:51:44 -04:00
Jason Ekstrand	fb89551047	isl: Allow multisampled array textures This probably isn't the only thing that needs to be done to get multisampled array textures working in Vulkan but I think this is all that ISL really needs and it does fix 8 of the new CTS tests. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-08-26 19:00:02 -07:00
Ian Romanick	cf7be70aa7	mesa/version: OpenGL ES 3.2 depends on OES_texture_cube_map_array This has a separate enable from ARB_texture_cube_map_array. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	b387bc90c8	i965: Enable OES_texture_cube_map_array on Gen8+ These are the only platforms that current expose OES_geometry_shader. Once OpenGL ES 3.1 and OES_geometry_shader are enabled on Gen7, this extension can be enabled there as well. Gen6 will never get OpenGL ES 3.1, so it will never get this extension... even though it has the desktop OpenGL extension. Alas. NOTE: This causes a failure on Gen8+ platforms in ES3-CTS.gtf.GL3Tests.texture_storage.texture_storage_texture_targets. The test only fails because it doesn't know that 0x9009 is a valid value when the extension exists. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	dc4f53b683	mesa: Add support for OES_texture_cube_map_array This has a separate enable flag because this extension also requires OES_geometry_shader. It is possible that some drivers may support OpenGL ES 3.1 and ARB_texture_cube_map but not support OES_geometry_shader. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	87fa462ffd	mesa: Add and use _mesa_has_texture_cube_map_array helper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	66b988d09a	mesa: Use _mesa_has_ARB_texture_cube_map_array instead of open-coding it Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	daf1a61e11	mesa: Cosmetic changes in legal_texobj_target Use bool instead of GLboolean and constify ctx. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	d79c950eeb	mesa: Rearrange legal_texobj_target to look more like _mesa_legal_get_tex_level_parameter_target This makes it a bit easier to add support for more features in different APIs. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	ef5bad09c4	glsl: Add and use has_texture_cube_map_array helper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	c879dbc4e4	glsl: Mark cube map array sampler types as reserved in GLSL ES 3.10 All the GLSL 4.x keywords were added to the list of reserved keywords in GLSL ES 3.10. As far as I can tell, these are the only ones that were missed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	8fb4af7789	glsl: Silence unused parameter warning glsl/lower_buffer_access.cpp:324:55: warning: unused parameter ‘var’ [-Wunused-parameter] ir_variable *var, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	63af53dcd3	i965: Enable GL_OES_geometry_shader on Gen8+ Gen7 can get this extension (and GL_OES_shader_io_blocks) as soon as the rest of OpenGL ES 3.1 is enabled. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	259fc50545	glsl/linker: Fail linking on ES if uniform precision qualifiers don't match When GL_OES_geometry_shader is enabled, this fixes dEQP-GLES31.functional.shaders.linkage.geometry.uniform.rules.type_mismatch_1. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	06201e4f1a	glsl: Allow invocations layout qualifier with GL_OES_geometry_shader Fixes dEQP-GLES31.functional.geometry_shading.instanced.geometry_1_invocations dEQP-GLES31.functional.geometry_shading.instanced.invocation_per_layer_2d_array dEQP-GLES31.functional.geometry_shading.instanced.invocation_per_layer_2d_multisample_array dEQP-GLES31.functional.geometry_shading.instanced.invocation_per_layer_3d dEQP-GLES31.functional.geometry_shading.instanced.invocation_per_layer_cubemap dEQP-GLES31.functional.geometry_shading.instanced.multiple_layers_per_invocation_2d_array dEQP-GLES31.functional.geometry_shading.instanced.multiple_layers_per_invocation_2d_multisample_array dEQP-GLES31.functional.geometry_shading.instanced.multiple_layers_per_invocation_3d dEQP-GLES31.functional.geometry_shading.instanced.multiple_layers_per_invocation_cubemap dEQP-GLES31.functional.geometry_shading.query.geometry_shader_invocations Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:14 -07:00
Ian Romanick	3a0ae7b55c	glsl: Allow gl_InvocationID and gl_Layer with GL_OES_geometry_shader Fixes dEQP-GLES31.functional.geometry_shading.layered.fragment_layer_2d_array dEQP-GLES31.functional.geometry_shading.layered.fragment_layer_2d_multisample_array dEQP-GLES31.functional.geometry_shading.layered.fragment_layer_3d dEQP-GLES31.functional.geometry_shading.layered.fragment_layer_cubemap v2: Don't enable gl_ViewportIndex in GLSL ES 3.20. Noticed by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:14 -07:00
Ian Romanick	1a72fbf9e6	mesa: Allow GL_EXT_geometry_shader and GL_EXT_geometry_point_size Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:14 -07:00
Ian Romanick	658e90f9a8	mesa: Document reasons for allowing XFB drawing modes in GLES 3.1 w/GL_OES_geometry_shader Originally this patch added the checks to allow the draw calls with XFB, but commit `2dabd497` beat me to it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:14 -07:00
Ian Romanick	aa228eb1a6	mesa: Remove redundant _mesa_has_shader_subroutine The checks in _mesa_has_shader_subroutine are slightly different than _mesa_has_ARB_shader_subroutine, but they're not different in a way that matters. The only way to have ctx->Version >= 40 is if ctx->Extensions.ARB_shader_subroutine is set. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-26 15:03:14 -07:00
Ian Romanick	0115f356ee	nouveau: Enable EXT_texture_env_dot3 on NV10 and NV20 GL_DOT3_RGB_EXT and GL_DOT3_RGBA_EXT. are nearly identical to GL_DOT3_RGB and GL_DOT3_RGBA. The only difference is the _EXT versions do not apply the post-scale. Just smash logscale to 0 so that RC_OUT_SCALE_1 is always used. NOTE: I have not actually tested this. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-26 15:03:14 -07:00
Ian Romanick	a7d92c3c0b	nouveau: Fix non-1x post-scale factor with DOT3 combiner Fixes long standing bug on NV10 and NV20 where using a non-1x RGB or A post-scale with GL_DOT3_RGB or GL_DOT3_RGBA texture environment would not work. The old combiner math uses HALF_BIAS_NORMAL and HALF_BIAS_NEGATE. The GL_NV_register_combiners defines these as HALF_BIAS_NORMAL_NV max(0.0, e) - 0.5 HALF_BIAS_NEGATE_NV -max(0.0, e) + 0.5 In order to get the correct result from the dot-product, the intermediate dot-product must be multiplied by 4. This is a literal implementation of the GL_ARB_texture_env_dot3 spec. It also requires using the register combiner post-scale. As a result, the post-scale cannot be used for the post-scale set by the application. The new combiner math uses EXPAND_NORMAL and EXPAND_NEGATE. The GL_NV_register_combiners defines these as EXPAND_NORMAL_NV 2.0 * max(0.0, e) - 1.0 EXPAND_NEGATE_NV -2.0 * max(0.0, e) + 1.0 Since this fully expands the value to [-1, 1] range, the intermediate dot-product result is the desired value. This leaves the register combiner post-scale available for application use. NOTE: I have not actually tested this. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-26 15:03:14 -07:00
Ian Romanick	f926cf5bd0	docs: Rename GL3.txt to features.txt Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-08-26 15:03:14 -07:00
Ian Romanick	8cd5c3cfe7	docs: Update GL3.txt for OpenGL 4.x on i965-ish hardware v2: Note that GL_KHR_blend_equation_advanced and GL_KHR_blend_equation_advanced_coherent are done. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:14 -07:00
Nicholas Bishop	d0a4c36dd6	docs: add links to clarify patch mailing section * Changed "Mesa mailing list" to "mesa-dev mailing list" to clarify which list patches should be sent to * Added an explicit link to https://lists.freedesktop.org/mailman/listinfo/mesa-dev to show where to subscribe to the list * Added a link to https://git-scm.com/docs/git-send-email to help new users of that command v2: add signed-off-by Signed-off-by: Nicholas Bishop <nicholasbishop@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-08-26 14:54:26 -07:00
Brian Paul	ea33df7b58	svga: minor whitespace, etc clean-ups in svga_pipe_misc.c Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	8433b43337	svga: move some code in svga_propagate_surface() Move computation of zslice, layer inside the conditional where they're used. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	1a10b37ac3	svga: simplify surface propagation code in svga_set_framebuffer_state() Rewrite the comment too. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	bb7f094b37	svga: add some comments in the svga_surface struct Give more info about backing resources/surfaces. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	dcf63339e7	svga: use new svga_check_sampler_framebuffer_resource_collision() Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	ff500ed5a1	svga: add new svga_check_sampler_framebuffer_resource_collision() Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	d3d20d650d	svga: remove assertions in svga_surface cast wrappers We don't do this for other cast wrappers. And this will simplify some code at call sites. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	c6e89fa215	svga: minor code simplification in svga_texture_transfer_unmap() Use the tex variable instead of using svga_texture() again. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	fe5a2704ec	svga: reformat some expressions in svga_texture_transfer_map() Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	10ef6ddcf9	svga: remove duplicated variable in svga_texture_transfer_map() tex was already declared at the function body scope. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	09d2780b39	svga: move some assignments in svga_texture_transfer_map() Put near other assignments to the svga_transfer variable. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:18 -06:00
Brian Paul	4a52512666	svga: minor simplifications in svga_texture_transfer_map() Use local vars instead of jumping through a pointer. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:18 -06:00
Brian Paul	088dd8f45e	svga: minor reformatting of svga_texture() cast wrapper Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:18 -06:00
Brian Paul	e206f67261	svga: rewrite svga_buffer() cast wrapper To make it symmetric with the svga_texture() cast wrapper. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:18 -06:00
Brian Paul	c72dcd9a71	svga: remove local variable in create_backed_surface_view() To simplify the code a bit. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:18 -06:00
Kenneth Graunke	bc13e5f42a	docs: Add GL_KHR_blend_equation_advanced to relnotes.	2016-08-26 13:17:22 -07:00
Mario Kleiner	2cc880cba5	r600: increase performance for DRI PRIME offloading if 2nd GPU is Evergreen+ This is a direct port of Marek Olšáks patch "radeonsi: increase performance for DRI PRIME offloading if 2nd GPU is CIK or VI" to r600. It uses SDMA for the detiling blit from renderoffload VRAM to GTT, as SDMA is much faster for tiled->linear blits from VRAM to GTT. Testing on a dual Radeon HD-5770 setup reduced the time for the render offload gpu to get its rendering into system RAM from approximately 16 msecs for simple rendering at 1920x1080 pixel 32 bpp to 5 msecs, a > 3x speedup! This was measured using ftrace to trace the time the radeon kms driver waited on the dmabuf fence of the renderoffload gpu to complete. All in all this brought the time for a flip down from 20 msecs to 9 msecs, so the prime setup can display at full 60 fps instead of barely 30 fps vsync'ed. The current r600 implementation supports SDMA on Evergreen and later, but not R600/R700 due to some bugs apparently present in their SDMA implementation. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-08-26 19:57:21 +02:00
Jordan Justen	7970238fcf	docs: Update stencil texturing & ES 3.1 status for i965 Haswell Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	93f5eb7ae7	i965: Enable OpenGLES 3.1 for Haswell Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	116b6e12d4	i965: Enable ARB_texture_stencil8 for Haswell Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	f20f616324	i965: Enable ARB_stencil_texturing for Haswell Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	751682434e	i965/gen7: Use R8_UINT stencil copy when sampling the stencil texture v2: * Check gen <= 7, rather than gen == 7. (Ian) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	8d78b096f8	i965/gen7: Copy stencil when sampling the stencil texture Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	7af51b8f03	i965: Add function to copy a stencil miptree to an R8_UINT miptree v2: * Cleanups suggested by Ian, Matt and Topi Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	c8194dc737	i965: Track that the stencil data was updated when using Tex*Image Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	101b56bab2	i965: Track that the stencil data was updated when rendering Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	7bd87c1e6e	i965: Track that the stencil data was updated when clearing Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	2a9c65a01d	i965/gen7: Add R8_UINT stencil miptree copy for sampling For gen < 8, we can't sample from the stencil buffer, which is required for the ARB_stencil_texturing extension. We'll make a copy of the stencil data into a new texture that we can sample using the R8_UINT surface type. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	91627d1956	i965: Fix assert with multisampling and cubemaps Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	b82bb98441	i965/hsw: Adjust uploading default color for stencil surfaces v2: * has_component (Ken); const bits_per_channel (Topi) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	30fee52036	i965/hsw: Don't advertise more than 64 threads for compute shaders thread_width_max in the GPGPU walker command limits us to a maximum of 64 threads. This fixes a crash on Haswell in the OpenGLES 3.1 conformance test suite which tests the advertised limits of the max invocation counts. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	861c9cbee3	main: Add MESA_VERBOSE=api support for glClearStencil Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	9a1f950bef	main: Add MESA_VERBOSE=api support for glTexImage Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Charmaine Lee	0035f7f136	svga: add guest statistic gathering interface This file was supposed to be added with the previous "svga: add guest statistic gathering interface" patch but went MIA for some reason. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 08:04:02 -06:00
Marek Olšák	49c798e902	radeonsi: disable CE on SI + AMDGPU Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	281f1a5980	winsys/amdgpu: disable IB chaining on SI Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	a6869e7c06	winsys/amdgpu: finish up SI addrlib integration Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-26 15:50:10 +02:00
Ronie Salgado	97b55243fb	winsys/amdgpu: initial SI support Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-26 15:50:10 +02:00
Marek Olšák	971ef7518f	gallium/radeon: add a driver query for AMDGPU_INFO_NUM_EVICTIONS If the kernel driver doesn't support it, it returns 0. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	7172906c0c	radeonsi: fix printing shaders and states on a VM fault This was missed while rewriting the PIPE_DUMP flags. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	5ee3cac138	radeonsi: increase performance for DRI PRIME offloading if 2nd GPU is CIK or VI SDMA is much faster for tiled->linear blits from VRAM to GTT. I have Bonaire in my second PCIe slot. $ glxinfo \| grep OpenGL.renderer OpenGL renderer string: Gallium 0.4 on AMD TONGA ... $ DRI_PRIME=1 glxinfo \| grep OpenGL.renderer OpenGL renderer string: Gallium 0.4 on AMD BONAIRE ... Without SDMA: $ DRI_PRIME=1 glxgears 8796 frames in 5.0 seconds = 1759.074 FPS 8899 frames in 5.0 seconds = 1779.672 FPS With SDMA: $ DRI_PRIME=1 glxgears 12765 frames in 5.0 seconds = 2552.788 FPS 12888 frames in 5.0 seconds = 2577.495 FPS The 1st GPU is irrelevant. The improvement should be much lower at 60 fps, but definitely measurable. SI will get this once we add SDMA blit support for it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	0241d8300f	radeonsi: enable SDMA on CIK It passes R600_DEBUG=testdma on Bonaire/radeon. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	bcfd49e511	gallium/radeon: increase priority for shader binaries Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	c3f716fe67	gallium/radeon: merge USER_SHADER and INTERNAL_SHADER priority flags there's no reason to separate these Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Miklós Máté	b9ac72b511	vbo: set draw_id Fixes conditional jump depending on uninitialized value in si_state_draw.c:593 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Miklós Máté <mtmkls@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 07:34:22 -06:00
Neha Bhende	10f6e08549	svga: fix regression related to srgb This regression is caused because of commit `3190c7ee97` Regression caused by following OpenGL 4.4 spec rules relates to GL_FRAMEBUFFER_SRGB in Mesa. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:52 -06:00
Neha Bhende	3b7341d547	svga: use local variable blit instead of pointer Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:52 -06:00
Brian Paul	b09e4ab13c	svga: s/INDEX_0D/INDEX_IMMEDIATE32/ Both are zero, but the later is the right token.	2016-08-26 06:19:52 -06:00
Brian Paul	93779b87a1	svga: add comment about unsupported blend modes	2016-08-26 06:19:52 -06:00
Charmaine Lee	b1772651b7	svga: fix ordering of mksstats counter strings String for SVGA_STATS_COUNT_TEXREADBACK was swapped with the string for SVGA_STATS_COUNT_SURFACEWRITEFLUSH. Trivial fix.	2016-08-26 06:19:52 -06:00
Charmaine Lee	2781d60375	svga: avoid emitting redundant SetShaderResource command Tested with Lightsmark2008, Heaven, MTT piglit, glretrace, viewperf, conform. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 06:19:52 -06:00
Charmaine Lee	5313b294e6	svga: add a cleanup function to clean up sampler state This patch adds a cleanup function to clean up sampler state at context destruction time. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 06:19:52 -06:00
Brian Paul	e292f38c6c	svga: loosen the condition to flush in get_query_result_vgpu10() Fixes piglit spec/ext_transform_feedback/overflow-edge-cases segfaults because the query's fence pointer was null. Tested with Piglit, Sauerbraten, ETQW. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:52 -06:00
Brian Paul	99d8fe20ab	svga: fix vgpu10 query fencing We don't want to flush the command buffer or sync on the fence when ending a query (that kind of defeats the whole purpose of async queries). Do that instead in get_query_result(). Tested with Piglit, arbocclude, Sauerbraten game, Nobel Clinician Viewer, ETQW. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:52 -06:00
Charmaine Lee	3f51a3f6ac	svga: avoid emitting redundant DXSetSamplers command This patch avoid emitting redundant DXSetSamplers command. Tested with Lightsmark2008, Heaven, MTT piglit, glretrace, viewperf. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 06:19:52 -06:00
Neha Bhende	6a43148e20	svga: enable ARB_clear_texture extension in the driver. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:52 -06:00
Neha Bhende	2111795d51	svga: define svga_clear() in svga_init_clear_functions() Put all the clearing related functions in svga_init_clear_functions() Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:51 -06:00
Neha Bhende	40557ae07c	svga: add svga_init_clear_functions() define svga_init_clear_functions() and svga_clear_texture as svga->pipe.clear_texture. This is part of ARB_clear_texture extension Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:51 -06:00
Neha Bhende	52d88b67be	svga: add new function svga_clear_texture() To clear texture this function can be used. This is part of ARB_clear_texture extension. Basically this extension allows you to clear texture with given color values. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:51 -06:00
Neha Bhende	1da538f85b	svga: add new begin_blit() Saving all blitter states will be done in begin_blit() so that begin_blit() can be used before performing any blit operation. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:51 -06:00
Charmaine Lee	a5fd54f8bf	svga: add opt to the list of valid build types For opt build, add VMX86_STATS to the list of cpp defines. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 06:19:51 -06:00
Charmaine Lee	2e1cfcc431	svga: add guest statistic gathering interface With this patch, guest statistic gathering interface is added to svga winsys interface that can be used to gather svga driver statistic. The winsys module can then share the statistic info with the VMX host via the mksstats interface. The statistic enums used in the svga driver are defined in svga_stats_count and svga_stats_time in svga_winsys.h Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 06:19:51 -06:00
Charmaine Lee	4791991808	svga: fix indirect non-indexable temp access If the shader has indirect access to non-indexable temporaries, convert these non-indexable temporaries to indexable temporary array. This works around a bug in the GLSL->TGSI translator. Fixes glsl-1.20/execution/fs-const-array-of-struct-of-array.shader_test on DX11Renderer. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 06:19:51 -06:00
Brian Paul	d221a6545c	gallium/hud: move signo declaration inside PIPE_OS_UNIX block To silence unused var warning with MSVC, MinGW. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-26 06:19:51 -06:00
Chris Wilson	f92a87a140	i965: Embrace "unlimited" GTT mmap support From about kernel 4.9, GTT mmaps are virtually unlimited. A new parameter, I915_PARAM_MMAP_GTT_VERSION, is added to advertise the feature so query it and use it to avoid limiting tiled allocations to only fit within the mappable aperture. A couple of caveats: - fence support is still limited by stride to 262144 and the stride needs to be a multiple of tile_width (as before, and same limitation as the current 3D pipeline in hardware) - the max_gtt_map_object_size forcing untiled may be hiding a few bugs in handling of large objects, though none were spotted in piglits. See kernel commit 4cc6907501ed ("drm/i915: Add I915_PARAM_MMAP_GTT_VERSION to advertise unlimited mmaps"). v2: Include some commentary on mmap virtual space vs CPU addressable space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-08-26 09:09:34 +01:00
Tobias Klausmann	bc5be5323f	mesa/main: Fix missing return in non void function This was found by obs: I: Program returns random data in a function E: Mesa no-return-in-nonvoid-function main/program_resource.c:109 v2: Remove the ! on the string (Ian Romanick) Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 08:46:03 +02:00
Kenneth Graunke	219a451497	i965: Implement GL_KHR_blend_equation_advanced_coherent on Gen9+. We always use a coherent read, and ignore the "opt out" enable flag. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	1bf9b2a600	mesa: Implement GL_KHR_blend_equation_advanced_coherent. This adds the extension enable (so drivers can advertise it) and the extra boolean state flag, GL_BLEND_ADVANCED_COHERENT_KHR, which can be set to request coherent blending. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	c2b10cabed	i965: Enable GL_KHR_blend_equation_advanced on G45 and later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	40241d40d0	i965: Disable hardware blending if advanced blending is in use. We'll do blending in the shader in this case, so just disable the hardware blending. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	8ab50f5dd1	glsl: Add a lowering pass to handle advanced blending modes. Many GPUs cannot handle GL_KHR_blend_equation_advanced natively, and need to emulate it in the pixel shader. This lowering pass implements all the necessary math for advanced blending. It fetches the existing framebuffer value using the MESA_shader_framebuffer_fetch built-in variables, and the previous commit's state var uniform to select which equation to use. This is done at the GLSL IR level to make it easy for all drivers to implement the GL_KHR_blend_equation_advanced extension and share code. Drivers need to hook up MESA_shader_framebuffer_fetch functionality: 1. Hook up the fb_fetch_output variable 2. Implement BlendBarrier() Then to get KHR_blend_equation_advanced, they simply need to: 3. Disable hardware blending based on ctx->Color._AdvancedBlendEnabled 4. Call this lowering pass. Very little driver specific code should be required. v2: Handle multiple output variables per render target (which may exist due to ARB_enhanced_layouts), and array variables (even with one render target, we might have out vec4 color[1]), and non-vec4 variables (it's easier than finding spec text to justify not handling it). Thanks to Francisco Jerez for the feedback. v3: Lower main returns so that we have a single exit point where we can add our blending epilogue (caught by Francisco Jerez). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	e299661166	compiler: Add a new STATE_VAR_ADVANCED_BLENDING_MODE built-in uniform. This will be used for emulating GL_KHR_advanced_blend_equation features in shader code. We'll pass in the blending mode that's in use, and use that in (effectively) a switch statement in the shader. v2: Use the new _AdvancedBlendMode field. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	acf57fcf7f	mesa: Add draw time validation for advanced blending modes. v2: Add null checks (requested by Curro). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	75ae338d14	mesa: Restyle _mesa_check_blend_func_error(). I'm about to add more error conditions to this function, so I wanted to move the current spec citation above the code that checks it. Indenting it required reformatting, so I tried to move it to our newer style. While there, I also decided to drop some GL type usage, and drop the unnecessary "_mesa_" prefix on a static function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Kenneth Graunke	0745e039a2	mesa: Track the current advanced blending mode. This will be useful for a number of things: - Checking the current advanced blending mode against the shader's blend_support_* qualifiers. - Disabling hardware blending when emulating advanced blending. - Uploading the current advanced blending mode as a state var. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Kenneth Graunke	74837e3e91	mesa: Allow advanced blending enums in glBlendEquation[i]. Don't allow them in glBlendEquationSeparate[i], though, as required by the spec. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Kenneth Graunke	80df3c030e	glsl: Merge blend_support qualifiers when linking. Since each qualifier represents a blending mode the shader can be used with, we take the union of all possible modes when linking. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Ilia Mirkin	4b6819b407	glsl: process blend_support_* qualifiers v2 (Ken): Add a BLEND_NONE enum value (no qualifiers in use). v3 (Ken): Rename gl_blend_support_qualifier to gl_advanced_blend_mode. v4 (Ken): Mark map[] as static const (Ilia). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Ilia Mirkin	e682f94594	glsl: add basic KHR_blend_equation_advanced infrastructure Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Ilia Mirkin	3b0406457a	mesa: add KHR_blend_equation_advanced enable and extension string Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Ilia Mirkin	a8ae1bc767	glapi: add KHR_blend_equation_advanced dispatch v2 (Ken): Fix enum values, drop _mesa_BlendBarrierKHR stub as Curro has already implemented it. v3 (Ken): Rework for _mesa_BlendBarrierKHR -> _mesa_BlendBarrier rename. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Kenneth Graunke	1a1f4496c6	mesa: Rename _mesa_BlendBarrierMESA to _mesa_BlendBarrier. Note that _mesa_BlendBarrierMESA is not currently hooked up in the glapi XML, so we can just rename it. We'll hook it up for the KHR_blend_equation_advanced extension shortly. We may as well use the ES 3.2 core name with no suffixes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Kenneth Graunke	c2fd6b0f5d	i965: Safely iterate the predecessors of the end block. We want to insert code in each of the predecessors of the end block. This code includes a nir_if, which would split the block, altering the set. To avoid that, I emitted a dead constant at the end of each block before splitting it, so that the set of predecessors remained unchanged. This was admittedly ugly. Connor suggested instead saving a copy of the set, so we can iterate it safely. This is also a little ugly, but a much better plan. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-08-25 19:18:24 -07:00
Kenneth Graunke	3203fe3d50	nir: Use nir_shader_get_entrypoint in TCS quad workaround code. We want to insert the code at the end of the program. Looping over all the functions (of which there was only one) was the old way of doing this, but now we have nir_shader_get_entrypoint(), so let's use it. Suggested by Connor Abbott. v2: Update for nir_shader_get_entrypoint API change. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-08-25 19:18:24 -07:00
Kenneth Graunke	93bfa1d7a2	nir: Change nir_shader_get_entrypoint to return an impl. Jason suggested adding an assert(function->impl) here. All callers of this function actually want ->impl, so I decided just to change the API. We also change the nir_lower_io_to_temporaries API here. All but one caller passed nir_shader_get_entrypoint(), and with the previous commit, it now uses a nir_function_impl internally. Folding this change in avoids the need to change it and change it back. v2: Fix one call I missed in ir3_compiler (caught by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-08-25 19:18:24 -07:00
Kenneth Graunke	8479b03c58	nir: Make nir_lower_io_to_temporaries store an impl internally. This changes the pass internals to work with a nir_function_impl directly rather than a nir_function. The next patch will change the API. v2: Rebase after framebuffer fetch landed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-08-25 19:18:11 -07:00
Francisco Jerez	da85b5a9f1	i965: Expose shader framebuffer fetch extensions on Gen9+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:09 -07:00
Francisco Jerez	4135fc22ff	i965/fs: Hook up coherent framebuffer reads to the NIR front-end. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:09 -07:00
Francisco Jerez	be12a1f36e	i965/fs: Remove special casing of framebuffer writes in scheduler code. The reason why it was safe for the scheduler to ignore the side effects of framebuffer write instructions was that its side effects couldn't have had any influence on any other instruction in the program, because we weren't doing framebuffer reads, and framebuffer writes were always non-overlapping. We need actual memory dependency analysis in order to determine whether a side-effectful instruction can be reordered with respect to other instructions in the program. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:09 -07:00
Francisco Jerez	3daa0fae4b	i965/fs: Don't CSE render target messages with different target index. We weren't checking the fs_inst::target field when comparing whether two instructions are equal. For FB writes it doesn't matter because they aren't CSE-able anyway, but this would have become a problem with FB reads which are expression-like instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	db123df747	i965/fs: Define logical framebuffer read opcode and lower it to physical reads. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	f2f75b0cf0	i965/fs: Define framebuffer read virtual opcode. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	71d639f69e	i965/disasm: Fix RC message type strings on Gen7+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	26ac16fe2f	i965/eu: Add codegen support for the Gen9+ render target read message. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	29eb8059fd	i965/eu: Take into account the target cache argument in brw_set_dp_read_message. brw_set_dp_read_message() was setting the data cache as send message SFID on Gen7+ hardware, ignoring the target cache specified by the caller. Some of the callers were passing a bogus target cache value as argument relying on brw_set_dp_read_message not to take it into account. Fix them too. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	8a2f19a777	i965: Flip the non-coherent framebuffer fetch extension bit on G45-Gen8 hardware. This is not enabled on the original Gen4 part because it lacks surface state tile offsets so it may not be possible to sample from arbitrary non-zero layers of the framebuffer depending on the miptree layout (it should be possible to work around this by allocating a scratch surface and doing the same hack currently used for render targets, but meh...). On Gen9+ even though it should mostly work (feel free to force-enable it in order to compare the coherent and non-coherent paths in terms of performance), there are some corner cases like 1D array layered framebuffers that cannot be handled easily by the non-coherent path because of the incompatible layout in memory of 1D and 2D miptrees (it should be possible to work around this too by doing state-dependent recompiles, but it's hard to care enough since Gen9 has native support for coherent render target reads...) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	ecc4800383	i965: Implement glBlendBarrier. This is a no-op if the platform supports coherent framebuffer fetch, -- If it doesn't we just need to flush the render cache and invalidate the texture cache in order for previous rendering to be visible to framebuffer fetch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	786108e7b2	i965: Upload surface state for non-coherent framebuffer fetch. This iterates over the list of attached render buffers and binds appropriate surface state structures to the binding table block allocated for shader framebuffer read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	dc96968dbf	i965: Implement support for overriding the texture target in brw_emit_surface_state. This allows the caller to bind a miptree using a texture target other than the one it it was created with. The code should work even if the memory layouts of the specified and original targets don't match, as long as the caller only intends to access a single slice of the miptree structure. This will be exploited by the next commit in order to support non-coherent framebuffer fetch of a single layer of a 3D texture (since some generations lack the minimum array element control for 3D textures bound to the sampler unit), and multiple layers of a 1D array texture (since binding it as an actual 1D array texture would require state-dependent recompiles because the same shader couldn't simultaneously work for 1D and 2D array textures due to the different texel fetch coordinate ordering). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	49ea2bd175	i965: Massage argument list of brw_emit_surface_state(). This commit does three different things in a single pass in order to keep the amount of churn low: Remove the for_gather boolean argument which was unused, pass the isl_view argument by value rather than by reference since I'll have to modify it from within the function, and add a target argument to allow callers to bind textures using a target other than the original. The prototype of the function now looks like: void brw_emit_surface_state(struct brw_context brw, struct intel_mipmap_tree mt, GLenum target, struct isl_view view, uint32_t mocs, uint32_t *surf_offset, int surf_index, unsigned read_domains, unsigned write_domains); Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	74e4baec59	i965: Add missing has_surface_tile_offset flag to the Gen8+ device info structures. This surface state control has been supported by all hardware generations since G45. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	0fe732e66f	i965: Return the correct layout from get_isl_dim_layout for pre-ILK cube textures. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	5759eb458b	i965: Factor out isl_surf_dim/isl_dim_layout calculation into functions. The logic to calculate the right layout and dimensionality for a given GL texture target is going to be useful elsewhere, factor it out from intel_miptree_get_isl_surf(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	99fb167839	i965: Resolve color for non-coherent FB fetch at UpdateState time. This is required because the sampler unit used to fetch from the framebuffer is unable to interpret non-color-compressed fast-cleared single-sample texture data. Roughly the same limitation applies for surfaces bound to texture or image units, but unlike texture sampling, non-coherent framebuffer fetch is by definition non-coherent with previous rendering, so the brw_render_cache_set_check_flush() call can be omitted except after resolve. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	071665c161	i965: Return whether the miptree was resolved from intel_miptree_resolve_color(). This will allow optimizing out the cache flush in some cases when resolving wasn't necessary. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	f24e393bd5	i965/fs: Translate nir_intrinsic_load_output on a fragment output. This gets the non-coherent framebuffer fetch path hooked up to the NIR front-end. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	b00a236d6a	i965/fs: Allocate fragment output temporaries on demand. This gets rid of the duplication of logic between nir_setup_outputs() and get_frag_output() by allocating fragment output temporaries lazily whenever get_frag_output() is called. This makes nir_setup_outputs() a no-op for the fragment shader stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	7dac882073	i965/fs: Rework representation of fragment output locations in NIR. The problem with the current approach is that driver output locations are represented as a linear offset within the nir_outputs array, which makes it rather difficult for the back-end to figure out what color output and index some nir_intrinsic_load/store_output was meant for, because the offset of a given output within the nir_output array is dependent on the type and size of all previously allocated outputs. Instead this defines the driver location of an output to be the pair formed by its GLSL-assigned location and index (I've borrowed the bitfield macros from brw_defines.h in order to represent the pair of integers as a single scalar value that can be assigned to nir_variable_data::driver_location). nir_assign_var_locations is no longer useful for fragment outputs. Because fragment outputs are now allocated independently rather than within the nir_outputs array, the get_frag_output() helper becomes necessary in order to obtain the right temporary register for a given location-index pair. The type_size helper passed to nir_lower_io is now type_size_dvec4 rather than type_size_vec4_times_4 so that output array offsets are provided in terms of whole array elements rather than in terms of scalar components (dvec4 is the largest vector type supported by the GLSL so this will cause all individual fragment outputs to have a size of one regardless of the type). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	4e990b67ce	i965: Fix undefined signed overflow in INTEL_MASK for bitfields of 31 bits. Most likely we had only ever used this macro on bitfields of less than 31 bits -- That's going to change shortly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	f3cb2c34f2	i965/fs: Special-case nir_intrinsic_store_output for the fragment shader. I'm about to change how fragment shader output locations are represented, so the generic nir_intrinsic_store_output implementation that assumes that outputs are just contiguous elements in the big nir_outputs array won't work anymore. This somewhat simplified implementation of nir_intrinsic_store_output for fragment shaders should be functionally equivalent to the current fall-back one. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	af0cc743e6	i965/fs: Implement non-coherent framebuffer fetch using the sampler unit. v2: Memoize sample ID, misc codestyle changes. (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	fe6abb5755	i965/fs: Emit interpolation setup if non-coherent framebuffer fetch is in use. This will be required for the next commit since the non-coherent path makes use of the fragment coordinates implicitly, so they need to be calculated. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	98d61ee083	i965/fs: Force per-sample dispatch if the shader reads from a multisample FBO. The result of a framebuffer fetch from a multisample FBO is inherently per-sample, so the spec requires at least those sections of the shader that depend on the framebuffer fetch result to be executed once per sample. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	08705badfe	i965: Allocate space in the binding table for non-coherent FB fetch. Unfortunately due to the inconsistent meaning of some surface state structure fields, we cannot re-use the same binding table entries for sampling from and rendering into the same set of render buffers, so we need to allocate a separate binding table block specifically for render target reads if the non-coherent path is in use. The slight noise is due to the change of brw_assign_common_binding_table_offsets to return the next available binding table index rather than void. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	40b23ad57e	i965/fs: Add brw_wm_prog_key bit specifying whether FB reads should be coherent. Some of the following changes in this series are specific to the non-coherent path, so I need some way to tell whether the coherent or non-coherent path is in use. The flag defaults to the value of the gl_extensions::MESA_shader_framebuffer_fetch enable so that it can be overridden easily on hardware that supports both framebuffer fetch extensions in order to test the non-coherent path, like: MESA_EXTENSION_OVERRIDE=-GL_EXT_shader_framebuffer_fetch (Of course trying to force-enable the coherent framebuffer fetch extension on hardware without native support won't work and lead to assertion failures). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	4a87e4ade7	i965/fs: Get rid of fs_visitor::do_dual_src. This boolean flag was being used for two different things: - To set the brw_wm_prog_data::dual_src_blend flag. Instead we can just set it based on whether the dual_src_output register is valid, which will be the case if the shader writes the secondary blending color. - To decide whether to call emit_single_fb_write() once, or in a loop that would iterate only once, which seems pretty useless. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:00 -07:00
Francisco Jerez	aee3d8f0d9	nir: Handle FB fetch outputs correctly in nir_lower_io_to_temporaries. This requires emitting a series of copies at the top of the program from each output variable to the corresponding temporary. The initial copy can be skipped for non-framebuffer fetch outputs whose initial value is undefined, and the final copy needs to be skipped for read-only outputs (i.e. gl_LastFragData), since it would be illegal to emit a store output intrinsic for it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:33:29 -07:00
Francisco Jerez	97ac3eba58	nir: Pass through fb_fetch_output and OutputsRead from GLSL IR. The NIR representation of framebuffer fetch is the same as the GLSL IR's until interface variables are lowered away, at which point it will be translated to load output intrinsics. The GLSL-to-NIR pass just needs to copy the bits over to the NIR program. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:33:29 -07:00
Eric Anholt	00c72acba5	vc4: Add support for fddx/fddy Based vaguely on a patch by jonasarrow on github.	2016-08-25 17:24:11 -07:00
Eric Anholt	e763e19808	vc4: Add register allocation support for MUL output rotation. We need the source to be in r0-r3, so make a new register class for it. It will be up to the surrounding passes to make sure that the r0-r3 allocation of its source won't conflict with anything other class requirements on that temp.	2016-08-25 17:24:11 -07:00
Eric Anholt	8ce6526178	vc4: Add support for MUL output rotation. Extracted from a patch by jonasarrow on github.	2016-08-25 17:24:11 -07:00
Eric Anholt	074f1f3c0c	vc4: Add support for the 2-bit LOAD_IMM variants. Extracted and fixed up from a patch by jonasarrow on github. This ended up not getting used for ddx/ddy, but seems like it might still be useful.	2016-08-25 17:24:11 -07:00
Eric Anholt	3da4e38f48	vc4: Add QPU scheduling to handle MUL rotate sources. We need MUL rotates to do ddx/ddy support.	2016-08-25 17:24:11 -07:00
Eric Anholt	b0b99a7952	vc4: Add disassembly for constant MUL rotates	2016-08-25 17:24:11 -07:00
Eric Anholt	b160708e03	vc4: Add real validation for MUL rotation. Caught problems in the upcoming DDX/DDY implementation.	2016-08-25 17:24:11 -07:00
Eric Anholt	31da39ddc9	vc4: Add a QIR value for the QPU element register. This will be used in the ddx/ddy support for "Am I the top half?" or "Am I the left half?" checks.	2016-08-25 17:24:11 -07:00
Chad Versace	5b03975889	i965: Respect miptree offsets in intel_readpixels_tiled_memcpy() Respect intel_miptree_slice::x_offset,y_offset and intel_mipmap_tree::offset. All three may be non-zero when glReadPixels is called on an EGLImage created from the non-base slice of a miptree. Patch 2/2 that fixes test 'dEQP-EGL.functional.image.create.gles2_cubemap_*'. Reported-by: Haixia Shi <hshi@chromium.org> Diagnosed-by: Haixia Shi <hshi@chromium.org> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Change-Id: I4b397b27e55a743a7094d29fb0a6a4b6b34352b0	2016-08-25 16:52:00 -07:00
Chad Versace	c82f99e883	i965: Fix miptree layout for EGLImage-based renderbuffers When glEGLImageTargetRenderbufferStorageOES() was given an EGLImage created from the non-base slice of a miptree, intel_image_target_renderbuffer_storage() forgot to apply the intra-tile offsets __DRIimage::tile_x,tile_y to the miptree layout. This patch fixes the problem with a quick hack suitable for cherry-picking. A proper fix requires more thorough plumbing in intel_miptree_create_layout() and brw_tex_layout(). Patch 1/2 that fixes test 'dEQP-EGL.functional.image.create.gles2_cubemap_*'. Reported-by: Haixia Shi <hshi@chromium.org> Diagnosed-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Change-Id: I8a64b0048a1ee9e714ebb3f33fffd8334036450b	2016-08-25 16:52:00 -07:00
Jason Ekstrand	bebc1a1d99	intel: Flatten the makefile structure This pulls isl and genxml into a single make file so that they can properly build in parallel. This isn't terribly important now as genxml just generates sources which happens serially first anyway but it will be more important as we add more stuff to src/intel. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-25 15:29:48 -07:00
Jason Ekstrand	c19fc5e019	isl/tests: Use a longer path for isl.h The tests assumed that isl would be in the include path but that usually isn't the case. Instead, we usually have src/intel and you need to add an "isl/" prefix. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-25 15:29:47 -07:00
Jason Ekstrand	8bdf605214	intel/isl/gen9: Only use the magic 1D alignment for GEN9_1D surfaces If the surface has a layout of GEN4_2D then we need to compute a normal 2D alignment and not use the magic linewar 1D alignment. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-08-25 14:11:15 -07:00
Jason Ekstrand	cda1a5dc0e	intel/isl: Pass the dim_layout into choose_alignment_el Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-08-25 14:10:43 -07:00
Jason Ekstrand	f68cfb05fa	intel/isl: Use DIM_LAYOUT_GEN4_2D for tiled 1-D surfaces on SKL The Sky Lake 1D layout is only used if the surface is linear. For tiled surfaces such as depth and stencil the old gen4 2D layout is used. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-08-25 14:09:44 -07:00
Jason Ekstrand	78715c7211	nir/phi_builder: Don't recurse in value_get_block_def In some programs, we can have very deep dominance trees and the recursion can cause us to risk stack overflows. Instead, we replace the recursion with a pair of loops, one at the start and one at the end. This is functionally equivalent to what we had before and it's actually a bit easier to read in the new form without the recursion. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97225 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-25 14:08:07 -07:00
Chad Versace	3eddf5219e	.mailmap: Update my address again I joined Google's Chrome OS graphics team.	2016-08-25 13:55:52 -07:00
Matt Turner	e53130cc27	nir: Walk blocks in source code order in lower_vars_to_ssa. Prior to this commit rename_variables_block() is recursively called, performing a depth-first traversal of the control flow graph. The function uses a non-trivial amount of stack space for local variables, which puts us in danger of smashing the stack, given a sufficiently deep dominance tree. XCOM: Enemy Within contains a shader with such a dominance tree (1574 nir_blocks in total, depth of at least 143). Jason tells me that he believes that any walk over the nir_blocks that respects dominance is sufficient (a DFS might have been necessary prior to the introduction of nir_phi_builder). In fact, the introduction of nir_phi_builder made the problem worse: rename_variables_block(), walks to the bottom of the dominance tree before calling nir_phi_builder_value_get_block_def() which walks back to the top of the dominance tree... In any case, this patch ensures we avoid that problem as well. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97225 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-08-25 13:45:39 -07:00
Marek Olšák	a491b9e945	radeonsi: don't use allocas for arrays with LLVM 3.8 It crashes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97413	2016-08-25 21:19:17 +02:00
Marek Olšák	fe91ae06d3	gallium/radeon: unify and simplify checking for an empty gfx IB We can take advantage of the fact that multi_fence does the obvious thing with NULL fences. This fixes unflushed fences that can get stuck due to empty IBs.	2016-08-25 21:19:17 +02:00
Matt Turner	e6673e7ac2	mesa: Drop sed of now dead Plo files. gen6/7/8_blorp.c were removed in commits `c8bc1ae96a`, `e198983c61`, and `16a9fcbbb6` respectively.	2016-08-25 11:20:54 -07:00
Kenneth Graunke	6cf8708ce5	meta: Always do GenerateMipmaps in linear colorspace. When generating mipmaps for sRGB textures, force both decode and encode, so the filtering is done in linear colorspace, regardless of settings. Fixes a WebGL conformance test in Chrome: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/textures/misc/tex-srgb-mipmap.html?webglVersion=2 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97322 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-25 11:07:01 -07:00
Eric Engestrom	ed871af91c	configure.ac: raise Mako required version to 0.8.0 It seems [0] old versions of Mako are no longer supported. Emil mentioned it might need v0.8.0 [1] for isl_format_layout [2], although I didn't get a confirmation that it's really the minimum. Let's raise it to that to avoid getting other bugs. We might lower it a bit again later if it turns out we can. [0] https://lists.freedesktop.org/archives/mesa-dev/2016-July/122772.html [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/122775.html [2] https://lists.freedesktop.org/archives/mesa-dev/2016-July/123278.html Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Dave Airlie <Airlied@redhat.com>	2016-08-25 16:51:27 +01:00
Brian Paul	2a2dc416b6	swrast: fix incorrectly positioned putImage() in swrast driver Some front buffer rendering was in the wrong position. This included scissored clears, glDrawPixels and glCopyPixels. The problem was the y coordinate passed to putImage() didn't match the y coordinate passed to getImage(). We fix this by setting xrb->map_y to the inverted coordinate in swrast_map_renderbuffer() which is used later by the putImage() call. Also pass xrb->map_y to getImage() to be symmetric. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97426 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-25 07:19:35 -06:00
Marek Olšák	3ff0b67e1b	radeonsi: disable SDMA texture copying on Carrizo Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-08-25 14:51:08 +02:00
Marek Olšák	1276316d67	gallium/noop: use 3-space indentation Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-25 14:09:48 +02:00
Marek Olšák	9daaa6f5a6	gallium: add a pipe_context parameter to resource_get_handle radeonsi needs to do some operations (DCC decompression) for OpenGL-OpenCL interop and this is the only way to make it coherent with the current context. It can optionally be set to NULL. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-25 14:09:48 +02:00
Nicolai Hähnle	b662c70aea	st/mesa: fix sRGB BlitFramebuffer regression Broken since: `3190c7ee97` Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97285 Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-08-25 13:21:05 +02:00
Michel Dänzer	1e3218bc5b	loader/dri3: Overhaul dri3_update_num_back Always use 3 buffers when flipping. With only 2 buffers, we have to wait for a flip to complete (which takes non-0 time even with asynchronous flips) before we can start working on the next frame. We were previously only using 2 buffers for flipping if the X server supports asynchronous flips, even when we're not using asynchronous flips. This could result in bad performance (the referenced bug report is an extreme case, where the inter-frame stalls were preventing the GPU from reaching its maximum clocks). I couldn't measure any performance boost using 4 buffers with flipping. Performance actually seemed to go down slightly, but that might have been just noise. Without flipping, a single back buffer is enough for swap interval 0, but we need to use 2 back buffers when the swap interval is non-0, otherwise we have to wait for the swap interval to pass before we can start working on the next frame. This condition was previously reversed. Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97260 Reviewed-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-25 17:40:24 +09:00
Jason Ekstrand	2301705dee	anv: Include the pipeline layout in the shader hash The pipeline layout affects shader compilation because it is what determines binding table locations as well as whether or not a particular buffer has dynamic offsets. Since this affects the generated shader, it needs to be in the hash. This fixes a bunch of CTS tests now that the CTS is using a pipeline cache. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-24 20:42:05 -07:00
Jason Ekstrand	05f36435ef	anv: Add a --disable-vulkan-icd-full-driver-path option This option makes installed Vulkan ICD files contain only a driver library name and not a path. This is intended for distros to help them work around multi-arch issues. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-08-25 10:32:31 +10:00
Francisco Jerez	c8f5bd2c99	i965/fs: Don't consider the stencil output to be a color output. This would cause gl_FragStencilRef to be counted as a color output incorrectly during the precompile phase, which leads to unnecessary recompilation on master and could trigger an assertion failure in fs_visitor::emit_fb_writes() on my i965-fb-fetch branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	2018371692	glsl: Keep track of the set of fragment outputs read by a GL program. This is the set of shader outputs whose initial value is provided to the shader by some external means when the shader is executed, rather than computed by the shader itself. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	711213fb72	glsl: Don't consider read-only fragment outputs to be written to. Since they cannot be written. This prevents adding fragment outputs to the OutputsWritten set that are only read from via the gl_LastFragData array but never written to. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	913ae618c6	glsl/linker: Allow fragment output overlap for gl_LastFragData. gl_LastFragData overlaps gl_FragData by definition. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	6b3d23dcc0	glsl/ast: Allow redeclaration of gl_LastFragData with different precision qualifier. v2: No need to check the GLSL version. (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	5e1d34394e	glsl: Don't attempt to do dead varying elimination on gl_LastFragData arrays. Apparently this pass can only handle elimination of a single built-in fragment output array, so the presence of gl_LastFragData (which it wouldn't split correctly anyway) could prevent it from splitting the actual gl_FragData array. Just match gl_FragData by name since it's the only built-in it can handle. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	6b33eab959	glsl: Define a gl_LastFragData built-in for older GLSL versions. The EXT_shader_framebuffer_fetch extension defines alternative language for GLES2 shaders where user-defined fragment outputs are not allowed. Instead of using inout user-defined fragment outputs the shader is expected to read from the gl_LastFragData built-in array. In addition this allows using the same language on desktop GLSL versions prior to 4.2 that support the deprecated gl_FragData built-in in preparation for the MESA_shader_framebuffer_fetch desktop GL extension. Both legacy and user-defined inout outputs have a common representation at the GLSL IR level, so it shouldn't make any difference for optimization passes and back-ends whether the application is using gl_LastFragData or user-defined outputs, all they'll see is a variable dereference of a fragment output at a certain interface location with the fb_fetch_output bit set to one. v2: Don't define the built-in variable on GLSL versions for which gl_FragData exists but is deprecated. (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	19e929a177	glsl: Handle the inout qualifier in fragment shader output declarations. According to the EXT_shader_framebuffer_fetch extension the inout qualifier can be used on ESSL 3.0+ shaders to declare a special kind of fragment output that gets implicitly initialized with the previous framebuffer contents at the current fragment coordinates. In addition we allow using the same language to define FB fetch outputs in GLSL 1.3+ shaders in preparation for the desktop MESA_shader_framebuffer_fetch extensions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	b49d8f20f4	glsl: Add support for representing framebuffer fetch in the GLSL IR. The GLSL IR representation of framebuffer fetch amounts to a single bit in the ir_variable object applicable to fragment shader outputs. The flag indicates that the variable will be implicitly initialized to the previous contents of the render buffer at the same fragment coordinates and sample index. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	d7cd7b9c49	glsl: Add parser state enables for the framebuffer fetch extensions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	303fb5881c	mesa: Add blend barrier entry point and driver hook. Both MESA_shader_framebuffer_fetch_non_coherent and the non-coherent variant of KHR_blend_equation_advanced will use this driver hook to request coherency between framebuffer reads and writes. This intentionally doesn't hook up glBlendBarrierMESA to the dispatch layer since the extension isn't exposed to applications yet, see [1] for more details. [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	6a976bbf84	mesa: Move shader memory barrier functions into barrier.c. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	83d2f9db29	mesa: Rename "texturebarrier" source files to "barrier". In preparation for collecting all pipeline barrier GL entry points into a single source file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	642aa58577	mesa: Add support for querying GL_FRAGMENT_SHADER_DISCARDS_SAMPLES_EXT. This can currently only give true as result since the only way you can expose EXT_shader_framebuffer_fetch right now is by flipping the MESA_shader_framebuffer_fetch bit, but that could potentially change in the future, see [1] for an explanation. [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	115a27357c	mesa: Add extension enables for framebuffer fetch extensions. This allows drivers to expose EXT_shader_framebuffer_fetch in GLES2+ contexts if desired. Note that this adds boolean flags for two MESA extensions, but only the EXT GLES-only extension is exposed for the moment, see the cover letter of this series [1] for the rationale. [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	acb12a1228	glapi: Add XML for GL_EXT_shader_framebuffer_fetch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Samuel Pitoiset	a227b0a4f1	nvc0: invalidate textures/samplers on GK104+ Like Fermi, textures and samplers are aliased between 3D and compute, especially the TIC_FLUSH/TSC_FLUSH methods and we have to re-validate these resources when switching between the two pipelines. This fixes a GPU hang with Elemental (and most likely with other UE4 demos). Tested on GK107 and GM107. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> CC: <mesa-stable@lists.freedesktop.org>	2016-08-24 22:26:36 +02:00
Rhys Kidd	c9c989763a	gallium/ttn: Remove duplicated TGSI_OPCODE_DP2A initialization Duplicate line is currently on 1535. Identified by Clang, when run through Eric Anholt's Travis harness. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-24 11:54:50 -07:00
Eric Anholt	78ab62b1e9	travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2016-08-24 11:54:50 -07:00
Eric Anholt	084678ccbb	travis: Enable vc4 in libdrm to satisfy vc4 test build dependency. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2016-08-24 11:54:50 -07:00
Eric Anholt	80a872f3f0	travis: Update to the Ubuntu Trusty image. This will hopefully fix wget from x.org (no real reason explained in Travis CI bug reports), and may also mean that we can enable LLVM driver builds. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2016-08-24 11:54:50 -07:00
Eric Anholt	ecbc76cf6e	travis: Parse configure.ac to pick an updated LIBDRM_VERSION. Travis has been broken a couple of times by configure.ac updates. To make it useful, auto-update the version necessary. This could potentially be used for other dependencies, too, but those get bumped less frequently. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2016-08-24 11:54:50 -07:00
Lionel Landwerlin	91987c51e3	anv: meta_blit2d: adapt texel fetch pitch for fake w-tiled We need to compute detiling coordinates using the physical size of W tiling (128x32) rather than the logical size (64x64). v2: Correct comment (Jason) Fixes dEQP-VK.api.copy_and_blit.image_to_image_stencil Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97448 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-24 11:29:23 -07:00
Eric Anholt	87a88f2daa	vc4: Fix GPU hangs with >16 varying values. Fixes glsl-routing in piglit and hangs in glbenchmark 2.0.2.	2016-08-24 10:43:22 -07:00
Leo Liu	5277f25480	vl/rbsp: fix another three byte not detected This happens when three byte "00 00 03" is partly loaded to vlc->buffer, thus at the bottom of buffer with valid bits is "00" or "00 00" and left like "00 03" or "03" in the data, so that it will not be detected by three byte emulation check. The reason for that is the escaped bit was set to 0 from the rbsp init. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-08-24 11:17:16 -04:00
Marek Olšák	2c13abb491	radeonsi: fix VM faults due NULL internal const buffers on CIK They are harmless, but the interrupts do decrease performance. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97039 Cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-08-24 15:39:57 +02:00
Tomasz Figa	577f85e2bb	gallium/winsys/kms: Look up the GEM handle after importing a prime FD drmPrimeHandleToFD() will return the same GEM handle every time the same buffer is imported, even from a different prime FD. Since GEM handles are not reference counted, we need to make sure that each GEM handle is referenced only by one display target struct, by looking it up in kms_sw->bo_list first and bumping the refcount of the found dt on hit and falling back to creating a new dt only on miss. v2: Split into separate function. Use helper function for lookup. v3 [Emil Velikov]: Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan) Signed-off-by: Tomasz Figa <tfiga@chromium.org> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 14:39:23 +01:00
Tomasz Figa	0465c72d46	gallium/winsys/kms: Move display target handle lookup to separate function As a preparation to use the lookup in more than once place, move the code that looks up given KMS/GEM handle to a separate function. This change should not introduce any functional changes. v2: Split into separate patch. Move lookup code into separate function. v3 [Emil Velikov]: Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan) Signed-off-by: Tomasz Figa <tfiga@chromium.org> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 14:39:23 +01:00
Tomasz Figa	e71b78ebf9	gallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2) Currently kms_sw_displaytarget_add_from_prime() allocates the struct and fills in only some of the fields, resulting in a half-baked struct that needs to be further completed by the caller. To make this a bit more consistent, pass width, height and stride to this function and fill in everything there, so that caller can take the returned struct as is. v2: Split from one big patch into four fixing one thing at a time. Signed-off-by: Tomasz Figa <tfiga@chromium.org> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 14:39:23 +01:00
Tomasz Figa	0aa6a818ef	gallium/winsys/kms: Fix double refcount when importing from prime FD (v2) Currently the code creates a display target struct with refcount field initialized to 1 and then the caller again increments it, leading to a leaked reference. Let's remove the unnecessary increment. v2: Split from one big patch into four fixing one thing at a time. Signed-off-by: Tomasz Figa <tfiga@chromium.org> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 14:39:22 +01:00
Alejandro Piñeiro	b4959e17f1	shaderapi: don't generate not linked error on GetProgramStage in general Both ARB_shader_subroutine and the GL core spec doesn't list any error when the program is not linked. We left a error generation for the uniform location, in order to be consistent with other methods from the spec that generate them. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-08-24 14:57:13 +02:00
Eric Engestrom	9411eb67ec	gallium/cso: avoid unnecessary null dereference The label `out:` calls `destroy()` which dereferences `ctx`. This is unnecessary as there is nothing to destroy. Immediately return instead. CovID: 1258255 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-24 11:35:05 +01:00
Eric Engestrom	2f86582b92	.gitignore: Ignore tags generated by `make tags` Signed-off-by: Eric Engestrom <eric@engestrom.ch> [Emil Velikov: rebase] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 11:33:48 +01:00
Eric Engestrom	f6b9fb6e4c	st/xvmc: fix a couple 'unused-but-set-variable' warnings Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 11:32:00 +01:00
Eric Engestrom	49dad1aafd	egl: turn a couple asserts static (compile-time) Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 11:30:15 +01:00
Eric Engestrom	8af1b540c5	i915: remove unnecessary `if` if (x) return true; else return false; can be simplified as: return x; since `x` is already a boolean expression. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 11:17:05 +01:00
Eric Engestrom	253274351f	i965: remove unnecessary `if` if (x) return true; else return false; can be simplified as: return x; since both `x` are already boolean expressions. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 11:17:05 +01:00
Alejandro Piñeiro	07fe2d565b	program_resource: subroutine active uniforms should return NumSubroutineUniforms Before this commit, GetProgramInterfaceiv for pname ACTIVE_RESOURCES and all the <shader>_SUBROUTINE_UNIFORM programInterface were returning the count of resources on the shader program using that interface, instead of the num of uniform resources. This would get a wrong value (for example) if the shader has an array of subroutine uniforms. Note that this means that in order to get a proper value, the shader needs to be linked, something that is not explicitly mentioned on ARB_program_interface_query spec, but comes from the general definition of active uniform. If the program is not linked we return 0. v2: don't generate an error if the program is not linked, returning 0 active uniforms instead, plus extra spec references (Tapani Palli) Fixes GL44-CTS.program_interface_query.subroutines-compute Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-08-24 11:33:04 +02:00
Stencel, Joanna	690ead4a13	egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface. Segfault occurs when destroying EGL surface attached to already destroyed Wayland window. The fix is to set to NULL the pointer of surface's native window when wl_egl_destroy_window() is called. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Stencel, Joanna <joanna.stencel@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 10:18:13 +01:00
Kai Wasserbäch	f033d97155	st/va: Remove unused variable coded_size from vlVaEndPicture() Removes the following GCC warning: ../../../../../src/gallium/state_trackers/va/picture.c:542:17: warning: unused variable 'coded_size' [-Wunused-variable] unsigned int coded_size; ^~~~~~~~~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-08-24 10:35:53 +02:00
Kai Wasserbäch	83d08d4cab	st/va: Remove else case in vlVaEndPicture() made superfluous by `c59628d11b` Commit `c59628d11b` made the else statement and duplication of the context->decoder->end_frame() call superfluous. Cc: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-08-24 10:35:20 +02:00
Eric Engestrom	cd340052ad	st/va: add missing mutex_unlock Fixes: `c59628d11b` ("st/va: enable dual instances encode by sync surface") Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-08-24 10:33:07 +02:00
Kenneth Graunke	e7530bfcd6	aubinator: Style fixes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-23 21:19:58 -07:00
Sirisha Gandikota	56ba9656bb	aubinator: Fix the tool to correctly decode the DWords Several fixes have been added as part of this as listed below: 1) Fix the mask and add disassembler handling for STATE_DS, STATE_HS as the mask returned wrong values of the fields. 2) Fix the GEN_TYPE_ADDRESS/GEN_TYPE_OFFSET decoding - the address/ offset were handled the same way as the other fields and that gives the wrong values for the address/offset. 3) Decode nested/recurssive structures - Many packets contain nested structures, ex: 3DSATE_SO_BUFFER, STATE_BASE_ADDRESS, etc contain MOC structures. Previously, the aubinator printed 1 if there was a MOC structure. Now we decode the entire structure and print out its fields. 4) Print out the DWord address along with its hex value - For a better clarity of information, it is helpful to print both the address and hex value of the DWord along with the DWord count. Since the DWord0 contains the instruction code and the instruction length, it is unnecessary to print the decoded values for DWord0. This information is already available from the DWord hex value. 5) Decode the <group> and the corresponding fields in the group- The <group> tag can have fields of several types including structures. A group can contain one or more number of fields and this has be correctly decoded. Previously, aubinator did not decode the groups or the fields/structures inside them. Now we decode the <group> in the instructions and structures where the fields in it repeat for any number of times specified. v2: Fix the formatting (per Matt) Make the start and end pos calculation to extract fields from a DWord more appropriate by moving %32 away from mask() method Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ben Widawsky <ben@bwidawsk.net>	2016-08-23 21:19:55 -07:00
Kristian Høgsberg Kristensen	3e218ad7f8	aubinator: Add a new tool called Aubinator to the src/intel/tools folder. The Aubinator tool is designed to help the driver developers in debugging the driver functionality by decoding the data in the .aub files. Primary Authors of this tool are Damien Lespiau <damien.lespiau at intel.com> and Kristian Høgsberg Kristensen <krh at bitplanet.net>. v2: Review comments are incorporated by Sirisha Gandikota as below: 1) Make Makefile.am more crisp, reuse intel_aub.h from libdrm (per Emil) 2) Aubinator will use platform name instead of GEN number (per Matt) 3) Disassmebler gets created based on pciid rather then GEN number (per Matt) 4) Other formatting comments (per Ken, Matt and Emil) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ben Widawsky <ben@bwidawsk.net>	2016-08-23 21:19:33 -07:00
Kenneth Graunke	eb1a0ddfd5	glsl: Mark tessellation qualifier maps static const. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-23 21:15:59 -07:00
Jason Ekstrand	70bc891c42	isl/formats: Integer formats are not filterable In `ca2a8e5628`, we updated the format table to add more formats (most of which are new on SKL) but accidentally marked some integer formats as filterable. You can't filter an integer format. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-23 16:51:34 -07:00
Ilia Mirkin	361678edd7	st/dri: respect driver's request to avoid mixed color/depth bit configs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-23 18:30:53 -04:00
Ilia Mirkin	9515d651f9	gallium: add a cap to expose whether driver supports mixed color/zs bits Some hardware can't render to color/depth buffers of mixed bitness. When that happens a fallback has to happen, but this allows the driver to express that this isn't an optimal scenario. The purpose of this is to remove such fbconfigs from the GLX/EGL config list. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-23 18:30:49 -04:00
Ilia Mirkin	528390021f	dri: add a way to request that modes have matching color/zs depths Some GPUs, notably nv3x/nv4x can't render to mismatched color/zs framebuffer depths. Fallbacks can be done by the driver, with shadow surfaces, but no reason to encourage applications to select non-matching glx visuals. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-23 18:30:30 -04:00
Ilia Mirkin	092f994a03	nv50/ir: make sure cfg iterator always hits all blocks In some very specially-crafted cases, we could attempt to visit a node that has already been visited, and then run out of bb's to visit, while there were still cross blocks on the list. Make sure that those get moved over in that case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96274 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-08-23 18:30:12 -04:00
Jason Ekstrand	7bdccd104b	anv/clear: Clear E5B9G9R9 images as R32_UINT We can't actually clear these images normally because we can't render to them. Instead, we have to manually unpack the rgb9e5 color value on the CPU and clear it as R32_UINT. We still have a bit of work to do to clear non-power-of-two images, but this should get all of the power-of-two clears working on at least Haswell. This fixes three of the new Vulkan CTS tests in the dEQP-VK.api.image_clearing.clear_color_image.* group. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-23 11:45:25 -07:00
Jason Ekstrand	afa7ca0f77	anv/clear: Make cmd_clear_image take an actual VkClearValue Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	cf3cf2ecfc	anv/blit2d: Add support for RGB destinations This fixes 104 of the new image_clearing and copy_and_blit Vulkan CTS tests. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	16ddda8452	anv/blit2d: Add a format parameter to bind_dst and create_iview Signed-off-by: Jasosn Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	954c0bfb20	anv/image: Don't create invalid render target surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	ca2a8e5628	isl/formats: Update the table with more samplable formats There were a lot of formats where support was added on Haswell or later but we never updated the format table. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	aba9e25b70	isl/formats: Report ETC as being samplable on Bay Trail Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	f6967ddd32	i965/surface_formats: Don't advertise 8 or 16-bit RGB formats We have implicitly been not advertising these formats since we had them turned off in the format capabilities table. We are about to update that table and this prevents a change in behavior. The only change in behavior created by this patch is that we no longer advertise support for R16G16B16_FLOAT which means that it's now renderable which seems like a bonus. Maybe someday we'll want to change things to start supporting 16-bit RGB formats natively but, at the moment, there's no need. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	fb90291dd5	anv/formats: Don't use an RGBX format if it isn't renderable The whole point of using RGBX is so that we can render to it so if it isn't renderable, that kind-of defeats the purpose. Some formats (one example is R32G32B32X32_SFLOAT) exist in the format table but aren't actually renderable. Eventually, we'd like to get away from RGBX entirely, but this fixes hangs on BDW today. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-23 11:45:24 -07:00
Nicolas Boichat	4f3f8bb59d	egl/dri2: dri2_initialize: Do not reference-count TestOnly display In the case where dri2_initialize is called with a TestOnly display, the display is not actually initialized, so dri2_egl_display always fails, and we cannot do any reference counting. Fixes piglit spec@egl_khr_create_context@verify gl flavor (reproducible with LIBGL_ALWAYS_SOFTWARE=1). Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reported-by: Michel Dänzer <michel@daenzer.net> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-23 18:08:17 +01:00
Jan Ziak	6687037f1f	vbo: fix format string compiler warning for 32-bit machines Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-23 07:31:28 -06:00
Dongwon Kim	c6e97aaf75	egl/dri2: remove error checks on return values from mtx_lock and cnd_wait This removes unnecessary error checks on return result of mtx_lock and cnd_wait calls as in all other places in MESA source since there is no chance that any of these functions return any of error codes in current implementation. This patch also removes a redundent _eglError call that follows EGL_FALSE check in the bottom of dri2_client_wait_sync. Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-23 12:00:45 +01:00
Dave Airlie	96ea753d9e	i965: report bound buffer size not underlying buffer size for image size (v2) This seems to make sense, the image is bound to a subset of the buffer so the image size should be from the bound size not the underlying object. This fixes: GL44-CTS.shader_image_size.advanced-nonMS-fs-int v2: get mininum of the two values, same as we write to the hw. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-08-23 13:39:15 +10:00
Jason Ekstrand	34ff4fbba6	anv: Throw INCOMPATIBLE_DRIVER for non-fatal initialization errors The only reason we should throw INITIALIZATION_FAILED is if we have found useable intel hardware but have failed to bring it up for some reason. Otherwise, we should just throw INCOMPATIBLE_DRIVER which will turn into successfully advertising 0 physical devices Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-22 18:49:49 -07:00
Dave Airlie	26187f3890	st/glsl_to_tgsi: fix st_src_reg_for_double constant. This needs to set the src swizzle so it doesn't access the .zw members ever when we are just emitting a 0 constant here. This fixes: vert-conversion-explicit-dvec3-bvec3.shader_test and a bunch of other fp64 tests on softpipe and radeonsi. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-08-23 11:14:03 +10:00
Dave Airlie	0bce055d9e	mesa/subroutines: drop the old subroutine index uploads. We used to upload the indices when they changed, now we rely on the drivers calling the correct hook to have the values updated from the context storage. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2016-08-23 11:03:46 +10:00
Dave Airlie	6a332a389a	st/mesa: use the new subroutine index upload API. This plugs the new API into the gallium state tracker. Signed-off-by: Dave Airlie <airlied@redhat.com> Acked-by: Andres Gomez <agomez@igalia.com>	2016-08-23 11:03:45 +10:00
Dave Airlie	4adad99cfb	i965: use new subroutine index uploader. This plugs the subroutine index updates into the i965 backend, where it loads constants. Signed-off-by: Dave Airlie <airlied@redhat.com> Acked-by: Andres Gomez <agomez@igalia.com>	2016-08-23 11:03:45 +10:00
Dave Airlie	ea783667e4	mesa: add api to write subroutine indicies to the program storage. This writes the subroutine indicies to the program storage for a stage. This API is intended to be used by drivers to update the uniform storage before uploading to the hw. This isn't the most thread safe effort, but it will be significantly more multi-context safe. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2016-08-23 11:03:45 +10:00
Dave Airlie	4566aaaa5b	mesa/subroutines: start adding per-context subroutine index support (v1.1) One piece of ARB_shader_subroutine I ignored was the fact that it needs to store the subroutine index data per context and not per shader program. There is one CTS test that tests this: GL45-CTS.shader_subroutine.multiple_contexts However the test only does a write to context and readback, it never renders using the values, so this is enough to fix the test however not enough to do what the spec says. So with this patch the info is now stored per context, but it gets updated into the program at UseProgram and when the values are inserted into the context, which won't help if multiple contexts are in use in multiple threads. v1.1: cleanups and nit-picks (Andres) Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2016-08-23 11:03:45 +10:00
Matt Turner	27d20ee264	vbo: Make #if 0'd debugging code compile.	2016-08-22 16:31:50 -07:00
Timothy Arceri	8ee909ee42	nir: avoid segfault when ssa src not found Without this the following line will segfault and we don't get to see the results of the validate_assert() above. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-08-23 09:06:29 +10:00
Eric Anholt	47e3cc7557	vc4: Tell state_tracker that we would prefer NIR. Before this series, the code generation path was: GLSL IR -> TGSI -> NIR -> NIR clone -> QIR -> QPU Now it's (generally) GLSL IR -> NIR -> NIR clone -> QIR -> QPU	2016-08-22 12:11:08 -07:00
Eric Anholt	d08f09c24e	st/nir: Trim out unused VS input variables. If we're going to skip setting up vertex input data in them, we should probably not leave them as vertex inputs with a driver_location that happens to alias to something else. Fixes a regression in glsl-mat-attribute on vc4 when enabling GTN. v2: Change commit message shortlog, lower the new globals away before handing off to the driver. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-22 12:11:05 -07:00
Eric Anholt	3ef1853f7d	nir: Fix crash in nir_lower_drawpixels. Generally you'd see the gl_Color reference first and get some cursor set. However, in piglit draw-pixel-with-texture we're now seeing the TexCoord dereferenced first. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-08-22 11:52:27 -07:00
Eric Anholt	0a8ff1681b	nir: Fix a comment typo in nir_lower_drawpixels. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-08-22 11:52:26 -07:00
Eric Anholt	f4d143f0d9	vc4: Use proper type sizes for uniforms.	2016-08-22 11:52:26 -07:00
Eric Anholt	bdb54cdc16	vc4: Add VARYING_SLOT_PNTC support. We end up with this when doing GLSL-to-NIR.	2016-08-22 11:52:26 -07:00
Eric Anholt	3c1ea6e651	vc4: Fix vc4_nir_lower_io for non-vec4 I/O. To support GLSL-to-NIR, we need to be able to support actual float/vec2/vec3 varyings.	2016-08-22 11:52:26 -07:00
Eric Anholt	e8378fee0c	nir: Define system values for vc4's blending-lowering arguments. In the GLSL-to-NIR conversion of VC4, I had a bit of trouble with what I was calling the "state uniforms" that I was putting into the NIR fighting with its other lowering passes. Instead of using magic uniform base numbers in the backend, follow the lead of load_user_clip_plane and just define system values for them. v2: Fix unintended change to channel_num, drop unspecified const_index value on blend_const_color_r_float. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-22 11:52:26 -07:00
Lionel Landwerlin	475ce61d1a	anv: GetDeviceImageFormatProperties: fix TRANSFER formats We let the user believe we support some transfer formats which we don't. This can lead to crashes when actually trying to use those formats for example on dEQP-VK.api.copy_and_blit.image_to_image.* tests. Let all formats we can render to or sample from as meta implements transfers using attachments. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-22 10:41:30 -07:00
Marek Olšák	0328b20050	gallium/hud: round max_value to print nicely rounded numbers next to graphs This improves readability a lot. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	0f1befe926	gallium/hud: generalize code for drawing numbers next to graphs Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	a33eb48d61	gallium/hud: draw numbers with 3 decimal places if those aren't 0 Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	b9c9551c09	gallium/hud: use sRGB for nicer AA lines Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	6ffde82083	gallium/hud: use AA lines for graphs this looks a lot better (with the next patch) Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	6902f9e82a	gallium/hud: don't enable blending for all objects Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Tapani Pälli	0abebec012	util: add assert that key cannot be NULL on insertion Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-22 07:37:55 +03:00
Tapani Pälli	68233801ae	glsl: fix key used for hashing switch statement cases Implementation previously used value itself as the key, however after hash implementation change by `ee02a5e` we cannot use 0 as key. v2: use constant pointer as the key and implement comparison for contents (Eric Anholt) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97309	2016-08-22 07:36:33 +03:00
Mauro Rossi	a5f445640e	android: i965: add per-gen libmesa_i965_gen{8,9} static Needed to fix android build after commit `16a9fcb` which enabled genxml for gen{8,9} state setup This is the last patch needed, android build tested successfully.	2016-08-20 16:18:31 -07:00
Mauro Rossi	9dc70a71f8	android: i965: add per-gen libmesa_i965_gen{7,75} static libraries Needed to fix android build after commit `e198983` which enabled genxml for gen{7,75} state setup Android build fix for gen{8,9} will follow as incremental patch, build tested successfully with all per-gen patches applied.	2016-08-20 16:18:28 -07:00
Mauro Rossi	7478ddad29	android: i965: add per-gen libmesa_i965_gen6 static library Needed to fix android build after commit `c8bc1ae` where new per-gen genX_blorp.c source replaced gen6_blorp.c for gen6 Android build fixes for gen{7,75} and gen{8,9} will follow as incremental patches, build tested successfully with all per-gen patches applied.	2016-08-20 16:18:26 -07:00
Kenneth Graunke	7db81d9a87	glsl: Rename link_fs_input_layout_qualifiers to "inout". We're going to handle output qualifiers here too, and calling it "inout" seems to be the going convention. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-20 13:52:25 -07:00
Matt Turner	7e3e1bed03	i965/cfg: Factor common code out of switch statement. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-20 11:40:42 -07:00
Jason Ekstrand	a2ae67aa47	anv: Give the installed intel_icd.json file an absolute path Not providing a path allows the ICD to work on multi-arch systems but breaks it if you install anywhere other than /usr/lib. Given that users may be installing locally in .local or similar, we probably do want to provide a filename. Distros can carry a revert of this commit if they want an intel_icd.json file without the path. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chad Versace <chad@kiwitree.net>	2016-08-20 00:50:03 -07:00
Daniel Scharrer	16ef7ab5c1	mesa: Fix fixed function spot lighting on newer hardware (again) This was first fixed in commit `b3f9c5c` and then broken again in commit `fe2d2c7`, which removed the abs modifier from input registers. v2: Don't change the size of struct ureg. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Daniel Scharrer <daniel@constexpr.org>	2016-08-19 20:46:53 -07:00
Matt Turner	a9033d1dc1	i965: Remove comment within a comment.	2016-08-19 20:44:37 -07:00
Roland Scheidegger	0849621891	llvmpipe: fix issues with depth clamp We only did depth clamp when the value was written from the fs. This is very wrong both for d3d10 and GL, and only passed the corresponding piglit test due to pure luck (it no longer does with the enhanced test). Also, interpolation clamped values to 1.0 always, which can legitimately happen if depth clip is disabled, so fix that as well (untested). There is one unresolved issue left, d3d10 always does depth clamping, whereas GL does not (but does [0,1] clamp instead for fs depth outputs) - this information isn't in any gallium state object, leave it as-is for now (though it looks like llvmpipe misses the [0,1] clamp as well). This (with the previous patch) fixes piglit depth-clamp-range test. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-08-20 04:05:33 +02:00
Roland Scheidegger	b0a647f284	llvmpipe: fix depth clamping wrt reversed near/far values This wasn't handled before (the result was that no matter what value got clamped, it always ended up as the near value in this case) (if clamping actually happened). Fix this by using the util helper for that (the math is otherwise "mostly" the same, mostly because there could actually be differences due to float rounding, but I don't even know which one would be more correct). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-08-20 04:05:33 +02:00
Matt Turner	a73116ecc6	i965/sched: Simplify work done by add_barrier_deps(). Scheduling barriers are implemented by placing a dependence on every node before and after the barrier. This is unnecessary as we can limit the number of nodes we place dependencies on to those between us and the next barrier in each direction. Runtime of dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 is reduced from ~25 minutes to a little more than three. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94681 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 16:52:25 -07:00
Matt Turner	e7c376adfd	i965/vec4: Ignore swizzle of VGRF for use by var_range_end(). var_range_end(v, n) loops over the n components of variable number v and finds the maximum value, giving the last use of any component of v. Therefore it expects v to correspond to the variable associated with the .x channel of the VGRF. var_from_reg() however returns the variable for the first channel of the VGRF, post-swizzle. So, if the last register had a swizzle with y, z, or w in the swizzle component, we would read out of bounds. For any other register, we would read liveness information from the next register. The fix is to convert the src_reg to a dst_reg in order to call the dst_reg version of var_from_reg() that doesn't consider the swizzle. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 16:52:25 -07:00
Matt Turner	3ef31122d0	i965/vec4: Print spills:fills. Allows shader-db to work on vec4 programs (has been broken since shader-db commit 646df5ca98b2 from April!) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 16:52:25 -07:00
Ilia Mirkin	89f00f749f	a4xx: make sure to actually clamp depth as requested We were previously ... not clamping. I guess this meant that everything got clamped to 1/0, which was enough to pass the existing tests. Or perhaps the clamping would only happen to the rasterized depth value and not the frag shader's output depth value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-08-19 19:40:04 -04:00
Ilia Mirkin	cd8e30452f	a4xx: only disable depth clipping, not all clipping, when requested The previous bit disables the whole clipper, including the regular viewport-related clipping that would go on. The two new bits disable near and far clipping (separately, as verified with the depth-clamp-range piglit). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-08-19 19:40:04 -04:00
Eric Anholt	5adee83806	vc4: Switch store_output to using nir_lower_io_to_scalar / component.	2016-08-19 13:11:36 -07:00
Eric Anholt	f8fecc396a	vc4: Use the intrinsic's first_component for vattr VPM index. Avoids another multiplication by 4 of the base in the NIR.	2016-08-19 13:11:36 -07:00
Eric Anholt	cbf8c19410	vc4: Convert to using nir_lower_io_scalar for FS inputs. The scalarizing of FS inputs can be done in a non-driver-dependent manner, so extract it out of the driver.	2016-08-19 13:11:36 -07:00
Eric Anholt	c30b22c421	vc4: Switch to using the intrinsic accessors. The const_index[] values have always felt magic, and this documents them a bit better.	2016-08-19 13:11:36 -07:00
Eric Anholt	9f1411d1ec	nir: Add an IO scalarizing pass using the intrinsic's first_component. vc4 wants to have per-scalar IO load/stores so that dead code elimination can happen on a more granular basis, which it has been doing in the backend using a multiplication by 4 of the intrinsic's driver_location. We can represent it properly in the NIR using the first_component field, though. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	c35f979220	nir: Add nir_builder support for individual system value loads. The previous nir_load_system_value(b, nir_intrinsic_load_whatever), 0) was rather verbose, when system values should be easy to generate. The index is left out because only one system value had an index included in it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	24728637e2	nir: Move the undef of nir_intrinsics.h macros to the .h. I wanted to include this from nir_builder as well, so it also needed the undefs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	c078c41520	ttn: Use nir_load_front_face instead of the TGSI-style input. This reduces the diff between GLSL-to-NIR and TGSI-to-NIR, and gives NIR more optimization to work on. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	3f607f9e4f	nir: Use the system-value front face for twoside lowering. GLSL-to-NIR generates system value usage, and vc4/freedreno would both like the system value instead of the varying, so switch this pass over to it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	ed92241d78	ttn: Make FRAG_RESULT_DEPTH be a float variable to match gtn and ptn. This lets TTN-using drivers handle FRAG_RESULT_DEPTH the same between all their source paths. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-08-19 13:11:36 -07:00
Eric Anholt	d80d03b830	vc4: Dump the TGSI before trying to convert it to NIR. In the case of debugging a crash in TTN, this is nice to have.	2016-08-19 13:11:36 -07:00
Boyuan Zhang	c0be51f270	radeon/vce: set flag based on dual instance enablement Set the flag on when dual instance encoding is supported, otherwise set it to off. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-08-19 10:36:44 -04:00
Boyuan Zhang	c59628d11b	st/va: enable dual instances encode by sync surface This patch improves the performance of Vaapi Encode by enabling dual instances encoding. flush function is not called after each end_frame call. radeon/vce will do flush whenever 2 frames are submitted for encoding. Implement sync surface function to flush only if the frame hasn't been flushed yet. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-08-19 10:36:44 -04:00
Jason Ekstrand	93d2b5c576	i965/blorp: Remove no longer used state setup helpers Now that we're using genxml for everything, we no longer need the hand-rolled state emit helpers. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	16a9fcbbb6	i965/blorp: Use genxml for gen8-9 state setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	e198983c61	i965/blorp: Use genxml for gen7 state setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	344841fcba	i965/blorp: Add genxml-based vertex setup helpers Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	7b035fd0c9	i965/blorp: Add a helper for emitting surface states The new helper emits surface states and the binding table in one go. It's nice to have it pulled out of the main blorp_exec function. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	48f13545dd	i965/blorp: Add genxml-based sampler state emit function Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	eb655c4fc2	i965/blorp: Add genxml-based dynamic state emit functions Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	c8bc1ae96a	i965: Move gen6_blorp.c to a file that gets recompiled per-gen At the moment, it's only used for gen6 but that will change soon. We use the genX prefix for recompiled things in the Vulkan driver. It isn't great, but it seems to have worked ok. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	eea6a66222	i965/blorp/gen6: Use genxml packing structs for state setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	b5c20a98c1	i965/blorp: Stop setting point and line rasterization rules Blorp never uses points or lines and the default values of 0 are perfectly fine. Explicitly setting them is just noise. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	5e2dd7a381	i965/blorp/gen8: Move viewport setup to after wm state This matches gen6 and gen7. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	802f0f8596	i965/blorp/gen6-7: Move multisample setup to right after samplers This mimics gen8 blorp Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	75304fdbd8	i965/blorp/gen6-7: Move surfaces and samplers closer together This mimics what we do on gen8. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	8b0426ddd4	i965/blorp/gen7-8: Emit depth stencil state with CC and BLEND All three go together on SNB so let's keep them together for gen7+ as well. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	38c1909c0a	i965/blorp/gen6: Move constant disables higher up This is what gen7-8 do and it's a bit cleaner. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	e0bc2cb145	i965/blorp: Don't clear an empty region Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	e4d6ffbbf6	i965/blorp: Move the non-static blorp state setup helpers to another file We're about to start replacing blorp state setup code with packing structs and we want to feel free to delete files as we go. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	50768a3879	i965/blorp: Make gen6 VS and GS disable helpers static Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	949a892026	i965: Roll intel_reg.h into brw_defines.h More than half of the stuff in intel_reg.h had nothing whatsoever to do with registers and really belongs in brw_defines.h anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	8455f9430f	i965: Stop including brw_defines.h in brw_state.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	4c3acf94da	i965/state: Move is_drawing_lines/points to gen6_clip_state.c Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	04f3594cd5	genxml/gen9: Make 3DSTATE_SBE::AttributeActiveComponentFormat an array Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	bfdff28d68	genxml: Add a uint MOCS field to VERTEX_BUFFER_STATE Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	373613fa4b	genxml: Make a couple of VERTEX_BUFFER_STATE fields boolean Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	29f1f945a6	genxml: Make VERTEX_ELEMENT_STATE::Valid a bool Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	eb2589cba6	genxml/gen6: Make SAMPLER_STATE look a bit more like gen7 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	2a84e40dae	genxml: Add a uint MOCS field to DEPTH_BUFFER packets This is easier than dealing with structs all the time Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	3f1022b029	genxml/gen6: Make "Depth Clear Value" a uint The actual data storred is in float, UNORM24, or UNORM16 depending on the actual depth format. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	be62e7645e	genxml/gen6: Add the 3D_Prim_Topo_Type enum Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	cca95a7bd6	genxml/gen6: Fix the length of 3DSTATE_WM Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	3ddb6f6e2a	genxml/gen6: Add a Surface Base Address field to HIER_DEPTH_BUFFER Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	be52e16dbc	genxml/gen6: Add uint MOCS fields for most things Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Kenneth Graunke	7d0554f341	nir: Rely on the fact that bcsel takes a well formed boolean. According to Connor, it's safe to assume that the first operand of bcsel, as well as the operand of b2f and b2i, must be well formed booleans. https://lists.freedesktop.org/archives/mesa-dev/2016-August/125658.html With the previous improvements to a@bool handling, this now has no change in shader-db instruction counts on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-19 02:05:23 -07:00
Francisco Jerez	7ceb42ccc5	i965/sched: Change the scheduling heuristics to favor early program termination. This uses the unblocked time of the exit assigned to each available node to attempt to unblock exit nodes as early as possible, potentially reducing the runtime of the shader when an exit branch is taken. There is a natural trade-off between terminating the program as early as possible and reducing the worst-case latency of the program as a whole (since this will typically move exit-unblocking nodes closer to its dependencies potentially causing additional stalls of the execution pipeline), but in practice the bandwidth and ALU cycle savings from terminating the program earlier tend to outweigh the slight increase in worst-case program execution latency, so it makes sense to prefer nodes likely to unblock an earlier exit regardless of the latency benefits of other available nodes. I haven't observed any benchmark regressions from this change after testing on VLV, HSW, BDW, BSW and SKL. The FPS of the GfxBench Manhattan benchmark increases by 10%-20% and the FPS of Unigine Valley improves by roughly 5% depending on the platform and settings. The change to the register pressure-sensitive heuristic is rather conservative and gives precedence to the existing heuristic in order to avoid increasing register pressure and causing spill count and SIMD width regressions in shader-db. It may make sense to revisit this with additional performance data. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 20:05:00 -07:00
Francisco Jerez	4147ca75d5	i965/sched: Assign a preferred exit node to each node of the dependency graph. This adds a bit of metadata to schedule_node that will be used to compare available nodes in the scheduling heuristic code based on which of them unblocks the earliest successor exit node. Note that assigning exit nodes wouldn't be necessary in a bottom-up scheduler because we could achieve the same effect by scheduling the exit nodes themselves appropriately. No shader-db changes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 20:05:00 -07:00
Francisco Jerez	b295d7ca32	i965/sched: Calculate the critical path of scheduling nodes non-recursively. The critical path of each node is calculated by induction based on the critical paths of its children, which can be done in a post-order depth-first traversal of the dependency graph. The current code implements graph traversal by iterating over all nodes of the graph and then recursing into its children -- But it turns out that recursion is unnecessary because the lexical order of instructions in the block is already a good enough reverse post-order of the dependency graph (if it weren't a reverse post-order some instruction would have been located before one of its dependencies in the original ordering of the basic block, which is impossible), so we just need to walk the instruction list in reverse to achieve the same result more efficiently. No shader-db changes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 20:05:00 -07:00
Francisco Jerez	b2b621a0ec	i965/fs: Switch to per-subspan discard jumps. ANY4H is more efficient than ANY8H and ANY16H because it makes sure that whenever a whole subspan hits a discard statement it gets disabled by the EU until the end of the program, regardless of whether the discard condition is uniform across all channels of the SIMD8-16 thread. OTOH ANY8H/ANY16H would cause the rest of the program to be executed for all channels if only one of the channels hadn't taken the discard branch, potentially increasing the bandwidth and ALU usage of the program unnecessarily. This change increases the FPS by over 3x of a simple micro-benchmark that discards a bunch of fragments and then does a single costly texturing operation. I've just re-verified the FPS change on HSW and SKL, but I expect all platforms from Gen6 up to get a similar benefit. Note that we could potentially be more aggressive and use the NORMAL predicate to discard individual channels, but that would need to happen post-scheduling because the scheduler currently doesn't care to reorder HALT instructions with respect to other instructions, and the NORMAL predicate would cause the results of subsequent derivative computations to become undefined -- If the scheduler didn't reorder HALT instructions it would actually be safe to switch to NORMAL because the behavior of derivative computations after a non-uniform discard statement is undefined by the GLSL spec, but that would make the optimization implemented by one of the following commits somewhat more difficult. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 20:05:00 -07:00
Francisco Jerez	01b321f242	i965/fs: Drop bogus writemasking disable bit from HALT instructions. This may have been the reason people ran into problems with non-uniform HALT instructions and ended up using the inefficient ANY16H/ANY8H predicates instead of ANY4H or NORMAL in order to prevent non-uniform discard. The HALT instruction is able to handle non-uniform execution masks just fine. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 20:04:59 -07:00
Ilia Mirkin	27e59ed477	mesa: avoid valgrind warning due to opaque only being set sometimes Valgrind complains with a "Conditional jump or move depends on uninitialised value(s)" warning due to opaque being conditionally initialized. However in the punchthrough_alpha == true case, it is always initialized, so just flip the condition around to silence the warning. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-08-18 22:48:55 -04:00
Ilia Mirkin	59bb821180	vbo: remove unnecessary max_basevertex computation The max basevertex is already computed and added into max_index by the caller, _tnl_draw_prims. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-18 20:26:34 -04:00
Ilia Mirkin	659dc10d32	vbo: add basevertex when looking up elements for vbo splitting Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97351 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-18 20:26:22 -04:00
Marek Olšák	07ccec002b	radeonsi: initialize and finalize the LLVM function pass manager Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2016-08-18 21:36:03 +02:00
Emil Velikov	d61d259518	isl: automake: use VISIBILITY_CFLAGS to restrict symbol visibility v2: Add VISIBILITY_CFLAGS to AM_CFLAGS (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-18 15:06:19 +01:00
mil Velikov	ebd5dc8826	anv: remove dummy VK_DEBUG_MARKER_EXT entry points The vkCmdDbgMarker{Begin,End} symbols are exported, yet the json does no advertise that the driver supports the extension. Furthermore the functions are empty stubs. Remove those until we get a proper implementation and json notation. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-18 15:05:32 +01:00
Emil Velikov	49394e8d77	anv: do not export the Vulkan API With version 1 of the Loader interface there is an internal/private symbol (vk_icdGetInstanceProcAddr) which is used to retrieve all the API from the Vulkan entrypoints from the ICD. Implying that exposing the Vulkan API is not recommended. Version 2 goes a step further explicitly forbiding the ICD from exposing Vulkan symbols (and adding a negotiation API) As a reference: - Nvidia 367.35 Missing negotiation API - version 1. Exposes only vk_icdGetInstanceProcAddr. - AMD 16.30.3.306809 Have negotiation API - version 2, Exposes vk_icdGetInstanceProcAddr. Exposes a couple of Vulkan entry points - seems to be in violation with the spec. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 14:55:42 +01:00
Emil Velikov	1cdb6ca40b	anv: automake: build with -Bsymbolic Explicitly suggested in the Loader interface version 2 section, but it's good idea either way. It essentially, ensures that our symbols are not interposed. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-18 14:53:33 +01:00
Emil Velikov	40e4fff563	anv: automake: use VISIBILITY_CFLAGS to restrict symbol visibility Hide the internal symbols and annotate the vk_icdGetInstanceProcAddr as public since the loader needs it (since v1 of the loader interface). v2: Add VISIBILITY_CFLAGS to AM_CFLAGS (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-18 14:53:30 +01:00
Emil Velikov	b0d56f2f4f	anv: remove internal 'validate' layer Presently the layer has only a single entry point. As mentioned by Jason the function does not validate anything that isn't checked elsewhere, thus we can drop the whole thing. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-18 14:53:24 +01:00
Kenneth Graunke	3a9e6102b4	nir/search: Extend 'a@bool' to handle a couple of system values. load_front_face and load_helper_invocation produce booleans. On Broadwell: total instructions in shared programs: 11638956 -> 11638011 (-0.01%) instructions in affected programs: 115093 -> 114148 (-0.82%) helped: 628 HURT: 14 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 01:27:27 -07:00
Kenneth Graunke	e8543feba7	nir/search: Fold src_is_bool()/alu_instr_is_bool() into src_is_type(). I don't want src_is_bool() and src_is_type(x, nir_type_bool) to behave differently. Having the logic spread out over three functions makes it harder to decide where to put new logic, as well. So, combine them all. It's a bit simpler because there's now only one recursive function rather than a pair of mutually recursive functions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 01:27:15 -07:00
Kenneth Graunke	241870fe5b	nir/search: Introduce a src_is_type() helper for 'a@type' handling. Currently, 'a@type' can only match if 'a' is produced by an ALU instruction. This is rather limited - there are other cases we can easily detect which we should handle. Extending the code in-place would be fairly messy, so we introduce a new src_is_type() helper. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 01:26:47 -07:00
Kenneth Graunke	d14dd727f4	i965: Fix barrier count shift in scalar TCS backend. The "Barrier Count" field goes in 14:9 of m0.2. The vec4 backend correctly shifts by 9, but the scalar backend only shifted by 8. It's not like this changed - I think I just made a typo when writing the original scalar TCS backend code. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-08-18 00:47:00 -07:00
Kenneth Graunke	159f037755	i965: Fix execution size of scalar TCS barrier setup code. Previously, the scalar TCS backend was generating: mov(8) g17<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; and(8) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all 1Q }; shl(8) g17.2<1>UD g17.2<8,8,1>UD 0x0000000bUD { align1 WE_all 1Q }; or(8) g17.2<1>UD g17.2<8,8,1>UD 0x00008200UD { align1 WE_all 1Q }; send(8) null<1>UW g17<8,8,1>UD gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q }; This is rubbish - g17.2<8,8,1>UD spans two registers, and is an illegal region. Not to mention it clobbers 8 channels of data when we only wanted to touch m0.2. Instead, we want: mov(8) g17<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; and(1) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all }; shl(1) g17.2<1>UD g17.2<0,1,0>UD 0x0000000bUD { align1 WE_all }; or(1) g17.2<1>UD g17.2<0,1,0>UD 0x00008200UD { align1 WE_all }; send(8) null<1>UW g17<8,8,1>UD gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q }; Using component() accomplishes this. Fixes GL44-CTS.tessellation_shader.tessellation_shader_tc_barriers. barrier_guarded_read_write_calls on Skylake. Probably fixes other barrier issues on Gen8+. v2: Use a group(1, 0) builder so inst->exec_size is set correctly (thanks to Francisco Jerez for catching that it was incorrect). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-18 00:47:00 -07:00
Kenneth Graunke	9e778837ff	i965: Implement the WaPreventHSTessLevelsInterference workaround. Fixes several GL44-CTS.tessellation_shader (and GL45 and ES31) subcases: - vertex_spacing - tessellation_shader_point_mode.points_verification - tessellation_shader_quads_tessellation.inner_tessellation_level_rounding Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-08-18 00:46:55 -07:00
Kenneth Graunke	d8971128ac	nir/builder: Add bany_inequal and bany helpers. The first simply picks the bany_inequal[234] opcodes based on the SSA def's number of components. The latter implicitly compares with zero to achieve the same semantics of GLSL's any(). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-08-18 00:46:04 -07:00
Kenneth Graunke	01e99cba04	mesa: Fix uf10_to_f32() scale factor in the E == 0 and M != 0 case. GL_EXT_packed_float, 2.1.B Unsigned 10-Bit Floating-Point Numbers: 0.0, if E == 0 and M == 0, 2^-14 * (M / 32), if E == 0 and M != 0, 2^(E-15) * (1 + M/32), if 0 < E < 31, INF, if E == 31 and M == 0, or NaN, if E == 31 and M != 0, In the second case (E == 0 and M != 0), we were multiplying the mantissa by 2^-20, when we should have been multiplying by 2^-19 (which is 2^(-14 + -5), or 2^-14 * 2^-5, or 2^-14 / 32). The previous section defines the formula for 11-bit numbers, which is: 2^-14 * (M / 64), if E == 0 and M != 0, In other words, we had accidentally copy and pasted the 11-bit code to the 10-bit case, and neglected to change the exponent. Fixes dEQP-GLES3.functional.pbo.renderbuffer.r11f_g11f_b10f_triangles when run with surface dimensions of 1536x1152 or 1920x1080. Cc: mesa-stable@lists.freedesktop.org References: https://code.google.com/p/chrome-os-partner/issues/detail?id=56244 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com> Reviewed-by: Antia Puentes <apuentes@igalia.com>	2016-08-17 17:26:11 -07:00
Tim Rowley	0ff57446e3	swr: [rasterizer core] only use Viewport/Scissors during SwrDraw* operations Add explicit rects for: - SwrClearRenderTarget - SwrDiscardRect - SwrInvalidateTiles - SwrStoreTiles Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	6209dbf5a4	swr: [rasterizer common] reorder SWR_FORMAT_INFO Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	2a25ce7472	swr: [rasterizer core] make dirtytile list point directly to macrotilequeues Speeds up high geometry HPC workloads. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	550503e776	swr: [rasterizer core] portability - remove use of INT64 Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	d70f96fd67	swr: [rasterizer core] viewport transform disabled fix When viewport transform is disabled (ie. screen space coords are passed in directly), the W component should be interpreted as RHW. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	812b45d049	swr: [rasterizer core] clamp scissor rects to current tile rect Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	93fb768c7e	swr: [rasterizer core] align stats structures Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	9a25987b4a	swr: [rasterizer core] use AVX2 permute to simplify PaTriList Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	c7c1a03f90	swr: [rasterizer core] move some global variables to SWR_CONTEXT Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	b8c4717567	swr: [rasterizer core] change scale on VP matrix element gathers Was 1, which led to pulling denorms for non-zero indices. Changed to sizeof(float). Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:54 -05:00
Tim Rowley	d816c5d6ad	swr: [rasterizer] implementing native AVX-512 simd16 intrinsics Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:49 -05:00
Jason Ekstrand	342756a100	i965/blorp: Use nir_alu_type for the texture data type This lets us remove the brw_reg.h include Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	ce2a9831cc	i965: brw_blorp_blit.cpp -> blorp_blit.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	934adf1c30	i965: brw_blorp_clear.cpp -> blorp_clear.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	f5fbcc3683	i965: Split brw_blorp.c/h into multiple files This mega-commit pulls most of the i965-specific bits of blorp into the brw_blorp.c/h files which now contain nothing but i965 wrappers around "core blorp" calls. The "core blorp" api is moved into blorp.h and the internal blorp data structures are moved into blorp_priv.h. The new file blorp.c is created to house "core blorp" internals which are pulled from the old brw_blorp.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	075cc874bb	i965/blorp: Factor the guts of blorp_hiz_exec into a helper Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	9d22fd934a	i965/blorp: Break the guts of do_single_blorp_clear into two helpers The helpers are completely miptree-unaware and each fairly cleanly do a single thing. This does come at the downside of not doing proper debug reporting on whether or not we're doing replicated clears. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	7cddca39c0	i965/meta_util: Convert get_fast_clear_rect to take an isl_surf Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	376ce1d26e	i965/blorp/clear: Move isl_surf setup higher in the function Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	583f040fda	i965/blorp: Refactor fast-clear logic a bit This pulls the mcs allocation into the if statement where we initially determine that we are doing a fast clear and moves the programming of wm_inputs and figuring out the fast clear rect into it's own if statement. The next commit will put code inbetween the two. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	457a408932	i965/blorp/clear: Stop stomping the destination format The blorp_surface_info_init call above should set the format for us and stomping it later does nothing whatsoever. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	a6c2091da6	i965/meta_util: Only modify the input parameters in get_fast_clear_rect We had another inline copy of brw_meta_get_buffer_rect embedded in get_fast_clear_rect for no good reason. This lets us get rid of the gl_frameuffer parameter to get_fast_clear_rect. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	f748e15735	i965/blorp: Stop calling brw_meta_get_buffer_rect We already have an inlined version of the function slightly higher up in do_single_blorp_clear and all calling it does is stomp the values with the same thing. We might as well just get rid of it. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	18aad17ce2	i965/blorp: Pull the guts of resolve_color into a miptree-agnostic helper Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	dff74b83e1	i965/meta_util: Convert get_resolve_rect to use ISL Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	8fccdf85ba	i965/blorp: Make the guts of brw_blorp_blit_miptrees miptree-unaware Now that we have the brw_blorp_surf struct, we can start to make bits of blorp completely miptree-unaware. To start things off, we split the guts of brw_blorp_blit_miptrees into a brw_blorp_blit function which knows nothing about miptrees. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	75deae9c90	i965/blorp: Add a new brw_blorp_surf intermediate struct At the moment, this seems to make all of the interfaces messier rather than clener. However, it does provide a representation of a surface that simultaneously contains everything and is completely unaware of miptrees. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	57664c869f	i965/blorp: Use the isl_surf for more params setup The isl_surf munging doesn't happen until fairly late in the blorp_blit function. We can use the isl_surf for the vast majority if not all of our params setup. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	d8644f3eb6	i965/blorp: Do gen6 stencil offsets up-front This keeps all of the nastyness of gen6 stencil on the i965 side of the API line and lets us delete that nasty hand-rolled ISL-based offset path that we were using for ALL_SLICES_AT_EACH_LOD. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	406c503396	i965/blorp: Set up HiZ surfaces up-front Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	4d86b3fa2d	i964/blorp: Set up most aux surfaces up-front This commit also adds support for an offset for aux surfaces. In GL, this only gets used for HiZ on SNB at the moment. However, in Vulkan, all aux surfaces are at a non-zero offset and that is likely to happen in GL eventually. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	d540864730	i965/blorp: Stop using the miptree in state setup for tex/rt surfaces This commit movies us from a miptree model to a surf+bo+offset model. In the GL driver, miptrees are almost always at the start of the bo so the offset is zero but we don't want to always make that assumption. In the sort term, gen6 stencil and HiZ will be at an offset but, in the long term, any Vulkan surface is liable to be at a non-zero offset. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	8b02cd44d7	i965/blorp/blit: Move format work-arounds before surface_info_init Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	20c06d2b79	i965/miptree: Add real support for HiZ The previous HiZ support was bogus because all of get_aux_isl_surf looked at mt->mcs_mt directly. For HiZ buffers, you need to look at either mt->hiz_buf or mt->hiz_buf->mt. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	dc880c99b6	isl/state: Only set clear color if aux is used Otherwise, the clear color will get ignored. This prevents assertion errors if clear color is set to something invalid and aux is not used. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	2684e48321	i965/miptree: Use the isl helpers for creating aux surfaces In order for the calculations of things such as fast clear rectangles to work, we need more details of the auxiliary surface to be correct. In particular, we need to be able to trust the width and height fields. (These are not necessarily what you want coming out of the miptree.) The only values state setup really cares about are the row and array pitch and those we can safely stomp from the miptree. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	d9df82f2ff	isl: Add helpers for creating different types of aux surfaces Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	3c44d99653	i965/miptree: Use mcs_mt->qpitch for aux surfaces At one point, we were doing this correctly. It must have gotten lost in one of the many rebases. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	67ea60db0b	i965/miptree: Allow get_aux_isl_surf when there is no aux surface Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	dd46c8da31	i965/miptree: Support depth in get_isl_clear_color Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	6155d4ef56	isl/state: Add an assertion for IVB multisample array textures Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	3c75b315e1	isl: Add a #define for DEV_IS_BAYTRAIL Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	56746d04d5	i965/blorp: Remove unused fields from blorp_surface_info The only reason why we need layer or level is that we need the z-offset for 3-D surfaces. Let's just have the one field for that. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	1495b6315e	i965/blorp: Simplify depth buffer state setup a bit The data comes in via ISL in a format that's almost directly usable by the hardware so we can avoid some of the conversion headache. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	d814353365	i965/blorp: Use the generic surface state path for gen8 textures Now that the generic blorp path uses base level/layer, there's no need to make gen8 special. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	ed432fd681	isl: Add asserts for gen8+ X/YOffset rules Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	96fa98c18e	i965/blorp: Only do offset hacks for fake W-tiling and IMS Since the dawn of time, blorp has used offsets directly to get at different mip levels and array slices of surfaces. This isn't really necessary since we can just use the base level/layer provided in the surface state. While it may have simplified blorp's original design, we haven't been using the blorp path for surface state on gen8 thanks to render compression and there's really no good need for it most of the time. This commit restricts such surface munging to the cases of fake W-tiling and fake interleaved multisampling. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	9f9abc8214	i965/blorp: Add a z_offset field to blorp_surface_info The layer field is in terms of physical layers which isn't quite what the sampler will want for 2-D MS array textures. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	a9a6df807e	i965/blorp: Pass the Z component into all texture operations Multisample array surfaces on IVB don't support the minimum array element surface attribute so it needs to come through the sampler message. We may as well just pass it through everything. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	7abcdfbe13	i965/blorp: Rework hiz rect alignment calculations At the moment, the minify operation does nothing because params.depth.view.base_level is always zero. However, as soon as we start using actual base miplevels and array slices, we are going to need the minification. Also, we only need to align the surface dimensions in the case where we are operating on miplevel 0. Previously, it didn't matter because it aligned on miplevel 0 and, for all other miplevels, the miptree code guaranteed that the level was already aligned. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	871893cda2	i965/blorp: Map 1-D render targets with DIM_LAYOUT_GEN4_2D as 2D on gen9 The sampling hardware can handle them ok. It just looks at the tiling to determine whether it's the new gen9 1-D layout or the old one. The render hardware isn't so smart. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	ecd9789368	i965/miptree: Fill out the isl_surf::usage field Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	560a92c4fd	isl: Take the slice0_extent shortcut for interleaved MSAA The shortcut works just fine for MSAA and the comment even says so. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	1e02611276	isl: Remove duplicate px->sa conversions In all three cases, we start with width and height taken from isl_surf::phys_slice0_extent_sa which is already in samples. There is no need to do the conversion and doing so gives us an incorrect value. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	603d5f7638	i965/blorp: Use the isl_view from the blorp_surface_info Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	c097160463	i965/blorp: Get rid of brw_blorp_surface_info::width/height Instead, we manually mutate the surface size as needed. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	2095f932ef	i965/blorp: Move surface offset calculations into a helper The helper does a full transformation on the surface to turn it into a new 2-D single-layer single-level surface representing the original layer and level in memory. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	90ab43d1bb	i965/blorp: Use ISL to compute image offsets For the moment, we still call the old miptree function; we just assert that the two are equal. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	ba88a9622d	isl: Add functions for computing surface offsets in samples Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	f6c75df083	isl: Fix get_image_offset_sa_gen4_2d for multisample surfaces The function takes a logical array layer but was assuming it was a physical array layer. While we'er here, we also make it not assert-fail on gen9 3-D surfaces. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	7997f4f95b	i965/blorp: Add an isl_view to blorp_surface_info Eventually, this will be the actual view that gets passed into isl to create the surface state. For now, we just use it for the format and the swizzle. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	e046a46460	i965/blorp: Move intratile offset calculations out of surface state setup Previously we multiplied full x/y offsets, resolved tile aligned buffer offset and intra tile offset based on that. Now we let ISL to take into account the msaa setting and we only multiply the resolved intra tile offsets. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	27a58615d3	i965/blorp: Refactor interleaved multisample destination handling We put all of the code for fake IMS together. This requires moving a bit of the program key setup code further down so that it gets the right values out of the final surface. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	3c25caa318	i965/blorp: Get rid of brw_blorp_surface_info::array_layout Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	09879eff30	i965/blorp: Use isl_msaa_layout instead of intel_msaa_layout We also remove brw_blorp_surface_info::msaa_layout. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	e2a1bdb3c5	i965/blorp: Use the ISL aux_layout for deciding whether to do an MCS fetch Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	28b0ad890c	i965/blorp: Get rid of brw_blorp_surface_info::num_samples Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	aa6c058ac4	i965/blorp: Make sample count asserts a bit more lazy Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	aa4117a9e4	i965/blorp: Get rid of brw_blorp_surface_info::map_stencil_as_y_tiled Now that we're carrying around the isl_surf, we can just modify it directly instead of passing an extra bit around. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	801189e199	i965/blorp: Remove compute_tile_offsets We have a handy little function is ISL that does exactly the same thing. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	b82de88008	i965/blorp: Create the isl_surf up-front Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	ffeb5f67ac	i965/blorp/clear: Initialize surface info after allocating an MCS Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	1666d029aa	isl/state: Use a valid alignment for 1-D textures The alignment we use doesn't matter (see the comment) but it should at least be an alignment we can represent with the enums. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	0aa0b39769	i965/miptree: Remove the stencil_as_y_tiled parameter from get_tile_masks It's only used to stomp the tiling to Y and it's only used by blorp so there's no reason why blorp can't do it itself. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	573f6ffd04	isl: Fix the parameter names for get_intratile_offset It's been in elements for a while but, for whatever reason, the parameter names in the header file never got updated. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Brian Paul	5de29aeef0	util: try to use SSE instructions with MSVC and 32-bit gcc The lrint() and lrintf() functions are pretty slow and make some texture transfers very inefficient. This patch makes a better effort at using those intrisics for 32-bit gcc and MSVC. Note, this patch doesn't address the use of SSE4.1 with MSVC. v2: get rid of the ROUND_WITH_SSE symbol, per Matt. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 12:53:20 -06:00
Brian Paul	18e6e0796a	svga: fix src/dst typo in can_blit_via_copy_region_vgpu10() The function was always returning false because of this typo. Retested with piglit. There's some sRGB-related blit failures, but that seems unrelated. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-17 12:53:20 -06:00
Brian Paul	55417140cd	svga: initialize a variable to silence a gcc warning Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-17 12:53:20 -06:00
Ian Romanick	607ab6d3bf	glsl: Pull enum ir_expression_operation out to its own file No change except to the copyright symbol. The next patch will generate this file with Python, and Unicode + Python = pure rage. v2: Massive rebase... I guess a lot can change in a year. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 13:48:25 +01:00
Ian Romanick	de71bc9eb6	glsl: Make the generated sources build rules more like NIR Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 13:48:25 +01:00
Francesco Ansanelli	120c9c6380	mesa/st: use llabs instead of abs for long args (v2) v2: long has 32bit on Windows (Marek) Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 14:16:29 +02:00
Marek Olšák	57a8991020	radeonsi: fix up buffer descriptor upper-bound checking st/mesa does this too, so we're safe. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:33 +02:00
Marek Olšák	325379096f	gallium: change pipe_image_view::first_element/last_element -> offset/size This is required by OpenGL. Our hardware supports this. Example: Bind RGBA32F with offset = 4 bytes. Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:33 +02:00
Marek Olšák	7cd256ce7e	gallium: change pipe_sampler_view::first_element/last_element -> offset/size This is required by OpenGL. Our hardware supports this. Example: Bind RGBA32F with offset = 4 bytes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97305 Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:33 +02:00
Marek Olšák	1ac23a9359	gallium/radeon: assign the highest priority to scratch; make rings second just FYI, the kernel receives priority/4 Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:29 +02:00
Marek Olšák	9009516501	gallium/winsys: re-number winsys priority flags free 60..63, move CP_DMA up Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	95020c6dfd	gallium/radeon: mark shader rings as highest-priority buffers and rename the enum Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	e2bb24f213	gallium/radeon: set SHADER_RW_BUFFER priority for streamout buffers Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	a6b5845a0d	radeonsi: use current context for DCC feedback-loop decompress, fixes Elemental This is just a workaround. The problem is described in the code. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96541 v2: say that it's only between the current context and aux_context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-08-17 12:24:35 +02:00
Marek Olšák	9812a50ae6	radeonsi: simplify CB_TARGET_MASK logic we can now rely on CB_COLORn_INFO to disable empty slots. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	2d2b384066	radeonsi: don't set CB_COLOR1_INFO for dual src blending Vulkan doesn't do this. The reason may be that CB_COLOR1_INFO.SOURCE_FORMAT from NI was moved to SPI_SHADER_COL_FORMAT for SI. I asked CB guys about this 2 days ago and they still haven't replied. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	e722b90bc9	radeonsi: eliminate PS OUT[1] if dual src blending is off and CB1 is not bound All VP DX9 ports benefit from this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	3de8ffe836	gallium/radeon: use unflushed fences for PIPE_QUERY_GPU_FINISHED Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Nicolai Hähnle	c5798d6314	gallium/radeon: use lp_build_alloca_undef Avoid building all those store 0 / store undef instruction pairs that end up getting removed anyway. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:25 +02:00
Nicolai Hähnle	41001ca4bd	gallivm: add lp_build_alloca_undef Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	17e88e276c	gallivm: add create_builder_at_entry helper function Reduces code duplication. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	f4204ba53d	gallium/radeon: protect against out of bounds temporary array accesses They can lead to VM faults and worse, which goes against the GL robustness promises. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	ea283779be	gallium/radeon: add radeon_llvm_bound_index for bounds checking Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	8916d1e2fa	gallium/radeon: reduce alloca of temporaries based on usagemask v2: take actual writemasks into account Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	6bba956073	gallium/radeon: use tgsi_scan_arrays for temp arrays Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	7c2295d7ef	gallium/radeon: allocate temps array info in radeon_llvm_context_init Also, prepare for using tgsi_array_info. This also opens the door for properly handling allocation failures, but I'm leaving that for a separate change. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	850c8dcc9c	gallium/radeon: always do the full store in store_value_to_array Doing the write-back of the temporary vector in radeon_llvm_emit_store makes no sense. This also allows us to get rid of get_alloca_for_array. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	4b150931c9	gallium/radeon: extract common getelementptr logic into get_pointer_into_array Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	dfbb8ea284	gallium/radeon: pass indirect register info into get_alloca_for_array To have the same signature as get_array_range. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	b76aabffa2	gallium/radeon: extract common lookup code into get_temp_array function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	fa84296a5a	gallium/radeon: clarify the comment on the array alloca heuristic Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	92b66b38c9	gallium/radeon: more descriptive names for LLVM temporaries in debug builds Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	eacfc86d83	gallium/radeon: simplify radeon_llvm_emit_store for direct array addressing We can use the pointer stored in the temps array directly. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	87fa7cea23	gallium/radeon: simplify radeon_llvm_emit_fetch for direct array addressing We can use the pointer stored in the temps array directly. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	eb50cbf3bd	gallium/radeon: clean up emit_declaration for temporaries In the alloca'd array case, no longer create redundant and unused allocas for the individual elements; create getelementptrs instead. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	cb9ed66cc5	st_glsl_to_tgsi: use calloc the way it's meant to be used Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	67c0f077a2	tgsi/scan: add tgsi_scan_arrays Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:21 +02:00
Ian Romanick	2ec3a3e151	glsl: Add missing ir_quadop_vector constant evaluation for Boolean types Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	cf58e3f522	glsl: Fix typo in ir_unop_f2u implementation This won't affect the output, but it was, technically, wrong. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	8b123b08cb	glsl: Fix typo in ir_unop_b2i implementation This won't affect the output, but it was, technically, wrong. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	cd8764737e	glsl: Don't support integer types for operations that can't handle them ir_unop_fract already forbade integer types in ir_validate. ir_unop_rcp, ir_unop_rsq, and ir_unop_sqrt should also forbid them in ir_validate. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	437e612bd7	glsl: Don't support ir_unop_abs or ir_unop_sign for unsigned integers Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	cceb50e14e	nir/algebraic: Optimize common array indexing sequence Some shaders include code that looks like: uniform int i; uniform vec4 bones[...]; foo(bones[i * 3], bones[i * 3 + 1], bones[i * 3 + 2]); CSE would do some work on this: x = i * 3 foo(bones[x], bones[x + 1], bones[x + 2]); The compiler may then add '<< 4 + base' to the index calculations. This results in expressions like x = i * 3 foo(bones[x << 4], bones[(x + 1) << 4], bones[(x + 2) << 4]); Just rearranging the math to produce (i * 48) + 16 saves an instruction, and it allows CSE to do more work. x = i * 48; foo(bones[x], bones[x + 16], bones[x + 32]); So, ~6 instructions becomes ~3. Some individual shader-db results look pretty bad. However, I have a really, really hard time believing the change in estimated cycles in, for example, 3dmmes-taiji/51.shader_test after looking that change in the generated code. G45 total instructions in shared programs: 4020840 -> 4010070 (-0.27%) instructions in affected programs: 177460 -> 166690 (-6.07%) helped: 894 HURT: 0 total cycles in shared programs: 98829000 -> 98784990 (-0.04%) cycles in affected programs: 3936648 -> 3892638 (-1.12%) helped: 894 HURT: 0 Ironlake total instructions in shared programs: 6418887 -> 6408117 (-0.17%) instructions in affected programs: 177460 -> 166690 (-6.07%) helped: 894 HURT: 0 total cycles in shared programs: 143504542 -> 143460532 (-0.03%) cycles in affected programs: 3936648 -> 3892638 (-1.12%) helped: 894 HURT: 0 Sandy Bridge total instructions in shared programs: 8357887 -> 8339251 (-0.22%) instructions in affected programs: 432715 -> 414079 (-4.31%) helped: 2795 HURT: 0 total cycles in shared programs: 118284184 -> 118207412 (-0.06%) cycles in affected programs: 6114626 -> 6037854 (-1.26%) helped: 2478 HURT: 317 Ivy Bridge total instructions in shared programs: 7669390 -> 7653822 (-0.20%) instructions in affected programs: 388234 -> 372666 (-4.01%) helped: 2795 HURT: 0 total cycles in shared programs: 68381982 -> 68263684 (-0.17%) cycles in affected programs: 1972658 -> 1854360 (-6.00%) helped: 2458 HURT: 307 Haswell total instructions in shared programs: 7082636 -> 7067068 (-0.22%) instructions in affected programs: 388234 -> 372666 (-4.01%) helped: 2795 HURT: 0 total cycles in shared programs: 68282020 -> 68164158 (-0.17%) cycles in affected programs: 1891820 -> 1773958 (-6.23%) helped: 2459 HURT: 261 Broadwell total instructions in shared programs: 9002466 -> 8985875 (-0.18%) instructions in affected programs: 658784 -> 642193 (-2.52%) helped: 2795 HURT: 5 total cycles in shared programs: 78503092 -> 78450404 (-0.07%) cycles in affected programs: 2873304 -> 2820616 (-1.83%) helped: 2275 HURT: 415 Skylake total instructions in shared programs: 9156978 -> 9140387 (-0.18%) instructions in affected programs: 682625 -> 666034 (-2.43%) helped: 2795 HURT: 5 total cycles in shared programs: 75591392 -> 75550574 (-0.05%) cycles in affected programs: 3192120 -> 3151302 (-1.28%) helped: 2271 HURT: 425 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-17 10:52:38 +01:00
Michel Dänzer	4ac640e3d2	glx: Don't use current context in __glXSendError There's no guarantee that there is one, and we don't need one anyway. Fixes piglit tests: glx@glx-fbconfig-bad glx@glx_ext_import_context@import context, multi process glx@glx_ext_import_context@import context, single process Fixes: `2e3f067458` ("glx: fix error code when there is no context bound") Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-08-17 17:16:34 +09:00
Ilia Mirkin	e988999791	nv50/ir: fix bb positions after exit instructions It's fairly rare that the BB layout puts BBs after the exit block, which is likely the reason these issues lingered for so long. This fixes a fraction of issues with the giant pixmark piano shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: <mesa-stable@lists.freedesktop.org>	2016-08-16 21:56:16 -04:00
Ilia Mirkin	0b5f40b881	nv50/ir: properly clear upper bits of a bitset fill Found by inspection. In practice, val is always == 0, so this never got triggered. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-16 21:56:16 -04:00
Francisco Jerez	4d436c011f	i965/fs: Estimate maximum sampler message execution size more accurately. The current logic used to determine the execution size of sampler messages was based on special-casing several argument and opcode combinations, which unsurprisingly missed the possibility that some messages could exceed the payload size limit or not depending on the number of coordinate components present. In particular: - The TXL, TXB and TEX messages (the latter on non-FS stages only) would attempt to use SIMD16 on Gen7+ hardware even if a shadow reference was present and the texture was a cubemap array, causing it to overflow the maximum supported sampler payload size and crash. - The TG4_OFFSET message with shadow comparison was falling back to SIMD8 regardless of the number of coordinate components, which is unnecessary when two coordinates or less are present. Both cases have been handled incorrectly ever since cubemap arrays and texture gather were respectively enabled (the current logic used by the SIMD lowering pass is almost unchanged from the previous no16 fall-back logic used pre-SIMD lowering times). Fixes the following GL4.5 conformance test on Gen7-8 (the bug also affects Gen9+ in principle, but SKL passes the test by luck because it manages to use the TXL_LZ message instead of TXL): GL45-CTS.texture_cube_map_array.sampling Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97267 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-16 16:31:59 -07:00
Francisco Jerez	61a02fb74c	i965/fs: Return zero from fs_inst::components_read for non-present sources. This makes it easier for the caller to find out how many scalar components are actually read by the instruction. As a bonus we no longer need to special-case BAD_FILE in the implementation of fs_inst::regs_read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-16 16:31:59 -07:00
Francisco Jerez	0c754d1c42	i965/fs: Lower TEX to TXL during NIR translation. This simplifies the code slightly and will allow the SIMD lowering pass to find out easily what the actual texturing opcode is in order to determine the maximum execution size of texturing instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-16 16:31:59 -07:00
Rob Clark	5def00875d	freedreno/a3xx: fix generic clear path Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-16 19:26:03 -04:00
Brian Paul	df2dcf6200	st/mesa: use pipe var instead of st->pipe in st_create_context_priv() As is done in most other places in the function. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 08:28:33 -06:00
Brian Paul	038b1b11fe	gallium: remove unused u_clear.h file Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 08:28:33 -06:00
Brian Paul	22b8288b33	gallium/i915: inline the util_clear() code into i915_clear_blitter() This is the only place the util_clear() function was used. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 08:28:32 -06:00
Brian Paul	66debeae9d	gallium/util: minor reformatting in u_box.h Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 08:28:32 -06:00
Brian Paul	b6c81a780f	svga: remove unused var in svga_mark_surfaces_dirty() Signed-off-by: Brian Paul <brianp@vmware.com>	2016-08-16 08:28:22 -06:00
Brian Paul	1e5eb79d9a	svga: avoid a calloc in svga_buffer_transfer_map() Just initialize the two other pipe_transfer fields explicitly. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-16 08:24:53 -06:00
Brian Paul	f934117bbb	svga: don't call os_get_time() when not needed by Gallium HUD The calls to os_get_time() were showing up higher than expected in profiles. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-16 08:24:53 -06:00
Brian Paul	dcf2126f90	svga: remove unneeded memset() call in draw_vgpu10() All three fields of the vbuffer_attrs[] array are assigned in the following loop. The remaining elements of the array are not used. Tested with full Piglit run, Heaven 4.0, etc. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-16 08:24:52 -06:00
Brian Paul	ced0dd0e95	svga: reduce looping in svga_mark_surfaces_dirty() We don't need to loop over the max number of color buffers, just the current number (which is usually one). Tested with full Piglit run, Heaven 4.0, etc. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-16 08:24:52 -06:00
Brian Paul	88efaf9878	svga: minor clean-ups in define_rasterizer_object() Add const qualifiers, new comment. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-16 08:24:52 -06:00
Brian Paul	ce9c05a593	svga: remove incorrect buffer invalidation code Fixes regression with team_fortress_2 trace. This change has been in our in-house tree for some time. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-16 08:24:52 -06:00
Brian Paul	06b23f747d	svga: additional comments for svga_hw_draw_state members And re-order a few fields. Signed-off-by: Brian Paul <brianp@vmware.com>	2016-08-16 08:24:52 -06:00
Brian Paul	7c5eda6f4e	svga: use the sws local var to simplify some code Signed-off-by: Brian Paul <brianp@vmware.com>	2016-08-16 08:24:52 -06:00
Brian Paul	7b821941f6	svga: minor whitespace and code clean-ups Signed-off-by: Brian Paul <brianp@vmware.com>	2016-08-16 08:24:52 -06:00
Rob Clark	27f12dd8fd	freedreno/a4xx: use generic clear path Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-16 09:21:13 -04:00
Rob Clark	f77e59e76c	freedreno/a3xx: use generic clear path Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-16 09:21:13 -04:00
Rob Clark	a8e6734a83	freedreno: support for using generic clear path Since clears are more or less just normal draws, there isn't that much benefit in having hand-rolled clear path. Add support to use u_blitter instead if gen specific backend doesn't implement ctx->clear(). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-16 09:21:13 -04:00
Rob Clark	142dd7b9c0	gallium/u_blitter: split out a helper for common clear state Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 09:21:13 -04:00
Rob Clark	2b2f436c69	gallium/u_blitter: add helper to save FS const buffer state Not (currently) state that is overwridden by u_blitter itself, but drivers with custom blit/clear which are reusing part of the u_blitter infrastructure will use it. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 09:21:13 -04:00
Rob Clark	433e12fea8	gallium/u_blitter: export some functions Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 09:21:13 -04:00
Nicolas Boichat	78e3cea419	egl/dri2: dri2_make_current: Release previous context's display eglMakeCurrent can also be used to change the active display. In that case, we need to decrement ref_count of the previous display (possibly destroying it), and increment it on the next display. Also, old_dsurf/old_rsurf cannot be non-NULL if old_ctx is NULL, so we only need to test if old_ctx is non-NULL. v2: Save the old display before destroying the context. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97214 Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reported-by: Alexandr Zelinsky <mexahotabop@w1l.ru> Tested-by: Alexandr Zelinsky <mexahotabop@w1l.ru> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>	2016-08-16 17:30:35 +09:00
Nayan Deshmukh	09dff7ae2e	st/vdpau: change the order in which filters are applied(v3) Apply the median and matrix filter before the compostioning we apply the deinterlacing first to avoid the extra overhead in processing the past and the future surfaces in deinterlacing. v2: apply the filters on all the surfaces (Christian) v3: use get_sampler_view_planes() instead of get_sampler_view_components() and iterate over VL_MAX_SURFACES (Christian) Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-08-16 10:07:35 +02:00
Kenneth Graunke	1f47f78fc3	glcpp: Update tests for new #undef of built-in macro rules. Ian recently changed the preprocessor to allow this in most GLSL versions, but not GLSL ES 3.00+. This patch converts the existing test that expects a failure to a #version 300 es shader, and adds a #version 110 shader to make sure that it's allowed. Fixes 'make check'. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97307 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Tested-by: Vinson Lee <vlee@freedesktop.org>	2016-08-15 22:55:34 -07:00
Dave Airlie	c2f2252037	anv: fix writemask on blit fragment shader. I'm not sure if anything even uses this, but I found this on radv, so just fix it on anv for consistency. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-08-16 10:29:44 +10:00
Nicolas Boichat	c0580f6a38	egl/android: Set dpy->DriverData to NULL on error Avoid use-after-free on error. Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 19:00:30 +01:00
Nicolas Boichat	a9e8fb7397	egl/drm: Set disp->DriverData to NULL on error Avoid use-after-free on error. Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 19:00:30 +01:00
Nicolas Boichat	0e67d86540	egl/surfaceless: Set disp->DriverData to NULL on error Avoid use-after-free on error. Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 19:00:30 +01:00
Nicolas Boichat	48fd952f28	egl/wayland: Set disp->DriverData to NULL on error Avoid use-after-free, fix spec@egl_khr_fence_sync@conformance. Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reported-by: Michel Dänzer <michel@daenzer.net> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 19:00:30 +01:00
Jan Ziak	769ac1ec78	egl/x11: avoid using freed memory if dri2 init fails Found with valgrind: ==4841== Invalid read of size 4 ==4841== at 0x56BDC80: dri2_initialize (egl_dri2.c:783) ==4841== by 0x56BAFE5: _eglMatchAndInitialize (egldriver.c:261) ==4841== by 0x56BB15E: _eglMatchDriver (egldriver.c:295) ==4841== by 0x56B58C9: eglInitialize (eglapi.c:480) ==4841== by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x402E59: main ==4841== Address 0x6a05824 is 148 bytes inside a block of size 480 free'd ==4841== at 0x4C2B680: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==4841== by 0x56C2AAE: dri2_initialize_x11_swrast (platform_x11.c:1233) ==4841== by 0x56C2AAE: dri2_initialize_x11 (platform_x11.c:1493) ==4841== by 0x56BDCEB: dri2_initialize (egl_dri2.c:805) ==4841== by 0x56BAFAF: _eglMatchAndInitialize (egldriver.c:261) ==4841== by 0x56BB0C9: _eglMatchDriver (egldriver.c:292) ==4841== by 0x56B58C9: eglInitialize (eglapi.c:480) ==4841== by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x402E59: main ==4841== Block was alloc'd at ==4841== at 0x4C2A868: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==4841== by 0x56C2A47: dri2_initialize_x11_swrast (platform_x11.c:1171) ==4841== by 0x56C2A47: dri2_initialize_x11 (platform_x11.c:1493) ==4841== by 0x56BDCEB: dri2_initialize (egl_dri2.c:805) ==4841== by 0x56BAFAF: _eglMatchAndInitialize (egldriver.c:261) ==4841== by 0x56BB0C9: _eglMatchDriver (egldriver.c:292) ==4841== by 0x56B58C9: eglInitialize (eglapi.c:480) ==4841== by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x402E59: main Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com> Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 19:00:29 +01:00
Emil Velikov	6b4b2a4dd6	anv: add genX_multisample.h to the sources list(s). Otherwise it won't end up in the release tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 19:00:29 +01:00
Kevin Strasser	71258e9462	anv/x11: Add support for Xlib platform Some applications continue to use the Xlib client library and expect that VK_KHR_xlib_surface will be available in the driver. Service these applications by converting the Display pointer to xcb_connection_t and use the existing xcb code in the driver. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 09:47:06 -07:00
Tapani Pälli	5d9b50e596	glx: apple specific occurences of dummyContext check Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2016-08-15 09:24:10 +03:00
Bernard Kilarski	2e3f067458	glx: fix error code when there is no context bound v2: change all related NULL checks to check against dummyContext v3: really check for dummyContext only when ctx was from __glXGetCurrentContext v4: cover more checks, add dummyBuffer, dummyVtable (Emil) Signed-off-by: Bernard Kilarski <bernard.r.kilarski@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "11.2" <mesa-stable@lists.freedesktop.org>	2016-08-15 09:24:10 +03:00
Mathias Fröhlich	312ece9cd7	mesa: Remove duplicate include. In api_validate.c stdbool.h was included twice. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-15 07:10:39 +02:00
Mathias Fröhlich	84984b9986	vbo: Remove always true return from vbo_bind_arrays. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-15 07:10:39 +02:00
Mathias Fröhlich	72f1566f90	mesa: Move check for vbo mapping into api_validate.c. Instead of checking for mapped buffers in vbo_bind_arrays do this check in api_validate.c. This additionally enables printing the draw calls name into the error string. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-15 07:10:39 +02:00
Mathias Fröhlich	b7b0c51f1f	mesa: Move _mesa_all_buffers_are_unmapped to arrayobj.c. Move the function to check if all vao buffers are unmapped into the vao implementation file. Rename the function to _mesa_all_buffers_are_unmapped. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-15 07:10:39 +02:00
Mathias Fröhlich	c17cf1c8f5	vbo: Array draw must not care about glBegin/glEnd vbo mapping. In array draw do not check if the vertex buffer object that is used to implement immediate mode glBegin/glEnd is mapped. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-15 07:10:39 +02:00
Ilia Mirkin	5c1ccd8053	nv50,nvc0: fix depth range when halfz is enabled Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-08-14 17:41:49 -04:00
Ilia Mirkin	c85b7f0e87	gallium/util: add helper to compute zmin/zmax for a viewport state Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-08-14 17:41:33 -04:00
Ilia Mirkin	68b64f32e8	vbo: allow DrawElementsBaseVertex in display lists Looks like it was missed originally. The multi version is there already. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97331 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org	2016-08-14 12:06:51 -04:00
Rob Clark	561fd226d4	freedreno/a3xx+a4xx: move common VBOs to fd_context These are the same for a3xx and later. (a2xx could probably use them too, but due to limited hw support and ancient downstream kernels, it isn't so easy to test.) Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-13 13:59:03 -04:00
francians@gmail.com	a49fb4ab2d	freedreno/a2xx: add missing casts to silence notices Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-13 09:37:41 -04:00
Rob Clark	78ba262d00	freedreno/ir3: fix issue with emit_tex() For various tex fetch instructions, coord's get fixed up in different ways. But modifying the array returned from get_src() has side-effects if the same SSA src is used again.. the later instruction will see the previous fixups. Fix this, and const'ify things to prevent this sort of mistake in the future. Noticed by Varad when adding support for txf_ms. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-13 09:33:47 -04:00
Ilia Mirkin	a32c87f74b	glsl: emit a specific error when ast_*_assign changes type For regular ast_add, we can implicitly change either a or b's type. However in an assignment situation, the type of the lvalue is fixed. So if the implicit conversion logic decides to change it, it means that the rhs's type could not be converted to the lhs type. Emit a specific error for this rather than the rather mysterious "is not an lvalue" error that results from having a i2f or other operation as the lvalue. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96729 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-12 22:45:20 -04:00
Ilia Mirkin	d816a51b81	st/mesa: provide GL_OES_copy_image support by caching the original ETC data The additional provision of GL_OES_copy_image is that it work for ETC. However many desktop GPUs don't have native ETC support, so st/mesa does the decoding by hand. Instead of discarding the compressed data, keep it around in CPU memory. Use it when performing image copies. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-08-12 20:21:08 -04:00
Ilia Mirkin	7727e6f67c	st/mesa: refactor duplicated etc fallback checks Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-12 20:21:08 -04:00
Ilia Mirkin	1baae00089	glsl: look for frag data bindings with [0] tacked onto the end for arrays The GL spec is very unclear on this point. Apparently this is discussed without resolution in the closed Khronos bugtracker at https://cvs.khronos.org/bugzilla/show_bug.cgi?id=7829 . The recommendation is to allow dropping the [0] for looking up the bindings. The approach taken in this patch is to instead tack on [0]'s for each arrayness level of the output's type, and doing the lookup again. That way, for out vec4 foo[2][2][2] we will end up looking for bindings for foo, foo[0], foo[0][0], and foo[0][0][0], in that order of preference. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-12 20:21:08 -04:00
Lionel Landwerlin	0294dd00cc	anv: pipeline: gen7: fix assert in debug mode SampleMask is only 8bits long on gen7. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97278 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-12 17:03:48 -07:00
Haixia Shi	8c56ff643b	mesa: change state query return value for RGB565 The GL_BGR and GL_UNSIGNED_SHORT_5_6_5_REV are not defined anywhere in OpenGL ES 3.2 (or earlier) specification, and there are no known extensions in the Khronos registry that would add these enums as valid responses for glGetIntegerv(GL_IMPLEMENTATION_COLOR_READ_TYPE) and glGetIntegerv(GL_IMPLEMENTATION_COLOR_READ_FORMAT) queries. Note that this patch does not change the bit layout returned by the query. As defined by the GL spec, the bit layout of GL_RGB + GL_UNSIGNED_SHORT_5_6_5 and GL_BGR + GL_UNSIGNED_SHORT_5_6_5_REV are identical. TEST=dEQP-GLES3.functional.state_query.integers.* Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Stéphane Marchesin <marcheu@chromium.org> Change-Id: I81bbc8ccdc7e125edaeae443baf6fa8fdefcc6b6	2016-08-12 15:34:09 -07:00
Anuj Phogat	0bf531aee6	anv/device: Add limits for InterpolationOffset Fixes the vulkan cts regression in test dEQP-VK.api.info.device.properties Cc: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-12 10:45:02 -07:00
Anuj Phogat	7f6136d7db	i965: Change 8X MSAA sample mapping This is required following the change in 8X sample positions. Fixes the recently modified multisample-scaled-blit piglit tests. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-12 10:45:02 -07:00
Anuj Phogat	fb1bc5007d	i965: Change 8x multisample positions There are no standard sample positions defined in OpenGL and OpenGL ES specs. Implementations have the freedom to pick the positions which give plausible results. But the Vulkan 1.0 spec does define standard sample positions for different sample counts. Defined positions in Vulkan for all the sample counts except 8X match with the positions we set in i965. We have an upcoming plan to share the blorp code between OpenGL and Vulkan driver in near future. Keeping the 8X sample positions same on both the drivers will help us move in that direction. Here is an argument by Neil Roberts (from commit `20250e85`) against any advantage of current 8X sample positions over the new ones: "The comment above for the 8x sample positions says that the hardware implements centroid interpolation by picking the centre-most sample that is inside the primitive. That implies that it might be worthwhile to pick a pattern that includes 0.5,0.5. However by experimentation this doesn't seem to actually be the case. With the sample positions in this patch, if I modify the piglit test below so that it instead reports the centroid position, it reports 0.492188,0.421875 which doesn't match any of the positions. If I modify the sample positions so that they include one at exactly 0.5,0.5 it doesn't help and it reports another position which is even further from the center for some reason. arb_gpu_shader5-interpolateAtSample-different Kenneth Graunke experimented with some other patterns that have a higher standard deviation but I think after some discussion it was decided that it would be better to pick the same pattern as the other graphics API in case there are games that rely on this pattern." Observed no regressions in jenkins testing. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-12 10:45:02 -07:00
Anuj Phogat	1fe36d849c	anv: Use macro to avoid code duplication for sample positions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-12 10:45:02 -07:00
Marek Olšák	317e136ef0	st/mesa: BufferData should flag NewDriverState because NewDriverState is filtered depending on active shader states, while st->dirty isn't. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:50:01 +02:00
Marek Olšák	085aa7f91e	st/mesa: don't update atomic, SSBO, UBO and TBO states that have no effect Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:50:01 +02:00
Marek Olšák	ac032d800e	st/mesa: _NEW_TEXTURE & CONSTANTS shouldn't flag states that aren't used Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:50:01 +02:00
Marek Olšák	c323d5b809	st/mesa: when changing shaders, only dirty states that are affected by them This reduces the amount of state processing that has no effect. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:50:01 +02:00
Marek Olšák	8c1775c14c	st/mesa: determine states used or affected by shaders at compile time At compile time, each shader determines which ST_NEW flags should be set at shader bind time. This just sets the new field for all shaders. The next commit will use it. v2: small code unification Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-08-12 18:49:24 +02:00
Marek Olšák	a7d33315a7	st/mesa: remove TES/TCS/GS state dirtying optimization This will be replaced with a better mechanism. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:47:24 +02:00
Marek Olšák	0be30ea1a8	st/mesa: don't update clip state on VS changes if it has no effect Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:47:24 +02:00
Marek Olšák	412bd7360c	st/mesa: don't update clip state if it has no effect Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:47:24 +02:00
Chad Versace	dd93cbc894	mesa: Document that _mesa_enum_to_string() returns non-null (v2) It always returns non-null, even if the number is an invalid enum. Cc: Haixia Shi <hshi@chromium.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Change-Id: I26e8843c96130be972e66f48a49e362442e1bf97	2016-08-12 09:09:55 -07:00
Kenneth Graunke	f9f462936a	glsl: Fix invariant matching in GLSL 4.30 and GLSL ES 1.00. Old languages (GLSL <= 4.20 and GLSL ES 1.00) require "invariant" to be specified on both inputs and outputs, and match when linking. New languages only allow outputs to be qualified as "invariant" and remove the "invariant must match" restriction when linking varyings (because no input can have that qualifier). Commit `426a50e208` introduced the new behavior for ES 3.00. It also removed the "must match" restriction for ES 1.00 shaders, which I believe is incorrect. This patch adds that back, as well as making 4.30+ follow the new rules. Thanks to Qiankun Miao for noticing this discrepancy. Fixes a WebGL 2.0 conformance test when run in Chromium: https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2 Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96971 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-11 23:56:53 -07:00
Kenneth Graunke	0ed316360f	glsl: Tidy stream handling in merge_qualifier(). The previous commit fixed xfb_buffer handling, which was largely copy and pasted from the stream handling. The difference is that stream was set in input_layout_mask, so it worked. However, that's totally rubbish: stream is only valid on geometry shader outputs. Presumably this was to hack around inout. Instead, apply the solution I used in the previous fix. Really, we just need to separate shader interface and parameter qualifier handling so this isn't a mess, but this patch at least tidies it slightly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-11 23:56:48 -07:00
Kenneth Graunke	dffa371665	glsl: Fix inout qualifier handling in GLSL 4.40. inout variables have q.in and q.out set. We were trying to set xfb_buffer = 1 for shader output variables (and inadvertantly setting it on inout parameters, too). But input_layout_mask doesn't have xfb_buffer set, so it was seen as in invalid input qualifier. This meant that all 'inout' parameters were broken. Caught by running a WebGL conformance test in Chromium: https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2 Fixes Piglit's tests/spec/glsl-4.40/compiler/inout-parameter-qualifier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-11 23:56:40 -07:00
Miklós Máté	17f1c49b9a	swrast: fix active attribs with atifragshader Only include the ones that can be used by the shader. This fixes texture coordinates, which were completely wrong, because WPOS was included in the list of attribs. It also increases performance noticeably. Signed-off-by: Miklós Máté <mtmkls@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-11 08:29:23 -06:00
Indrajit Das	8074c6b6ea	st/omx/dec/h264: pass default scaling lists in raster format Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-08-11 16:02:28 +02:00
Jose Fonseca	06b63f1f43	appveyor: Force Visual Studio 2013 image. It seems the default build image is now Visual Studio 2015, and Visual Studio 2013 is not installed.	2016-08-11 14:39:39 +01:00
Jose Fonseca	16627fc87d	appveyor: Install pywin32 extensions. AppVeyor build images seem to have been upgraded to Python 2.7.12, but no longer have pywin32 pre-installed.	2016-08-11 14:39:39 +01:00
Timothy Arceri	33b3815773	glsl/tests: fix segfault in uniform initializer test Caused by `549222f5` Tested-by: Aaron Watry <awatry@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97286	2016-08-11 14:57:18 +10:00
Ian Romanick	50b49d242d	glcpp: Only disallow #undef of pre-defined macros on GLSL ES >= 3.00 shaders Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says: It is an error to undefine or to redefine a built-in (pre-defined) macro name. The GLSL ES 1.00 spec does not contain this text. Section 3.3 (Preprocessor) of the GLSL 1.30 spec says: #define and #undef functionality are defined as is standard for C++ preprocessors for macro definitions both with and without macro parameters. At least as far as I can tell GCC allow '#undef __FILE__'. Furthermore, there are desktop OpenGL conformance tests that expect '#undef __VERSION__' and '#undef GL_core_profile' to work. Fixes: GL45-CTS.shaders.preprocessor.definitions.undefine_version_vertex GL45-CTS.shaders.preprocessor.definitions.undefine_version_fragment GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_vertex GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_fragment Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-08-10 16:42:02 -07:00
Ian Romanick	eda6349346	glcpp: Track the actual version instead of just the version_resolved flag Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-08-10 16:42:02 -07:00
Timothy Arceri	30e5ff7067	glsl: remove remaining tabs in link_uniform_initializers.cpp Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-11 08:33:38 +10:00
Timothy Arceri	549222f5f8	glsl: use UniformHash to find storage location There is no need to be looping over all the uniforms. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-11 08:33:30 +10:00
Timothy Arceri	82e153daff	glsl: remove dead builtins before assigning varying locations Builtins already have locations assigned so this shouldn't change anything. We want to call it earlier so we can tranform GLSL IR to NIR earlier. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-11 08:33:21 +10:00
Timothy Arceri	588702cc41	glsl: split out varying and uniform linking code Here a new function link_varyings_and_uniforms() is created this should help make it easier to follow the code in link_shader() which was getting very large. Note the end of the new function contains a for loop with some lowering calls that currently don't seem related to varyings or uniforms but they are a dependancy for converting to NIR ealier so we move things here now to keep things easy to follow. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-11 08:33:12 +10:00
Jason Ekstrand	4c3a6b07e2	i965/vec4: Make opt_vector_float reset at the top of each block The pass isn't really control-flow aware and you can get into case where it tries to combine instructions from different blocks. This can actually lead to an assertion failure when removing unneeded instructions if part of the vector is set in one block and part in another. This prevents regressions in the next commit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-10 15:19:55 -07:00
Eric Anholt	ac6966360f	mesa: Use a temporary set to track whether we've added a resource yet. Saves another .1s on servo.trace. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-10 12:27:22 -07:00
Eric Anholt	ee02a5e330	prog_hash_table: Convert to using util/hash_table.h. Improves glretrace -b servo.trace (a trace of Mozilla's servo rendering engine booting, rendering a page, and exiting) from 1.8s to 1.1s. It uses a large uniform array of structs, making a huge number of separate program resources, and the fixed-size hash table was killing it. Given how many times we've improved performance by swapping the hash table to util/hash_table.h, just do it once and for all. This just rebases the old hash table API on top of util/, for minimal diff. Cleaning things up is left for later, particularly because I want to fix up the new hash table API a little bit. v2: Add UNUSED to the now-unused parameter. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-10 12:27:22 -07:00
Eric Anholt	91945f9e91	prog_hash_table: Convert compare funcs to match util/hash_table.h. I'm going to replace this hash table with util/hash_table.h, and the first step is to compare things the same way. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-10 12:27:22 -07:00
Eric Anholt	60f1b436b9	nir: Drop an unused program/hash_table.h include. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-10 12:27:22 -07:00
Tim Rowley	6198160250	swr: [rasterizer core] unused variable warning fixes Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:48 -05:00
Tim Rowley	9aa75e5d46	swr: [rasterizer jitter] add core string to JitManager Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:42 -05:00
Tim Rowley	b311bdf92d	swr: [rasterizer core] fix OOB check of viewport indices Use correct comparison intrinsic for OOB check of viewport indices. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:36 -05:00
Tim Rowley	2eae02f77c	swr: [rasterizer common] add linux definition for InterlockedAdd64 Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:22 -05:00
Tim Rowley	e8b35a2321	swr: [rasterizer jitter] add VMASKSTOREPS intrinsic Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:16 -05:00
Tim Rowley	3393279fc9	swr: [rasterizer jitter] add mask support for odd format fetch Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:10 -05:00
Tim Rowley	92621ac5d5	swr: [rasterizer core] routing of viewport indexes through frontend Viewport transform performed based on per-prim viewport index if available. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:00 -05:00
Tim Rowley	4e8763cb09	swr: [rasterizer core] split FE and BE stats Separated FE stats out into its own structure. There are 17 FE vs 3 BE stat fields. Since there is only one FE thread per DC then we don't have to loop over all threads and sum up FE stats over all the worker threads. This also reduces size of DC since we only need to store one copy of the FE stats and not one per worker. Finally, we can use the new FE callback mechanism to update these. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:51 -05:00
Tim Rowley	f833b694cd	swr: [rasterizer core] remove all old stats code Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:45 -05:00
Tim Rowley	ad153189ec	swr: [rasterizer core] viewport array support Change viewport matrix storage from AOS to SOA to support viewport arrays. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:40 -05:00
Tim Rowley	d86e2487a0	swr: [rasterizer jitter] fetch support for offsetting VertexID Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:33 -05:00
Tim Rowley	6625fd08db	swr: [rasterizer core] fundamentally change how stats work Add a per draw stats callback to update driver stats. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:23 -05:00
Tim Rowley	047493c198	swr: [rasterizer core] add rasterizerSampleCount to PS context Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:17 -05:00
Tim Rowley	a83beb936e	swr: [rasterizer core] remove cygwin threads.cpp stubs Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:11 -05:00
Tim Rowley	29e1c4a8a9	swr: [rasterizer core] allow override of KNOB thread settings - Remove HYPERTHREADED_FE support - Add threading info as optional data passed to SwrCreateContext. If supplied this data will override any KNOB thread settings. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:05 -05:00
Tim Rowley	e0c10306f5	swr: [rasterizer core] add SwrWaitForIdleFE This is a blocking call that waits until all FE work is complete. This is useful for waiting for FE work to complete such as for streamout. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:07:59 -05:00
Tim Rowley	8dfaf249cc	swr: [rasterizer core] change threadsDone to be a 32-bit value. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:07:53 -05:00
Tim Rowley	6624e01114	swr: [rasterizer core] update trivial accept test conditions enable/disable raster tile trivial accept test based on scissor enable trait. Can be optimized further. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:07:47 -05:00
Tim Rowley	7cf187d08a	swr: [rasterizer core] improve implementation for SoWriteOffset 1. SoWriteOffset is no longer treated as a stat 2. Added callback from core to update streamout write offset Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:07:40 -05:00
Tim Rowley	8d3b20135e	swr: [rasterizer common] make disabled asserts always print (but not break) Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:07:00 -05:00
Leo Liu	6575ebdc45	vl/rbsp: add a check for emulation prevention three byte This is the case when the "00 00 03" is very close to the beginning of nal unit header v2: move the check to rbsp init Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-08-10 09:52:44 -04:00
Ilia Mirkin	bc5df3b321	Re-apply "glsl: don't try to lower non-gl builtins as if they were gl_FragData" If a shader has an output array, it will get treated as though it were gl_FragData and rewritten into gl_out_FragData instances. We only want this to happen on the actual gl_FragData and not everything else. This is a small part of the problem pointed out by the below bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-10 15:43:36 +02:00
Marek Olšák	9c63fd9056	radeonsi: set CB_COLORn_INFO.ROUND_MODE just do what the register spec says Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-10 15:43:36 +02:00
Marek Olšák	667ad9fa3e	radeonsi: set CB_COLORn_INFO.SIMPLE_FLOAT This can help enable some blend optimizations (see the register spec). Vulkan always sets this. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-10 15:43:36 +02:00
Marek Olšák	36057ff12a	radeonsi: disallow MIN/MAX blend equations for dual source blending Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-10 15:43:36 +02:00
Marek Olšák	947e0614d0	radeonsi: only set dual source blending for MRT0 This is the proper fix for Overlord and Witcher 2 hangs. The hang condition is that 1 app must write to MRT0 and MRT1 from a pixel shader while MRT1 is disabled in CB_TARGET_MASK (does this generate unflushable pixel quads? I don't know), and another app (e.g. Glamor) must enable dual source blending in both MRT0 and MRT1. The hw gets confused, which leads to corruption and hangs. Cc: 12.0 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-10 15:43:36 +02:00
Miklós Máté	88c2fc6b2d	st/mesa: in ATI fs don't assume TEMP0=REG0 The temporaries are allocated dynamically. Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-08-10 15:03:58 +02:00
Trevor Davenport	9a4d5db4d2	st/nine: Fix invalid attempt to use indirect draws. Since commit `6d7177f01b`, radeonsi would take a different path if info->indirect_params was not initialized properly. Nine was not initializating this field. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-08-10 15:02:20 +02:00
Mathias Fröhlich	0ce5ec8ece	util: Use win32 intrinsics for util_last_bit if present. v2: Split into two patches. v3: Fix off by one problem. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2016-08-10 09:30:07 +02:00
Marek Olšák	3f100b77f9	gallium/radeon: use unflushed fences for deferred flushes (v2) +23% Bioshock Infinite performance. v2: - use the new fence_finish interface - allow deferred fences with multiple contexts - clear the ctx pointer after a deferred flush Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	1cc95a1255	st/mesa: set the ctx parameter of fence_finish for deferred flushes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	54272e18a6	gallium: add a pipe_context parameter to fence_finish required by glClientWaitSync (GL 4.5 Core spec) that can optionally flush the context Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	c6043e7d54	st/mesa: use PIPE_USAGE_STREAM for GL_CLIENT_STORAGE_BIT without READ_BIT (v2) v2: keep STAGING for GL_MAP_READ_BIT Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	33a9b4e8a1	gallium/radeon: add HUD queries for mapped VRAM/GTT mainly for monitoring visible VRAM congestion Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	645d395d9a	winsys/radeon: track the amount of mapped memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	1e04483c22	winsys/amdgpu: track the amount of mapped memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	8276776e64	winsys/amdgpu: don't try to unmap userptr buffers no app calls this AFAIK Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	ef836c0d04	gallium/radeon: increase the size of the renderer string Mine is longer than 64 bytes. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	739d526b07	gallium/radeon: implement ARB_clear_texture (v3) Some ideas copied from Jakob Sinclair's implementation, but the color clearing is completely different. v2: remove leftover code, disable conditional rendering disable render condition cleanly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	7df15389af	gallium/radeon: handle render_condition_enable for clear_rt/ds Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:10:21 +02:00
Marek Olšák	a909210131	gallium: add render_condition_enable param to clear_render_target/depth_stencil Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:10:21 +02:00
Haixia Shi	a7c6993a33	egl: android: query native window default width and height (v2) On android platform, the width and height of a native window surface may be updated after initialization. It is therefore necessary to query android framework for the current width and height. v2: remove Android specific #ifdef's and just implement the fallback directly if the platform query_surface() callback is not provided. TEST=dEQP-EGL.functional.resize.surface_size#* on cyan-cheets Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> (v1) Reviewed-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Chad Versace <chad@kiwitree.net> Change-Id: I673f7d2f1d90c3bf572b30f63da537f2cae1496e	2016-08-09 15:49:28 -07:00
Anuj Phogat	c4cd0e8ecd	anv/device: Enable sample shading on gen7+ Passes all 30 min_sample_shading tests in vulkan cts. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-09 14:45:25 -07:00
Anuj Phogat	f16295a198	anv/gen7_pipeline: Set multisample state using shared function Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-09 14:45:25 -07:00
Anuj Phogat	2ef5063ad7	anv/pipeline: Add sample locations for gen7-7.5 V1: Add multisample positions (Nanley) V2: Fix 8x sample positions to match OpenGL (Anuj) V3: Vulkan has standard sample locations. They need not be same as in OpenGL. (Anuj) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-09 14:45:25 -07:00
Anuj Phogat	dc49dd7f10	anv/pipeline: Move emit_ms_state() to genX_pipeline_util.h This will help sharing multisample state setting code. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-09 14:45:25 -07:00
Mathias Fröhlich	aa920736fe	gallium: Add c99_compat.h to u_bitcast.h We need this for 'inline'. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-09 21:20:56 +02:00
Mathias Fröhlich	027cbf00f2	util: Move _mesa_fsl/util_last_bit into util/bitscan.h As requested with the initial creation of util/bitscan.h now move other bitscan related functions into util. v2: Split into two patches. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-09 21:20:46 +02:00
Nicolai Hähnle	e4cb3af524	radeonsi: enable multi-draw related pipe caps This enables GL_shader_draw_parameters and GL_ARB_indirect_parameters as well as a properly accelerated implementation of GL_ARB_multi_draw_indirect. Enabling the feature requires a sufficiently uptodate firmware -- those have already been released a long time ago, although this does mean that the feature only works with the amdgpu kernel module, since the radeon module doesn't have a way to query the firmware version. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:04 +02:00
Nicolai Hähnle	6d7177f01b	radeonsi: program additional multi draw parameters Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:04 +02:00
Nicolai Hähnle	b6c71d37c7	radeonsi: program the DRAWID SGPR Note that for indirect draws, the new MULTI firmware packets are required. There's also no need to reset last_{start_instance,sh_base_reg}, since resetting last_base_vertex is sufficient. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:04 +02:00
Nicolai Hähnle	8dbf2a8570	radeonsi: add DRAWID parameter to vertex shaders Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:04 +02:00
Nicolai Hähnle	febb5dbf72	radeonsi: wire up TGSI_SEMANTIC_BASEINSTANCE Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:03 +02:00
Nicolai Hähnle	d34292a77f	radeonsi: remove an incorrect assertion Byte indices don't need any alignment, so remove this assertion (it got moved into a path where a piglit test hit it during the refactoring of commit `64ff23a58c`). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:03 +02:00
Nicolai Hähnle	2852dedaa0	radeonsi: flush TC L2 cache for indirect draw data This fixes a bug when indirect draw data is generated by transform feedback. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:03 +02:00
Nicolai Hähnle	76c4a3b567	radeonsi/sid: add additional bits for the DRAW_(INDEX)_INDIRECT_MULTI packets Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:03 +02:00
Brian Paul	60dc36a680	st/mesa: define ST_NEW_ flags as uint64_t values, not enums MSVC doesn't support 64-bit enum values, at least not with C code. The compiler was warning: c:\users\brian\projects\mesa\src\mesa\state_tracker\st_atom_list.h(43) : warning C4309: 'initializing' : truncation of constant value c:\users\brian\projects\mesa\src\mesa\state_tracker\st_atom_list.h(44) : warning C4309: 'initializing' : truncation of constant value ... And at runtime we crashed since the high 32-bits of the 'dirty' bitmask was always 0xffffffff and the 32+u_bit_scan() index went out of bounds of the atoms[] array. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-09 07:50:18 -06:00
Miklós Máté	d9519c6f06	mesa: simplify ff fs generator a bit Literally. Signed-off-by: Miklós Máté <mtmkls@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-09 07:46:37 -06:00
Marek Olšák	06b2fd04f6	ddebug: dump driver states and shaders for apitrace calls I think this was an oversight when the PIPE_DUMP flags were added. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-09 15:35:42 +02:00
Timothy Arceri	8c4d9afb7e	nir: make use of nir_cf_list_extract() helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-09 13:21:30 +10:00
Matt Turner	b1d9c742e9	nir: Always print non-identity swizzles. Previously we would not print a swizzle on ssa_52 when only its .x component is used (as seen in the definition of ssa_53): vec3 ssa_52 = fadd ssa_51, ssa_51 vec1 ssa_53 = flog2 ssa_52 vec1 ssa_54 = flog2 ssa_52.y vec1 ssa_55 = flog2 ssa_52.z But this makes the interpretation of the RHS of the definition difficult to understand and dependent on the size of the LHS. Just print swizzles when they are not the identity swizzle, so the previous example is now printed as: vec3 ssa_52 = fadd ssa_51.xyz, ssa_51.xyz vec1 ssa_53 = flog2 ssa_52.x vec1 ssa_54 = flog2 ssa_52.y vec1 ssa_55 = flog2 ssa_52.z Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-08 17:52:35 -07:00
Lionel Landwerlin	8cde4ddbce	anv/pipeline/gen7: Set multisample modes Fixes the following failures : dEQP-VK.api.copy_and_blit.resolve_image.whole_4_bit dEQP-VK.api.copy_and_blit.resolve_image.whole_8_bit dEQP-VK.api.copy_and_blit.resolve_image.partial_4_bit dEQP-VK.api.copy_and_blit.resolve_image.partial_8_bit dEQP-VK.api.copy_and_blit.resolve_image.with_regions_4_bit dEQP-VK.api.copy_and_blit.resolve_image.with_regions_8_bit Tested on IVB/HSW Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-08 14:44:25 -07:00
Lionel Landwerlin	a3c472a2ec	anv/pipeline: rename info to rs_info in emit_rs_state Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-08 14:44:25 -07:00
Marek Olšák	1ebf3c4b67	Revert "glsl: don't try to lower non-gl builtins as if they were gl_FragData" This reverts commit `a37e46323c`. It broke the game Overlord such that it hung a GCN GNU. While I don't know how the hang happened because of its randomness and gfx corruption precedes it, many of the shaders contain this: out vec4 FragData[gl_MaxDrawBuffers];	2016-08-08 23:24:20 +02:00
Tomasz Figa	3723e9826f	egl/android: Add support for YV12 pixel format (v2) This patch adds support for YV12 pixel format to the Android platform backend. Only creating EGL images is supported, it is not added to the list of available visuals. v2: Use const array defined just for YV12 instead of trying to be overly generic. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Tested-by: Rob Herring <rob@kernel.org> Reviewed-by: Chad Versace <chad@kiwitree.net> Change-Id: I4aeb2d67a95c5cdd10b530c549b23146c8f0b983	2016-08-08 14:18:38 -07:00
Kenneth Graunke	3190c7ee97	st/mesa: Make Gallium's BlitFramebuffer follow the GL 4.4 sRGB rules. OpenGL 4.4 specifies that BlitFramebuffer should perform sRGB encode and decode like ES 3.x does, but only when GL_FRAMEBUFFER_SRGB is enabled. This is technically incompatible in certain cases, but is more consistent across GL, ES, and WebGL, and more flexible. The NVIDIA 367.35 drivers appear to follow this behavior. For the awful spec analysis, please read Piglit's tests/spec/arb_framebuffer_srgb/blit.c, which explains the differences between GL 4.1, 4.2, 4.3 (2012), 4.3 (2013), and 4.4, and why this is the right rule to implement. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-08 14:04:18 -07:00
Kenneth Graunke	f6dc71483a	meta: Make Meta's BlitFramebuffer() follow the GL 4.4 sRGB rules. Just avoid whacking GL_FRAMEBUFFER_SRGB altogether, so we respect the application's setting. This appears to work. v2: Update one more comment (requested by Ian). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 14:04:01 -07:00
Kenneth Graunke	ad32dcf630	i965: Make BLORP's BlitFramebuffer follow the GL 4.4 sRGB rules. OpenGL 4.4 specifies that BlitFramebuffer should perform sRGB encode and decode like ES 3.x does, but only when GL_FRAMEBUFFER_SRGB is enabled. This is technically incompatible in certain cases, but is more consistent across GL, ES, and WebGL, and more flexible. The NVIDIA 367.35 drivers appear to follow this behavior. For the awful spec analysis, please read Piglit's tests/spec/arb_framebuffer_srgb/blit.c, which explains the differences between GL 4.1, 4.2, 4.3 (2012), 4.3 (2013), and 4.4, and why this is the right rule to implement. Note that ctx->Color.sRGBEnabled is initialized to _mesa_is_gles(ctx), and ES doesn't have enable/disable flags for GL_FRAMEBUFFER_SRGB, so it's effectively on all the time. This means the ES behavior should be unchanged. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 14:01:51 -07:00
Kenneth Graunke	352401f6a9	i965: Make BLORP do sRGB encode/decode on ES 2 as well. This should have no effect, as all drivers which support BLORP also support ES 3.0 - so ES 2.0 would be promoted and follow the ES 3 rules. ES 1.0 doesn't have BlitFramebuffer. This is purely to clarify the next patch a bit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 14:01:51 -07:00
Kenneth Graunke	0c7047ab9c	Revert "st/mesa: use sRGB formats for MSAA resolving if destination is sRGB" This reverts commit `4e549ddb50`, dropping the hack from Gallium that I just deleted from i965. See the previous commit for rationale. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-08 14:01:51 -07:00
Kenneth Graunke	cc27c7fe38	i965: Drop the "do resolves in sRGB" hack. I've never quite understood the purpose of this hack - supposedly, doing resolves in the sRGB colorspace is slightly more accurate. Currently, BlitFramebuffer() ignores sRGB encoding and decoding on OpenGL, although it encodes and decodes in GLES 3.x. The updated OpenGL 4.4 rules also allow for encoding and decoding if GL_FRAMEBUFFER_SRGB is enabled, allowing the application to control what colorspace blits are done in. I don't think this hack makes any sense in such a world - the application can do what it wants, and we shouldn't second guess them. A related Piglit patch, "Make multisample accuracy test set GL_FRAMEBUFFER_SRGB when resolving." makes the Piglit MSAA accuracy test explicitly request SRGB encoding/decoding during resolves when running "srgb" subtests. Without that patch, this commit will regress those tests, but with it, they should continue to work just fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 14:01:51 -07:00
Kenneth Graunke	b1586526e8	i965: Bail on the BLT path if BlitFramebuffer requires sRGB conversion. Modern OpenGL BlitFramebuffer require sRGB encode/decode when GL_FRAMEBUFFER_SRGB is enabled. The blitter can't handle this, so we need to bail. On Gen4-5, this means falling back to Meta, which should handle it. We allow sRGB <-> sRGB blits, as decode then encode ought to be a noop (other than potential precision loss, which nobody wants anyway). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 14:01:51 -07:00
Tomasz Figa	7dfb1a4074	egl/android: Make get_fourcc() accept HAL formats There are DRI_IMAGE_FOURCC macros, for which there are no corresponding DRI_IMAGE_FORMAT macros. To support such formats we need to make the lookup function take the native format directly. As a side effect, it simplifies all existing calls to this function, because they all called get_format() first to convert from native to DRI_IMAGE_FORMAT. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Rob Herring <rob@kernel.org> Reviewed-by: Chad Versace <chad@kiwitree.net> Change-Id: I4674000fb5ccfd02e38b8fa89bc567ac1d4fc16b	2016-08-08 11:40:41 -07:00
Tomasz Figa	e77b493390	egl/android: Refactor image creation to separate flink and prime paths (v2) This patch splits current dri2_create_image_android_native_buffer() into main entry point and two additional functions, one for creating an image from flink name and one for handling prime FDs using the generic DMA-buf path. This makes the code cleaner and also prepares for disabling flink path more easily in the future. v2: Split into separate patch. Add error messages. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Rob Herring <rob@kernel.org> Reviewed-by: Chad Versace <chad@kiwitree.net> Change-Id: Ifdfb5927399d56992fe707160423c29278f49172	2016-08-08 11:40:37 -07:00
Tomasz Figa	217af75a40	egl/android: Respect buffer mask in droid_image_get_buffers (v2) Drivers can request different set of buffers depending on the buffer mask they pass to the get_buffers callback. This patch makes droid_image_get_buffers() respect this mask. v2: Return error only in case of real error condition and ignore requests of unavailable buffers. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Rob Herring <rob@kernel.org> Reviewed-by: Chad Versace <chad@kiwitree.net> Change-Id: I6c3c4eca90f4c618579f6725dec323c004cb44ba	2016-08-08 11:40:31 -07:00
Tomasz Figa	c6c26bc589	egl/android: Remove unused variables in droid_get_buffers_with_format() Fix compilation warnings due to unused variables left after some earlier code changes. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Rob Herring <rob@kernel.org> Reviewed-by: Chad Versace <chad@kiwitree.net> Change-Id: Iec09eb2a62887f3a38dff156756ed8385f3f3447	2016-08-08 11:40:26 -07:00
Jason Ekstrand	52fcc40760	anv/pipeline/gen7: Set the depth format in 3DSTATE_SF Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:46 -07:00
Jason Ekstrand	21d5c1be6a	isl: Add a helper for getting a depth format from an isl_format Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:44 -07:00
Jason Ekstrand	ce980541d5	anv/pipeline: Unify 3DSTATE_RASTER and 3DSTATE_SF setup between gen7 and gen8 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:41 -07:00
Jason Ekstrand	960e8a1260	anv/pipeline/gen8: Set 3DSTATE_SF::StatisticsEnable We've been setting it in gen7 forever but never in gen8; best to make it consistent. This hasn't caused any problems yet because we don't advertise support for statistics queries yet. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:36 -07:00
Jason Ekstrand	12e653adec	anv/pipeline/gen8: Unconditionally set DXMultisampleRasterizaitonEnable The multisample rasterization mode is computed based on this field, 3DSTATE_RASTER::DXMultisampleRasterizationMode (only for forced multisampling), 3DSTATE_RASTER::APIMode, and the number of samples. There are two tables in the SKL PRM that describe how the final multisample mode is calculated: "Windower (WM) Stage >> Multisampling >> Multisample ModeState >> Table 1" and the formula for "SF_INT::Multisample Rasterization Mode". The "DX Multisample Rasterization Enable" bit changes whether multisample mode is set to OFF_PIXEL or ON_PATTERN in the samples > 1 case. In the samples == 1 case, the bit has no effect. Since Vulkan has no concept of disabling multisampling for samples > 1, we can just set the bit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:33 -07:00
Jason Ekstrand	1df511b6f0	anv/pipeline/gen8: Use fewer designated initializers in emit_rs_state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:31 -07:00
Jason Ekstrand	6136fb8687	genxml: Make 3DSTATE_SF more consistent between gen7 and gen8+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:28 -07:00
Jason Ekstrand	2d76dcae71	anv/pipeline/gen8: Remove an old comment This is now handled in emit_3dstate_clip Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:04 -07:00
Kenneth Graunke	7314007925	mesa: Skip ES 3.0/3.1 transform feedback primitive counting error. This error condition is not implementable when using tessellation or geometry shaders. The text was also removed from the ES 3.2 spec. I believe the intended behavior is to remove the error condition when either OES_geometry_shader or OES_tessellation_shader are exposed. v2: Quote a better part of issue 13 (suggested by Ian). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 10:01:30 -07:00
Kenneth Graunke	23b2bcd460	mesa: Share code between _mesa_validate_DrawArrays[_Instanced]. Mostly, I want to share the GLES 3 transform feedback handling, though most of the rest of the code is identical as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 10:01:30 -07:00
Kenneth Graunke	522b5d4566	glsl: Implicitly enable OES_shader_io_blocks if geom/tess are enabled. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 09:59:03 -07:00
Kenneth Graunke	0eaa84e8af	glsl: Expose gl_PointSize if OES/EXT_tessellation_point_size is enabled. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 09:59:03 -07:00
Kenneth Graunke	58709d36d7	glsl: Add extension plumbing for OES/EXT_tessellation_shader. This adds the #extension directive support, built-in #defines, lexer keyword support, and updates has_tessellation_shader(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 09:59:03 -07:00
Kenneth Graunke	722fd10456	mesa: Move tessellation shader gets to GL_CORE, GLES31 section. This makes them available in the GLES 3.1 API. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 09:59:03 -07:00
Kenneth Graunke	c8438b62b7	mesa: Add {OES,EXT}_tessellation_shader to the extensions table. Also update _mesa_has_tessellation to know about the new extensions. For now, these are dummy_false, to avoid turning on the extension until everything's in place. Eventually, we'll move them over to the "ARB_tessellation_shader" bit so that any drivers supporting both the desktop extension and ES 3.1 get the feature. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 09:59:03 -07:00
Kenneth Graunke	73554c47e0	mapi: Add PatchParameteriOES and PatchParameteriEXT. The OES_tessellation_shader and EXT_tessellation_shader specifications have suffixed names. These are identical to the core function, so just alias them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 09:59:03 -07:00
Nicolai Hähnle	96bbb620a5	radeonsi: add has_draw_indirect_multi flag Prefer to use DRAW_(INDEX)_INDIRECT_MULTI when available in the firmware. Versions for SI and CI already added as provided by the firmware team, but keep in mind that they won't currently be used since the radeon kernel module has no interface to query the firmware version. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:53:06 +02:00
Nicolai Hähnle	5c343cce0f	radeonsi: transpose indirect/index draw dispatch This allows better code sharing for indirect draw calls. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:53:04 +02:00
Nicolai Hähnle	64ff23a58c	radeonsi: move index buffer calculations in si_emit_draw_packets up Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:53:02 +02:00
Nicolai Hähnle	cf7d18b75c	radeonsi: unify emitting PKT3_SET_BASE for indirect draws Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:52:59 +02:00
Nicolai Hähnle	e0736c438c	winsys/amdgpu: query ME/PFP/CE firmware versions The radeon kernel module doesn't have the firmware query interface, so the corresponding values will remain 0. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:52:41 +02:00
Nicolai Hähnle	7f5a8dc27e	radeonsi: move spi_ps_input_addr override outside of the loop Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:51:32 +02:00
Nicolai Hähnle	287822ee33	radeonsi: drop unnecessary u_pstipple.h include Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:51:29 +02:00
Nicolai Hähnle	3e4c5693a1	radeonsi: do not pass the return type to buffer_load_const Overriding it is not allowed anyway, and actually lead to a crash when polygon stippling was used with monolithic shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:51:26 +02:00
Kenneth Graunke	bd1bd03268	glsl: Combine GS and TES array resizing visitors. These are largely identical, except that the GS version has a few extra error conditions. We can just pass in the stage and skip these. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-07 23:53:59 -07:00
Kenneth Graunke	398428f406	glsl: Fix location bias for patch variables. We need to subtract VARYING_SLOT_PATCH0, not VARYING_SLOT_VAR0. Since "patch" only applies to inputs and outputs, we can just handle this once outside the switch statement, rather than replicating the check twice and complicating the earlier conditions. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-07 23:53:42 -07:00
Kenneth Graunke	1556f16e46	glsl: Fix the program resource names of gl_TessLevelOuter/Inner[]. These are lowered to gl_TessLevel{Outer,Inner}MESA. We need them to appear in the program resource list with their original names and types. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-07 23:53:28 -07:00
Kenneth Graunke	4a49851da1	glsl: Delete bogus ir_set_program_inouts assert. This assertion is bogus. Varying structs, and arrays of structs, are allowed by GLSL, and we can see them here. While we currently don't have any partial-variable support for those, simply returning false and marking the entire thing as used is certainly legitimate. I believe this is often swept under the rug by varying packing, but that's disabled in certain tessellation situations. Hit by 20 dEQP-GLES31.functional.tessellation.user_defined_io.* tests. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-07 23:51:21 -07:00
Kenneth Graunke	86915b495b	glsl: Simplify interface qualifier parsing. This better matches the grammar in section 4.3.9 of the GLSL 4.5 spec, and also removes some redundant code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-07 23:48:48 -07:00
Kenneth Graunke	d0642c52fc	glsl: Add a has_tessellation_shader() helper. Similar to has_geometry_shader(), has_compute_shader(), and so on. This will make it easier to add more conditions here later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-07 23:47:55 -07:00
Marek Olšák	3fb4a9b3b3	Revert "gallium/radeon: count contexts" This reverts commit `b403eb3385`. Not needed.	2016-08-06 17:29:23 +02:00
Marek Olšák	11b1d064a3	radeonsi: add GLSL lit tests They can only be run manually as described in HOW_TO_RUN. It should help catch suboptimal code generation. Some of the tests already fail. v2: rename the tests to *.glsl, fix lit.cfg to find FileCheck Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-08-06 16:11:43 +02:00
Marek Olšák	35942ee8a8	radeonsi: add a standalone compiler amdgcn_glslc This will be used by GLSL lit tests. For developers only. It shouldn't be distributable and it doesn't use the Mesa build system. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 16:11:39 +02:00
Marek Olšák	ad8af99c86	radeonsi: add environment variable SI_FORCE_FAMILY This will be used by: amdgcn_glslc -mcpu=[family] It can also be used for shader-db if you want stats for a different family. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 16:11:35 +02:00
Marek Olšák	d0646cc745	winsys/radeon: implement cs_get_next_fence Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 14:29:31 +02:00
Marek Olšák	63b99590db	winsys/amdgpu: implement cs_get_next_fence Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 14:29:30 +02:00
Marek Olšák	04a6cb63aa	gallium/radeon: add cs_get_next_fence winsys callback Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 14:29:30 +02:00
Marek Olšák	b403eb3385	gallium/radeon: count contexts We don't wanna use unflushed fences when we have multiple contexts. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 14:29:30 +02:00
Marek Olšák	16d568d911	gallium/radeon: count gfx IB flushes This will be used as a counter for whether fence_finish needs to flush the IB. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 14:29:30 +02:00
Marek Olšák	c5ff0d3e65	gallium/radeon: move radeon_winsys::cs_memory_below_limit to drivers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	076db67217	gallium/radeon: inline radeon_winsys::query_memory_usage Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	9646ae7799	gallium/radeon/winsyses: expose per-IB used_vram and used_gart to drivers The following patches will use this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	1c8f17599e	gallium/radeon/winsyses: print CS submission error number Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	0edc2e433e	radeonsi: flush if constant, shader, and streamout buffers use too much memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	c3efdeb8dd	radeonsi: flush if sampler views and images use too much memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	d82cfab84c	radeonsi: deal with high vertex buffer memory usage correctly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	e62caf576e	radeonsi: take compute shader and dispatch indirect memory usage into account Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	c56ecb68e7	radeonsi: take scratch buffer and draw indirect memory usage into account Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	ed2254d157	radeonsi: check IB memory usage of CP DMA operations Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	f4b977bf3d	gallium/radeon: add r600_resource::vram_usage and gart_usage Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Mathias Fröhlich	62d41162bb	mesa: Copy bitmask of VBOs in the VAO on gl{Push,Pop}Attrib. On gl{Push,Pop}Attrib(GL_CLIENT_VERTEX_ARRAY_BIT) take care that gl_vertex_array_object::VertexAttribBufferMask matches the bound buffer object in the gl_vertex_array_object::VertexBinding array. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2016-08-06 06:27:37 +02:00
Nanley Chery	c495c18b24	anv/gen7_pipeline: Set PixelShaderKillPixel for discards According to the IVB PRM Vol2 P1, this bit must be set if a pixel shader contains a discard instruction. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97207 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-05 09:53:52 -07:00
Jason Ekstrand	21f357b66e	util/r11g11b10f: Whitespace cleanups Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:07:06 -07:00
Jason Ekstrand	ffcf8e1049	util/format: Use explicitly sized types Both the rgb9e5 and r11g11b10 formats are defined based on how they are packed into a 32-bit integer. It makes sense that the functions that manipulate them take an explicitly sized type. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:07:04 -07:00
Jason Ekstrand	c7eb9a7565	util/rgb9e5: Get rid of the float754 union There are a number of reasons for this refactor. First, format_rgb9e5.h is not something that a user would expect to define such a generic union. Second, defining it requires checking for endianness which is ugly. Third, 90% of what we were doing with the union was float <-> uint32_t bitcasts and the remaining 10% can be done with a sinmple left-shift by 23. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:07:01 -07:00
Jason Ekstrand	cda8d95660	util/format_rgb9e5: Get rid of the rgb9e5 union The rgb9e5 format is a packed format defined in terms of slicing up a single 32-bit value. The bitfields are far more confusing than simple shifts and require that we check the endianness. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:06:59 -07:00
Jason Ekstrand	f29fd7897a	util: Move format_r11g11b10f.h to src/util It's used from both mesa main and gallium. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:06:57 -07:00
Jason Ekstrand	6c665cdfc5	util: Move format_rgb9e5.h to src/util It's used from both mesa main and gallium. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:06:31 -07:00
Andres Gomez	591869e921	glsl: fix indentation, comments and line lengths in ast_function.cpp Acked-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-08-05 14:27:11 +03:00
Andres Gomez	8f98a120f3	glsl: apply_implicit_conversion is static again Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-08-05 14:27:11 +03:00
Andres Gomez	1443c10d74	glsl: struct constructors/initializers only allow implicit conversions When an argument for a structure constructor or initializer doesn't match the expected type, only Section 4.1.10 “Implicit Conversions” are allowed to try to match that expected type. From page 32 (page 38 of the PDF) of the GLSL 1.20 spec: " The arguments to the constructor will be used to set the structure's fields, in order, using one argument per field. Each argument must be the same type as the field it sets, or be a type that can be converted to the field's type according to Section 4.1.10 “Implicit Conversions.”" From page 35 (page 41 of the PDF) of the GLSL 4.20 spec: " In all cases, the innermost initializer (i.e., not a list of initializers enclosed in curly braces) applied to an object must have the same type as the object being initialized or be a type that can be converted to the object's type according to section 4.1.10 "Implicit Conversions". In the latter case, an implicit conversion will be done on the initializer before the assignment is done." v2: Remove also the now redundant constant conversion, the constant_record_constructor helper and the replacement code (Timothy). Fixes GL44-CTS.shading_language_420pack.initializer_list_negative Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-08-05 14:27:03 +03:00
Andres Gomez	de60d549b9	glsl: Refactor implicit conversion into its own helper v2: Refactor also the conversion to constant and replacement code (Timothy). Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-08-05 14:27:03 +03:00
Andres Gomez	af796d756e	glsl/types: disallow implicit conversions before GLSL 1.20 Implicit conversions were added in the GLSL 1.20 spec version. v2: Join the checks for GLSL 1.10 and ESSL (Timothy). Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-08-05 14:27:03 +03:00
Kenneth Graunke	875341c69b	i965: Rework the unlit centroid workaround. Previously, for every input, we moved the dispatch mask to the flag register, then emitted two predicated PLN instructions, one with centroid barycentric coordinates (for normal pixels), and one with pixel barycentric coordinates (for unlit helper pixels). Instead, we can simply emit a set of predicated MOVs at the top of the program which copy the pixel barycentric coordinates over the centroid ones for unlit helper pixel channels. Then, we can just use normal PLNs. On Sandybridge: total instructions in shared programs: 7538470 -> 7534500 (-0.05%) instructions in affected programs: 101268 -> 97298 (-3.92%) helped: 705 HURT: 9 (all of which are SIMD16 programs) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-05 01:43:52 -07:00
Tim Rowley	b521083ffb	swr: [rasterizer core] static analysis fixes for conservative rast Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:35 -05:00
Tim Rowley	68dc544879	swr: [rasterizer core] implement InnerConservative input coverage Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:35 -05:00
Tim Rowley	4034f48833	swr: [rasterizer core] remove CanEarlyZ function Test is now in SetupPipeline. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	b365989875	swr: [rasterizer core] use 32x32 macrotile for openswr Significant performance increase (up to 2x) on high geometry workloads. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	5f4bc9e85b	swr: [rasterizer fetch] add support for 24bit format fetch Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	527d45c8fe	swr: [rasterizer fetch] additional fetch format support Add support for 0 pitch in fetch. Add support for USCALE/SSCALE for 32bit integer fetches. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	f438b7ba81	swr: [rasterizer jitter] fix potential jit exit crash Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	57b07498d2	swr: [rasterizer core] update sync handling Sync now uses a callback to ensure that it's called by the last thread moving past a DC. This will help with the new counter handling. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	191786d0f4	swr: [rasterizer core] rename variable Avoid nested declarations of the same name within a single function. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:01:37 -05:00
Tim Rowley	61cc012e9a	swr: [rasterizer jitter] adjust extern "C" block scope Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:01:31 -05:00
Tim Rowley	9f7d99fcfe	swr: [rasterizer core] conservative rast degenerate handling Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:01:25 -05:00
Tim Rowley	f01827a469	swr: [rasterizer core] allow hexadecimal for integer knobs Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 13:52:12 -05:00
Eric Anholt	49741e1cd2	mesa: Dynamically allocate the matrix stack. By allocating and initializing the matrices at context creation, the OS couldn't even overcommit the pages. This saves about 63k (out of 946k) of maximum memory size according to massif on simulated vc4 glsl-algebraic-add-add-1. It also means we could potentially relax the maximum stack sizes, but that should be a separate commit. v2: Drop redundant Top update, explain why the stack is small at init time. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-04 08:52:11 -07:00
Eric Anholt	2a808219b3	state_tracker: Initialize the draw context only when needed. It's only used for rarely-used deprecated GL features (feedback/rasterpos), so we can skip the memory allocation and initialization for it most of the time. Saves about 659k (out of 1605k) of maximum memory size according to massif on simulated vc4 glsl-algebraic-add-add-1 Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-04 08:48:27 -07:00
Eric Anholt	c976e164d2	vc4: Move scalarizing and some lowering to link time. This works out to be a wash in terms of memory usage: We use more memory to store the separate ALU instructions, but we optimize out a lot of code as well. The main result, though, is that we do more of our work at link time rather than draw time.	2016-08-04 08:48:27 -07:00
Eric Anholt	2350569a78	vc4: Avoid VS shader recompiles by keeping a set of FS inputs seen so far. We don't want to bake the whole array into the FS key, because of the hashing overhead. But we can keep a set of the arrays seen, and use a pointer to the copy in as the array's proxy. Between this and the previous patch, gl-1.0-blend-func now passes on hardware, where previously it was filling the 256MB CMA area with shaders and OOMing. Drops 712 shaders from shader-db.	2016-08-04 08:48:27 -07:00
Eric Anholt	62ea2461ed	vc4: Don't recompile the CS when the FS changes. The compiled_fs_id is a proxy for the vc4->prog.fs->input_slots[], but only the VS dereferences it. Drops 754 shaders from shader-db.	2016-08-04 08:48:27 -07:00
Eric Anholt	d577dbc201	vc4: Move FS inputs setup out to a helper function. It's a pretty big block, and I was about to make it bigger.	2016-08-04 08:48:27 -07:00
Kenneth Graunke	144cbf8987	nir: Make nir_opt_remove_phis see through moves. I found a shader in Tales of Maj'Eyal that contains: if ssa_21 { block block_1: /* preds: block_0 / ...instructions that prevent the select peephole... vec1 32 ssa_23 = imov ssa_4 vec1 32 ssa_24 = imov ssa_4.y vec1 32 ssa_25 = imov ssa_4.z / succs: block_3 / } else { block block_2: / preds: block_0 / vec1 32 ssa_26 = imov ssa_4 vec1 32 ssa_27 = imov ssa_4.y vec1 32 ssa_28 = imov ssa_4.z / succs: block_3 / } block block_3: / preds: block_1 block_2 */ vec1 32 ssa_29 = phi block_1: ssa_23, block_2: ssa_26 vec1 32 ssa_30 = phi block_1: ssa_24, block_2: ssa_27 vec1 32 ssa_31 = phi block_1: ssa_25, block_2: ssa_28 Here, copy propagation will bail because phis cannot perform swizzles, and CSE won't do anything because there is no dominance relationship between the imovs. By making nir_opt_remove_phis handle identical moves, we can eliminate the phis and rewrite everything to use ssa_4 directly, so all the moves become dead and get eliminated. I don't think we need to check "exact" - just the alu sources. Presumably phi sources should match in their exactness. On Broadwell: total instructions in shared programs: 11639872 -> 11638535 (-0.01%) instructions in affected programs: 134222 -> 132885 (-1.00%) helped: 338 HURT: 0 v2: Fix return value to be NULL, not false (caught by Iago). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-04 00:42:12 -07:00
Kenneth Graunke	7603b4d3a1	nir: Make nir_alu_srcs_equal non-static. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-04 00:41:07 -07:00
Kenneth Graunke	6aa730000f	nir: Turn imov/fmov of undef into undef. On Broadwell: total instructions in shared programs: 11640214 -> 11639872 (-0.00%) instructions in affected programs: 17744 -> 17402 (-1.93%) helped: 78 HURT: 0 total spills in shared programs: 2924 -> 2922 (-0.07%) spills in affected programs: 104 -> 102 (-1.92%) helped: 1 HURT: 0 total fills in shared programs: 4394 -> 4389 (-0.11%) fills in affected programs: 237 -> 232 (-2.11%) helped: 1 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-04 00:40:59 -07:00
Kenneth Graunke	12a912586f	i965: Use a separate register for every access to an SSA undef. Previously, we allocated a new VGRF for every undefined definition. Instead, this patch makes us allocate a new VGRF for every use of an undefined definition. This makes sure that undefined values are fully independent of one another, and have live ranges limited to their single use. This allows register coalescing to combine the source and destination of MOVs from undefined sources, eliminating the MOV altogether. On Broadwell: total instructions in shared programs: 11641187 -> 11640214 (-0.01%) instructions in affected programs: 70199 -> 69226 (-1.39%) helped: 213 HURT: 1 v2: Add a comment (based on Iago's suggested one). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-04 00:40:10 -07:00
Michel Dänzer	67c5e843b9	vl/dri3: Destroy Present event context when destroying drawable v2 Without this, the X server may accumulate stale Present event contexts if a client performs several video decoding sessions using the same window. v2: Based on Chris Wilson's review: * Use xcb_discard_reply() instead of free(xcb_request_check()) Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>	2016-08-04 15:45:43 +09:00
Michel Dänzer	5d191bafa2	loader/dri3: Destroy Present event context when destroying drawable v2 Without this, the X server may accumulate stale Present event contexts if a client ends up creating and destroying DRI drawables for the same window. v2: Based on Chris Wilson's review: * Use xcb_present_select_input_checked so that protocol errors generated by old X servers can be handled gracefully * Use xcb_discard_reply() instead of free(xcb_request_check())	2016-08-04 15:45:43 +09:00
Ben Widawsky	1743c4184b	gbm: Correct bo_import documentation (trivial) Missed here: commit `a43d286ef7` Author: Kristian Høgsberg <krh@bitplanet.net> Date: Fri Mar 28 10:17:11 2014 -0700 gbm: Add import from fd Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-08-03 10:56:41 -07:00
Eric Anholt	bc1fc9c985	vc4: Avoid generating a custom shader per level in glGenerateMipmaps(). We were baking in the LOD of the source level to each shader. Instead, pass it in as a uniform -- this requires storing it to a temp register, but that's better than compiling a ton of separate shaders: total instructions in shared programs: 115032 -> 115036 (0.00%) instructions in affected programs: 96 -> 100 (4.17%) LOST: 572	2016-08-03 10:55:54 -07:00
Eric Anholt	e97e9e62a1	vc4: Tell valgrind about BO allocations from mmap time to destroy. This helps in debugging memory pressure. It would be nice if we could tell valgrind about it all the way from allocation time to destroy, but we need a pointer to hand to VALGRIND_MALLOCLIKE_BLOCK.	2016-08-03 10:28:20 -07:00
Jan Ziak	fd32868590	loader: fix memory leak in loader_dri3_open Found via "valgrind --leak-check=full glxgears". Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com> Acked-by: Boyan Ding <boyan.j.ding@gmail.com> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-03 10:25:09 -07:00
Eric Anholt	a0671d67de	vc4: Fix a leak of the src[] array of VPM reads in optimization. Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-03 10:25:09 -07:00
Eric Anholt	9f95690959	vc4: Fix leak of the bo_handles table.	2016-08-03 10:25:08 -07:00
Eric Anholt	02f8c444e8	vc4: Fix handling of UBO range offsets. The ranges are in units of bytes, not dwords. This wasn't caught by piglit tests because ttn tends to make one big uniform file, so we only had one UBO range with a src and dst offset of 0.	2016-08-03 10:25:08 -07:00
Eric Anholt	9128acfb57	nir: Allow opt_peephole_select to work on empty blocks. nir_opt_peephole_select has the job of removing IF statements with no side effects. However, if the IF statement's successor didn't have any instructions in it, we were skipping it, which occurred in mupen64 on vc4 with glsl_to_nir enabled: instructions in affected programs: 6134 -> 4120 (-32.83%) total uniforms in shared programs: 38268 -> 38219 (-0.13%) No changes on Haswell shader-db. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-03 10:25:08 -07:00
Eric Anholt	36b9eb82c1	vc4: Dump NIR at shader state creation time as well. I keep wanting to see this version of the NIR.	2016-08-03 10:25:08 -07:00
Marek Olšák	435d9595d3	r600g: use last_gfx_fence like radeonsi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	a6bfafa083	gallium/radeon: move last_gfx_fence from radeonsi to common code Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	c15a9dec29	radeonsi: skip unnecessary si_update_shaders calls Small decrease in draw call overhead. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	c2a0e99169	radeonsi: print the command line to VM fault reports (v2) v2: rebase on top of Brian's commit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	6573ad69ef	ddebug: print the command line to all logs (v2) for piglit with the pipelined hang detection mode v2: rebase on top of Brian's commit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	840353059a	ddebug: don't use fmemopen on non-Linux OS Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97140 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	c88b309fd5	radeonsi: don't set the last parameter component of llvm.AMDGPU.cube LLVM doesn't use it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	42c5f839ad	radeonsi: use llvm.amdgcn.cube* if available Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	1fb6e55eaf	radeonsi: use llvm.amdgcn.rsq.f64 if available Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	db2d31dab1	radeonsi: use v_mad_f32 for fma v_fma_f32 runs at FP64 rate (= slow). Alien Isolation and F1 2015 seem to use fma for all d3d multiply-add instructions, which is silly. This tries to restore performance for those games. The main difference between v_mad_f32 and v_fma_f32 is that v_mad doesn't support denormals, which we don't enable anyway, because they are slow too. Also, there is code size reduction: Totals from affected shaders: VGPRS: 109796 -> 109808 (0.01 %) Spilled SGPRs: 29995 -> 30022 (0.09 %) Spilled VGPRs: 12 -> 13 (8.33 %) <-- it's just one shader going from 12 to 13 Code Size: 6667596 -> 6476356 (-2.87 %) bytes Max Waves: 26931 -> 26899 (-0.12 %) I've not actually tested real performance. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Haixia Shi	4c4bfed670	i965: use mt->offset in intel_miptree_map_movntdqa() We need to include mt->offset in the calculation of src pointer because its value may be non-zero, for example in a cubemap texture. Signed-off-by: Haixia Shi <hshi@chromium.org> Cc: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chad Versace <chad@kiwitree.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Change-Id: I461ad5b204626d5a1c45611fc6b63735dcf29f63	2016-08-03 08:28:52 -07:00
Timothy Arceri	6fb6201f71	nir: fix validation message Looks like a copy and paste error from `f752effa08` Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-08-03 09:31:57 +10:00
Chad Versace	2d788a9181	.mailmap: Update my address I left Intel, so make my personal address the canonical address.	2016-08-02 13:29:53 -07:00
Tim Rowley	11072de368	swr: build swr with -fno-strict-aliasing swr rasterizer contains numerous data transfers between vectors and ordinary C types. Fixing for strict aliasing will take time. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-02 14:30:33 -05:00
Andres Gomez	3356ac208b	ast: Updated AST_NUM_OPERATORS for coherence with ast_operators AST_NUM_OPERATORS stores the dimension of the ast_operators enumeration but was not updated after its last modification. This doesn't add any real modification for any code paths but it makes sense for coherence. v2 (Eric Engestrom): Just place the define at the end of the enumeration, not below. Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-08-02 21:33:03 +03:00
Matt Turner	c3211ae093	i965: Disable the unlit centroid workaround on Gen7. Once upon a time (commit `8313f44409`) Paul added code for the unlit centroid workaround (WaCopyUnlitCentroidBarys). His commit message claims it fixed the EXT_framebuffer_multisample/interpolation {2,4} {centroid-deriv,centroid-deriv-disabled} piglit tests but does not say on which platform, though he cites the IVB PRM. "3DSTATE_WM [DevIVB, DevHSW]" says "[DevIVB]: Workaround: When Centroid Barycentric mode is required, HW may produce incorrect interpolation results when a 2X2 pixels have unlit pixels." I later disabled it for Haswell (commit `f6db414f3c`) with no known ill effects. The Sandybridge page does not have this text, but the workarounds database (see WaCopyUnlitCentroidBarys) says the issues applies only to Sandybridge, and in fact in commit `1a2de7dce8` I note that disabling the workaround on Sandybridge causes the tests Paul originally mentioned to fail. So this is, and always has been, a huge confusing mess. Disabling the workaround indeed causes the tests Paul originally mentioned to fail on Sandybridge but not on Ivybridge/Baytrail. On Ivybridge: total instructions in shared programs: 6914901 -> 6909599 (-0.08%) instructions in affected programs: 106766 -> 101464 (-4.97%) helped: 884 total cycles in shared programs: 70874764 -> 70813774 (-0.09%) cycles in affected programs: 794144 -> 733154 (-7.68%) helped: 688 HURT: 186 LOST: 1 GAINED: 6 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-02 10:37:13 -07:00
Marek Olšák	6db93cd167	gallium/util: fix align64 it cut off the upper 32 bits Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-01 23:28:14 +02:00
Matt Turner	88ad8c7ded	mesa: Drop -fno-strict-aliasing. Improves performance of OglBatch7 by 4.06851% +/- 1.17925% (n=169) on Haswell, and cuts ~18k of .text: text data bss dec hex filename 5824627 287816 29384 6141827 5db783 before/i965_dri.so 5806354 287816 29384 6123554 5d7022 after/i965_dri.so Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-01 12:09:17 -07:00
Matt Turner	12a14052e8	i915: Avoid aliasing violation. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-01 12:09:17 -07:00
Matt Turner	be35c6ba92	draw: Avoid aliasing violations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00
Matt Turner	8e68f35d32	r600g: Avoid aliasing violations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00
Matt Turner	d2838f77ec	r300g: Avoid aliasing violation. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00
Matt Turner	16ff8f9ae8	gallium/auxiliary: Add u_bitcast.h header. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00
Matt Turner	bbe012f02a	glsl_to_tgsi: Avoid aliasing violations. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-08-01 12:09:17 -07:00
Brian Paul	500a3dd11f	st/mesa: silence missing braces warning in st_program.c Silence a gcc warning: state_tracker/st_program.c: In function 'st_create_fp_variant': state_tracker/st_program.c:957:10: warning: missing braces around initializer [-Wmissing-braces] nir_lower_drawpixels_options options = {0}; ^ state_tracker/st_program.c:957:10: warning: (near initialization for 'options.texcoord_state_tokens') [-Wmissing-braces] Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:20:19 -06:00
Brian Paul	13fa051356	auxiliary/os: add new os_get_command_line() function This can be used by the driver to get the command line which started the process. Will be used by the VMware driver for extra logging. For now, this is only implemented for Linux via /proc/self/cmdline and Windows via GetCommandLine(). Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:20:19 -06:00
Charmaine Lee	c2b4942afc	svga: avoid redundant SetVertexBuffer/SetIndexBuffer commands at rebind This patch eliminates the redundant SetVertexBuffers and SetIndexBuffer commands that are emitted for rebind purpose. With this patch, the set commands will be skipped, but we will still reference the associated resources to allow the kernel to bring in the resources. Tested with Lightsmark2008, Valley, MTT glretrace, piglit, conform. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-01 12:20:19 -06:00
Rob Clark	53b2b8bf6f	u_vbuf: fix potentially bogus assert There are cases where we hit u_vbuf path due to alignment or pitch- alignment restrictions, but for an output-format that u_vbuf does not support translating (yet the driver does support natively). In which case we hit the memcpy() path and don't care that u_vbuf doesn't understand it. Fixes crash with debug build of mesa in: dEQP-GLES3.functional.vertex_arrays.single_attribute.strides.fixed.user_ptr_stride17_components2_quads1 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95000 Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 13:42:11 -04:00
Ben Widawsky	e7c8c85785	gbm: Removed unused function. AFAICT, it's never been used. It was briefly nudged in the right direction here: commit `10e5ffd496` Author: Emil Velikov <emil.l.velikov@gmail.com> Date: Sat Jan 25 17:19:10 2014 +0000 gbm: do not export _gbm_mesa_get_device Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2016-08-01 09:11:14 -07:00
Timothy Arceri	cec377eed3	i965: fix comparison warning Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-01 14:52:07 +10:00
Eric Anholt	26ff7e373f	vc4: Zero-initialize the hardware sampler view structure. Fixes failure to initialize the force_first_level flag, causing failures in piglit levelclamp.	2016-07-31 19:23:03 -07:00
Mathias Fröhlich	b730960e77	mesa: Remove set but not used gl_client_array::Stride. The field is only read for printing today and there it was probably a leftover. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:46 +02:00
Mathias Fröhlich	56c65cd315	mesa: Remove set but not used gl_client_array::Enabled. The way it is used today does not care about the Enabled flag anymore. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:46 +02:00
Mathias Fröhlich	43a6f435ca	vbo: Use the VAO array enabled flags in vbo_exec_array. Instead of gl_client_array::Enabled inside a VAO, directly use the gl_vertex_attrib_array::Enabled value which is the origin of the above. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:46 +02:00
Mathias Fröhlich	4cda690019	vbo: Walk the VAO in check_array_data. Only a debugging function, but move away from gl_client_array and use the first order information from the VAO. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:46 +02:00
Mathias Fröhlich	99b42184f9	vbo: Walk the VAO in print_draw_arrays. Only a debugging function, but move away from gl_client_array and use the first order information from the VAO. Also make use of gl_vert_attrib_name. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:45 +02:00
Mathias Fröhlich	eec516d8e1	mesa: Walk the VAO in _mesa_print_arrays. Only a debugging function, but move away from gl_client_array and use the first order information from the VAO. Also make use of gl_vert_attrib_name. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:45 +02:00
Mathias Fröhlich	144737a498	vbo: Walk the VAO to check for mapped buffers. Similarily to _mesa_all_varyings_in_vbos walk the VAO to check if we have an illegal mapped buffer object instead of walking all gl_client_arrays. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:45 +02:00
Mathias Fröhlich	3f5e5696fe	vbo: Walk the VAO to see if all varyings are in vbos. In vbo_draw_transform_feedback we currently look at exec->array.inputs to determine if all varying vertex attributes reside in vbos. But the vbo_bind_arrays call only happens past the vbo_all_varyings_in_vbos query. Thus we may work on a stale set of client arrays. Using the current VAOs content for this query feels much more logical to me. Additionally with this change mesa makes more use of the information already tracked in the VAO instead of looping across VERT_ATTRIB_MAX vertex arrays. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:45 +02:00
Mathias Fröhlich	f8be969b1b	mesa: Implement _mesa_all_varyings_in_vbos. Implement the equivalent of vbo_all_varyings_in_vbos for vertex array objects. v2: Update comment. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:45 +02:00
Mathias Fröhlich	f7cb46a972	mesa: Unbind deleted vbo using _mesa_bind_vertex_buffer. When a vertex buffer object gets deleted, it is unbound at the VAO. To do this use _mesa_bind_vertex_buffer instead of plain unreferencing the buffer object. This keeps the VAOs internal state consistent. In this case it showed up with gl_vertex_array_object::VertexAttribBufferMask getting out of sync. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:45 +02:00
Timothy Arceri	f696b712d7	glsl: be more strict on block qualifiers V2: Add spec references and allow patch qualifier (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96528	2016-07-31 09:24:45 +10:00
Timothy Arceri	d3dc1b8b5e	glsl: add name param to validate_flags() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-31 09:24:45 +10:00
Timothy Arceri	2262fe4081	glsl: add component to ast_type_qualifier::validate_flags This was added with ARB_enhanced_layouts. V2: Add an extra format specifier for the new qualifier. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-31 09:24:45 +10:00
Timothy Arceri	bbe839379a	docs: Add GL4.4 and ARB_enhanced_layouts to the release notes Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-31 08:19:21 +10:00
Kenneth Graunke	b5661c1d70	anv: Perform rasterizer discard in the SOL stage instead of the clipper. See commit `b0629e6894`, where we discovered that the SOL stage's "Rendering Disable" feature is a lot faster at throwing away all geometry than the clipper's "reject all" mode. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-30 12:06:37 -07:00
Roland Scheidegger	99a47391e4	Revert "gallium/util: fix resource leak" This reverts commit `d1fe26a628`. Replacing a resource leak with a segfault isn't the solution.	2016-07-30 18:18:09 +02:00
Eric Engestrom	d1fe26a628	gallium/util: fix resource leak CovID: 401540 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-30 17:27:42 +02:00
francians@gmail.com	e713a9e613	freedreno/a4xx: fix comparison out of range warnings Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:25:42 -04:00
francians@gmail.com	43492c7f2c	freedreno/a3xx: fix comparison out of range warnings Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:25:31 -04:00
francians@gmail.com	089cc74b6a	freedreno/a2xx: fix comparison out of range warnings Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:25:16 -04:00
francians@gmail.com	3fa68fdc90	freedreno/ir3: init ir3_shader_key with memset() To silence missing initializers warning Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:24:59 -04:00
Eric Engestrom	a63bac9271	gallium/freedreno: move cast to avoid integer overflow Previously, the bitshift would be performed on a simple int (32 bits on most systems), overflow, and then be cast to 64 bits. CovID: 1362461 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Eric Engestrom	3563c4d161	freedreno/a2xx: remove duplicate assignment CovID: 1362445, 1362446 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	2d64a003c5	freedreno: defer flush_queue allocation Some apps, like warsow, create a bazillion contexts but don't render on most of them. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	4175606474	freedreno: add some hw query traces Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	e684c32d2f	freedreno: some locking Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	010e4b2d52	os: add pipe_mutex_assert_locked() Would be nice if we could also have lockdep, like in the linux kernel. But this is better than nothing. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	9f0eb69527	freedreno: drop needs_rb_fbd We need to emit RB_FRAME_BUFFER_DIMENSION once per batch.. tracking this in fd_context is wrong when the gmem code executes asynchronously from the flush_queue worker. But in fact we don't really need to track it at all. We cannot assume previous value at the beginning of the batch (because of other processes potentially using the GPU), so just drop the tracking and emit it in _tile_init(). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	e6bfe1c773	freedreno: move needs_wfi into batch This is also used in gmem code, which executes from the "bottom half" (ie. from the flush_queue worker thread), so it cannot be in fd_context. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	0739bbceec	freedreno: a bit of micro-optimization Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	e1b1052700	freedreno: drop mem2gmem/gmem2mem query stages They weren't really used, and it gets somewhat more complicated to deal with if batches are flushed asynchronously (on another thread). So just drop them, and move _query_set_state(NULL) call into batch (so it is not happening on background thread). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	00bed8a794	freedreno: threaded batch flush With the state accessed from GMEM+submit factored out of fd_context and into fd_batch, now it is possible to punt this off to a helper thread. And more importantly, since there are cases where one context might force the batch-cache to flush another context's batches (ie. when there are too many in-flight batches), using a per-context helper thread keeps various different flushes for a given context serialized. TODO as with batch-cache, there are a few places where we'll need a mutex to protect critical sections, which is completely missing at the moment. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	c44163876a	freedreno: track batch/blit types Add a bit of extra book-keeping about blits and back-blits (from resource shadowing). If the app uploads all mipmap levels, as opposed to uploading the first level and then glGenerateMipmap(), we can discard the back-blit (as opposed to being naive and shadowing the resource for each mipmap level). Also, after a normal blit, we might as well flush the batch immediately, since there is not likely to be further rendering to the surface. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	7f8fd02dc7	freedreno: re-order support for hw queries Push query state down to batch, and use the resource tracking to figure out which batch(es) need to be flushed to get the query result. This means we actually need to allocate the prsc up front, before we know the size. So we have to add a special way to allocate an un- backed resource, and then later allocate the backing storage. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	10baf05b2c	freedreno: use prsc for hw queries Switch to using a pipe_resource (rather than an fd_bo directly) for hw query result buffers. This is first step towards making queries work properly with reordered batches, since we'll need the additional dependency tracking to know which batches to flush. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	ba30096888	freedreno: support discarding previous rendering in special cases Basically, to "DCE" blits triggered by resource shadowing, in cases where the levels are immediately completely overwritten. For example, mid-frame texture upload to level zero triggers shadowing and back-blits to the remaining levels, which are immediately overwritten by glGenerateMipmap(). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	7105774bab	freedreno: shadow textures if possible to avoid stall/flush To make batch re-ordering useful, we need to be able to create shadow resources to avoid a flush/stall in transfer_map(). For example, uploading new texture contents or updating a UBO mid-batch. In these cases, we want to clone the buffer, and update the new buffer, leaving the old buffer (whose reference is held by cmdstream) as a shadow. This is done by blitting the remaining other levels (and whatever part of current level that is not discarded) from the old/shadow buffer to the new one. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	dcde4cd114	freedreno: spiff up some debug traces Make it easier to track batches, to ensure things happen properly when they are reordered. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	9f219c7047	freedreno: add batch-cache and batch reordering Note that I originally also had a entry-point that would construct a key and do lookup from a pipe_surface. I ended up not needing that (yet?) but it is easy-enough to re-introduce later if we need it for the blit path. For now, not enabled by default, but can be enabled (on a3xx/a4xx) with FD_MESA_DEBUG=reorder. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	f02a64dbdd	freedreno: move more batch related tracking to fd_batch To flush batches out of order, the gmem code needs to not depend on state from fd_context (since that may apply to a more recent batch). So this all moves into batch. The one exception is the gmem/pipe/tile state itself. But this is only used from gmem code (and batches are flushed serially). The alternative would be having to re-calculate GMEM layout on every batch, even if the dimensions of the render targets are the same. Note: This opens up the possibility of pushing gmem/submit into a helper thread. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	eeafaf2d37	freedreno: dynamically sized/growable cmd buffers Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	9e4561d3c4	freedreno: push resource tracking down into batch Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	9bbd239a40	freedreno: introduce fd_batch Introduce the batch object, to track a batch/submit's worth of ringbuffers and other bookkeeping. In this first step, just move the ringbuffers into batch, since that is mostly uninteresting churn. For now there is just a single batch at a time. Note that one outcome of this change is that rb's are allocated/freed on each use. But the expectation is that the bo pool in libdrm_freedreno will save us the GEM bo alloc/free which was the initial reason to implement a rb pool in gallium. The purpose of the batch is to eventually facilitate out-of-order rendering, with batches associated to framebuffer state, and tracking the dependencies on other batches. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Marek Olšák	12aec78993	mesa: remove dd_function_table::UseProgram finally unused Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	b47839ad83	st/mesa: update sampler states when shaders are changed This bug seems to have always been there. Applications changing shaders but not textures between draw calls would have gotten undefined behavior. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	c7954b130a	st/mesa: don't dirty sample shading on _NEW_PROGRAM Already done as part of ST_NEW_FRAGMENT_PROGRAM in st_validate_state. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	79dcd69afa	st/mesa: remove excessive shader state dirtying This just needs to be done by st_validate_state. v2: add "shaders_may_be_dirty" flags for not skipping st_validate_state on _NEW_PROGRAM to detect real shader changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	1f73e2bb94	st/mesa: unreference optional shaders when unbinding Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	0a46e6f410	st/mesa: skip updates of states that have no effect v2: - also don't check edge flags for GLES Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	c8fe3b9dca	st/mesa: completely rewrite state atoms The goal is to do this in st_validate_state: while (dirty) atoms[u_bit_scan(&dirty)]->update(st); That implies that atoms can't specify which flags they consume. There is exactly one ST_NEW_* flag for each atom. (58 flags in total) There are macros that combine multiple flags into one for easier use. All _NEW_* flags are translated into ST_NEW_* flags in st_invalidate_state. st/mesa doesn't keep the _NEW_* flags after that. torcs is 2% faster between the previous patch and the end of this series. v2: - add st_atom_list.h to Makefile.sources Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	53bc28920a	st/mesa: remove st_tracked_state::name Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	f2adba4a4c	st/mesa: remove atom debugging code This won't be needed after the rewrite. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Kenneth Graunke	ebdc82d065	i965: Fix move_interpolation_to_top() pass. The pass I introduced in commit `a2dc11a781` was entirely broken. A missing "break" made the load_interpolated_input case always fall through to "default" and hit a "continue", making it not actually move any load_interpolated_input intrinsics at all. It would only move the simple load_barycentric_* intrinsics, which don't emit any code anyway, making it basically useless. The initial version I sent of the pass worked, but I apparently failed to verify that the simplified version in v2 actually worked. With the obvious fix applied (so we actually tried to move load_interpolated_input intrinsics), I discovered a second bug: we weren't moving the offset SSA def to the top, breaking SSA validation. The new version of the pass actually moves load_interpolated_input intrinsics and all their dependencies, as intended. Papers over GPU hangs on Ivybridge and Baytrail caused by the recent NIR FS input rework by restoring the old behavior. (I'm not honestly sure why they hang with PLN not at the top.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97083 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-29 16:05:24 -07:00
Rob Clark	591eeb7d1c	freedreno: limit non-user constant buffers to a4xx Seems to mostly work on a3xx. Except when it doesn't and kills gpu quite badly. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-29 14:58:39 -04:00
Jan Ziak	427771d1c7	glsl: fix uninitialized instance variable Valgrind detected that variable ir_copy_propagation_visitor::killed_all is uninitialized. Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-29 14:57:51 -04:00
Jan Ziak	b107169eef	configure: add support for LLVM 4.0.0svn static libs Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-07-29 16:24:03 +09:00
Rob Herring	a235765d27	virgl: add exported dmabuf to BO hash table Exported dmabufs can get imported by the same process, but the handle was not getting added to the hash table on export. Add the handle to the hash table on export. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-07-29 09:09:56 +10:00
Anuj Phogat	6d958c7c16	anv: Enable per sample shading on gen8+ Vulkan CTS test results on gen9: ./deqp-vk --deqp-case=dEQP-VK.pipeline.multisample.min_sample_shading* Test run totals: Passed: 60/90 (66.7%) Failed: 0/90 (0.0%) Not supported: 30/90 (33.3%) Warnings: 0/90 (0.0%) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-28 13:11:12 -07:00
Anuj Phogat	0f94cdc976	anv/pipeline: Fix setting per sample shading in pixel shader We should use the persample_dispatch variable in prog_data. Fixes all (~60) the DEQP sample shading tests. Many tests exited with VK_ERROR_OUT_OF_DEVICE_MEMORY without this patch. V2: Use the shader key bits set in brw_compile_fs (Jason) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-28 13:11:12 -07:00
Nicolas Boichat	9ee683f877	egl/dri2: Add reference count for dri2_egl_display android.opengl.cts.WrapperTest#testGetIntegerv1 CTS test calls eglTerminate, followed by eglReleaseThread. A similar case is observed in this bug: https://bugs.freedesktop.org/show_bug.cgi?id=69622, where the test calls eglTerminate, then eglMakeCurrent(dpy, NULL, NULL, NULL). With the current code, dri2_dpy structure is freed on eglTerminate call, so the display is not initialized when eglReleaseThread calls MakeCurrent with NULL parameters, to unbind the context, which causes a a segfault in drv->API.MakeCurrent (dri2_make_current), either in glFlush or in a latter call. eglTerminate specifies that "If contexts or surfaces associated with display is current to any thread, they are not released until they are no longer current as a result of eglMakeCurrent." However, to properly free the current context/surface (i.e., call glFlush, unbindContext, driDestroyContext), we still need the display vtbl (and possibly an active dri dpy connection). Therefore, we add some reference counter to dri2_egl_display, to make sure the structure is kept allocated as long as it is required. One drawback of this is that eglInitialize may not completely reinitialize the display (if eglTerminate was called with a current context), however, this seems to meet the EGL spec quite well, and does not permanently leak any context/display even for incorrectly written apps. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-28 14:08:25 +01:00
Emil Velikov	8431c0e9d4	vc4: automake: remove vc4_drm.h from the sources lists The file was removed with earlier commit breaking 'make dist'. Drop it from Makefile.sources since it's no longer around. Fixes: `16985eb308` ("vc4: Switch to using the libdrm-provided vc4_drm.h.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-28 14:08:24 +01:00
Nicolai Hähnle	bade0cd0fb	ddebug: use pclose to close a popen()'d FILE Found by Coverity. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-28 10:47:51 +01:00
Nicolai Hähnle	21556d86fc	glsl: fix optimization of discard nested multiple levels The order of optimizations can lead to the conditional discard optimization being applied twice to the same discard statement. In this case, we must ensure that both conditions are applied. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96762 Cc: mesa-stable@lists.freedesktop.org Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-28 10:47:04 +01:00
Nicolai Hähnle	185b0c15ab	st_glsl_to_tgsi: only skip over slots of an input array that are present When an application declares varying arrays but does not actually do any indirect indexing, some array indices may end up unused in the consuming shader, so the number of input slots that correspond to the array ends up less than the array_size. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-28 10:46:02 +01:00
Dieter Nützel	041b330a32	clover: make GCC 4.8 happy Without this GCC 4.8.x throws below error: error: invalid initialization of non-const reference of type 'clover::llvm::compat::raw_ostream_to_emit_file {aka llvm::raw_svector_ostream&}' from an rvalue of type '<brace-enclosed initializer list>' v2: change commit title and add error message like Eric Engestrom requested Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97019 [ Francisco Jerez: Trivial formatting fix. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-27 20:41:05 -07:00
Timothy Arceri	a86aa87342	i965: remove unnecessary null check We would have hit a segfault already if this could be null. Fixes Coverity warning spotted by Matt. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-28 11:05:57 +10:00
Timothy Arceri	29d70cc964	glsl: free hash tables earlier These are only used by get_matching_input() which has been call at this point so free the hash tables. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-07-28 08:05:04 +10:00
Samuel Pitoiset	af08cfc626	nvc0: enable ARB_tessellation_shader on GM107+ This exposes OpenGL 4.1 on Maxwell (tested on GM107 and GM206). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-27 23:19:07 +02:00
Samuel Pitoiset	3ac373df6e	gm107/ir: add a legalize SSA pass for PFETCH PFETCH, actually ISBERD on GM107+ ISA only accepts a GPR for src0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-27 23:18:58 +02:00
Samuel Pitoiset	653af07119	nvc0: fix up TCP header on GM107+ The number of outputs patch (limited to 255) has moved in the TCP header, but blob seems to also set the old position. Also, the high 8-bits are now located inbetween the min/max parallel output read address at position 20. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-27 23:18:41 +02:00
Mathias Fröhlich	2060f19b4f	vbo: Fix handling of POS/GENERIC0 attributes. In case of split primitives we need to restore the original setting of the vtx.attrsz array to make immediate mode attribute array tracking work. v2: Use bool instead of boolean. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96950	2016-07-27 06:43:03 +02:00
Marek Olšák	c98c732158	radeon/llvm: Use alloca instructions for larger arrays [revert a revert] This reverts commit `f84e9d749f`. Bioshock Infinite no longer hangs.	2016-07-26 23:31:56 +02:00
Marek Olšák	8636a718b5	r600g: add support for B5G6R5 PBO uploads via texture buffers (v2) v2: set endian swap to 16 untested	2016-07-26 23:21:45 +02:00
Marek Olšák	1e5f00f9d5	radeonsi: pre-generate shader logs for ddebug This cuts down the overhead of si_dump_shader when ddebug is capturing shader logs, which is done for every draw call unconditionally (that's quite a lot of work for a draw call). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	18475aab6d	radeonsi: add empty lines after shader stats to separate individual shaders dumped consecutively. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	dd66f9d3e7	radeonsi: move the shader key dumping to si_shader_dump Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	b47727a83a	ddebug: implement pipelined hang detection mode For good performance while being able to generate decent hang reports. The report doesn't contain the parsed IB and the buffer list, but it isolates the draw call and dumps shaders while not having to flush the context. This is for GPU hangs that are harder to reproduce and require interactive playing for minutes or even hours. dd_pipe.h explains some implementation details. Initializing, copying (recording) and clearing states is most of the code. The performance should be at least 50% of the normal performance depending on the circumstances. (i.e. 50% is expected to be the worst case scenario, not the best case) The majority of time is spent in dump_debug_state(PIPE_DUMP_CURRENT_SHADERS) and that's after all the optimizations in later patches. There is no obvious way to optimize that further. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	0795a3d54f	ddebug: don't save pointers to call parameters Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	e4079677a7	ddebug: move dd_call into dd_pipe.h Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	d50f9e9b04	ddebug: separate draw call dumping logic Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	95c3025a41	ddebug: move all states into a separate structure Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	f7720948cc	ddebug: write contents of dmesg into hang reports Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	1f85f17998	ddebug: implement create_batch_query Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	6b9924ccb6	ddebug: don't use abort() We don't want a core dump. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	26ef8158ac	ddebug: make dd_get_file_stream accept the screen only Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	27fa933a71	ddebug: clean up ddebug_screen_create Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	6bf81de339	gallium: rework flags for pipe_context::dump_debug_state The pipelined hang detection mode will not want to dump everything. (and it's also time consuming) It will only dump shaders after a draw call and then dump the status registers separately if a hang is detected. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Rob Herring	9ace2c1355	vc4: add hash table look-up for exported dmabufs It is necessary to reuse existing BOs when dmabufs are imported. There are 2 cases that need to be handled. dmabufs can be created/exported and imported by the same process and can be imported multiple times. Copying other drivers, add a hash table to track exported BOs so the BOs get reused. v2: Whitespace fixup (by anholt) Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-26 13:47:50 -07:00
Eric Anholt	ce8504d196	vc4: Disable early Z with computed depth. We don't tell the hardware whether we're computing depth, so we need to manage early Z state manually. Fixes piglit early-z.	2016-07-26 13:47:50 -07:00
Eric Anholt	4d0b2c7aaa	ttn: Update shader->info as we generate code. We could use the nir_shader_gather_info() pass to update it after the fact, but this is what glsl_to_nir and prog_to_nir do. Reviewed-by: Rob Clark <robclark@freedesktop.org>	2016-07-26 13:47:50 -07:00
Vedran Miletić	7b9a0f4e38	mesa: standardize naming Mesa3D, MESA -> Mesa Signed-off-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-26 13:28:01 -07:00
Kenneth Graunke	95c48391ee	mesa: Make MESA_SHADER_CAPTURE_PATH skip shaders with Name == -1. Shaders with shProg->Name == ~0 (aka 4294967295) are internal meta shaders that we don't really want to capture. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-26 13:27:09 -07:00
Matt Turner	20553e4a2d	mesa: Use AC_HEADER_MAJOR to include correct header for major(). Gentoo has been smoke testing an upcoming change to glibc. Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=580392	2016-07-26 12:12:41 -07:00
Matt Turner	815135166c	glsl: Remove references to tail_pred.	2016-07-26 12:12:27 -07:00
Matt Turner	5ed3299822	glx: Avoid aliasing violations. Compilers are perfectly capable of generating efficient code for calls like these to memcpy(). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-26 12:12:27 -07:00
Matt Turner	2a1d2874f1	mesa: Avoid aliasing violation in uniform_query.cpp. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-26 12:12:27 -07:00
Matt Turner	f5ac1d366e	mesa: Avoid aliasing violation in FXT1. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-26 12:12:27 -07:00
Matt Turner	a1e9b72102	swrast: Avoid aliasing violation. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-26 12:12:27 -07:00
Matt Turner	149309a424	glsl: Avoid aliasing violations. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-26 12:12:27 -07:00
Matt Turner	d1f6f65697	glsl: Separate overlapping sentinel nodes in exec_list. I do appreciate the cleverness, but unfortunately it prevents a lot more cleverness in the form of additional compiler optimizations brought on by -fstrict-aliasing. No difference in OglBatch7 (n=20). Co-authored-by: Davin McCall <davmac@davmac.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-26 12:12:27 -07:00
Jason Ekstrand	5d76690f17	i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations intel_mipmap_tree::logical_depth0 is now in number of 2D slices so we no longer need to be multiplying by 6. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-26 07:58:44 -07:00
Jason Ekstrand	833e389bc0	i965/miptree/isl: Stop multiplying depth by 6 for cubes Now that the logical_depth0 field is in number of 2D slices, we don't need to be multiplying by 6 when creating the surface. It wasn't hurting anything primarily because we get the actual length from the view which was already handling it correctly. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-26 07:58:44 -07:00
Jason Ekstrand	d16dc8e963	i965/blorp/gen8: Stop multiplying depth by 6 for cubes intel_mipmap_tree::logical_depth0 is now in 2-D slices so there is no need for us to multiply by 6 when we go to fill out a blorp surface state. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-26 07:58:44 -07:00
Samuel Pitoiset	126bd15940	nvc0: use nvc0_m2mf_push_linear() to reduce code duplication Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-26 00:50:34 +02:00
Samuel Pitoiset	c5236f0ecc	nvc0: use nve4_p2mf_push_linear() to reduce code duplication Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-26 00:40:37 +02:00
Andreas Boll	0420666ac0	build: Remove unused AX_CHECK_COMPILE_FLAG macro Unused since `1a6ae84041` Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-25 15:14:12 +02:00
Nils Wallménius	a354c389f5	main: memcpy larger chunks in _mesa_propagate_uniforms_to_driver_storage When possible, do the memcpy on larger blocks. This reduces cycles spent in _mesa_propagate_uniforms_to_driver_storage from 1.51 % to 0.62% according to perf during the Unigine Heaven benchmark. It did not affect the framerate of the benchmark. The system used for testing was an i5 6600K with a Radeon R9 380. Piglit hangs randomly on this system both with and without the patch so i could not make a comparison. v2: fixed whitespace Signed-off-by: Nils Wallménius <nils.wallmenius@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-25 13:51:16 +02:00
Boyuan Zhang	dd208ea006	st/va: enable h264 VAAPI encode Enable H.264 VAAPI encoding through config. Currently only H.264 baseline is supported. Encode entrypoint is not accepted by driver. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-07-25 13:39:54 +02:00
Boyuan Zhang	71da1354d7	st/va: add function to handle misc param type frame rate Frame rate can be passed to driver either through VAEncSequenceParameterBufferType or VAEncMiscParameterTypeFrameRate. Previous code only implement the former one, which is used by Gstreamer-Vaapi. Now adding implementation for VAEncMiscParameterTypeFrameRate. Also adding default frame rate as 30 just in case application never provides frame rate information to driver. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-07-25 13:39:53 +02:00
Boyuan Zhang	10dec2de2d	st/va: add enviromental variable to disable interlace Add environmental variable to disable interlace mode. At VAAPI decoding stage, driver can not distinguish b/w pure decoding case and transcoding case. And since interlace encoding is not supported, we have to disable interlace for transcoding case. The temporary solution is to use enviromental variable to disable interlace mode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-07-25 13:39:53 +02:00
Boyuan Zhang	b0ceb4cc48	st/va: add preset values for VAAPI encode Add some hardcoded values hardware needs mainly for rate control purpose. With previously hardcoded values for OMX, the rate control result is not correct. This change fixed the rate control result by setting correct values for Vaapi. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-07-25 13:39:52 +02:00
Boyuan Zhang	85d807f2e0	st/va: add functions for VAAPI encode Add necessary functions/changes for VAAPI encoding to buffer and picture. These changes will allow driver to handle all Vaapi encode related operations. This patch doesn't change the Vaapi decode behaviour. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-07-25 13:39:52 +02:00
Boyuan Zhang	10c1cc47a6	st/va: get rate control method from configattrib v2 Rate control method is passed from app to driver through config attrib list. That is why we need to store this rate control method to config. And later on, we will pass this value to context->desc.h264enc.rate_ctrl.rate_ctrl_method. v2 (chk): fix broken build and commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2016-07-25 13:39:51 +02:00
Boyuan Zhang	34f4634843	st/va: add conversion for yv12 to nv12in putimage v2 For putimage call, if image format is yv12 (or IYUV with U V field swap) and surface format is nv12, then we need to convert yv12 to nv12 and then copy the converted data from image to surface. We can't use the existing logic where surface is destroyed and re-created with yv12 format. v2 (chk): fix some compiler warnings and commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2016-07-25 13:39:51 +02:00
Boyuan Zhang	23b4ab1738	vl/util: add copy func for yv12image to nv12surface v2 Add function to copy from yv12 image to nv12 surface for VAAPI putimage call. We need this function in VaPutImage call where copying from yv12 image to nv12 surface for encoding. Existing function can't be used because it only work for copying from yv12 surface to nv12 image in Vaapi. v2: cleanup variable types and commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2016-07-25 13:39:18 +02:00
Boyuan Zhang	5bcaa1b9e9	st/va: add encode entrypoint v2 VAAPI passes PIPE_VIDEO_ENTRYPOINT_ENCODE as entry point for encoding case. We will save this encode entry point in config. config_id was used as profile previously. Now, config has both profile and entrypoint field, and config_id is used to get the config object. Later on, we pass this entrypoint to context->templat.entrypoint instead of always hardcoded to PIPE_VIDEO_ENTRYPOINT_BITSTREAM for decoding case previously. Encode entrypoint is not accepted by driver until we enable Vaapi encode in later patch. v2 (chk): fix commit message to match 80 chars, use switch instead of ifs, fix memory leaks in the error path, implement vlVaQueryConfigEntrypoints as well, drop VAEntrypointEncPicture (only used for JPEG). Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2016-07-25 13:30:42 +02:00
Samuel Pitoiset	e7b2ce5fd8	nvc0: upload sample locations on GM20x This fixes a bunch of multisample piglit tests on GM206, like bin/arb_texture_multisample-texelfetch 2 -auto -fbo Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-24 22:46:26 +02:00
Rob Clark	2f57e57881	freedreno/a4xx: time-elapsed query should be active for clears Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-24 09:33:05 -04:00
Samuel Pitoiset	3a2e67bf78	nvc0/ir: fix up an assertion in emitUADD() It's illegal to have neg modifiers on both sources for OP_ADD, and it's illegal to have OP_SUB with just src0 neg. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-24 00:42:47 +02:00
Samuel Pitoiset	a159a3d5cb	nvc0: fix wrong indentation in nvc0_validate_fb() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-23 23:59:10 +02:00
Ilia Mirkin	e483cb9a3a	glsl: reuse main extension table to appropriately restrict extensions Previously we were only restricting based on ES/non-ES-ness and whether the overall enable bit had been flipped on. However we have been adding more fine-grained restrictions, such as based on compat profiles, as well as specific ES versions. Most of the time this doesn't matter, but it can create awkward situations and duplication of logic. Here we separate the main extension table into a separate object file, linked to the glsl compiler, which makes use of it with a custom function which takes the ES-ness of the shader into account (thus allowing desktop shaders to properly use ES extensions that would otherwise have been disallowed.) We can also now use this logic to generate #define's for all supported extensions automatically, removing the duplicate (and often inaccurate) list in glcpp. The effect of this change should be nil in most cases. However in some situations, extensions like GL_ARB_gpu_shader5 which were formerly available in compat contexts on the GLSL side of things will now become inaccessible. This regresses two ES CTS tests: ES3-CTS.shaders.shader_integer_mix.define ES31-CTS.shader_integer_mix.define however that is due to them using #version 100 instead of 300 es. As the extension is only defined for ES3, I believe this is the correct behavior. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v2) v2 -> v3: integrate glcpp defines into the same mechanism	2016-07-23 13:48:04 -04:00
Rob Clark	9253dcde58	freedreno/a4xx: timestamp queries Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-23 13:39:30 -04:00
Rob Clark	b888d8e937	freedreno: hw timestamp support If the kernel supports it, use hw counter for timestamps. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-23 13:39:30 -04:00
Rob Clark	6a4b052820	freedreno: prep work for timestamp queries We need "NULL" state to be a valid bit in the bitmask, because timestamp queries are not restricted to draw/etc stages (ie. the only commands to submit may just be to read the timestamp). And just because there are no draws, isn't a reason to skip the flush and return zero. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-23 13:39:30 -04:00
Nicolai Hähnle	3d69357da9	radeonsi: ensure sample locations are set for line and polygon smoothing Since commit `d938b8c`, the sample locations are no longer set unconditionally, so we need to set the atom to dirty on all chips, not just Polaris. Cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-07-23 15:36:39 +02:00
Nicolai Hähnle	f755da0f2f	radeonsi: fix Polaris MSAA regression The regression was introduced by commit `d938b8c`. The problem here is that in order to use the small primitive filter, we need to explicitly set the sample locations to 0. But the DB doesn't properly process the change of sample locations without a flush, and so we can end up with incorrect Z values. Instead of doing a flush, just disable the small primitive filter when MSAA is force-disabled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96908 Cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-07-23 15:36:38 +02:00
francians@gmail.com	abb2a865a4	freedreno/ir3: Add missing braces in initializer Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-23 09:14:55 -04:00
francians@gmail.com	c99cdd2175	freedreno/a2xx: silence missing case 'SHADER_COMPUTE' warning (v2) v2: no need for break after an unreachable (Matt Turner) Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-23 09:14:18 -04:00
Marek Olšák	700de07771	radeonsi: implement buffer_subdata without indirect calls There is less noise in CPU profile data now. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-23 13:33:42 +02:00
Marek Olšák	8e3e9d2839	gallium/util: don't modify usage in pipe_buffer_write All drivers were already doing it except virgl. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-23 13:33:42 +02:00
Marek Olšák	1ffe77e7bb	gallium: split transfer_inline_write into buffer and texture callbacks to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-07-23 13:33:42 +02:00
Kenneth Graunke	0ba7288376	nir: Lower interp_var_at_* like a normal load_var for flat inputs. "flat centroid" and "flat sample" both just mean "flat", so we should ignore interpolateAtCentroid/Sample and just return the flat value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97032 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-22 20:31:20 -07:00
Kenneth Graunke	f80bea2d80	mesa: Don't call GenerateMipmap if Width or Height == 0. One of the WebGL 2.0 conformance tests is trying to call glGenerateMipmaps with a width and height of 0. With the meta implementation, this generates a "framebuffer attachment incomplete" status, and falls back to the CPU path, calling MapTextureImage. Except that there's no actual texture to map, and we assert fail. There's no work to do in this case. The test expects it to succeed, so just return early with no error and avoid hassling the driver. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96911 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-22 20:31:20 -07:00
Jason Ekstrand	b33bccb519	anv/pipeline: Set up point coord enables Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	9e05e51cff	spirv/nir: Add support for ImageQuerySamples Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	71202352c8	spirv/nir: Handle texture projectors Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	36c31b8fa2	nir/spirv: Refactor coordinate handling in handle_texture Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	b820c8b78c	spirv/nir: Refactor type handling in handle_texture Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	561be50a1a	spirv/nir: Move opcode selection higher up in handle_texture Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	c8da91aa24	anv/image: Assert that the image format is actually supported Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	34a39e91ba	spirv/nir: Don't increment coord_components for array lod queries For lod query instructions, we really don't care whether or not the sampler is an array type because that doesn't factor into the LOD. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	67b7d876e4	i965: Get rid of the do_lower_unnormalized_offsets pass We can do this in NIR now. No need to keep a GLSL pass lying around for it. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	9f32721f86	i965/nir: Enable NIR lowering of txf and rect offsets This fixes the following piglit tests on gen6+: tex-miplevel-selection textureProjGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRectShadow tex-miplevel-selection textureProjGradOffset 2DRect_ProjVec4 tex-miplevel-selection textureProjGradOffset 2DRectShadow Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	d9156efc52	nir/lower_tex: Add support for lowering coordinate offsets On i965, we can't support coordinate offsets for texelFetch or rectangle textures. Previously, we were doing this with a GLSL pass but we need to do it in NIR if we want those workarounds for SPIR-V. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:53 -07:00
Jason Ekstrand	843fc8f3e7	nir/lower_tex: Add some helpers for working with tex sources Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:53 -07:00
Jason Ekstrand	09135cd55a	nir: Add a helper for determining the type of a texture source Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:27:35 -07:00
Jason Ekstrand	3c0077a6ec	anv/pipeline: Set binding_table.gather_texture_start This should get texture gather working on gen8+ and mostly working on gen7. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:27:35 -07:00
Jason Ekstrand	95e9d58bdb	spirv/nir: Properly handle gather components Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:27:35 -07:00
Jason Ekstrand	7c7acf53b2	spirv/nir: Add support for shadow samplers that return vec4 While SPIR-V technically doesn't support "old style" shadow, the shadow-compare gather instruction does return a vec4 so we need to be able to set the old_style_shadow bit in NIR. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:27:35 -07:00
Jason Ekstrand	2ddefd03b7	spirv/nir: Fix some texture opcode asserts We can't get an lod with txf_ms and SPIR-V considers textureGrad to be an explicit-LOD texturing instruction. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:27:35 -07:00
Samuel Pitoiset	3f5cf8c488	nv50/ir: allow to swap sources for OP_SUB This allows the load-propagation pass to swap the sources in presence of immediate values. Maxwell (GM107): total instructions in shared programs :1928187 -> 1927634 (-0.03%) total gprs used in shared programs :330741 -> 330154 (-0.18%) total local used in shared programs :28032 -> 28032 (0.00%) local gpr inst bytes helped 0 271 425 425 hurt 0 0 194 194 Fermi (GF114): total instructions in shared programs :2334474 -> 2333829 (-0.03%) total gprs used in shared programs :380934 -> 380215 (-0.19%) total local used in shared programs :33304 -> 33264 (-0.12%) local gpr inst bytes helped 5 314 521 521 hurt 0 4 195 195 No regressions on GM107 and GF114 with full piglit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-22 22:51:37 +02:00
Marek Olšák	2e890b5350	gallium/radeon: make deferred flushes asynchronous Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-22 22:34:49 +02:00
Marek Olšák	d17b35e671	gallium: add PIPE_FLUSH_DEFERRED There are 2 uses: - Asynchronous flushing for multithreaded drivers. - Return a fence without flushing (mid-command-buffer fence). The driver can defer flushing until fence_finish is called. This is required to make Bioshock Infinite faster, which creates 1000 fences (flushes) per frame. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-07-22 22:34:49 +02:00
Marek Olšák	4cdc482283	gallium/os: use CLOCK_MONOTONIC for sleeps (v2) v2: handle EINTR, remove backslashes Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-07-22 22:34:49 +02:00
Eric Engestrom	4da9f7e7ce	mapi: fix typo in macro name Fixes: `5ec140c17b` ("mapi: Massage code to allow clang to compile.") Reported-by: Alexandre Demers <alexandre.f.demers@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-07-22 10:14:00 -07:00
Kenneth Graunke	44ef2ce6ec	docs: Put swr back on the GL_ARB_texture_buffer_object_rgb32 list. Looks like this was lost when resolving merge conflicts in commit `d1fbd4cdb1`.	2016-07-22 09:57:54 -07:00
Andres Gomez	d068b38e46	glsl: subroutine types cannot be compared subroutine variables are to be used just in the way functions are called. Although the spec doesn't say it explicitely, this means that these variables are not to be used in any other way than those left for function calls. Therefore, a comparison between 2 subroutine variables should also cause a compilation error. From The OpenGL® Shading Language 4.40, page 117: " To use subroutines, a subroutine type is declared, one or more functions are associated with that subroutine type, and a subroutine variable of that type is declared. The function currently assigned to the variable function is then called by using function calling syntax replacing a function name with the name of the subroutine variable. Subroutine variables are uniforms, and are assigned to specific functions only through commands (UniformSubroutinesuiv) in the OpenGL API." From The OpenGL® Shading Language 4.40, page 118: " Subroutine uniform variables are called the same way functions are called. When a subroutine variable (or an element of a subroutine variable array) is associated with a particular function, all function calls through that variable will call that particular function." Fixes GL44-CTS.shader_subroutine.subroutines_cannot_be_assigned_float_int_values_or_be_compared Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-22 17:30:25 +03:00
Timothy Arceri	a2b3c146d2	i965: fix varying output setup Since `7f53fead5c` we treat every location as using all four components so we only need special handling for doubles when they cross multiple locations. This fixes a crash in GL45-CTS.enhanced_layouts.varying_locations where the outputs array would overflow when a dmat2 was stored at the max varying location i.e 30. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-07-23 00:04:10 +10:00
Samuel Pitoiset	c2801f9272	nvc0/mme: fix offsets used for indirect draws This fixes a regression introduced in `1da704a94c` because the offset has moved from 0x180 to 0x1a0, and the macros have to be re-compiled. Fixes: `1da704a` ("nvc0: increase the tex handles area size in the driver") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-22 11:32:09 +02:00
Samuel Pitoiset	dbcff7fdbb	nvc0: fix offsets of MP perf counters input parameters This fixes a regression introduced in `1da704a94c` because the offset has moved from 0x600 to 0x620, and the kernels used for reading MP perf counters have to be re-assembled. This also fixes amd_performance_monitor_measure piglit. Fixes: `1da704a` ("nvc0: increase the tex handles area size in the driver") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-22 11:32:04 +02:00
Kenneth Graunke	cb70773129	mesa: Add GL_BGRA_EXT to the list of GenerateMipmap internal formats. The GL_EXT_texture_format_BGRA8888 extension specification defines a GL_BGRA_EXT unsized internal format (which is a little odd - usually BGRA is a pixel transfer format). The extension is written against the ES 1.0 specification, so it's a little hard to map, but I believe it's effectively adding it to the table used here, so we should allow it here as well. Note that GL_EXT_texture_format_BGRA8888 is always enabled (dummy_true), so we don't need to check if it's enabled here. This fixes mipmap generation in Skia and ChromeOS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> References: https://bugs.chromium.org/p/chromium/issues/detail?id=630371 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reported-by: Stéphane Marchesin <marcheu@chromium.org> Cc: mesa-stable@lists.freedesktop.org	2016-07-21 21:31:57 -07:00
Kenneth Graunke	be1c53d2cf	i965: Fix "operation operation" in comment. From the redundant redundant department. Reported-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com>	2016-07-21 21:31:57 -07:00
Kenneth Graunke	76e161056a	i965: Fix shared atomic intrinsics to pay attention to base. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-21 21:31:55 -07:00
Kenneth Graunke	cf6f2d3ce7	nir: Add a base const_index to shared atomic intrinsics. Commit `52e75dcb8c` made nir_lower_io start using nir_intrinsic_set_base instead of writing const_index[0] directly. However, those intrinsics apparently don't /have/ a base, so this caused assert failures. However, the old code was happily setting non-existent const_index fields, so it was pretty bogus too. Jason pointed out that load_shared and store_shared have a base, and that the i965 driver uses that field. So presumably atomics should have one as well, so that loads/stores/atomics all refer to variables with consistent addressing. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-21 21:31:41 -07:00
Timothy Arceri	91dde3ddca	glsl: re-enable varying packing in GL4.4+ We can still do packing we just need to get the packing type from the consumer rather than the producer. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97033	2016-07-22 10:21:08 +10:00
Kenneth Graunke	2db357e4c3	i965: Include VUE handles for GS with invocations > 1. We always resort to the pull model for instanced GS inputs. So, we'd better include the VUE handles, or else we can't actually pull anything. Ian reports that on his branch with OES_geometry_shader enabled, this fixes a bunch of dEQP-GLES31.functional.geometry_shading tests:: - instanced.draw_2_instances_geometry_2_invocations - instanced.draw_2_instances_geometry_8_invocations - instanced.draw_4_instances_geometry_2_invocations - instanced.draw_4_instances_geometry_8_invocations - instanced.draw_8_instances_geometry_2_invocations - instanced.draw_8_instances_geometry_8_invocations - instanced.geometry_2_invocations - instanced.geometry_32_invocations - instanced.geometry_8_invocations - instanced.geometry_max_invocations - instanced.geometry_output_different_2_invocations - instanced.geometry_output_different_32_invocations - instanced.geometry_output_different_8_invocations - instanced.geometry_output_different_max_invocations - instanced.invocation_output_vary_by_attribute - instanced.invocation_output_vary_by_texture - instanced.invocation_output_vary_by_uniform - query.primitives_generated_instanced Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-21 11:15:12 -07:00
Matt Turner	8c8c3f859e	mesa: Add -fno-math-errno -fno-trapping-math to CXXFLAGS. Not sure why I forgot to add them to CXXFLAGS in commit `f55c408067` or commit `875458b778`. Cuts about 1k of .text. text data bss dec hex filename 5806354 287816 29384 6123554 5d7022 i965_dri.so before 5805497 287744 29384 6122625 5d6c81 i965_dri.so after Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-21 10:45:28 -07:00
Matt Turner	5353855e9d	mesa: Drop -fno-builtin-memcmp. According to the referenced bug report, gcc-4.5 and newer do not inline memcmp(). I see no difference in performance of ipers with llvmpipe on a Sandybridge (which does not have "Enhanced REP MOVSB/STOSB") by removing this flag. I attempted to confirm the problem with gcc-4.4, but it fails to compile for quite a few different reasons. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-21 10:45:28 -07:00
Matt Turner	5ec140c17b	mapi: Massage code to allow clang to compile. According to https://llvm.org/bugs/show_bug.cgi?id=19778#c3 this code was violating the spec, resulting in it failing to compile. Cc: mesa-stable@lists.freedesktop.org Co-authored-by: Tomasz Paweł Gajc <tpgxyz@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89599 Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-21 10:45:28 -07:00
Ian Romanick	6bc5491193	docs: Add extensions not part of any GL or GL ES version Based loosely on patches submitted ages ago by Thomas Helland. v2: Add lots of missing data provided by Ilia. Fix sort order of GL_ARB_sparse_texture extensions suggested by Ilia. v3: Note that Dave Airlie has started work on GL_ARB_bindless_texture. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-21 10:31:04 -07:00
Ian Romanick	d1fbd4cdb1	docs: Update GL3.txt for OpenGL 4.0 on i965-ish hardware Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-21 10:30:20 -07:00
Ian Romanick	7dc99da81a	docs: Update GL3.txt for OpenGL ES on i965-ish hardware Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-21 10:26:55 -07:00
Timothy Arceri	4f89cf4941	i965: print error messages if gs fails to compile We do this for all other stages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-21 15:05:05 +10:00
Timothy Arceri	b463b1d7cc	i965: enable GL4.4 for Gen8+ Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-21 12:06:11 +10:00
Timothy Arceri	4ba9bd138a	i965: enable ARB_enhanced_layouts for gen6+ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	f3805c5f09	i965/vec4: add packing support for tcs load outputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	255388a965	i965/vec4: add support for packing tes inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-07-21 12:06:11 +10:00
Timothy Arceri	d07cfb31c4	i965/vec4: add support for packing tcs outputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	b25e49a3c7	i965/vec4: support packing tcs inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	d1192bef7e	i965/vec4: add component packing for gs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	d1b1fca0b7	i965/vec4: add support for packing vs/gs/tes outputs Here we create a new output_generic_reg array with the ability to store the dst_reg for each component of user defined varyings. This is needed as the previous code only stored the dst_reg based on the varying location which meant packed varyings would overwrite each other. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-07-21 12:06:11 +10:00
Timothy Arceri	b427abba0c	i965/vec4: add support for packing inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	138aad06b3	i965: add helper for creating packing writemask For example where n=3 first_component=1 this will give us 0xE (WRITEMASK_YZW). V2: Add assert to check first component is <= 4 (Suggested by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	4b57b53f85	i965: add helpers for creating component layout swizzle This will be used to swizzle components to the beginning or end of the vector based on the component layout qualifier and whether we are doing a load or store. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Eric Anholt	d2b4b16589	vc4: Return V3D version details in the GL renderer info. This is as close as we get to a name for the 3D blocks.	2016-07-20 16:15:15 -07:00
Eric Anholt	d81934cded	vc4: Check the V3D version reported by the kernel. We don't want to bring up an old userspace driver on a kernel for newer hardware. We'll also want to look at the other ident fields in the future.	2016-07-20 16:15:15 -07:00
Eric Anholt	83b8ca58e1	vc4: Detect and report kernel support for branching.	2016-07-20 16:15:15 -07:00
Eric Anholt	16985eb308	vc4: Switch to using the libdrm-provided vc4_drm.h. The required version is set to .69 for the getparam ioctl that will be used in the next commit.	2016-07-20 16:15:15 -07:00
Timothy Arceri	3d8c29ed32	docs: mark ARB_enhanced_layouts as DONE for i965 Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 09:10:53 +10:00
Timothy Arceri	d99a040bbf	i965: enable ARB_enhanced_layouts for gen8+ Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 09:10:53 +10:00
Timothy Arceri	cba6657d8b	nir: add doubles component packing support This makes sure we give the correct driver location for doubles when using component packing. Specifically it handles packing a dvec3 with a double which is the only packing scenario allowed which spans across two locations. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-07-21 09:10:53 +10:00
Timothy Arceri	ad5dd39984	i965: add component packing support for load_output intrinsics Here we use the component qualifier (which is the first component) as an offset when loading output varyings. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-21 09:10:53 +10:00
Timothy Arceri	7f53fead5c	i965: enable component packing for vs and fs Rather than trying to work out the total number of components used at a location we simply treat all outputs as vec4s. This removes the need for complex code looping over varyings to match packed locations and the need for storing the total number of components used at each location. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-21 09:10:53 +10:00
Timothy Arceri	09e46f99ad	i965: bring back type_size_vec4_times_4() We will use this for output varyings. To make component packing simpler we will just treat all varyings as vec4s. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-21 09:10:53 +10:00
Jason Ekstrand	9d503aea06	nir/inline: Constant-initialize local variables in the callee if needed Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-20 15:29:55 -07:00
Jason Ekstrand	dc9f2436c3	nir: Add a nir_deref_foreach_leaf helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-20 15:29:55 -07:00
Tom Stellard	106946153f	clover: Re-order includes in invocation.cpp to fix build The build was failing because the official CL headers have a few defines, like: # define cl_khr_gl_sharing 1 Which have the same name as some class members of clang's OpenCLOptions class. If we include the cl headers first, this breaks the build because the member names of this class are replaced by the literal 1. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-07-20 21:15:53 +00:00
Tom Stellard	a73bf11a63	clover: Add missing include v2 clang commit r275822 removed unnecessary includes from header files, so we now need to explicitly include clang/Lex/PreprocessorOptions.h v2: - Use <> instead of "" for the include path. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-07-20 21:15:53 +00:00
Kenneth Graunke	3dba8516d6	i965: Move VS load_input handling to nir_emit_vs_intrinsic(). TCS/TES/GS and now FS all handle these in stage-specific functions. CS don't have inputs, so VS was the only one left using this code. Move it to the VS-specific function for clarity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:01:26 -07:00
Kenneth Graunke	1608209952	i965: Delete the FS_OPCODE_INTERPOLATE_AT_CENTROID virtual opcode. We no longer use this message. As far as I can tell, it's fairly useless - the equivalent information is provided in the payload. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:01:24 -07:00
Kenneth Graunke	1eef0b73aa	i965: Rewrite FS input handling to use the new NIR intrinsics. This eliminates the need to walk the list of input variables, recurse into their types (via logic largely redundant with nir_lower_io), and interpolate all possible inputs up front. The backend no longer has to care about variables at all, which eliminates complications from trying to pack multiple variables into the same location. Instead, each intrinsic specifies exactly what's needed. This should unblock Timothy's work on GL_ARB_enhanced_layouts. Each load_interpolated_input intrinsic corresponds to PLN instructions, while load_barycentric_at_* intrinsics correspond to pixel interpolator messages. The pixel/centroid/sample barycentric intrinsics simply refer to payload fields (delta_xy[]), and don't actually generate any code. Because we use a single intrinsic for both centroid-qualified variables and interpolateAtCentroid(), they become indistinguishable. We stop sending pixel interpolator messages for those, and instead use the payload provided data, which should be considerably faster. On Broadwell: total instructions in shared programs: 9067751 -> 9067570 (-0.00%) instructions in affected programs: 145902 -> 145721 (-0.12%) helped: 422 HURT: 209 total spills in shared programs: 2849 -> 2899 (1.76%) spills in affected programs: 760 -> 810 (6.58%) helped: 0 HURT: 10 total fills in shared programs: 3910 -> 3950 (1.02%) fills in affected programs: 617 -> 657 (6.48%) helped: 0 HURT: 10 LOST: 3 GAINED: 3 The differences mostly appear to be slight changes in MOVs. v2: Use nir_shader_compiler_options::use_interpolated_input_intrinsics flag rather than passing it directly to nir_lower_io. Use the unreachable() macro rather than assert in one place. (Review feedback from Chris Forbes.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:01:16 -07:00
Kenneth Graunke	a2dc11a781	i965: Move load_interpolated_input/barycentric_* intrinsics to the top. Currently, i965 interpolates all FS inputs at the top of the program. This has advantages and disadvantages, but I'd like to keep that policy while reworking this code. We can consider changing it independently. The next patch will make the compiler generate PLN instructions "on the fly", when it encounters an input load intrinsic, rather than doing it for all inputs at the start of the program. To emulate this behavior, we introduce an ugly pass to move all NIR load_interpolated_input and payload-based (not interpolator message) load_barycentric_* intrinsics to the shader's start block. This helps avoid regressions in shader-db for cases such as: if (...) { ...load some input... } else { ...load that same input... } which CSE can't handle, because there's no dominance relationship between the two loads. Because the start block dominates all others, we can CSE all inputs and emit PLNs exactly once, as we did before. Ideally, global value numbering would eliminate these redundant loads, while not forcing them all the way to the start block. When that lands, we should consider dropping this hacky pass. Again, this pass currently does nothing, as i965 doesn't generate these intrinsics yet. But it will shortly, and I figured I'd separate this code as it's relatively self-contained. v2: Dramatically simplify pass - instead of creating new instructions, just remove/re-insert their list nodes (suggested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:01:11 -07:00
Kenneth Graunke	048a56c1fc	i965: Add a pass to demote sample interpolation intrinsics. When working with a non-multisampled render target, asking for "sample" interpolation locations doesn't make sense. We demote them to centroid. In a couple of patches, brw_compute_barycentric_modes will begin looking at these intrinsics to determine the barycentric modes. fs_visitor also will use them to code-generate pixel interpolator messages or payload references. Handling the "but what if it's not MSAA?" logic ahead of time in a NIR pass simplifies things and prevents duplicated logic. This patch doesn't actually do anything useful yet as we don't generate these intrinsics. I decided to keep it separate as it's self-contained, in the hopes of shrinking the "convert everything" patch for reviewers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:01:08 -07:00
Kenneth Graunke	707ca00fce	nir: Add nir_load_interpolated_input lowering code. Now nir_lower_io can optionally produce load_interpolated_input and load_barycentric_* intrinsics for fragment shader inputs. flat inputs continue using regular load_input. v2: Use a nir_shader_compiler_options flag rather than ad-hoc boolean passing (in response to review feedback from Chris Forbes). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:01:00 -07:00
Kenneth Graunke	2496462479	nir: Add new intrinsics for fragment shader input interpolation. Backends can normally handle shader inputs solely by looking at load_input intrinsics, and ignore the nir_variables in nir->inputs. One exception is fragment shader inputs. load_input doesn't capture the necessary interpolation information - flat, smooth, noperspective mode, and centroid, sample, or pixel for the location. This means that backends have to interpolate based on the nir_variables, then associate those with the load_input intrinsics (say, by storing a map of which variables are at which locations). With GL_ARB_enhanced_layouts, we're going to have multiple varyings packed into a single vec4 location. The intrinsics make this easy: simply load N components from location <loc, component>. However, working with variables and correlating the two is very awkward; we'd much rather have intrinsics capture all the necessary information. Fragment shader input interpolation typically works by producing a set of barycentric coordinates, then using those to do a linear interpolation between the values at the triangle's corners. We represent this by introducing five new load_barycentric_* intrinsics: - load_barycentric_pixel (ordinary variable) - load_barycentric_centroid (centroid qualified variable) - load_barycentric_sample (sample qualified variable) - load_barycentric_at_sample (ARB_gpu_shader5's interpolateAtSample()) - load_barycentric_at_offset (ARB_gpu_shader5's interpolateAtOffset()) Each of these take the interpolation mode (smooth or noperspective only) as a const_index, and produce a vec2. The last two also take a sample or offset source. We then introduce a new load_interpolated_input intrinsic, which is like a normal load_input intrinsic, but with an additional barycentric coordinate source. The intention is that flat inputs will still use regular load_input intrinsics. This makes them distinguishable from normal inputs that need fancy interpolation, while also providing all the necessary data. This nicely unifies regular inputs and interpolateAt functions. Qualifiers and variables become irrelevant; there are just load_barycentric intrinsics that determine the interpolation. v2: Document the interp_mode const_index value, define a new BARYCENTRIC() helper rather than using SYSTEM_VALUE() for some of them (requested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:00:45 -07:00
Kenneth Graunke	e614062e54	anv: Properly call gen75_emit_state_base_address on Haswell. This should fix MOCS values. Caught by Coverity. CID: 1364155 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Kenneth Graunke	87660579f5	genxml: Rename "API Rendering Disable" to "Rendering Disable". Gen7/7.5 call it "Rendering Disable" while Gen8/9 prefix it with "API". Pick one for consistency, and so we can share code between generations. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Kenneth Graunke	bfd9942cdc	anv: Unify 3DSTATE_CLIP code across generations. The bulk of this is the same. There are just a couple fields that only exist on one generation or another, and we can easily handle those with an #ifdef. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Kenneth Graunke	44502afd82	anv: Enable early culling on Gen7. We set the cull mode, but forgot the enable bit. Gen8 uses this. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Kenneth Graunke	0d77f08042	anv: Fix near plane clipping on Gen7/7.5. The Gen7/7.5 clip code used APIMODE_OGL, while the Gen8+ clip code used APIMODE_D3D. The meaning hasn't changed, so one of these must be wrong. It appears that the hardware documentation is completely wrong. It claims that the "API Mode" bit means: 0h APIMODE_OGL NEAR_VP boundary == 0.0 (NDC) 1h APIMODE_D3D NEAR_VP boundary == -1.0 (NDC) However, DirectX typically uses 0.0 for the near plane, while unextended OpenGL uses -1.0. i965's gen6_clip_state.c uses APIMODE_D3D for the GL_ZERO_TO_ONE case, so I believe the meanings are backwards from what the documentation says. Section 23.2 ("Primitive Clipping") of the Vulkan 1.0.21 specification contains the following equations: -w_c <= x_c <= w_c -w_c <= y_c <= w_c 0 <= z_c <= w_c This means that Vulkan follows D3D semantics. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Kenneth Graunke	6b67270262	genxml: Add APIMODE_D3D missing enum values and improve consistency. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Kenneth Graunke	c31cf532af	genxml: Add CLIPMODE_* prefix to 3DSTATE_CLIP's "Clip Mode" enum values. Gen6-7.5 use CLIPMODE_REJECT_ALL, while Gen8+ just used REJECT_ALL. Being consistent will let me unify code, and I prefer having the prefix. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Tim Rowley	0f13a8f770	swr: [rasterizer core] introduce simd16intrin.h Refactoring to leave existing simd_* intrinsics in "simdintrin.h" unchanged, adding corresponding simd16_* intrinsics in "simd16intrin.h" on the side, with emulation, that we can use piecemeal, rather than the all-or-nothing approach to bring up avx512. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	5fe361e2c0	swr: [rasterizer core] fix for possible int32 overflow condition Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	a123d12e14	swr: [rasterizer core] rename _MAX enum values to _COUNT Makes these names semantically correct. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	e41d9dd576	swr: [rasterizer core] centroid correction Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	e0529a4668	swr: [rasterizer core] support range of values in TemplateArgUnroller Fixes Linux warnings. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	0363015964	swr: [rasterizer core] ensure adjacent topologies use the cut-aware PA Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	efdaf5fa3e	swr: [rasterizer] attribute swizzling and linkage Add support for enhanced attribute swizzling. Currently supports constant source overrides to handle PrimitiveID support. No support yet for input select swizzling or wrap shortest. Removes obsoleted linkageMask and associated code. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	a5846fb75a	swr: [rasterizer common] icc declspec definitions Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	0d13f2e801	swr: [rasterizer jitter] rework vertex/instance ID storage in fetch Moved the setting into the existing component control code. Fixes bad interaction between attribute/component setting for vertex/instance ID and component packing. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:14 -05:00
Tim Rowley	1d09b3971a	swr: [rasterizer core] avx512 simd utility work Enabling KNOB_SIMD_WIDTH = 16 for AVX512 pre-work and low level simd utils Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:14 -05:00
Tim Rowley	98641f4e73	swr: [rasterizer core] viewport rounding for disabled scissor Adjust viewport rounding when scissor rect is disabled during macro tile scissor setup. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:14 -05:00
Jason Ekstrand	96dfed49e4	i965: Stop muging cube array lengths by 6 From the Sky Lake PRM: "For SURFTYPE_CUBE: For Sampling Engine Surfaces and Typed Data Port Surfaces, the range of this field is [0,340], indicating the number of cube array elements (equal to the number of underlying 2D array elements divided by 6). For other surfaces, this field must be zero." In other words, the depth field for cube maps is in number of cubes not number of 2-D slices so we need to divide by 6. ISL will do this correctly for us assuming that we provide it with the correct array bounds which it expects to be in 2-D slices. It appears as if we've been doing this wrong ever since we first added cube map arrays for Sandy Bridge and the change to ISL made things slightly worse. While we're at it, we now need to remoe the shader hacks we've always done since they were only needed because we were setting the depth field six times too large. v2: Fix the vec4 backend as well (not sure how I missed this). Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-07-20 08:19:26 -07:00
Jason Ekstrand	e19b7f7f1b	i965/miptree: Set logical_depth0 == 6 for cube maps This matches what we do for cube maps where logical_depth0 is in number of face-layers rather than number of cubes. This does mean that we will temporarily be setting the surface bounds too loose for cube map textures but we are already setting them too loose for cube arrays and we will be fixing that in the next commit anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com> Cc: "12.0 11.2 11.1" <mesa-stable@lists.freedesktop.org>	2016-07-20 08:19:22 -07:00
Jason Ekstrand	d4d505d0b0	i965/miptree: Enforce that height == 1 for 1-D array textures The GL API and mesa internals do this differently than we do. In GL, there is no depth parameter for 1-D arrays and height is used. In the i965 miptree code we do the sane thing and make height == 1 and use depth for number of slices. This makes for a mismatch every time we create a 1-D array texture from GL. Instead of actually solving this problem, we just said "1-D is hard, let's make sure it works no matter which way we pass the parameters" and called it a day. This commit fixes the one GL -> i965 transition point where we weren't already handling 1-D array textures to do the right thing and then replaces the magic fixup code with an assert that you're doing the right thing. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com> Cc: "12.0 11.2 11.1" <mesa-stable@lists.freedesktop.org>	2016-07-20 08:18:19 -07:00
Stefan Dirsch	27ef7bfd6c	Avoid overflow in 'last' variable of FindGLXFunction(...) This 'last' variable used in FindGLXFunction(...) may become negative, but has been defined as unsigned int resulting in an overflow, finally resulting in a segfault when accessing _glXDispatchTableStrings[...]. Fixed this by definining it as signed int. 'first' variable also needs to be defined as signed int. Otherwise condition for while loop fails due to C implicitly converting signed to unsigned values before comparison. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Stefan Dirsch <sndirsch@suse.de> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 16:05:17 +01:00
Tomasz Figa	9e1248d075	egl/android: Stop leaking DRI images Current implementation of the DRI image loader does not free the images created in get_back_bo() and so leaks memory. Moreover, it creates a new image every time the DRI driver queries for buffers, even if the backing native buffer has not changed. leaking memory again. This patch adds missing call to destroyImage() in droid_enqueue_buffer() and a check if image is already created to get_back_bo() to fix the above. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 15:48:54 +01:00
Tomasz Figa	565fa6b748	egl/android: Add some useful error messages It is much easier to debug issues when the application gives some meaningful error messages. This patch adds few to the EGL Android platform backend. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 15:48:03 +01:00
Tomasz Figa	94282b6dd0	egl/android: Check return value of dri2_get_dri_config() It might return NULL if specific config variant is unsupported. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 15:47:23 +01:00
Emil Velikov	4f48674d51	i965: store reference to the context within struct brw_fence (v2) As the spec allows for {server,client}_wait_sync to be called without currently bound context, while our implementation requires context pointer. v2: Add a mutex and acquire it for the duration of brw_fence_client_wait() and brw_fence_is_completed() as suggested by Chad. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Tomasz Figa <tfiga@chromium.org>	2016-07-20 15:45:20 +01:00
Nicolas Boichat	9bebef4034	egl/dri2: dri2_make_current: Set EGL error if bindContext fails Without this, if a configuration is, say, available only on GLES2/3, but not on GLES1, and is rejected by the dri module's bindContext call, eglMakeCurrent fails with error "EGL_SUCCESS". In this patch, we set error to EGL_BAD_MATCH, which is what CTS/dEQP dEQP-EGL.functional.surfaceless_context expect. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 15:10:33 +01:00
Tomasz Figa	ccda100a5a	egl/android: Remove unused variables There are some unused variables left after previous clean-ups triggering compiler warnings. Let's remove them. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 15:10:33 +01:00
Tomasz Figa	70a28afb29	gallium/dri: Add shared glapi to LIBADD on Android An earlier patch fixed the problem for classic drivers, however Gallium was still left broken. This patch applies the same workaround to Gallium, when compiled for Android. Following is a quote from the original patch: `0cbc90c57c` mesa: dri: Add shared glapi to LIBADD on Android /system/vendor/lib/dri/*_dri.so actually depend on libglapi: without this, loading the so file fails with: cannot locate symbol "__emutls_v._glapi_tls_Context" On non-Android (non-bionic) platform, EGL uses the following workflow, which works fine: dlopen("libglapi.so", RTLD_LAZY \| RTLD_GLOBAL); dlopen("dri/<driver>_dri.so", RTLD_NOW \| RTLD_GLOBAL); However, bionic does not respect the RTLD_GLOBAL flag, and the dri library cannot find symbols in libglapi.so, so we need to link to libglapi.so explicitly. Android.mk already does this. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 15:10:33 +01:00
Emil Velikov	ae9a2baaa6	mesa: scons: remove left over src/glsl include The path no longer exists. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 13:33:43 +01:00
Emil Velikov	1c7c0d77ac	mesa: scons: list builddir before srcdir Analogous to previous commit. Note: scons always uses OOT builds, while the in-tree generated files could be created either manually or by the autoconf build. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 13:32:24 +01:00
Emil Velikov	eafa82e20e	mesa: automake: list builddir before srcdir In the case of building in out-of-tree fashion, while having generated in-tree sources, the latter [likely stale] files will be used. Flip the order to prevent that. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 13:30:50 +01:00
Józef Kucia	14608ef920	radeonsi: advertise 8 bits subpixel precision for viewport bounds Signed-off-by: Józef Kucia <joseph.kucia@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-07-20 12:45:31 +02:00
Józef Kucia	98aa807188	r600: advertise 8 bits subpixel precision for viewport bounds Signed-off-by: Józef Kucia <joseph.kucia@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-07-20 12:45:31 +02:00
Józef Kucia	3cd28fe3de	gallium: add a cap for VIEWPORT_SUBPIXEL_BITS (v2) This allows Gallium drivers to advertise the subpixel precision for floating point viewports bounds. v2: - Set ViewportSubpixelBits in st_init_limits. Signed-off-by: Józef Kucia <joseph.kucia@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 12:45:31 +02:00
Samuel Pitoiset	3c78d89692	nvc0: disable MS images on GM107+ MS images have to be handled explicitly and I don't plan to implement them for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:33 +02:00
Samuel Pitoiset	8489f20689	nv50/ir: print OP_SUREDB subops in debug mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:30 +02:00
Samuel Pitoiset	1edc44bfd3	gm107/ir: add emission for SUREDx Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:26 +02:00
Samuel Pitoiset	4aaacd6dd0	gm107/ir: add emission for SUSTx and SULDx Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:21 +02:00
Samuel Pitoiset	e14cb05ce1	gm107/ra: fix constraints for surface operations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:16 +02:00
Samuel Pitoiset	c68989b2c8	gm107/ir: lower surface operations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:12 +02:00
Samuel Pitoiset	2ae4b5d622	nvc0: bind images for 3d/cp shaders on GM107+ On Maxwell, images binding is slightly different (and much better) regarding Fermi and Kepler because a texture view needs to be uploaded for each image and this is going to simplify the thing a lot. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:03 +02:00
Samuel Pitoiset	1da704a94c	nvc0: increase the tex handles area size in the driver cb Currently, we can store 32 tex handles of 32-bits integer each and that fits perfectly with the underlying hardware except on GM107+ which requires to upload a texture view for each images. This patch increases the number of storable texture handles in the driver constant buffer from 32 to 40 because we expose 8 images. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:10:56 +02:00
Kenneth Graunke	f0f466214e	nir: Fix uninitialized use of 'replacement'. For intrinsics we don't care about, just skip to the next loop iteration and process the next instruction. We don't want to execute the rest of the code. This was a bug in commit `cdfc05ea6e`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-19 17:34:59 -07:00
Kenneth Graunke	89873c9b08	i965: Use tex_mocs instead of rb_mocs for GL images. Fixes a 10-20% performance regression in OglCSDof caused by commit `5a8c89038a`, which made images (in the image load/store sense) use BDW_MOCS_PTE instead of BDW_MOCS_WB. This seems sketchy, as the default PTE value is supposed to be WB LLC eLLC, which is the same as our MOCS WB setting. It's only supposed to change when using a surface for display, which won't ever happen for images. Something may be wrong in the kernel... Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-19 17:34:59 -07:00
Marek Olšák	0ab47146c9	winsys/amdgpu: use pb_cache buckets for fewer pb_cache misses Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	dea6fdadca	winsys/radeon: use pb_cache buckets for fewer pb_cache misses This makes Bioshock Infinite with deferred flushing 2.2% faster. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	8d5944199d	gallium/pb_cache: reduce the number of pointer dereferences Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	3cdc0e133f	gallium/pb_cache: divide the cache into buckets for reducing cache misses Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	fec7f74129	gallium/pb_cache: check parameters that are more likely to fail first This makes Bioshock Infinite with deferred flushing 2% faster. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	2596ae2b6e	radeonsi: emit PS exports last This effectively removes s_waitcnt instructions after FP16 exports. Before: v_cvt_pkrtz_f16_f32_e32 v0, v0, v1 ; 5E000300 v_cvt_pkrtz_f16_f32_e32 v1, v2, v3 ; 5E020702 exp 15, 0, 1, 0, 0, v0, v1, v0, v0 ; F800040F 00000100 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e32 v0, v4, v5 ; 5E000B04 v_cvt_pkrtz_f16_f32_e32 v1, v6, v7 ; 5E020F06 exp 15, 1, 1, 0, 0, v0, v1, v0, v0 ; F800041F 00000100 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e32 v0, v8, v9 ; 5E001308 v_cvt_pkrtz_f16_f32_e32 v1, v10, v11 ; 5E02170A exp 15, 2, 1, 0, 0, v0, v1, v0, v0 ; F800042F 00000100 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e32 v0, v12, v13 ; 5E001B0C v_cvt_pkrtz_f16_f32_e32 v1, v14, v15 ; 5E021F0E exp 15, 3, 1, 1, 1, v0, v1, v0, v0 ; F8001C3F 00000100 s_endpgm ; BF810000 After: v_cvt_pkrtz_f16_f32_e32 v0, v0, v1 ; 5E000300 v_cvt_pkrtz_f16_f32_e32 v1, v2, v3 ; 5E020702 v_cvt_pkrtz_f16_f32_e32 v2, v4, v5 ; 5E040B04 v_cvt_pkrtz_f16_f32_e32 v3, v6, v7 ; 5E060F06 exp 15, 0, 1, 0, 0, v0, v1, v0, v0 ; F800040F 00000100 v_cvt_pkrtz_f16_f32_e32 v4, v8, v9 ; 5E081308 v_cvt_pkrtz_f16_f32_e32 v5, v10, v11 ; 5E0A170A exp 15, 1, 1, 0, 0, v2, v3, v0, v0 ; F800041F 00000302 v_cvt_pkrtz_f16_f32_e32 v6, v12, v13 ; 5E0C1B0C v_cvt_pkrtz_f16_f32_e32 v7, v14, v15 ; 5E0E1F0E exp 15, 2, 1, 0, 0, v4, v5, v0, v0 ; F800042F 00000504 exp 15, 3, 1, 1, 1, v6, v7, v0, v0 ; F8001C3F 00000706 s_endpgm ; BF810000 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	b2b45cecef	radeonsi: set optimal settings in COMPUTE_RESOURCE_LIMITS ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	ad70c3954b	radeonsi: really wait for the second EOP event and not the first one Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	1a1cc67edd	gallium/radeon: remove RADEON_FLUSH_KEEP_TILING_FLAGS flag always set Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Ian Romanick	0b626d7524	nir/algebraic: Optimize fabs(u2f(x)) I noticed this when I tried to do frexp(float(some_unsigned)) in the ir_unop_find_lsb lowering pass. The code generated for frexp() uses fabs, and this resulted in an extra instruction. Ultimately I ended up not using frexp. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:30 -07:00
Ian Romanick	94296be276	st/mesa: Enable MESA_shader_integer_functions on all GLSL 1.30 platforms Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:30 -07:00
Ian Romanick	7cb49b1bd7	i965: Enable MESA_shader_integer_functions on all GLSL 1.30 platforms Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	5726e57f13	i965: Don't lower uaddCarry and usubBorrow in both GLSL IR and NIR Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	d7a47a76e0	i965: Update assertion to account for Gen < 7 Previously SHADER_OPCODE_MULH could only exist on Gen7+, so the assertion assumed the Gen7+ accumulator rules. A future patch will allow this instruction on at least Gen6, so update the assertion. v2: Use get_lowered_simd_width instead of open coding it. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]	2016-07-19 12:19:29 -07:00
Ian Romanick	3e7cebc8da	i965: Use LZD to implement nir_op_find_lsb on Gen < 7 v2: Rebase on changes to previous two patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	c2019c6c26	i965: Use LZD to implement nir_op_ifind_msb on Gen < 7 v2: Retype LZD source as UD to avoid potential problems with 0x80000000. Suggested by Matt. Also update comment about problem values with LZD(abs(x)). Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	de20086eed	i965: Use LZD to implement nir_op_ufind_msb This uses one less instruction. v2: Move emit_find_msb_using_lzd out of the visitor classes. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	26c7f04d4a	i965: Always enable GL_ARB_shading_language_packing With the existing lowering passes, the functions from this extension become a bunch of bit twiddling operations that have always been supported. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	4b2b6d4d4d	i965: Move enable of EXT_shader_integer_mix This extension does not depend on the Gen. It only depends on the availability of GLSL 1.30. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	a2379e44aa	glsl: Add lowering pass for ir_bin_imul_high This isn't the lowering pass you want. Most GPUs that can support GLSL 1.30 have a multiply unit that can do something more interesting than 32x32->32. Many have 32x16->48. Any GPU that does, should do the lowering in the backend. This is just the thing that will always work. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	1b5477668a	glsl: Add lowering pass for ir_unop_find_msb Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	2a381a3c73	glsl: Add lowering pass for ir_unop_find_lsb Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	ad9acb19c3	glsl: Add lowering pass for ir_unop_bitfield_reverse Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	3079dcb00c	glsl: Add lowering pass for ir_quadop_bitfield_insert Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	4d6d219b58	glsl: Add lowering pass for ir_triop_bitfield_extract Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	7340be8a01	glsl: Add lowering pass for ir_unop_bit_count Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	806add360f	MESA_shader_integer_functions: Allow new function overload matching rules Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	90537e1a0e	MESA_shader_integer_functions: Allow implicit int->uint conversions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	65b0346fdb	MESA_shader_integer_functions: Expose new built-in functions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	15c4ae461d	MESA_shader_integer_functions: Boiler plate extension tracking Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	91482ef226	MESA_shader_integer_functions: Add extension specification v2: Fix typo in #extension line noticed by Ken. v3: Update spec status. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:15 -07:00
Samuel Pitoiset	9c63224540	gm107/ir: make use of ADD32I for all immediates ADD only allows to emit 19-bits immediates. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-07-19 18:07:15 +02:00
Samuel Pitoiset	0904a2ba97	gm107/ir: add missing NEG modifier for IADD32I Like FADD32I, the NEG modifier of src0 is at position 56. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-07-19 18:07:10 +02:00
Andreas Boll	c482decd4d	ddebug: Fix trivial typo in stderr message Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>	2016-07-19 16:04:40 +02:00
Andreas Boll	d66cb7c84f	configure.ac: Use ${datarootdir} for --with-vulkan-icddir help string too The help string wasn't updated in `cbc37f7`. Fixes: `cbc37f7` ("anv: install the intel_icd.json to ${datarootdir} by default") Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-07-19 16:04:01 +02:00
Eric Engestrom	8ba46fbd9e	vl: fix memory leak CovID: 1363008 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-19 12:41:00 +02:00
Boyuan Zhang	60c7450f16	vl: add entry point Add entrypoint to distinguish H.264 decode and encode. For example, in patch 5/11 when is calling "VaCreateContext", "pps" and "sps" shouldn't be allocated for H.264 encoding. So we need to use the entry_point to determine this is H.264 decode or H.264 encode. We can use config to determine the entrypoint since config_id is passed to us for VaCreateContext call. However, for VaDestoyContext call, only context_id is passed to us. So we need to know the entrypoint in order to not free the pps/sps for encoding case. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-19 12:36:46 +02:00
Ilia Mirkin	ed9dd3bcd9	nv50,nvc0: srgb rendering is only available for rgba/bgra Mark both L8_SRGB and L8A8_SRGB as non-renderable (the latter already didn't have the bind flags). This makes the state tracker pick a different format when rendering is required, or mark the fb as incomplete. This fixes: bin/getteximage-formats init-by-clear-and-render -auto -fbo bin/getteximage-formats init-by-rendering -auto -fbo which previously ran into srgb-encoding differences. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-07-18 20:04:17 -04:00
Ilia Mirkin	8e7893eb53	nvc0: add support for BGRA8 images This is useful for pbo downloads, which are now accelerated with images. BGRA8 is a moderately common format to do that in. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-18 20:04:17 -04:00
Jason Ekstrand	905d7dc4d1	i965: Skip update_texture_surface when the plane doesn't exist Thanks to rebase fail, recent surface state changes (commits `7e951cd56`, `8521ce1a7`, and `69c0dc5c53`) effectively reverted `727a9b2493` and `367cf3a2e3` which was unintentional. This should bring it back. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-07-18 16:44:29 -07:00
Timothy Arceri	cd5cbf0f6b	glsl: use linked shaders rather than compiled shaders At this point there is no reason not to be using the linked shaders, using the linked shaders should be faster and will make things simpler for upcoming shader cache work. The previous variable name suggests the linked shaders were intended to be used here anyway. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-07-19 09:42:00 +10:00
Lars Hamre	198074a41c	The extension is already exposed, this simply marks it as done. Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-07-19 01:20:27 +02:00
Anuj Phogat	22935a3040	docs: Fix typo in extension name Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-18 15:53:24 -07:00
Anuj Phogat	7832e18879	docs: Add support for GL_KHR_texture_compression_astc_sliced_3d Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-18 15:44:18 -07:00
Anuj Phogat	c7b787ef90	Revert "docs: Mark KHR_texture_compression_astc_sliced_3d done on i965" This reverts commit `82f8c23950`. KHR_texture_compression_astc_sliced_3d is not a requirement for GLES 3.2. Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>\ Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-18 15:43:58 -07:00
Anuj Phogat	82f8c23950	docs: Mark KHR_texture_compression_astc_sliced_3d done on i965 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-07-18 14:39:54 -07:00
Anuj Phogat	ac0eb36d8e	i965/gen9: Enable KHR_texture_compression_astc_sliced_3d Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-07-18 14:39:54 -07:00
Anuj Phogat	15dea5ca82	mesa: Add the infrastructure for KHR_texture_compression_astc_sliced_3d V2: Drop the changes to gl.xml. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-07-18 14:39:54 -07:00
Christian König	3e1ad846f9	radeon/uvd: add session context buffer for polaris 10/11 v2 This way we have unlimited UVD sessions. v2: only enable it when kernel supports it as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-07-18 17:13:17 +02:00
Leo Liu	134d6e4e4f	vl/dri3: fix a memory leak from front buffer Inspired by fix for mem leak of vdpau interop, resource_from_handle set texture reference count, that need to be decreased and released, recall there is a similar case for DRI3, that is with VA-API glx extension, there is temporary TFP(texture from pixmap), we target it through dma-buf. leak happens when without count down the reference. Checked and found with mpv vo=opengl case, there only one static TFP, the leak happens once, but for totem player using gstreamer VA-API glx, the dynamic TFP for each frame, so leak quite a bit. This fixes mem leak for mpv and totem. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-18 09:20:40 -04:00
Iago Toral Quiroga	0f2516d88f	i965/tes/scalar: fix 64-bit indirect input loads We totally ignored this before because there were no piglit tests for indirect loads in tessellation stages with doubles. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-18 09:53:51 +02:00
Iago Toral Quiroga	1737e75bfb	i965/tcs/scalar: only update imm_offset for second message in 64bit input loads Our indirect URB read messages take both a direct and an indirect offset so when we emit the second message for a 64-bit input load we can just always incremement the immediate offset, even for the indirect case. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-18 09:53:16 +02:00
Kenneth Graunke	18f67c8a69	i965: Move pulls_bary setting to emit_pixel_interpolator_send(). pulls_bary should be set when the shader uses a pixel interpolator message. So, setting it from the function that emits pixel interpolator messages makes a lot of sense. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-17 19:26:54 -07:00
Kenneth Graunke	7ef7738a61	i965: Write gl_FragCoord directly to the destination. This patch makes emit_general_interpolation take a destination register as an argument, and write directly to that. This is simpler than the old approach of ralloc'ing a register, writing to that temporary, and then making the caller emit per-component MOVs to copy it to the actual destination. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-17 19:26:53 -07:00
Kenneth Graunke	a03812c321	i965: Drop has_pln checks in unlit centroid workaround. The unlit centroid workaround starts being necessary on Gen6, which is the first platform with multisampling. PLN exists on G45+, so all platforms which need this workaround have PLN. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-17 19:26:53 -07:00
Kenneth Graunke	b94890c19f	i965: Drop VARYING_SLOT_FACE special case in barycentric setup. glsl_to_nir always produces a system value for gl_FrontFacing, rather than an input. So there should never be an input with this slot, making this code dead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-17 19:26:53 -07:00
Kenneth Graunke	ac1181ffbe	compiler: Rename INTERP_QUALIFIER_* to INTERP_MODE_. Likewise, rename the enum type to glsl_interp_mode. Beyond the GLSL front-end, talking about "interpolation modes" seems more natural than "interpolation qualifiers" - in the IR, we're removed from how exactly the source language specifies how to interpolate an input. Also, SPIR-V calls these "decorations" rather than "qualifiers". Generated by: $ find . -regextype egrep -regex '.\.(c\|cpp\|h)' -type f -exec sed -i \ -e 's/INTERP_QUALIFIER_/INTERP_MODE_/g' \ -e 's/glsl_interp_qualifier/glsl_interp_mode/g' {} \; Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Dave Airlie <airlied@redhat.com>	2016-07-17 19:26:48 -07:00
Dave Airlie	e7d96e7685	virgl: drop pointless leftover init of virgl_transfer_inline_write. Pointed out by Marek. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-07-17 06:20:53 +10:00
Ilia Mirkin	062c6b8e54	nv50: fix alphatest for non-blendable formats The hardware can only do alphatest when using a blendable format. This means that the various *16 norm formats didn't work with alphatest. It appears that Talos Principle uses such formats, as well as alpha tests, for some internal renders, which made them be incorrect. However this does not appear to affect the final renders, but in a different game it easily could. The approach we take is that when alphatests are enabled and a suitable format is used (which we anticipate is the vast minority of the time), we insert code into the shader to perform the comparison and discard. Once inserted, that code lives in the shader forever, and we re-upload it each time the function changes with a fixed-up compare. To avoid re-uploading too often, if we switch back to a blendable format, the test is (effectively) disabled and the hw alphatest functionality is used. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-16 11:45:30 -04:00
Rob Clark	cc46fc3c09	mesa/st: reduce size of state->st bitmask In `d035d50` this changed to 64b.. which I'm pretty sure was unintentional. Revert it back to 32b so the entire state struct is a nice round 64b. (Note sure that it would actually be measurable, but I did notice that check_state() was hot in some benchmarks.) Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-16 10:00:04 -04:00
Rob Clark	44bbfedbd9	gallium/u_queue: add optional cleanup callback Adds a second optional cleanup callback, called after the fence is signaled. This is needed if, for example, the queue has the last reference to the object that embeds the util_queue_fence. In this case we cannot drop the ref in the main callback, since that would result in the fence being destroyed before it is signaled. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-16 10:00:04 -04:00
Nicolai Hähnle	6f73c7595f	radeonsi: remove the DRAW_PREAMBLE packet According to firmware guys, the new sequence that we added for Polaris should work on all CIK parts, and should actually be faster on some parts. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-16 13:02:37 +02:00
Brian Paul	b89d0df535	mesa: handle numSamples=0 in _mesa_test_proxy_teximage() Should fix the regressions reported in bug 96949. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96949 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-15 21:32:24 -07:00
Kenneth Graunke	aa6f60f844	nir: Use dest.ssa.num_components rather than intrin->num_components. I recently refactored this to share code between load and atomic lowering. loads used intrin->num_components, while atomics used intrin->dest.ssa.num_components. They should be equivalent, but Jason wanted me to use the latter. I missed applying his review. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-15 19:42:43 -07:00
Kenneth Graunke	da3d4a4c56	nir: Update outdated intrinsic const_index comments. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:10 -07:00
Kenneth Graunke	52e75dcb8c	nir: Use nir_intrinsic_set_base in atomic lowering. This is more readable and also offers assertions that protect against setting const_index fields on the wrong kind of intrinsic. Suggested by Jason. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:10 -07:00
Kenneth Graunke	50b9bb9421	nir: Split nir_lower_io's input/output/atomic handling into helpers. The original function was becoming a bit hard to read, with the details of creating and filling out load/store/atomic atomics all in one function. This patch makes helpers for creating each type of intrinsic, and also combines them with the *_op() helpers, as they're closely coupled and not too large. v2: Minor style nits from Jason. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:10 -07:00
Kenneth Graunke	e12e4af780	nir: Drop bogus nir_var_shader_in case in nir_lower_io's store_op(). This can't happen, the caller asserts that mode is shader_out or shared. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:09 -07:00
Kenneth Graunke	cdfc05ea6e	nir: Share destination rewriting and replacement code in IO lowering. Both loads and atomics had identical code to rewrite destinations, and all cases had the same two lines to replace instructions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:09 -07:00
Kenneth Graunke	349fe79c9b	nir: Share get_io_offset handling in nir_lower_io. The load/store/atomic cases all duplicated the get_io_offset code, with a few tiny differences: stores didn't bother checking for per-vertex inputs, because they can't be stored to, and atomics didn't check at all, since shared variables aren't per-vertex. However, it's harmless to check, and allows us to share more code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:09 -07:00
Kenneth Graunke	7171a9a87d	nir: Make a 'var' temporary in nir_lower_io. Less typing and word wrapping issues than intrin->variables[0]->var. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:09 -07:00
Kenneth Graunke	f05770121f	i965: Remove the emit_linterp() helper. Rather than computing the barycentric mode each time we emit a LINTERP, we can simply compute it once, as soon as we know we're doing non-flat interpolation. At that point, emit_linterp() doesn't do much, so fold it into the call sites and drop it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:16:54 -07:00
Kenneth Graunke	203243f5ff	i965: Reduce the number of fs_reg(brw_reg) calls in LINTERP handling. A bit tidier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:16:54 -07:00
Kenneth Graunke	eefbbb943e	i965: Make a barycentric_mode() helper function. This combines two copies of basically the same code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-15 17:16:54 -07:00
Kenneth Graunke	783511e605	i965: Rename brw_wm_barycentric_interp_mode to brw_barycentric_mode. brw_wm_barycentric_interp_mode is wordy, brw_barycentric_mode is less typing and suffers from fewer line wrapping problems. The enum values themselves don't really benefit from "WM" in the name, either. Put "BARYCENTRIC" first instead of at the end and drop "WM". Generated by: for file in .c .cpp .h; do sed -i \ -e 's/brw_wm_barycentric_interp_mode/brw_barycentric_mode/g' \ -e 's/BRW_WM_$[A-Z_]$_BARYCENTRIC/BRW_BARYCENTRIC_\1/g' \ -e 's/BRW_WM_BARYCENTRIC_INTERP_MODE_COUNT/BRW_BARYCENTRIC_MODE_COUNT/g' \ $file; done with a few whitespace changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:16:54 -07:00
Kenneth Graunke	2d6dd30a9b	i965: Handle default interpolation modes and locations in NIR. This consolidates a bunch of hacks in a single place - by setting the interpolation modes and locations on variables appropriately, we can simply trust them in the rest of the code. This avoids having to handle INTERP_QUALIFIER_NONE, gl_Color overrides, sample-shading overrides, and Gen4-5 centroid-overrides in a bunch of places. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:16:54 -07:00
Jason Ekstrand	745f5778f3	i965/context: Remove some unnecessary vfuncs Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	305044c5b1	i965: Get rid of gen6_surface_state.c The only useful thing left was gen6_init_vtable_surface_functions which we can easily put in brw_wm_surface_state.c. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	16fb285946	i965: Use ISL for emitting buffer surface states Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	ee229d1b9c	i965/state: Account for the element size in emit_buffer_surface_state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	69c0dc5c53	i965/gen4-6: Use the generic ISL-based path for texture surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	2d56959bf8	i965/gen6: Use the generic ISL-based path for renderbuffer surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	efa7668545	i965/gen7: Use the generic ISL-based path for renderbuffer surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	8521ce1a7e	i965/gen7: Use the generic ISL-based path for texture surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	26282a01f5	i965/gen8: Use the generic ISL-based path for renderbuffer surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	7e951cd562	i965/gen8: Use the generic ISL-based path for texture surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:41 -07:00
Jason Ekstrand	09b5a71517	i965/state: Add generic surface update functions based on ISL Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:59:33 -07:00
Jason Ekstrand	1abb37baa0	i965/surface_state: Rename brw_update to gen4_update We're about to add generic versions which work across gens and those should have the brw name. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:59:33 -07:00
Jason Ekstrand	5a8c89038a	i965/state: Use ISL for emitting image surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:59:33 -07:00
Jason Ekstrand	7a21d1bfc3	i965/blorp: Use a generic ISL path for texture surfaces on gen8 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-07-15 15:59:33 -07:00
Jason Ekstrand	5cf665afa1	i965/state: Add a helper for emitting a surface state using isl Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:59:24 -07:00
Jason Ekstrand	73ae4ec294	i965/blorp: Use the generic ISL path for texture surfaces on gen6 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:49 -07:00
Jason Ekstrand	cc78061003	i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen6 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:49 -07:00
Jason Ekstrand	366a6a659d	i965/blorp: Use the generic ISL path for texture surfaces on gen7 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:49 -07:00
Jason Ekstrand	3339ef42cf	i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen7 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	16022352ea	i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen8-9 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	6553dc0d70	i965/blorp: Add a generic ISL-based surface state emit path Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	e974456d4f	i965/miptree: Add a helper for getting the aux isl_surf from a miptree Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	1e45349e82	i965/miptree: Add a helper for getting the ISL clear color from a miptree Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	f665a3da72	i965/miptree: Add a helper for getting an isl_surf from a miptree Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	e2dd3ce976	i965: Add an isl_device to the brw_context Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	4f282ff67e	isl/state: Add support for OffsetX/Y in surface state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	f8984b918a	isl: Add support for filling out surface states all the way back to gen4 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	815847e2b3	isl: Add an ISL_DEV_IS_G4X macro Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	27883f8cbc	genxml: Add macros and #includes for gens 4-6 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	ba798ac6b1	genxml: Make X/Y Offset field of SURFACE_STATE a uint THe offset type has special implications that it's intended to be some form of aligned memory address. These assumptions allow it to handle the case where there is some alignment requirement on the offset and the bottom bits are used for other things. However, the offsets in the surface state field are really just unsigned integers. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	9a999ceab8	genxml: Add enough XML for gens 4, 4.5, and 5 to get SURFACE_STATE Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:47 -07:00
Jason Ekstrand	0f6eb5dea0	isl/state: Divide the aux qpitch by 4 The field is in multiples of 4 like regular QPitch. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:47 -07:00
Jason Ekstrand	2c6ca658e7	isl: Fix the bs assertion in isl_tiling_get_info Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:47 -07:00
Jason Ekstrand	593731ea3c	anv: Handle VK_WHOLE_SIZE properly for buffer views The old calculation, which used view->offset, encorporated buffer->offset into the size calculation where it doesn't belong. This meant that, if buffer->offset > buffer->size, you would always get a negative size. This fixes 170 dEQP-VK.renderpass.attachment.* Vulkan CTS tests on Haswell. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-15 15:48:21 -07:00
Jason Ekstrand	827405f072	anv: Add an align_down_npot_u32 helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-15 15:48:21 -07:00
Jason Ekstrand	f124f4a394	anv: Enable independentBlend on gen7 We can totally do it, we were just only setting up one BLEND_STATE and, now that the code is unified with gen8, we should be handling it correctly. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-15 15:48:21 -07:00
Jason Ekstrand	a2e7b2e653	anv/pipeline: Unify blend state setup between gen7 and gen8 This fixes all 674 broken dEQP-VK.pipeline.blend Vulkan CTS tests on Haswell. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-15 15:48:21 -07:00
Jason Ekstrand	aaa202ebe7	genxml: Make gen6-7 blending look more like gen8 This renames BLEND_STATE to BLEND_STATE_ENTRY and adds an new struct BLEND_STATE which is just an array of 8 BLEND_STATE_ENTRYs. This will make it much easier to write gen-agnostic blend handling code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-15 15:48:21 -07:00
Eric Anholt	3bcd0f1912	vc4: Speed up glGenerateMipmaps by avoiding shadow baselevel. To support general GL_TEXTURE_BASE_LEVEL we have to copy to a temporary miptree. However, if a single level is being selected, we can use the existing miptree and force all the sampling to be from that particular level. This avoids a ton of software fallbacks in glGenerateMipmaps(), which uses base levels in the blit implementation in gallium. Improves "glmark2 -b terrain" from 2 fps to 3 (perhaps some more precision would be useful?), and cuts its CPU usage during the benchmarking from ~30% to ~10% (total CPU time from 8.8s to 7.6s).	2016-07-15 13:54:00 -07:00
Eric Anholt	88152d7dc0	vc4: Drop VC4_DIRTY_TEXSTATE in favor of the per-stage flags. The compiler uses the per-stage flags already, so it didn't need this. vc4_uniforms was using it, so just replace it with both of the stage flags for now.	2016-07-15 13:54:00 -07:00
Eric Anholt	5db82e0c89	vc4: Remove dead dirty_samplers field. We use a big VC4_DIRTY_FRAGTEX/VC4_DIRTY_VERTEX on the stage, instead.	2016-07-15 13:54:00 -07:00
Eric Anholt	219b75deb9	vc4: Turn on control flow support in the simulator environment. We can't merge the non-simulator support until we merge the kernel side and get a new libdrm release.	2016-07-15 13:54:00 -07:00
Brian Paul	9a23a177b9	mesa: handle numLevels, numSamples in _mesa_test_proxy_teximage() If numSamples > 0, we can compute the size of the whole mipmapped texture. That's the case for glTexStorage(GL_PROXY_TEXTURE_x). Also, multiply the texture size by numSamples for MSAA textures. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-15 14:24:34 -06:00
Brian Paul	39183ea971	mesa: add proxy texture targets in _mesa_next_mipmap_level_size() So we can use it for computing size of proxy textures. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-15 14:24:34 -06:00
Brian Paul	0ac9f25032	mesa: add numLevels, numSamples to Driver.TestProxyTexImage() So that the function can work properly with glTexStorage(), where we know how many mipmap levels there are. And so we can compute storage for MSAA textures. Also, remove the obsolete texture border parameter. A subsequent patch will update _mesa_test_proxy_teximage() to use these new parameters. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-15 14:24:34 -06:00
Brian Paul	e477d92c94	mesa: use _mesa_clear_texture_image() in clear_texture_fields() This avoids a failed assert(img->_BaseFormat != -1) in init_teximage_fields_ms() because the internalFormat argument is GL_NONE. This was hit when using glTexStorage() to do a proxy texture test. Fixes a failure with the updated Piglit tex3d-maxsize test. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-15 14:24:34 -06:00
Charmaine Lee	6b7923ee46	svga: avoid ubinding render targets that have already been unbound Fixed the remaining redundant SetRenderTargets command emission. Tested with lightsMark2008, Heaven, mtt piglit, glretrace, conform. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-07-15 14:24:34 -06:00
Neha Bhende	4f633d110a	svga: dump code for GenMips. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-07-15 14:24:33 -06:00
Jon Turney	c7151401e0	Disable use of weak in threads_posix.h on Cygwin Weak doesn't work the same on PE/COFF as on ELF, they are only weak references. Specifically, since nothing else pulls in the object which contains pthread_mutexattr_init() (and coming from the C library, that is the only thing that object contains), means that it ends up as 0 Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>	2016-07-15 19:46:54 +01:00
Jon Turney	7d8edbaee7	configure: Don't require pthread-stubs on Cygwin Commit `1f4869a2` unconditionally requires pthread-stubs. Unfortunately, the cleverness that pthread-stubs is doesn't work with PE/COFF, and historically Cygwin doesn't have a pthread-stubs.pc. Don't require pthread-stubs on Cygwin. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>	2016-07-15 19:46:54 +01:00
Yaakov Selkowitz	5d303867f5	Use correct names for dlopen()ed files on Cygwin Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>	2016-07-15 19:46:54 +01:00
Yaakov Selkowitz	3c18c16ecf	configure: Define _GNU_SOURCE for Cygwin as well Cygwin headers are now a bit more correct in handling feature test macros, so use _GNU_SOURCE when building for Cygwin, as well. (Notwithstanding `f381c27c`, we should probably have always been using _GNU_SOURCE, since asprintf() is used by mesa in places) Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>	2016-07-15 19:46:54 +01:00
Nanley Chery	1fc739d28e	Revert "isl: Don't filter tiling flags if a specific tiling bit is set" This reverts commit `091f1da902` . Although a user may specify a specfic tiling bit, ISL should still prevent incompatible tiling/surface combinations. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 10:35:40 -07:00
Nanley Chery	e179fee049	anv/blit2d: Copy with stencil sources when needed In the next patch, ISL will unconditionally perform verification of a surface's tiling and usage. Since it will require that w-tiled images be stencil buffers, create a stencil surface to copy from a w-tiled/stencil surface. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	1ef80b26d7	anv/image: Fix initialization of the ISL tiling If an internal user creates an image with Vulkan tiling VK_IMAGE_TILING_OPTIMAL and an ISL tiling that isn't set, ISL will fail to create the image as anv_image_create_info::isl_tiling_flags will be an invalid value. Correct this by making anv_image_create_info::isl_tiling_flags an opt-in, filtering bitmask, that allows the caller to specify which ISL tilings are acceptable, but not contradictory to the Vulkan tiling. Opt-out of filtering for vkCreateImage. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	00caba4152	isl: Fix isl_tiling_is_any_y() Cc: 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	a5748cb920	anv/device: Fix max buffer range limits Set limits that are consistent with ISL's assertions in isl_genX(buffer_fill_state_s)() and Anvil's format-DescriptorType mapping in anv_isl_format_for_descriptor_type(). Fixes the following new crucible tests: * stress.limits.buffer-update.range.uniform * stress.limits.buffer-update.range.storage These tests are in this patch: https://patchwork.freedesktop.org/patch/98726/ Cc: 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	028f6d8317	isl: Fix assert on raw buffer surface state size See inline PRM reference. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	96c664cd03	anv/cmd_buffer: Simplify range member assignment A ternary is clearer because the range member is assigned one of two values dependant on one condition. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	1a7344531f	anv/cmd_buffer: Remove unused variable This became unused due to commit `612e35b2c6` . Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	fd16e64321	anv/descriptor_set: Fix binding partly undefined descriptor sets Section 13.2.3. of the Vulkan spec requires that implementations be able to bind sparsely-defined Descriptor Sets without any errors or exceptions. When binding a descriptor set that contains a dynamic buffer binding/descriptor, the driver attempts to dereference the descriptor's buffer_view field if it is non-NULL. It currently segfaults on undefined descriptors as this field is never zero-initialized. Zero undefined descriptors to avoid segfaulting. This solution was suggested by Jason Ekstrand. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96850 Cc: 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Brian Paul	50a669de4e	svga: handle mismatched number of samplers, sampler views in svga_init_shader_key_common(). Since the CSO module only tracks sampler views for fragment shaders, the number of samplers and sampler views can be mismatched for other types of shaders. This situation triggered an assertion in Chrome with maps.google.com This patch adds defensive code to handle that situation. Fixes VMware bug 1694027 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-07-15 11:05:18 -06:00
Leo Liu	b9d10e79c8	st/omx/enc: check uninitialized list from task release The uninitialized list should be checked and returned. Thank Julien for the notification and suggested fix. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-15 09:17:36 -04:00
Samuel Pitoiset	ea6b236ab1	nv50/ir: add missing string for SV_WORK_DIM Fixes: `2aa1197` ("nouveau: Add support for SV_WORK_DIM") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Hans de Goede <hdegoede@redhat.com>	2016-07-14 22:28:39 +02:00
Marek Olšák	f84e9d749f	Revert "radeon/llvm: Use alloca instructions for larger arrays" This reverts commit `513fccdfb6`. Bioshock Infinite hangs with that.	2016-07-14 22:15:08 +02:00
Jan Vesely	489bb5473b	r600,compute: Reserve vtx 3 for kernel arguments Using vtx 0 does not work for dynamic offsets. v2: add explanatory comment Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2016-07-14 16:04:50 -04:00
Marek Olšák	33eddde4a7	radeon/uvd: fail to create a decoder if RUVD_MSG_CREATE submission fails This is the bare minimum for reporting the error to the user. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 22:00:54 +02:00
Marek Olšák	85388652f9	winsys/amdgpu: return an error on IB submission failures Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 22:00:54 +02:00
Marek Olšák	a7d84f7731	gallium/radeon: add a return value to cs_flush Required by our UVD code. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 22:00:54 +02:00
Jason Ekstrand	b919100d61	glsl/types: Use _mesa_hash_data for hashing function types This is way better than the stupid string approach especially since you could overflow the string. Again, I thought I had something better at one point but it obviously got lost. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-14 10:48:25 -07:00
Jason Ekstrand	11ac1c4dbb	glsl/types: Fix function type comparison function It was returning true if the function types have different lengths rather than false. This was new with the SPIR-V to NIR pass and I thought I'd fixed it a while ago but it may have gotten lost in rebasing somewhere. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-14 10:48:11 -07:00
francians@gmail.com	3db7f3458f	freedreno/a4xx: Fix sign compare warnings Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-14 09:55:02 -04:00
francians@gmail.com	948822018f	freedreno/a3xx: Fix sign compare warnings Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-14 09:55:02 -04:00
francians@gmail.com	cf2f345356	freedreno/a2xx: Fix sign compare warnings Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-14 09:55:02 -04:00
Boyuan Zhang	23c5e8bc58	radeon/vce: handle newly added parameters Replace the previous hardcoded value with newly defined parameters Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 09:49:21 +02:00
Boyuan Zhang	5490068fb1	st/omx: assign previous values to new structure Assign previously hardcoded values for OMX to newly defined structure. As a result, OMX behaviour will not change at all. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 09:49:14 +02:00
Boyuan Zhang	b86bf4b568	vl: add parameters for VAAPI encode Allow to specify more parameters in the encoding interface which previously just hardcoded in the encoder Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 09:49:07 +02:00
Christian König	9ce52baf7f	st/mesa: fix reference counting bug in st_vdpau Otherwise we leak the resources created for the DMA-buf descriptors. Signed-off-by: Christian König <christian.koenig@amd.com> Cc: 12.0 <mesa-stable@lists.freedesktop.org> Tested-and-Reviewed by: Leo Liu <leo.liu@amd.com> Ack-by: Tom St Denis <tom.stdenis@amd.com>	2016-07-14 09:33:44 +02:00
Eric Anholt	9194473dd2	vc4: Emit resets of the uniform stream at the starts of blocks. If a block might be entered from multiple locations, then the uniform stream will (probably) be at different points, and we need to make sure that it's pointing where we expect it to be. The kernel also enforces that any block reading a uniform resets uniforms, to prevent reading outside of the uniform stream by using looping.	2016-07-13 23:54:15 -07:00
Eric Anholt	44df061aaa	vc4: Add support for scheduling of branch instructions. For now we don't fill the delay slots, and instead just drop in NOPs.	2016-07-13 23:54:15 -07:00
Eric Anholt	a59da513d3	vc4: Move the QPU instructions to schedule into each block. We'll want to schedule them individually, to handle delay slots.	2016-07-13 23:54:15 -07:00
Eric Anholt	37ecc61662	vc4: Disable vc4_opt_vpm in the presence of control flow. It's a really valuable pass currently, but it will be a mess to rewrite for control flow. For now, just disable it if we have multiple blocks present.	2016-07-13 23:54:15 -07:00
Eric Anholt	ee69cfd11d	vc4: Convert vc4_opt_dead_code to work in the presence of control flow. With control flow, we can't be sure that we'll see the uses of a variable before its def as we walk backwards. Given that NIR is eliminating our long chains of dead code, a simple solution for now seems fine. This slightly changes the order of some optimizations, and so an opt_vpm happens before opt_dce, causing 3 dead MOVs to be turned into dead FMAXes in Minecraft: instructions in affected programs: 52 -> 54 (3.85%)	2016-07-13 23:54:15 -07:00
Eric Anholt	4e797bd98f	vc4: Update copy propagation for control flow. Previously, we could assume that a MOV from a temp was always an available copy, because all temps were SSA in NIR, and their non-SSA state in QIR was just due to the fact that they were from a bcsel or pack_unorm_4x8, so we could use the current value of the temp after that series of QIR instructions to define it. However, this is no longer the case with control flow. Instead, we track a new array of MOVs defined within the block that haven't had their source or dest killed yet, and use that primarily. We fall back to looking through the QIR defs array to handle across-block MOVs, but now require that copies from the SSA defs have an SSA src as well.	2016-07-13 23:54:15 -07:00
Samuel Iglesias Gonsálvez	94135e8736	i965/fs: emit DIM instruction to load 64-bit immediates in HSW v2 (Matt): - Use brw_imm_df() as source argument of DIM instruction. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-14 08:11:50 +02:00
Samuel Iglesias Gonsálvez	0534863c47	i965/eu: set DF imm value to the source of DIM According to HSW's PRM, vol02b, the DIM instruction has the following restriction: "Restriction : src0 must be immediate. src0 must specify the :f (F, Float) type encoding but is an immediate 64-bit DF (Double Float) value. dst must have type DF." This commit allows to upload the immediate 64-bit DF value to the source of a DIM instruction even when it is of float type encoding. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-14 08:06:01 +02:00
Samuel Iglesias Gonsálvez	6e28976d35	i965: enable the emission of the DIM instruction v2 (Matt): - Take a DF source argument for the DIM instruction emission in the visitors. - Indentation. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-14 08:06:01 +02:00
Jason Ekstrand	b9e99282a6	anv: Add a stub for CmdCopyQueryPoolResults on Ivy Bridge Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-13 20:31:27 -07:00
Timothy Arceri	a738732abf	i965: fix compiler warnings for 32bit build Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-14 12:03:59 +10:00
Tim Rowley	29f53d7937	Revert "gallium: Force blend color to 16-byte alignment" This reverts commit `d8d6091a84`. Heap allocations may be only 8-byte aligned on 32-bit system, and so having members with 16-byte alignment (such as in the case where pipe_blend_color is embedded in radeonsi's si_context) is undefined behavior which indeed causes crashes when compiled with gcc -O3. Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96835 Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com> Acked-by: Chuck Atkins <chuck.atkins@kitware.com>	2016-07-13 13:55:33 -05:00
Jason Ekstrand	48ed8b6f26	isl/state: Add support for handling auxiliary surfaces Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	76e2dcc131	isl: Add an auxiliary surface usage enum Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	3ab3d97ac9	isl: Add support for color control surfaces Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	219024b9a7	isl: Add support for multisample compression surfaces Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	33dc8549fb	isl: Add support for HiZ surfaces Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	fc3650a0a9	isl: Kill off isl_format_layout::bs Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	1f0433f075	isl: Take bpb rather than bs in tiling_get_info Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	01855d7331	isl: Use bpb in a few places where it's more natural than bs Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	8c76b9bdce	isl: Use bpb for determining YUV image padding When we initially dropped bpb in favor of bs, we accidentally didn't change this one line properly. This brings it back to what it should be. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	cf9ff082b4	isl: Bring back isl_format_layout::bpb A while ago we got rid of the bits-per-block because we thought we didn't need it. We're about to introduce some very useful 1 and 2-bit formats so we really should be able to handle them again. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	0bd3a7e931	isl: Change the physical size of a W-tile to 128x32 Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	4b62c19c32	isl: Rework the way we define tile sizes. This is based on a very long set of discussions between Chad and myself about how we should properly represent HiZ and CCS buffers. The end result of that discussion was that a tiling actually has two different sizes, a logical size in elements, and a physical size in bytes and rows. This commit reworks ISL's pitch and size calculations to work in terms of these two sizes. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	7270bd0607	isl: Rework the way we handle surface padding Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	a52f26d6e8	isl: Use ARRAY_PITCH_SPAN_FULL for depth/stencil surfaces on gen7 We helpfully inserted a PRM quotation about how we need to use ARRAY_PITCH_SPAN_FULL and then set it to COMPACT. Oops... Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	0d48ac627a	isl: Stop multiplying height by block size The row pitch already specifies the size of a row of elements. Multiplying by the block height simply causes us to allocate as muc as 12 times more memory than needed for compressed textures. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	58c1b1088b	isl: Get rid of tiling_get_extent It was unused Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	49476576dd	nir/spirv: Don't multiply the push constant block size by 4 I have no idea why we were multiplying by 4 before. The offsets we get from SPIR-V are in bytes and so is nir->num_uniforms so there's no need to do any adjustment whatsoever. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-13 11:35:29 -07:00
Jason Ekstrand	1eed753ee8	anv/pipeline: Assert that the number of uniforms from NIR fits	2016-07-13 11:35:24 -07:00
Marek Olšák	0f7a6ea5e7	radeonsi: report accurate SGPR and VGPR spills Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	d227dbe272	radeonsi: add a workaround for a compute VGPR-usage LLVM bug v2: use abort(), describe which LLVM version is affected Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	f4d1de7f86	radeonsi: use LLVMGetTypeKind to tell if an input is an array of descriptors just a cleanup Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	785073ed0b	radeonsi: replace !tbaa with !invariant.load no change in generated code thanks to dereferenceable(n) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	348b9a5b1c	radeonsi: set dereferenceable attribute on descriptor arrays This allows moving the loads arbitrarily in the Sinking pass. 26002 shaders in 14643 tests Totals: SGPRS: 2080160 -> 2080160 (0.00 %) VGPRS: 798875 -> 797826 (-0.13 %) Spilled SGPRs: 108485 -> 79165 (-27.03 %) Spilled VGPRs: 327 -> 327 (0.00 %) Scratch VGPRs: 1656 -> 1652 (-0.24 %) dwords per thread Code Size: 36127192 -> 35559780 (-1.57 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 212464 -> 212672 (0.10 %) Wait states: 0 -> 0 (0.00 %) PERCENTAGES / App Shaders SGPRs VGPRs SpillSGPR SpillVGPR Scratch CodeSize MaxWaves Waits (unknown) 4 . . . . . . . . 0ad 6 . . . . . . . . alien_isolation 2938 . 0.04 % -8.53 % . . -0.71 % -0.06 % . anholt 10 . . . . . . . . batman_arkham_origins 589 . -0.58 % -79.54 % . . -6.72 % 0.57 % . bioshock-infinite 1769 . -0.65 % -89.32 % . . -4.73 % 0.48 % . borderlands2 3968 . -0.31 % -51.21 % . . -4.09 % 0.22 % . brutal-legend 338 . -0.03 % -2.95 % . . -0.06 % . . civilization_beyond.. 116 . . -14.17 % . . -0.88 % . . counter_strike_glob.. 1142 . . . . . . . . dirt-showdown 541 . -0.56 % -40.14 % . -3.45 % -1.82 % 0.35 % . dolphin 22 . . . . . 0.16 % . . dota2 1747 . . . . . 0.01 % . . europa_universalis_4 76 . -0.23 % -42.11 % . . -0.96 % . . f1-2015 774 . -0.09 % -28.89 % . . -2.60 % 0.09 % . furmark-0.7.0 4 . . . . . . . . gimark-0.7.0 10 . . . . . . . . glamor 16 . . . . . . . . humus-celshading 4 . . . . . . . . humus-domino 6 . . . . . . . . humus-dynamicbranching 24 . 0.71 % . . . 0.29 % -0.45 % . humus-hdr 10 . . . . . . . . humus-portals 2 . . . . . . . . humus-volumetricfog.. 6 . . . . . . . . left_4_dead_2 1762 . . . . . . . . metro_2033_redux 2670 . -0.10 % -7.15 % . . -0.03 % . . nexuiz 80 . . . . . . . . pixmark-julia-fp32 2 . . . . . . . . pixmark-julia-fp64 2 . . . . . . . . pixmark-piano-0.7.0 2 . . . . . . . . pixmark-volplosion-.. 2 . . . . . . . . plot3d-0.7.0 8 . . . . . . . . portal 474 . . . . . . . . sauerbraten 7 . . . . . . . . serious_sam_3_bfe 392 . . -13.20 % . . -1.81 % . . supertuxkart 4 . . . . . . . . talos_principle 324 . -0.21 % -18.39 % . . -2.73 % 0.14 % . team_fortress_2 808 . . . . . . . . tesseract 430 . 0.08 % -68.57 % . . -0.45 % . . tessmark-0.7.0 6 . . . . . . . . thea 172 . . . . . 0.03 % . . ue4_effects_cave 299 . -0.04 % -10.15 % . . -0.25 % 0.04 % . ue4_elemental 586 . -0.02 % -13.93 % . . -0.13 % 0.02 % . ue4_lightroom_inter.. 74 . -0.17 % -70.00 % . . -1.27 % . . ue4_realistic_rende.. 92 . . -32.58 % . . -0.35 % . . unigine_heaven 322 . 0.12 % -54.17 % . . -1.42 % -0.12 % . unigine_sanctuary 264 . . . . . . . . unigine_tropics 210 . . . . . . . . unigine_valley 278 . -0.15 % -40.74 % . . -2.00 % 0.09 % . unity 72 . . . . . 0.03 % . . warsow 176 . . . . . . . . warzone2100 4 . . . . . 0.13 % . . witcher2 1040 . -0.03 % -86.28 % . . -0.28 % 0.01 % . xcom_enemy_within 1236 . -0.24 % -63.54 % . . -0.93 % 0.18 % . yofrankie 82 . -0.61 % -100.00 % . . -0.83 % 0.41 % . ----------------------------------------------------------------------------------------------------------- Total 26002 . -0.13 % -27.03 % . -0.24 % -1.57 % 0.10 % . Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	6596ecf8c5	gallivm: add helper lp_add_attr_dereferenceable Not sure if this is the right way to do it, but it seems to work. v2: make it a no-op on LLVM <= 3.5 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	bccf9de4df	radeonsi: clean up shader value metadata code No change in behavior. BTW, tbaa_md_kind == 1, which was the magic number in the code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	d7d7e6adbe	radeonsi: remove LLVMNoUnwindAttribute uses always set by gallivm Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	c4807505c0	radeonsi: fix a typo in SI_PARAM_LINEAR_* handling introduced in `476e9cee1d` Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	f2f573e777	gallium/radeon: normalize the code style no change in behavior Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	ed3912d0da	radeonsi: just save buffer sizes instead of buffers while recording IBs whole buffer objects are not needed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Jon Turney	fc8139b146	Add c99_alloca.h include to fix compilation on Cygwin Fix compilation on Cygwin, since `50b22354`, by adding c99_alloca.h include, which should know how to portably make the alloc() prototype available. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-13 16:11:36 +01:00
Topi Pohjolainen	7d29fee4a8	i965/blorp: Cleanup leftovers from push constant disabling Setup for pixel shader push constants is the same as for other stages. Note that on gen8+ the if-else branches were identical and the generation check for packet size redundant. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-13 12:10:03 +03:00
Topi Pohjolainen	26778da571	i965/blorp/gen7+: Bring back push constant setup This is partial revert of commit `cc2d0e64`. It looks that even though blorp disables a stage the corresponding 3DSTATE_CONSTANT_XS packet is needed to be programmed. Hardware seems to try to fetch the constants even for disabled stages. Therefore care needs to be taken that the constant buffer is set up properly. Blorp will continue to trash it into non-existing such as before. It is possible that this could be omitted on SKL where the constant buffer is considered when the corresponding binding table settings are changed. Bspec: "The 3DSTATE_CONSTANT_* command is not committed to the shader unit until the corresponding (same shader) 3DSTATE_BINDING_TABLE_POINTER_* command is parsed." However, as CONSTANT_XS packet itself does not seem to stall on its own, it is safer to emit the packets for SKL also. Possible alternative to blorp trashing could have been to setup defaults in the beginning of each batch buffer. However, hardware doesn't seem to tolerate these packets being programmed multiple times per primitive. Bspec for IVB: "It is invalid to execute this command more than once between 3D_PRIMITIVE commands." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96878 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-13 12:09:35 +03:00
Nicolai Hähnle	65d48fcf8c	radeonsi: silence Coverity warning Coverity's analysis is too weak to understand that r600_init_flushed_depth(_, _, NULL) only returns true when flushed_depth_texture was assigned a non-NULL value. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-13 09:52:39 +02:00
Samuel Iglesias Gonsálvez	a2bd7334ed	i965/fs: do d2x lowering before simd splitting So that we can have gen7 split large writes produced by this lowering pass. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-13 07:09:41 +02:00
Iago Toral Quiroga	376d7ee587	i965/fs: do pack lowering before simd splitting So that we can have gen7 split large writes produced by the pack lowering. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-13 07:09:41 +02:00
Samuel Iglesias Gonsálvez	9979a3f2ac	i965/fs: do not require force_writemask_all with exec_size 4 So far we only used instructions with this size in situations where we did not operate per-channel and we wanted to ignore the execution mask, but gen7 fp64 will need to emit code with a width of 4 that needs normal execution masking. v2: - Modify the assert instead of deleting it (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-13 07:09:41 +02:00
Iago Toral Quiroga	aa4796ae81	i965/fs/gen7: split instructions that run into exec masking bugs In fp64 we can produce code like this: mov(16) vgrf2<2>:UD, vgrf3<2>:UD That our simd lowering pass would typically split in instructions with a width of 8, writing to two consecutive registers each. Unfortunately, gen7 hardware has a bug affecting execution masking and as a result, the second GRF register write won't work properly. Curro verified this: "The problem is that pre-Gen8 EUs are hardwired to use the QtrCtrl+1 (where QtrCtrl is the 8-bit quarter of the execution mask signals specified in the instruction control fields) for the second compressed half of any single-precision instruction (for double-precision instructions it's hardwired to use NibCtrl+1, at least on HSW), which means that the EU will apply the wrong execution controls for the second sequential GRF write if the number of channels per GRF is not exactly eight in single-precision mode (or four in double-float mode)." In practice, this means that we cannot write more than one consecutive GRF in a single instruction if the number of channels per GRF is not exactly eight in single-precision mode (or four in double-float mode). This patch makes our SIMD lowering pass split this kind of instructions so that the split versions only write to a single register. In the example above this means that we split the write in 4 instructions, each one writing 4 UD elements (width = 4) to a single register. v2 (Curro): - Make explicit that the thing about hardwiring NibCtrl+1 for the second compressed half is known to happen in Haswell and the issue with IVB might not be exactly the same. - Assign max_width instead of returning early so that we can handle multiple restrictions affecting to the same instruction. - Avoid division by 0 if the instruction does not write any registers. - Ignore instructions what have WE_all set. - Use the instruction execution type size instead of the dst type size. v3 (Curro): - Move the implementation down so it is not placed in the middle of another workaround. - Declare channels_per_grf as const. - Don't break the loop early if we find a BAD_FILE source. - Fix the number of channels that the hardware shifts for the second half of a compressed instruction to be 8 in single precision and 4 in double precision. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-13 07:09:41 +02:00
Iago Toral Quiroga	87a13f598b	i965/fs: use the new helper function to create double immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-13 07:09:41 +02:00
Iago Toral Quiroga	9e196e907e	i965/fs: add a helper function to create double immediates Gen7 hardware does not support double immediates so these need to be moved in 32-bit chunks to a regular vgrf instead. Instead of doing this every time we need to create a DF immediate, create a helper function that does the right thing depending on the hardware generation. v2: - Define setup_imm_df() as an independent function (Curro) - Create a specific builder to get rid of some instruction field assignments (Curro). v3: - Get devinfo from builder (Kenneth) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-13 07:09:41 +02:00
Eric Anholt	93794145dd	vc4: Validate QPU uniform pointer updates.	2016-07-12 17:42:42 -07:00
Eric Anholt	420845acb2	vc4: Add support for NIR loops and break/continue.	2016-07-12 17:42:42 -07:00
Eric Anholt	0adf2ec0ee	vc4: Add support for emitting NIR IF nodes.	2016-07-12 17:42:42 -07:00
Eric Anholt	f505f66cd5	vc4: Add support for storing to NIR registers in a non-SSA fashion. Previously, there were occasionally NIR registers in our programs, but they were always actually used SSA-only. Now that we're trying to support control flow, we need to actually conditionally move to registers based on whether channels are active or not.	2016-07-12 17:42:41 -07:00
Eric Anholt	ab1d40b84a	vc4: Add a flag in the screen to track control flow support. For now it's still always false, but I need it in place for kernel backwards compat support as I extend the backend for control flow.	2016-07-12 17:42:40 -07:00
Eric Anholt	05bcd9dd96	vc4: Define a QIR branch instruction This uses the branch condition code in inst->cond to jump to either successor[0] (condition matches) or successor[0] (condition doesn't match).	2016-07-12 17:42:40 -07:00
Eric Anholt	54800bb71c	vc4: Add kernel support for branching in shader validation. We're already checking that branch instructions are within the contents of the shader and the proper PROG_END sequence is present. The other thing we need in the presence of branching is to verify that the shader doesn't overflow past the end of the uniforms stream. To do that, we require that at the start of any basic block reading uniforms have the following instructions: load_imm temp, <offset within uniform stream> add unif_addr, temp, unif The instructions are generated by userspace, and the kernel verifies that the load_imm is of the expected offset, and that the add adds it to a uniform. We track which uniform in the stream that is, and at draw call time fix up the uniform stream to have the address of the start of the shader's uniforms for that draw call. Signed-off-by: Eric Anholt <eric@anholt.net>	2016-07-12 17:42:39 -07:00
Eric Anholt	e2d7760df5	vc4: Add a bitmap of branch targets in kernel validation. This isn't used yet, it's just a first step toward loop validation. During the main parsing of instructions, we need to know when we hit a new basic block so that we can reset validated state.	2016-07-12 17:42:38 -07:00
Eric Anholt	24095c8b3b	vc4: Track the current instruction into the validation_state. This reduces how much we need to pass around as arguments, which was becoming more of a problem with looping validation.	2016-07-12 17:42:38 -07:00
Eric Anholt	c73aa0a09b	vc4: Add QPU support for generating BRANCH instructions.	2016-07-12 17:42:38 -07:00
Eric Anholt	6d34345001	vc4: Print live variable start/ends during QIR dumping. This only happens when live variables are set up, which is not in the normal dump, but is set up when we've failed to register allocate.	2016-07-12 17:42:37 -07:00
Eric Anholt	89918c1e74	vc4: Implement live intervals using a CFG. Right now our CFG is always a trivial single basic block, but that will change when enable loops.	2016-07-12 17:41:59 -07:00
Eric Anholt	f2eb8e3052	vc4: Make vc4_qir_schedule handle each block in the program. Basically we just treat each block independently. The only inter-block scheduling I can think of that would be be interesting would be to move texture result collection to after a short loop/if block that doesn't do texturing. However, the kernel disallows that as part of its security validation.	2016-07-12 15:47:26 -07:00
Eric Anholt	46ec025ba9	vc4: Convert uniforms lowering to work with multiple blocks. We still decide which uniform to lower based on how many instructions-that-need-lowering use that uniform, but now we emit a new temporary uniform load in each of the basic blocks containing an instruction being lowered. This commit is best reviewed with diff -b.	2016-07-12 15:47:26 -07:00
Eric Anholt	0c923e6c33	vc4: Convert vc4_opt_peephole_sf to work with control flow. We need to apply the peephole pass to each of the blocks in the program. We don't do dataflow analysis for SF across blocks, but we also don't generate code that would need us to do so.	2016-07-12 15:47:26 -07:00
Eric Anholt	6c1f834a23	vc4: Create a basic block structure and move the instructions into it. The optimization passes and scheduling aren't actually ready for multiple blocks with control flow yet (as seen by the "cur_block" references in them instead of iterating over blocks), but this creates the structures necessary for converting them.	2016-07-12 15:47:26 -07:00
Eric Anholt	d3cdbf6fd8	vc4: Add a "qir_for_each_inst_inorder" macro and use it in many places. We have the prior list_foreach() all over the code, but I need to move where instructions live as part of adding support for control flow. Start by just converting to a helper iterator macro. (The simpler "qir_for_each_inst()" will be used for the for-each-inst-in-a-block iterator macro later)	2016-07-12 15:47:25 -07:00
Eric Anholt	6858f05924	vc4: Also enable phi elimination. This avoids a bunch of code gen regressions when enabling loops in vc4. Prior to that, the GLSL that would have generated these optimizable phi nodes was being lowered to csels between either (undef, a) or (a, a), and those were being dealt with by nir_opt_undef and nir_opt_algebraic.	2016-07-12 15:47:25 -07:00
Eric Engestrom	e8959ba7af	vc4: fix memory leak The allocation has succeeded by that point, so it needs to be freed. CovID: 1358929 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-12 15:47:12 -07:00
Eric Anholt	c65a00eaff	vc4: Close our screen's fd on screen close. We're passed in a freshly dup()ed fd on screen create, so we should close it on exit. Debugged by Hugh Cole-Baker.	2016-07-12 15:46:09 -07:00
Eric Anholt	c93f6938d5	nir: Add optimization for (a \|\| True == True) This was appearing in vc4 VS/CS in mupen64, due to vertex attrib lowering producing some constants that were getting compared. total instructions in shared programs: 112276 -> 112198 (-0.07%) instructions in affected programs: 2239 -> 2161 (-3.48%) total estimated cycles in shared programs: 283102 -> 283038 (-0.02%) estimated cycles in affected programs: 2365 -> 2301 (-2.71%) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-12 15:46:09 -07:00
Tim Rowley	be126c8a2a	swr: [rasterizer core] correct MSAA behavior for conservative rasterization Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-12 11:10:55 -05:00
Tim Rowley	c6ca126591	swr: [rasterizer core] conservative rast backend changes Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-12 11:10:49 -05:00
Tim Rowley	b6dbb95dc9	swr: [rasterizer] buckets cleanup Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-12 11:10:44 -05:00
Tim Rowley	eb6b2b340e	swr: [rasterizer core] make all api functions call GetContext Small api cleanup. Make all api functions call GetContext instead of locally casting handle. Makes debugging easier by providing a single point to track context changes. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-12 11:10:36 -05:00
Tim Rowley	f810907669	swr: [rasterizer] add support for llvm-3.9 v2: use signed compare, remove unneeded vmask Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-12 11:09:49 -05:00
Tim Rowley	ae4f2c849a	swr: [rasterizer jitter] fix llvm-3.7 compile d3d97f8 broke llvm-3.7, which has a mismatched API for setDataLayout/getDataLayout. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-12 10:42:57 -05:00
Brian Paul	d46489ddea	docs: remove duplicated line in 12.0.1 release notes file Signed-off-by: Brian Paul <brianp@vmware.com>	2016-07-12 09:42:42 -06:00
Leo Liu	55f0b97b40	st/omx/dec: convert decoder video buffer to progressive with encode tunneling The idea of encode tunneling is to use video buffer directly for encoder, but currently the encoder doesn’t support interlaced surface, the OMX decoder set progressive surface before on that purpose. Since now we are polling the driver for interlacing information for decoder, we got the interlaced as preferred as other APIs(VDPAU, VA-API), thus breaking the transcode with tunneling. The solution is when with tunnel detected, re-allocate progressive target buffers, and then converting the interlaced decoder results to there. This has been tested with transcode results bit to bit matching as before with surface from progressive to progressive. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-07-12 09:27:53 -04:00
Leo Liu	82f875f4d8	vl/compositor: set layer of y or uv to render Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-07-12 09:27:53 -04:00
Leo Liu	14761da9f9	vl/compositor: add weave to yuv shader This shader will make interlaced yuv to progressive yuv. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-07-12 09:27:53 -04:00
Leo Liu	2e18c2c6f8	vl/compositor: move weave shader out from rgb weaving We'll use weave shader in the later patch. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-07-12 09:27:53 -04:00
Marek Olšák	ead7736821	glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast This bug is uncovered by glsl/lower_if_to_cond_assign. I don't know if it can be reproduced in any other way. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-12 11:58:53 +02:00
Francisco Jerez	e300696304	clover/api: Implement clLinkProgram per-device binary presence validation rule. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:35 -07:00
Serge Martin	f29ed2da24	clover: Add clLinkProgram (CL 1.2). [ Francisco Jerez: Use validate_build_common for error checking, simplify control flow slightly and handle additional exception types. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:35 -07:00
Francisco Jerez	c478db6c0a	clover: Trivial cleanups for api/program.cpp. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:35 -07:00
Francisco Jerez	9c7cda2792	clover/core: Remove compiler.hpp. header_map was the only definition left in compiler.hpp, move it into program.hpp which is its only user in clover/core. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:35 -07:00
Francisco Jerez	c2e37fe1f9	clover/llvm: Get rid of compile_program_llvm(). Superseded by compile_program() and link_program(). Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:35 -07:00
Francisco Jerez	010918f5aa	clover: Provide separate program methods for compilation and linking. [ Serge Martin: Fix inverted opts and log build ctor args. Keep the log related to the build. Fix indentation ] Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:35 -07:00
Francisco Jerez	1942490bae	clover: Unify program::build_* into a single method returning a struct. This gets rid of the program::build_* query methods and replaces them with the program::build() method that returns a single data structure containing all parameters for the last build done on the given target device (including build logs, options and the binary itself). [ Serge Martin: Fix inverted opts and log build ctor args ] Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Serge Martin	7f6a4a4342	clover: Change program::build opts argument to std::string. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	2a73ae662c	clover: Define error subclass to signal build option parse failure. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	4ef1c0918d	clover: Move back to using build_error to signal compilation failure. This partially reverts `7e0180d57d`. Having two different exception subclasses for compilation and linking makes it more difficult to share or move code between the two codepaths, because the exact same function under the same error condition would need to throw one exception or the other depending on what top-level API is being implemented with it. There is little benefit anyway because clCompileProgram() and clLinkProgram() can tell whether they are linking or compiling a program. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Serge Martin	70fe6267a3	clover: Override ret_object. Return an API object from an intrusive reference to a Clover object, incrementing the reference count of the object. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	85309e8b55	clover/tgsi: Add stub link_program() function. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	ba613636e8	clover/tgsi: Move compiler entry point declaration into tgsi directory and namespace. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	fb3eeb1314	clover/llvm: Implement the -create-library linker option. [ Serge Martin: disable internalize pass when building a library. Otherwise some functions may be inlined and removed ] Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	9de3f4a59f	clover/llvm: Implement linkage of multiple clover modules. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	132b6ccd4f	clover/llvm: Split compilation and linking. Split the work previously done by compile_program_llvm() into compile_program() (which simply runs the front-end and serializes the resulting LLVM IR) and link_program() (which takes care of everything else down to binary codegen). [ Serge Martin: allow LLVM IR dump after compilation ] Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	1a7d11aa3d	clover/llvm: Implement library bitcode codegen. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	86100e13ab	clover/llvm: Trivial assorted cleanups for invocation.cpp. Drop a few include and using directives which are no longer necessary. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	520cc26859	clover/llvm: Split native codegen into separate file. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	8195637363	clover/llvm: Split bitcode codegen into separate file. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	71ac9820d6	clover/llvm: Split shared codegen support code into separate file. This is the common part of the code used to generate a clover::module from LLVM bitcode, shared between the native and LLVM paths. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	26fa9bfd0d	clover/llvm: Define function for bitcode print-out. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	f0721020ad	clover/llvm: Split native codegen and assembly print-out into separate functions. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	1d042adc0a	clover/llvm: Clean up bitcode codegen. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	952d1e6fd6	clover/llvm: Use metadata introspection utils for kernel enumeration. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	d37d5842c1	clover/llvm: Use metadata introspection utils for kernel argument set-up. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	3ed31bbf05	clover/llvm: Add simplified utility functions for metadata introspection. v2: Fix for latest LLVM from SVN. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> (v1) Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:30 -07:00
Francisco Jerez	7da2c1ff0f	clover/llvm: Clean up codestyle of get_kernel_args(). Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:59 -07:00
Francisco Jerez	0601fe7438	clover/llvm: Fold compile_native() call into build_module_native(). Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:56 -07:00
Francisco Jerez	f98422eafd	clover/llvm: Factor out duplicated construction of clover::module. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:53 -07:00
Francisco Jerez	3ce6ab068c	clover/llvm: Clean up compile_native(). This switches compile_native() to the C++ API (which the rest of this file makes use of anyway so there is little benefit from using the C API), what should get rid of an amount of boilerplate and fix a leak of the TargetMachine object in the error path. v2: Additional fixes for LLVM 3.6. v3: Update for the latest LLVM SVN changes. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:50 -07:00
Francisco Jerez	7bcefa5903	clover/llvm: Clean up ELF parsing. This function was doing three separate things: - Initializing and releasing the ELF parsing state (the latter can be better done using RAII). - Searching for the symbol table in the ELF file. - Extraction of kernel symbol offsets from the symbol table. Split each one into a separate function for clarity and clean up the result slightly. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:48 -07:00
Francisco Jerez	574477e599	clover/llvm: Move a bunch of utility functions into separate file. Some of these will be useful from a different compilation unit in the same subtree so put them in a publicly accessible header file. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:43 -07:00
Francisco Jerez	92247cef3f	clover/llvm: Tidy debug handling. Most significant change is debugging flags are now a scoped enum and all debugging helpers live in the debug namespace. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:40 -07:00
Francisco Jerez	4614397ac2	clover/llvm: Use helper function to abort compilation with error message. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:37 -07:00
Francisco Jerez	423eecb76a	clover/llvm: Simplify diagnostic_handler(). Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:29 -07:00
Francisco Jerez	5884dfbc2a	clover/llvm: Trivial codestyle clean-up for optimize(). Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:21 -07:00
Francisco Jerez	bdc27f13d5	clover/llvm: Clean up compilation into LLVM IR. Some assorted and mostly trivial clean-ups for the source to bitcode compilation path. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:21:50 -07:00
Francisco Jerez	714b167f57	clover/llvm: Factor out LLVM context init. So it can be shared between the compilation and linking codepaths. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:21:30 -07:00
Francisco Jerez	fa94055d53	clover/llvm: Declare compiler instance at top level and pass down as argument. This allows simplifying the interface of compile_llvm() because it no longer needs to read out and return the optimization level and address space map from the compiler instance. Instead declare the compiler instance at the top level so that both properties are available directly. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:21:13 -07:00
Francisco Jerez	a27d4ec3b9	clover/llvm: Refactor compiler instance initialization. This will be shared between the compiler and linker codepaths. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:21:08 -07:00
Francisco Jerez	c2a167ad73	clover/llvm: Factor out compiler option tokenization. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:20:47 -07:00
Francisco Jerez	c513cfa747	clover/llvm: Factor out target string parsing. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:20:41 -07:00
Francisco Jerez	251054220e	clover/llvm: Collect #ifdef mess into a separate file. This gets rid of most ifdef's from the invocation.cpp code -- Only a couple of them are left which will be removed differently in the following commits. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:20:12 -07:00
Francisco Jerez	11afde89b8	clover/llvm: Drop dead code. This ifdef'ed out code was meant to handle compilation into TGSI, but it doesn't seem likely that it will ever be useful even if the TGSI back-end is resurrected because the TGSI bitcode can just be plumbed through in ELF format and dealt with as a regular "native" back-end. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:20:05 -07:00
Francisco Jerez	600ac51448	clover/llvm: Drop support for LLVM < 3.6. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:19:49 -07:00
Serge Martin	8624888d6f	clover: Bump required LLVM version to 3.6. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:19:14 -07:00
Ilia Mirkin	da7223ebdc	mesa: set _NEW_BUFFERS when updating texture bound to current buffers When a glTexImage call updates the parameters of a currently bound framebuffer, we might miss out on revalidating whether it is complete. Make sure to set _NEW_BUFFERS which will trigger the revalidation in that case. Also while we're at it, fix the fb parameter passed in to the eventual RenderTexture call. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94148 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>	2016-07-11 21:18:05 -04:00
Ilia Mirkin	8b7607d28a	meta/texsubimage: tex_image is always non-null, avoid confusing code Probably a copy-paste from mesa_meta_pbo_GetTexSubImage where tex_image may apparently be null. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-07-11 21:18:05 -04:00
Ilia Mirkin	00d4315d37	st/mesa: return appropriate mesa format for ETC texture formats Even when the backend driver does not support ETC formats, we handle the decoding into an uncompressed backing texture. However as far as core mesa is concerned, it's an ETC texture and we should return the relevant ETC mesa format. This condition can get hit when using glTexStorage to create the texture object. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-07-11 21:17:30 -04:00
Ilia Mirkin	8ee3cdde04	mesa: etc2 online compression is unsupported, don't attempt it Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-07-11 21:17:01 -04:00
Ben Skeggs	0d911a720d	nvc0: initial support for GP100 GPUs Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-07-12 10:56:35 +10:00
Samuel Pitoiset	9bc083284f	nvc0: use a define for the driver constant buffer size This might avoid mistakes if the size is bumped in the future. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-11 22:30:41 +02:00
Samuel Pitoiset	31a615677b	nvc0: fix the driver cb size when draw parameters are used The size of the driver constant buffer for each stage should be 2048 and not 512 because it has been increased recently for buffers/images. While we are at it, do the same change for indirect draws. This fixes all ARB_shader_draw_parameters tests on GM107. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-07-11 22:11:27 +02:00
Samuel Pitoiset	19d0450b27	nvc0/ir: fix images indirect access on Fermi This fixes the following piglits: arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index2 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-07-11 21:01:21 +02:00
Marek Olšák	33c8723980	st/mesa: remove st_dump_program_for_shader_db replaced by MESA_SHADER_CAPTURE_PATH in core Mesa Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-11 19:06:05 +02:00
Marek Olšák	d7b6f90684	gallivm: set LLVMNoUnwindAttribute on all intrinsics RadeonSI stats: Mostly 0% difference, but Valley shows a small improvement: Application Files SGPRs VGPRs SpillSGPR SpillVGPR Code Size LDS Max Waves Waits unigine_valley 278 0.00 % -0.29 % 0.00 % 0.00 % 0.01 % 0.00 % 0.17 % 0.00 % Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-07-11 19:06:05 +02:00
Francesco Ansanelli	3c44629142	i965: fix ignored qualifiers warning Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-11 05:50:22 -07:00
Nicolai Hähnle	374aa2bb27	gallium/u_queue: assert that users must wait on fences before destroying them Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-11 11:04:44 +02:00
Nicolai Hähnle	a0a616720a	gallium/u_queue: guard fence->signalled checks with fence->mutex I have seen a hang during application shutdown that could be explained by the following race condition which this patch fixes: 1. Worker thread enters util_queue_fence_signal, sets fence->signalled = true. 2. Main thread calls util_queue_job_wait, which returns immediately. 3. Main thread deletes the job and fence structures, leaving garbage behind. 4. Worker thread calls pipe_condvar_broadcast, which gets stuck forever because it is accessing garbage. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-11 11:03:59 +02:00
Chad Versace	5c17fb2cd6	anv/dump: Fix post-blit memory barrier Swap srcAccessMask and dstAccessMask. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-09 20:58:33 -07:00
Chad Versace	bc33c9b455	anv/dump: Fix vkCmdPipelineBarrier flags 'true' is not valid for VkDependencyFlags. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-09 20:58:33 -07:00
Jason Ekstrand	ac7eeebce4	anv/dump: Add support for dumping framebuffers Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-09 20:58:33 -07:00
Jason Ekstrand	fad0b7b0b3	anv/dump: Add a barrier for the source image Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-09 20:58:33 -07:00
Jason Ekstrand	6ad183bf89	anv/dump: Refactor the guts into helpers Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-09 20:58:33 -07:00
Jason Ekstrand	adbed7ae7a	anv/dump: Use anv_minify instead of hand-rolling it Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-09 20:58:33 -07:00
Jason Ekstrand	a26cda5ca5	anv/dump: Take an aspect in dump_image_to_ppm Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-09 20:58:33 -07:00
Nicolai Hähnle	b479c47a9c	radeonsi: fix bad assertion in si_emit_sample_mask The blitter sets mask == 1, which is fine since it doesn't use smoothing. Fixes a regression introduced in commit `5bcfbf91`. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-09 19:46:54 +02:00
Matt Turner	6624174c0a	glx: Fix for commit `2c86668694`. Ian suggested these changes in his review and I made them, but I pushed the old version of the patch.	2016-07-08 16:46:17 -07:00
Emil Velikov	83a782cd5e	docs: add news item and link release notes for 12.0.0/12.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-09 00:09:51 +01:00
Emil Velikov	386ceb4c61	docs: add sha256 checksums for 12.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `edfc17a19a`)	2016-07-09 00:03:21 +01:00
Emil Velikov	c7c0adc7e6	docs: add release notes for 12.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `04277f058d`)	2016-07-09 00:03:16 +01:00
Emil Velikov	286a71b01f	docs: add sha256 checksums for 12.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `3a146a789c`)	2016-07-09 00:03:10 +01:00
Emil Velikov	4644908a9f	docs: Update 12.0.0 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `8b06176f31`)	2016-07-09 00:03:04 +01:00
Matt Turner	2c86668694	glx: Undo memory allocation checking damage. This partially reverts commit `d41f5396f3`. That untested commit broke the tex-skipped-unit piglit test and the arbvparray Mesa demo when run with indirect GLX. state->array_state is used during initialization, so its assignment cannot be moved to the end of the function. The backtrace looked like: Program received signal SIGSEGV, Segmentation fault. 0x00007ffff77c7a5c in __glXGetActiveTextureUnit (state=0x6270e0) at indirect_vertex_array.c:1952 1952 return state->array_state->active_texture_unit; (gdb) bt 0 0x00007ffff77c7a5c in __glXGetActiveTextureUnit (state=0x6270e0) at indirect_vertex_array.c:1952 1 0x00007ffff77cbf62 in get_client_data (gc=0x626f50, cap=34018, data=0x7fffffffd7a0) at single2.c:159 2 0x00007ffff77cce51 in __indirect_glGetIntegerv (val=34018, i=0x7fffffffd830) at single2.c:498 3 0x00007ffff77c4340 in __glXInitVertexArrayState (gc=0x626f50) at indirect_vertex_array.c:193 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-08 14:03:19 -07:00
Colin McDonald	b36644bae6	glx: Fix indirect multi-texture GL_DOUBLE coordinate arrays. There is no draw arrays protocol support for multi-texture coordinate arrays, so it is implemented by sending batches of immediate mode commands from emit_element_none in indirect_vertex_array.c. This sends the target texture unit (which has been previously setup in the array_state header field), followed by the texture coordinates. But for GL_DOUBLE coordinates the texture unit must be sent after the texture coordinates. This is documented in the glx protocol description, and can also be seen in the indirect.c immediate mode commands generated from gl_API.xml. Sending the target texture unit in the wrong place can crash the remote X server. To fix this required some more extensive changes to indirect_vertex_array.c and indirect_vertex_array_priv.h, in order to remove the texture unit value out of the array_state "header" field, and send it separately. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61907	2016-07-08 14:03:16 -07:00
Colin McDonald	5ced100bf5	glx: Correct opcode typos in __indirect_glTexCoordPointer. At the same time, replace opcode numbers with names in __indirect_glVertexAttribPointer. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61907	2016-07-08 14:03:09 -07:00
Colin McDonald	d57c85c1bf	glx: Call __glXInitVertexArrayState() with a usable gc. For each indirect context the indirect vertex array state must be initialised by __glXInitVertexArrayState in indirect_vertex_array.c. As noted in the routine header it requires that the glx context has been setup prior to the call, in order to test the server version and extensions. Currently __glXInitVertexArrayState is called from indirect_bind_context in indirect_glx.c, as follows: state = gc->client_state_private; if (state->array_state == NULL) { glGetString(GL_EXTENSIONS); glGetString(GL_VERSION); __glXInitVertexArrayState(gc); } But, the gc context is not yet usable at this stage, so the server queries fail, and __glXInitVertexArrayState is called without the server version and extension information it needs. This breaks multi-texturing as glXInitVertexArrayState doesn't get GL_MAX_TEXTURE_UNITS. It probably also breaks setup of other arrays: fog, secondary colour, vertex attributes. To fix this I have moved the call to __glXInitVertexArrayState to the end of MakeContextCurrent in glxcurrent.c, where the glx context is usable. Fixes a regression caused by commit `4fbdde889c`. Fixes ARB_vertex_program usage in the arbvparray Mesa demo when run with indirect GLX and also the tex-skipped-unit piglit test when run with indirect GLX. Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61907	2016-07-08 14:02:56 -07:00
Christian König	64ac4aef27	radeon/uvd: simplify sending context buffer message Just send it whenever it is allocated. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-07-08 21:03:32 +02:00
Christian König	6b474e06a2	radeon/uvd: fix contex buffer destruction in the error path Destroying a not allocated buffer is harmless. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-07-08 21:03:32 +02:00
Christian König	36df04dac4	radeon/uvd: move polaris fw check into radeon_video.c v2 It's actually not very clever to claim to support H.264 and then fail to create a decoder. v2: prefix FW macro with UVD_. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-07-08 21:03:31 +02:00
Christian König	5290bf43c8	radeon/video: fix coding style in radeon_video.c v2 v2: fix other tabs as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-07-08 21:03:31 +02:00
Brian Paul	74163475b0	svga: simplify/fix 1D/2D array resource copies Fixes the one of the piglit arb_copy_image-targets tests for 1D arrays. Previously, we were applying the 1D array z/face adjustment twice. Also simplify the copy_region_vgpu10() function. It never has to copy multiple array layers/slices. The Mesa code for glCopyImageSubData does the loop over slices/faces. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-07-08 12:53:21 -06:00
Brian Paul	0e23f370c9	mesa: print number of samples in renderbuffer_storage error msg Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-08 12:53:21 -06:00
Brian Paul	fb26317604	svga: remove unused variable Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-07-08 12:53:21 -06:00
Brian Paul	689293ad52	svga: add dumping for more device commands Signed-off-by: Brian Paul <brianp@vmware.com>	2016-07-08 12:53:21 -06:00
Brian Paul	599c333d07	svga: silence a couple unused variable warnings Signed-off-by: Brian Paul <brianp@vmware.com>	2016-07-08 12:53:20 -06:00
Charmaine Lee	c3c7ff014b	svga: rebind using render target surfaces in hw draw state Currently when we rebind framebuffer resources at the beginning of the command buffer, we use the color buffer surfaces saved in the context hw clear state. But the surfaces could be different from the actual emitted render target surfaces if any of the color buffer surfaces is also used for shader resource, in that case, we create a backed surface for the collided render target surface. So to rebind the framebuffer resources correctly, use the render target surfaces saved in the context hw draw state. Tested with Heaven, Lightsmark2008, MTT piglit, glretrace, conform. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-07-08 12:53:20 -06:00
Charmaine Lee	da98cee067	svga: invalidate gb surface before it is reused With this patch, a guest-backed surface will be invalidated using the SVGA_3D_CMD_INVALIDATE_GB_SURFACE command before the surface is reused. This fixes the updating dirty image error from the device when a surface is reused. v2: Instead of invalidating the surface when it is reused, send the invalidate command before the surface is put into the recycle pool. v3: (1) surface invalidate is a noop operation in Linux winsys, since surface invalidation is not needed for DMA path. (2) Instead of invalidating the surface content in svga_screen_surface_destroy() when a surface is to be destroyed, it is done in svga_screen_cache_flush() when the surface is no longer referenced in a command buffer and is ready to be moved to the unused list. At this point, the surface will be moved to the invalidate list. When the surface invalidation is submitted, the surface will be moved to the unused list. Tested with piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2016-07-08 12:53:20 -06:00
Brian Paul	ca531aeeb1	svga: fix use of provoking vertex control If the SVGA3D_DEVCAP_DX_PROVOKING_VERTEX query returns false, never define rasterizer state objects with provokingVertexLast set. Despite what the device reports, it may interpret the provokingVertexLast flag anyway. This fixes an issue when using capability clamping. Tested with piglit provoking-vertex and glsl-fs-flat-color tests. VMware bug 1550143. Reviewed-by: <charmainel@vmware.com>	2016-07-08 12:53:20 -06:00
Nayan Deshmukh	af18a04755	vl: add half pixel to v_tex before adding offsets Since pixel center lies at 0.5, add half_pixel to vtex before adding offsets to it. Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-08 20:51:12 +02:00
Samuel Pitoiset	a0bf1768c7	nvc0/ir: remove unused resource info loading helpers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-08 19:12:23 +02:00
Samuel Pitoiset	ed3a284382	nvc0/ir: refactor the surfaces info loading logic Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-08 19:12:21 +02:00
Samuel Pitoiset	9cdbe80745	nvc0/ir: move the shift left op inside loadTexHandle() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-08 19:12:06 +02:00
Nicolai Hähnle	04d93ea619	radeonsi: disable multi-threading when shader dumps are enabled Otherwise, shader dumps can become interleaved and unusable. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:59:36 +02:00
Nicolai Hähnle	7ffc832ab8	radeonsi: use multi-threaded compilation in debug contexts We only have to stay single-threaded when debug output must be synchronous. This yields better parallelism in shader-db runs for me. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:59:32 +02:00
Nicolai Hähnle	084ca0d8e5	st/mesa: set debug callback async flag Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:59:29 +02:00
Nicolai Hähnle	2909e292fc	gallium: add async flag to pipe_debug_callback v2: fix typo db -> cb Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:58:52 +02:00
Nicolai Hähnle	5bcfbf91e5	radeonsi: catch a potential state tracker error with non-MSAA FBs At least st/mesa ensures this, so I'd rather not handle deviations in radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:53:05 +02:00
Nicolai Hähnle	d938b8c0bf	radeonsi: explicitly choose center locations for 1xAA on Polaris Unlike SC, the small primitive filter does not automatically use center locations in 1xAA mode, so this is needed to avoid artifacts caused by the small primitive filter discarding triangles that it shouldn't. As a side effect of how the effective number of samples is now calculated, this patch also avoids submitting the sample locations for line/poly smoothing when they're not really needed. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:52:50 +02:00
Nicolai Hähnle	7d2ce5258f	r600g: call cayman_emit_msaa_sample_locs only when needed In the case of nr_samples <= 1, that function is (currently) a no-op anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:52:45 +02:00
Kenneth Graunke	b3c5df3ca4	mesa: Mark R*32F formats as filterable when an extension is present. GL_OES_texture_float_linear marks R32F, RG32F, RGB32F, and RGBA32F as texture filterable. Fixes glGenerateMipmap GL errors when visiting a WebGL demo in Chromium: http://www.iamnop.com/particles Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-08 01:26:23 -07:00
Eric Engestrom	b7be23b6e1	i965/blorp: fix indentation level Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-07-08 11:07:36 +03:00
Francisco Jerez	37b901003b	i965: Fix remaining flush vs invalidate race conditions in brw_emit_pipe_control_flush. This hardware race condition has caused problems several times already (see "i965: Fix cache pollution race during L3 partitioning set-up.", "i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs." and "i965: intel_texture_barrier reimplemented"). The problem is that whenever we attempt to both flush and invalidate multiple caches with a single pipe control command the flush and invalidation happen in reverse order, so the contents flushed from the R/W caches aren't guaranteed to become visible from the invalidated caches after the PIPE_CONTROL command completes execution if some concurrent rendering workload happened to pollute any of the invalidated R/O caches in the short window of time between the invalidation and flush. This makes sure that brw_emit_pipe_control_flush() has the effect expected by most callers of making the contents flushed from any R/W caches visible from the invalidated R/O caches. Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-07 14:16:39 -07:00
Francisco Jerez	0bd3a121c6	i965: Make room in the batch epilogue for three more pipe controls. Review carefully, it sucks to have to keep track of the number of command packet dwords emitted in the batch epilogue manually. The MI_REPORT_PERF_COUNT_BATCH_DWORDS calculation was obviously wrong. Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-07 14:16:39 -07:00
Francisco Jerez	a10879f48c	i965: Emit SKL VF cache invalidation W/A from brw_emit_pipe_control_flush. There were two places in the driver doing a pipe control VF cache flush, one of them was missing this workaround, move it down into brw_emit_pipe_control_flush to make sure we don't miss it again. Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-07-07 14:16:39 -07:00
Francisco Jerez	04f74d6629	i965: Emit SNB write cache flush W/A from brw_emit_pipe_control_flush. Shouldn't cause any functional changes at this point, but we have forgotten to apply this workaround several times in the past, make sure it doesn't happen again. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-07-07 14:16:38 -07:00
Frank Binns	8fd5779da4	egl: restrict swap_available dri2_egl_display field to X11 This field is only ever set and read by the X11 platform. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-07 13:28:50 -07:00
Guillaume Charifi	9fea9d6f8e	egl: Fix the bad surface attributes combination checking for pbuffers. (v3) Fixes a regression induced by commit `a0674ce5c4`: When EGL_TEXTURE_FORMAT and EGL_TEXTURE_TARGET were both specified (and both != EGL_NO_TEXTURE), an error was instantly triggered, before the other one had even a chance to be checked, which is obviously not the intended behaviour. v2: Full commit hash, remove useless variables. v3: [chadv] Add Fixes footers. Fixes: piglit "spec/egl 1.4/eglcreatepbuffersurface and then glclear" Fixes: piglit "spec/egl 1.4/largest possible eglcreatepbuffersurface and then glclear" Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-07 11:28:55 -07:00
Eric Engestrom	7adb9b0948	egl/display: remove unnecessary code and make it easier to read Remove the two first level `if` as they will always be true, and flatten the two remaining `if`. No functional change. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-07 11:13:13 -07:00
Gurchetan Singh	2e6d35809b	mesa: Make single-buffered GLES representation internally consistent There are a few places in the code where clearing and reading are done on incorrect buffers for GLES contexts. See comments for details. This fixes 75 GLES3 dEQP tests on the surfaceless platform with no regressions. v2: Corrected unclear comment v3: Make the change in context.c instead of get.c v4: Removed whitespace Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-07 11:02:35 -07:00
Emil Velikov	f35f8464ec	bugzilla_mesa.sh: Drop "Bug " from sed command After a recent Bugzilla update the word is no longer in the title. Thus the script ended up producing bogus HTML. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-07 15:58:46 +01:00
Akihiko Odaki	42968424fb	mesa: don't install GLX files if GLX is not built Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Akihiko Odaki <akihiko.odaki.4i@stu.hosei.ac.jp> [Emil Velikov: Drop guards around dri_interface.h, add stable tag] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-07 15:58:11 +01:00
Timothy Arceri	7a9d6abcae	nir: add glsl_dvec_type() helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-06 23:20:23 -07:00
Mathias Fröhlich	13affe0d3f	osmesa: Export OSMesaCreateContextAttribs. Since the function is exported like any other public api function and put in the header as if you could link against it, export it also from shared objects. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-07-07 06:19:13 +02:00
Timothy Arceri	7ed5bca21d	i965: consolidate generation check Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-07-07 12:29:21 +10:00
Timothy Arceri	e0dc3109d5	i965: don't copy VS attribute work arounds for HSW+ These workarounds are not required for HSW and above so stop copying them at VS key generation which is called at draw time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 12:29:12 +10:00
Timothy Arceri	27e28197e8	i965: add double packing support to tess stages Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	8b80e9c31d	i965: add double support packing support to gs inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	20e935e6f6	nir: add glsl_double_type() helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	9d9b0b54cd	i965: add indirect packing support to gs load inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	2477e6cfad	i965: add indirect packing support for tcs and tes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	2bda4b062f	i965: add component packing support for tcs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	cfff71a47a	i965: add component packing support for tes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	a102ef2d4f	i965: add component packing support for gs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	448adfbc67	nir: use the same driver location for packed varyings Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	0eea6b3297	nir: add new intrinsic field for storing component offset This offset is used for packing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Eric Engestrom	771f6db76f	i965/docs: update Intel Linux Graphics URLs Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-07-06 13:18:23 -07:00
Chad Versace	8910de39c7	anv: gitignore anv_timestamp.h	2016-07-06 13:13:18 -07:00
Tom Stellard	513fccdfb6	radeon/llvm: Use alloca instructions for larger arrays We were storing arrays in vectors, which was leading to some really bad spill code for large arrays. allocas instructions are a better fit for arrays and LLVM optimizations are more geared toward dealing with allocas instead of vectors. For arrays that have 16 or less 32-bit elements, we will continue to use vectors, because this will force LLVM to store them in registers and use indirect registers, which is usually faster for small arrays. In the future we should use allocas for all arrays and teach LLVM how to store allocas in registers. This fixes the piglit test: spec/glsl-1.50/execution/geometry/max-input-component Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 19:47:38 +00:00
Tom Stellard	02873a7b0c	radeon/llvm: Add helpers for loading and storing data from arrays. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 19:47:38 +00:00
Tom Stellard	2dc48984b2	radeon/llvm: Remove uses_temp_indirect_addressing() function bld->indirect_files is never set, so this function always returns false. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 19:47:38 +00:00
Emil Velikov	9618e2a24c	anv: vulkan: remove the anv_device.$(OBJEXT) rule Atm the actual rule will expand to foo.o which is used for static libraries only. Thus the automake manual recommendation [to use OBJEXT] won't help us, since since we're working with a shared library. Thus let's 'demote' the file and add it back to BUILT_SOURCES. This will manage all the complexity for us, at the (existing expense) of working only with the all, check and install targets. The crazy (why the issue was hard to spot): If the dependencies (.deps/*.Plo) are already created one can alter the anv_device.$(OBJEXT) line and/or nuke it all together. That won't lead to any warnings/issues, even though the Makefile is regenerated. Moral of the story: Always rm -rf top_builddir or don't resolve the dependencies manually and use BUILT_SOURCES. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96825 Fixes: d7a604c3f7a ("anv: use cache uuid based on the build timestamp.") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-07-06 10:19:19 -07:00
Rob Clark	64d35f817a	vbo: fix attr reset In `bc4e0c4` (vbo: Use a bitmask to track the active arrays in vbo_exec*.) we stopped looping over all the attributes and resetting all slots. Which exposed an issue in vbo_exec_bind_arrays() for handling GENERIC0 vs. POS. Split out a helper which can reset a particular slot, so that vbo_exec_bind_arrays() can re-use it to reset POS. This fixes an issue with 0ad (and possibly others). Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-07-06 10:17:30 -04:00
Rob Clark	23dd9eaa94	list: fix list_replace() for empty lists Before, it would happily copy list_head next/prev (ie. pointer to the from list_head), leaving things in a confused state and causing much mayhem. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-06 10:17:30 -04:00
Rob Clark	09fe35b450	gallium: un-inline pipe_surface_desc Want to re-use this struct, so un-inline it. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:17:30 -04:00
Rob Clark	def044376a	gallium/util: make util_copy_framebuffer_state(src=NULL) work Be more consistent with the other u_inlines util_copy_xyz_state() helpers and support NULL src. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:17:30 -04:00
Nicolai Hähnle	660cd3de4a	winsys/amdgpu: avoid flushed depth when possible If a depth/stencil texture has no mipmaps, we can always get a layout that is compatible with DB and TC. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:52 +02:00
Nicolai Hähnle	7000dfd5c3	gallium/radeon: add depth/stencil_adjusted output to surface computation This fixes a rare bug with stencil texturing -- seen on Polaris and Tonga, though it's basically a function of the memory configuration so could affect other parts as well. Fixes piglit "unaligned-blit * stencil downsample" and various "fbo-depth-array stencil" tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:52 +02:00
Nicolai Hähnle	68fe270e71	gallium/radeon: allocate only the required plane for flushed depth Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:52 +02:00
Nicolai Hähnle	1a0a8efcce	radeonsi: decompress to flushed depth texture when required v2: s/dirty_level_mask/stencil_dirty_level_mask/ in stencil case Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:51 +02:00
Nicolai Hähnle	4b7961da77	radeonsi: extract DB->CB copy logic into its own function Also clean up some of the looping. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:51 +02:00
Nicolai Hähnle	18cc825fb9	radeonsi: sample from flushed depth texture when required Note that this has no effect yet. A case where can_sample_z/s can be false in radeonsi will be added in a later patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:51 +02:00
Nicolai Hähnle	f2eb34f82f	gallium/radeon: replace is_flushing_texture with db_compatible This is a left-over of when I considered generalizing the separate stencil support. I do prefer the new name since it emphasizes what flushing vs. non-flushing means from a functional point-of-view, namely special handling of the texture format. v2: adjust r600_init_color_surface as well Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:48 +02:00
Nicolai Hähnle	dd65126153	gallium/radeon: add can_sample_z/s flags for textures v2: adjust r600_init_color_surface as well Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:43 +02:00
Nicolai Hähnle	065eeb79f7	radeonsi: correctly mark levels of 3D textures as fully decompressed Account for the fact that max_layer is minified for higher levels. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:42:49 +02:00
Nicolai Hähnle	19f8d2a843	gallium/radeon/winsyses: remove unused stencil_offset Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:42:49 +02:00
Nicolai Hähnle	3a1da559c5	gallium/radeon: remove redundant null-pointer check v2: keep using r600_texture_reference Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:42:48 +02:00
Nicolai Hähnle	5b87eef031	gallium/radeon: print StencilLayout only once It is the same for all levels. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:42:48 +02:00
Nicolai Hähnle	bae066c3f0	gallium/radeon: flush stdout after printing texture information Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:42:48 +02:00
Ilia Mirkin	a37e46323c	glsl: don't try to lower non-gl builtins as if they were gl_FragData If a shader has an output array, it will get treated as though it were gl_FragData and rewritten into gl_out_FragData instances. We only want this to happen on the actual gl_FragData and not everything else. This is a small part of the problem pointed out by the below bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-07-05 21:22:01 -04:00
Ian Romanick	795d8dff89	glsl: Document and enforce restriction on type values Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-07-05 17:55:29 -07:00
Ian Romanick	3119871bd9	glsl: Pack integer and double varyings as flat even if interpolation mode is none v2: Also update varying_matches::compute_packing_class(). Suggested by Timothy Arceri. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-05 16:58:27 -07:00
Ian Romanick	73a6a4ce49	mesa: Strip arrayness from interface block names in some IO validation Outputs from the vertex shader need to be able to match per-vertex-arrayed inputs of later stages. Acomplish this by stripping one level of arrayness from the names and types of outputs going to a per-vertex-arrayed stage. v2: Add missing checks for TESS_EVAL->GEOMETRY. Noticed by Timothy Arceri. v3: Use a slightly simpler stage check suggested by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-05 16:58:27 -07:00
Charmaine Lee	32651c67d1	svga: avoid emitting redundant DXSetRenderTargets command Tested with Lightsmark2008, MTT piglit, glretrace, conform. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-07-05 16:58:29 -06:00
Leo Liu	aa7d42a5f9	radeon/vce: update encRefPic addr and array mode to tiled Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-05 09:15:50 -04:00
Leo Liu	e560a11b87	radeon/vce: increase cpb height alignment Height should be aligned with 2 macroblocks, thus making safer for tiled mode Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-05 09:15:47 -04:00
Iago Toral Quiroga	fa0654fc3c	i965: Remove trailing whitespace Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-07-05 14:06:37 +02:00
Iago Toral Quiroga	d92ac67126	i965: Make inline function static Without this the i965 driver fails to load. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-07-05 14:05:58 +02:00
Emil Velikov	cbc37f72e3	anv: install the intel_icd.json to ${datarootdir} by default As mentioned by the spec (and used by Archlinux and Debian) default to ${datarootdir} as opposed to ${sysconfdir} for the default location. Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-05 12:17:34 +01:00
Emil Velikov	744d0d8f3b	swr: automake: don't ship LLVM version specific generated sources Otherwise things will fail to build, if the builder is using another version of LLVM. v2: annotate all the dependencies of builder_gen.h v3: clean the generated files as needed v4: comment cleanups (Tim) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Tested-by: Tim Rowley <timothy.o.rowley@intel.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com> (v2) Reported-by: Chuck Atkins <chuck.atkins@kitware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-05 12:17:05 +01:00
Emil Velikov	22e9357028	automake: don't mandate git_sha1.h/MESA_GIT_SHA1 It has proven subtle to get it right both from the build side POV (see commit list below) and builders due to their varying workflows. Furthermore it does not fully fulfil the reason why it was enforced - to detect uniqueness between different builds, in order to distinguish and invalidate Vulkan/GL caches. With that having a much better solution (previous commit) we can drop this solution. This effectively reverts the following commits: `359d9dfec3` ("mesa: automake: add directory prefix for git_sha1.h") `2c424e00c3` ("mesa: automake: ensure that git_sha1.h.tmp has the right attributes") `b7f7ec7843` ("mesa: automake: distclean git_sha1.h when building OOT") `8229fe68b5` ("automake: get in-tree `make distclean' working again.") Cc: Timo Aaltonen <tjaalton@debian.org> Cc: Haixia Shi <hshi@chromium.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-07-05 12:16:20 +01:00
Emil Velikov	e5c1229a9a	anv: automake: indent with tabs and not spaces Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-05 12:16:06 +01:00
Emil Velikov	addb099ce8	anv: use cache uuid based on the build timestamp. Do not rely on the git sha1: - its current truncated form makes it less unique - it does not attribute for local (Vulkand or otherwise) changes Use a timestamp produced at the time of build. It's perfectly unique, unless someone explicitly thinkers with their system clock. Even then chances of producing the exact same one are very small, if not zero. v2: Remove .tmp rule. Its not needed since we want for the header to be regenerated on each time we call make (Eric). v3: - Honour SOURCE_DATE_EPOCH, to make the build reproducible (Michel) - Replace the generated header with a define, to prevent needless builds on consecutive `make' and/or `make install' calls. (Dave) v4: - Keep the timestamp generation at make time. (Jason) v5: - Ensure that file is regenerated on incremental builds. Cc: Michel Dänzer <michel@daenzer.net> Cc: Dave Airlie <airlied@gmail.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-05 12:15:23 +01:00
Emil Velikov	f98530b739	clover: conditionally use MESA_GIT_SHA1 Considering how hard/annoying it was for many peoples' workflow to properly generate the macro, it will be demoted to conditionally available with follow-up commits. v2: Kill off gracious blank line (Vedran). Cc: mesa-stable@lists.freedesktop.org Cc: Vedran Miletić <vedran@miletic.net> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-07-05 12:14:34 +01:00
Timothy Arceri	9c9e3e7ee1	mesa: stop copying SamplerUnits twice The call to _mesa_update_shader_textures_used() already takes care of copying for us. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-07-05 20:18:05 +10:00
Timothy Arceri	25a32c2cbf	mesa: make attribute binding message more useful Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-07-05 20:18:05 +10:00
Timothy Arceri	8f1ca0ee3f	i965: make more effective use of SamplersUsed Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-07-05 20:18:05 +10:00
Timothy Arceri	51f912786f	glsl: stop allocating memory for UBOs during linking This just stops counting and assigning a storage location for these uniforms, the count is only used to create the uniform storage. These uniform types don't use this storage. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-07-05 20:18:05 +10:00
Timothy Arceri	549b9b12fc	glsl: mark link_uniform_blocks_are_compatible() as static Missed this when doing `6d1a59d15b`. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-07-05 20:18:05 +10:00
Timothy Arceri	30812e90d1	mesa: fix build error Fix build error cased by `6a524c76f5`.	2016-07-05 18:42:06 +10:00
Gregory Hainaut	6a524c76f5	mesa: faster validation of sampler unit mapping for SSO Code was inspired from _mesa_update_shader_textures_used However unlike _mesa_update_shader_textures_used that only check for a single stage, it will check all stages. It avoids to loop on all uniforms, only active samplers are checked. For my use case: high FS frequency switches with few samplers. Perf event (relative to nouveau_dri.so) goes from 5.01% to 1.68% for the _mesa_sampler_uniforms_pipeline_are_valid function. Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-05 16:44:31 +10:00
Dave Airlie	cb728df967	Revert "st/glsl_to_tgsi: don't increase immediate index by 1." This reverts commit `27d456cc87`. DOH, what seems right and what is right with fp64 are always two different things. This regressed: spec@arb_gpu_shader_fp64@shader_storage@layout-std140-fp64-mixed-shader on radeonsi Reported-by: Michel Dänzer <michel@daenzer.net> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-07-05 10:25:29 +10:00
Samuel Pitoiset	c1fb3290a6	nvc0/ir: rename NVE4_SU_INFO_XXX to NVC0_SU_INFO_XXX While we are at it, fix a typo inside the comment which describes what those constants are for. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-05 01:44:15 +02:00
Samuel Pitoiset	f3b9fff3c3	nvc0/ir: reset the base offset for indirect images accesses In presence of an indirect image access, the base offset should be zeroed because the stride will be computed twice. This is a pretty rare situation but it can happen when tex.r > 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-07-05 01:44:12 +02:00
Samuel Pitoiset	cb828b7b18	gm107/ir: fix sign bit emission for FADD32I When emitting OP_SUB, the sign bit for FADD and FADD32I is not at the same position. It's at position 45 for FADD but 51 for FADD32I. This fixes the following piglit test: tests/spec/arb_fragment_program/fdo30337b.shader_test Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-07-05 01:44:08 +02:00
Eric Anholt	ac772b24a1	vc4: Regularize instruction emit macros ALU0 didn't have the _dest variant, and ALU2 didn't unset the def the way ALU1 did. This should make the ALU[012] macros much clearer, by moving most of their contents to vc4_qir.c	2016-07-04 16:33:22 -07:00
Eric Anholt	8a52f03f5d	vc4: Enable dead CF elimination. Now that we're about to start generating control flow in our NIR, we want this in place. It optimizes things frequently in the CS, when the GL VS has control flow that doesn't affect the vertex position.	2016-07-04 16:33:22 -07:00
Eric Anholt	8f2af4763a	vc4: Optimize out redundant SF updates. Tiny change on shader-db currently, but it will be important when we start emitting a lot of SFs from the same variable as part of control flow support. total instructions in shared programs: 89463 -> 89430 (-0.04%) instructions in affected programs: 1522 -> 1489 (-2.17%) total estimated cycles in shared programs: 250060 -> 250015 (-0.02%) estimated cycles in affected programs: 8568 -> 8523 (-0.53%)	2016-07-04 16:33:22 -07:00
Eric Anholt	200b4e4bd5	vc4: Move SF removal to a separate peephole pass. The DCE pass is going to change significantly to handle control flow, while we don't really need to change it for the SF handling. We also need to add some more SF peephole optimization for SF updates generated by control flow support. No change on shader-db.	2016-07-04 16:33:22 -07:00
Eric Anholt	aa76ba6f2f	vc4: DCE instructions with a NULL destination. I'm going to add an optimization for redundant SF update removal, which will just remove the SF and leave us (in many cases) with an instruction with a NULL destination and no side effects. Rather than teaching that pass whether the whole instruction can be removed, leave that responsibility to this pass.	2016-07-04 16:33:22 -07:00
Eric Anholt	2a8973fb78	vc4: Mark texturing setup instructions as having side effects. We need to not DCE them even though they don't have a destination in QIR. We also shouldn't relocate them in vc4_opt_vpm. Neither of these things happen, but I'm about to make DCE consider instructions with a NULL destination.	2016-07-04 16:33:22 -07:00
Eric Anholt	44df374a9c	vc4: Fix a pasteo in scheduling condition flag usage. Noticed by code inspection. This hasn't been too big of a deal, because our cond usages all start out as adder ops, either MOVs or the FTOI for Z writes. MOVs can get converted to mul ops during scheduling, but apparently we hadn't hit this.	2016-07-04 16:33:22 -07:00
Eric Anholt	eaa53f80d9	vc4: Drop the dead QIR_PACK() macro. This isn't used since we switched to using the dst.pack field instead of custom instructions.	2016-07-04 16:33:18 -07:00
Marek Olšák	5c92c21369	radeonsi: do compilation from si_create_shader_selector asynchronously Main shader parts and geometry shaders are compiled asynchronously by util_queue. si_create_shader_selector doesn't wait and returns. si_draw_vbo(si_shader_select) waits for completion. This has the best effect when shaders are compiled at app-loading time. It doesn't help much for shaders compiled on demand, even though VS+PS compilation should take as much as time as the bigger one of the two. If an app creates more shaders, at most 4 threads will be used to compile them. Debug output disables this for shader stats to be printed in the correct order. (We could go even further and build variants asynchronously too, then emit draw calls without waiting and emit incomplete shader states, then force IB chaining to give the compiler more time, then sync the compilation at the IB flush and patch the IB with correct shader states. This is great for compilation before draw calls, but there are some difficulties such as scratch and tess states requiring the compiler output, and an on-disk shader cache will likely be a much better and simpler solution.) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	84824935cf	radeonsi: don't lock shader cache mutex during compilation to allow multiple shaders to be compiled simultaneously. ALso, shader-db can again use all 4 cores. v2: Remove the pipe_mutex_unlock call in the error path. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-07-05 00:47:13 +02:00
Marek Olšák	850cd953b1	radeonsi: separate the compilation chunk of si_create_shader_selector The function interface is ready to be used by util_queue. Also, si_shader_select_with_key can no longer accept si_context. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	6781a2a994	radeonsi: move LLVMTargetMachineRef creation to a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	8a4ace4a47	gallium/radeon: add and use radeon_info::max_alloc_size (v2) v2: - squashed the patches - use INT_MAX - clamp max_const_buffer_size - check the DRM version in radeon Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-07-05 00:47:13 +02:00
Marek Olšák	027ad71b57	radeonsi: print LLVM IRs to ddebug logs Getting LLVM IRs of hanging shaders have never been easier. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	28a03be06b	radeonsi: enable string markers and record apitrace call numbers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	642cf400aa	ddebug: add an option to dump info about a specific apitrace call Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	1daec2b795	ddebug: implement pipe_context::generate_mipmap Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	50b2235478	ddebug: record and dump apitrace call numbers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	861ecf1ca9	ddebug: implement emit_string_marker and remove some obsolete comments Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	a446c40e0a	gallium/radeon: remove unused code - radeon_llvm_util.* Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	eaccc4e8c8	radeonsi: keep using v_rcp_f32 for division in future LLVM (v2) This will be needed after some LLVM changes that haven't landed yet. v2: - use LLVMIsConstant to fix an LLVM assertion failure. LLVMSetMetadata doesn't work with constants. - don't set float metadata as string Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	1c00086746	radeonsi: remove an obsolete comment It's not true. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	4d1f32376d	radeonsi: don't interpolate colors if flatshading is enabled use v_interp_mov for those Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	4accb02d7a	radeonsi: enable the barycentric optimization in all cases Handle the bc_optimize SGPR bit if both CENTER and CENTROID are enabled. This should increase the PS launch rate for big primitives with MSAA. Based on discussion with SPI guys. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	476e9cee1d	radeonsi: compute only one set of interpolation (i,j) when MSAA is disabled This should increase the PS launch rate for shaders using at least 2 pairs of perspective (i,j) and same for linear. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	a675c6a000	radeonsi: split ps.prolog.force_persample_interp into persp and linear bits This reduces the number of v_mov's in the prolog. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	61010cfac0	radeonsi: don't dump the shader key for non-monolithic shaders early It's always zero. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Jan Vesely	015e2e0fce	r600g: Add double precision FMA ops Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96782 Fixes: `54c4d525da` ("r600g: Enable FMA on chips that support it") Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Tested-by: James Harvey <lothmordor@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-07-05 00:47:12 +02:00
Francesco Ansanelli	9827fc3f03	r600: fix duplicate 'const' declaration Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-04 21:26:31 +02:00
Topi Pohjolainen	2a60654f56	i965/urb: Allow blorp to record current settings This makes it possible to skip urb re-configuration if the subsequent renders agree with the settings. Also allows blorp to allocate the maximun amount of vs entries available. Core upload logic already knows how to calculate this. Helps one synthetic benchmark. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	39fdee6b2d	i965/blorp/gen7+: Do not trigger push constant space reconfig Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	cc2d0e64c0	i965/blorp/gen7+: Stop trashing push constant allocation Packet 3DSTATE_CONSTANT_PS is still emitted explicitly as ps stage itself is enabled and hardware may try to prefetch constants from the buffer. From the BSpec: 3D Pipeline - Windower - 3DSTATE_PUSH_CONSTANT_ALLOC_PS "Specifies the size of the PS constant buffer. This value will determine the amount of data the command stream can pre-fetch before the buffer is full." This is not possible on gen6. From the BSpec about 3DSTATE_CONSTANT_PS: "This packet must be followed by WM_STATE." Binding table emissions for stages other than PS can be now dropped, they were only needed for the 3DSTATE_CONSTANT_XS to be effective: From the BSpec: "The 3DSTATE_CONSTANT_* command is not committed to the shader unit until the corresponding (same shader) 3DSTATE_BINDING_TABLE_POINTER_* command is parsed." Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	175e095744	i965/blorp: Remove support for push constants Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	46e1132b80	i965/blorp: Use flat inputs instead of uniforms v2 (Jason): Use LOAD_INPUT() macro Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	07db95c24d	i965/blorp: Fix the size requirement for vertex elements v2: Rebased as this is needed before flat inputs are enabled Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	741a245ae4	i965/blorp: Load tranformation coordinates as vec4 In preparation for loading as flat vertex input. v2: Use LOAD_INPUT() macro Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	01f2f364d4	i965/blorp: Rename LOAD_UNIFORM to LOAD_INPUT Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	641868103c	i965/blorp: Organize pixel kill and blend/scaled inputs into vec4s In addition, as these are never used in parallel, add a few assertions. v2 (Jason): Skip some complexity by putting them into a union but pad rectangle grid into a vec4 instead. Also keep the LOAD_UNIFORM macro. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Lionel Landwerlin	dbbc4fb4cc	anv/wsi: create swapchain images using specified image usage The image usage specified by the caller of vkCreateSwapchainKHR should be passed onto the internal image creation. Otherwise the driver might later crash when the user tries to use the image as a combined sampler even though the creation was explicitly created with VK_IMAGE_USAGE_TRANSFER_SRC_BIT. Leaving the previous VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT as this might be expected even if the swapchain is created without any flag. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96791 Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-04 10:15:48 -07:00
Indrajit Das	51227b41c6	radeon/uvd: fix overflow error while calculating bit stream buffer size Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-04 11:38:05 +02:00
Topi Pohjolainen	9e3774a460	i965/blorp: Prepare for more than two vertex attributes Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 09:05:02 +03:00
Topi Pohjolainen	e762354309	i965/blorp: Tell vertex fetcher about flat inputs Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 09:04:38 +03:00
Topi Pohjolainen	89e6b4ef5d	i965/blorp: Add support for flat input buffer Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 09:04:00 +03:00
Topi Pohjolainen	9b2fa17e97	i965/blorp: Store input read mask Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 09:03:41 +03:00
Topi Pohjolainen	73f78ab44b	i965/blorp: Rename push constants to inputs Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 08:37:51 +03:00
Topi Pohjolainen	f2c472fcb3	i965/blorp: Use core vertex buffer state setup Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 08:37:44 +03:00
Topi Pohjolainen	4f7e68799f	i965/blorp: Split vertex data and element setup Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 08:33:41 +03:00
Topi Pohjolainen	575c8cbb54	i965: Unify vertex buffer setup On gen >= 8 one doesn't provide ending address but number of bytes available. This is relative to the given offset. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 08:33:41 +03:00
Topi Pohjolainen	bdab945edd	i965/draw: Expose vertex buffer state setup Also change the interface to use start and end offsets. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 08:33:41 +03:00
Rob Clark	7295428e41	freedreno: fix crash on smaller gpus and higher resolutions Devices with smaller GMEM size need more tiles. On db410c at 2048x1152, glmark2 shadow needed ~330 tiles for fullscreen. Lets bump it up to 512. (Maybe with MRT you could end up needing more, but at that point things are probably going to be painfully slow.) Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-03 11:16:28 -04:00
Rob Clark	01ccb0d91e	i965: don't drop const initializers in vector splitting Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-02 09:00:19 -04:00
Rob Clark	f78a6b1ce3	glsl: add driconf to zero-init unintialized vars Some games are sloppy.. perhaps because it is defined behavior for DX or perhaps because nv blob driver defaults things to zero. So add driconf param to force uninitialized variables to default to zero. This issue was observed with rust, from steam store. But has surfaced elsewhere in the past. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-02 09:00:19 -04:00
Rob Clark	202710d110	freedreno/ir3: support glsl linking for cmdline compiler For .vert/.frag, now multiple can be specified on the cmdline for purposes of linking, and the last one specified is the one that is fed into the ir3 backend (and dumped along the way if --verbose is specified) Without this, varyings in frag shaders would appear as undefined. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-07-02 09:00:19 -04:00
Rob Clark	07cfe4e6aa	glsl/standalone: initialize MaxUserAssignableUniformLocations Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-07-02 09:00:19 -04:00
Rob Clark	1759eb1d19	freedreno: update valid_buffer_range for SO buffers Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-02 08:58:50 -04:00
Rob Clark	da39ac9c51	freedreno/ir3: support non-user_buffer consts Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-02 08:58:50 -04:00
Rob Clark	2081c1ecc0	freedreno/a2xx: move setup/restore cmds into binning pass Rather than doing a separate submit at context create, move these cmds to before first tile, as is done on a3xx/a4xx. Otherwise state can be overwritten by other contexts. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-02 08:58:50 -04:00
Rob Clark	2c3b54c278	freedreno: pass index buffer as a pipe_resource This will be useful in a following patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-02 08:58:50 -04:00
Rob Clark	88cc11e971	freedreno: switch emit_const_bo() to take prsc's We can push the unwrap of pipe_resource down. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-02 08:58:50 -04:00
Hans de Goede	d7dfd4cb51	nv30: Fix "array subscript is below array bounds" compiler warning gcc6 does not like the trick where we point to one entry before the array start and then start a while with a pre-increment. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-02 12:21:28 +02:00
Hans de Goede	110ef733dc	nouveau: Fix a couple of "foo may be used uninitialized' compiler warnings These are all new false positives with gcc6. In nouveau_compiler.c: gcc6 no longer assumes that passing a pointer to a variable into a function initialises that variable. In nv50_ir_from_tgsi.cpp op and mode are not set if there are 0 enabled dst channels, this never happens, but gcc cannot know this. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-02 12:21:28 +02:00
Hans de Goede	1f3c8f3664	nouveau: Fix gcc6 / c++11 auto_ptr deprecation compiler warnings Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-02 12:21:28 +02:00
Hans de Goede	2aa1197eee	nouveau: Add support for SV_WORK_DIM Add support for SV_WORK_DIM for nvc0 and nve4. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-02 12:21:28 +02:00
Hans de Goede	3345f70f63	nvc0: Make NVC0_CB_AUX_GRID_INFO take an index argument This brings it inline with the other macros like NVC0_CB_AUX_UBO_INFO and NVC0_CB_AUX_TEX_INFO. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-02 12:21:28 +02:00
Hans de Goede	ef8e50a841	clover: Pass work_dim parameter of clEnqueueNDRangeKernel() to driver In order to implement get_work_dim() the driver may need to know the clEnqueueNDRangeKernel() work_dim parameter, so pass it to the driver. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-02 12:21:28 +02:00
Hans de Goede	d386cef246	tgsi: Add WORK_DIM System Value Add a new WORK_DIM SV type, this is will return the grid dimensions (1-4) for compute (opencl) kernels. This is necessary to implement the opencl get_work_dim() function. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-02 12:21:28 +02:00
Alejandro Piñeiro	da7efadf04	mesa/main: fix error checking logic on CopyImageSubData For the case (both src or dst) where we had a texobject, but the texobject target was not the same that the method target, this spec paragraph was appplied: /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core * Profile spec says: * * "An INVALID_VALUE error is generated if either name does not * correspond to a valid renderbuffer or texture object according * to the corresponding target parameter." / But for that case, the correct spec paragraph should be: / Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core * Profile spec says: * * "An INVALID_ENUM error is generated if either target is * not RENDERBUFFER or a valid non-proxy texture target; * is TEXTURE_BUFFER or one of the cubemap face selectors * described in table 8.18; or if the target does not * match the type of the object." */ specifically the last sentence: "or if the target does not match the type of the object". This patch fixes the error returned (s/INVALID/ENUM) for that case, and moves up the INVALID_VALUE spec paragraph, as that case (invalid texture object) was handled before. Fixes: GL44-CTS.copy_image.target_miss_match Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-02 11:54:40 +02:00
Dave Airlie	27d456cc87	st/glsl_to_tgsi: don't increase immediate index by 1. Immediates are stored into a separate table, and are consolidated, so if we get an immediate we don't need to offset it as the index it has is correct. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-07-02 17:01:25 +10:00
Ilia Mirkin	6f4d35212b	st/mesa: get max supported number of image samples from driver Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-01 23:01:03 -04:00
Ilia Mirkin	b2b5075e04	nvc0: fix up image support for allowing multiple samples Basically we just have to scale up the coordinates and then add the relevant sample offset. The code to handle this was already largely present from Christoph's earlier attempts to pipe images through back in the dark ages, this just hooks it all up. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-01 23:01:02 -04:00
Nicolai Hähnle	07cc838b10	st/mesa: check the texture image level in st_texture_match_image Otherwise, 1x1 images of arbitrarily high level are accepted. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96639#add_comment Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-01 17:55:19 +02:00
Nicolai Hähnle	0ba053b34c	st/mesa: an incomplete texture may have a zero-size first image Fixes a regression introduced by commit `42624ea83` which triggered an assertion in dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0 While stImage must have a non-zero size as verified by the caller, we also look at the size of the base image in an attempt to make a better guess at the level0 size (this is important when the base image size is odd). However, the base image may have a zero size even when it exists. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96629 Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-01 17:54:40 +02:00
Nayan Deshmukh	de772bc060	st/vdpau: use bicubic filter for scaling(v6.1) use bicubic filtering as high quality scaling L1. v2: fix a typo and add a newline to code v3: -render the unscaled image on a temporary surface (Christian) -apply noise reduction and sharpness filter on unscaled surface -render the final scaled surface using bicubic interpolation v4: support high quality scaling v5: set dst_area and dst_clip in bicubic filter v6: set buffer layer before setting dst_area v6.1: add PIPE_BIND_LINEAR when creating resource Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-01 12:54:58 +02:00
Nayan Deshmukh	872dd9ad15	vl: add a bicubic interpolation filter(v5) This is a shader based bicubic interpolater which uses cubic Hermite spline algorithm. v2: set dst_area and dst_clip during scaling (Christian) v3: clear the render target before rendering v4: intialize offsets while initializing shaders use a constant buffer to send dst_size to frag shader small changes to reduce calculation in shader v5: send half pixel offset instead of sending dst_size Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-01 12:54:33 +02:00
Vinson Lee	3fea592c4e	mesa/st: Use 'struct nir_shader' instead of 'nir_shader'. Fix this build error with GCC 4.4. CC state_tracker/st_nir_lower_builtin.lo In file included from state_tracker/st_nir_lower_builtin.c:61: state_tracker/st_nir.h:34: error: redefinition of typedef ‘nir_shader’ ../../src/compiler/nir/nir.h:1830: note: previous declaration of ‘nir_shader’ was here Suggested-by: Rob Clark <robdclark@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96235 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-07-01 00:19:24 -07:00
Alejandro Piñeiro	a97ee60926	docs: update MESA_DEBUG envvar documentation. silent, flush, incomplete_tex and incomplete_fbo flags were not documented (see src/mesa/main.debug.c for more info). FP is not checked anymore. v2 (Brian Paul): * MESA_DEBUG accepts a comma-separated list of parameters. * Clarify how MESA_DEBUG behaves with mesa debug and release builds. * Updated wording. v3: Better wording for one paragraph (Brian Paul) Reviewed-by: Brian Paul <brianp@vmware.com>	2016-07-01 08:15:15 +02:00
Alejandro Piñeiro	5e553a6bb3	i965: intel_texture_barrier reimplemented Fixes: GL44-CTS.texture_barrier_ARB.same-texel-rw-multipass On Haswell, Broadwell and Skylake (note that in order to execute that test, it is needed to override GL and GLSL versions). On gen6 this test was already working without this change. It keeps working after it. This commit replaces the call to brw_emit_mi_flush for gen6+ with two calls to brw_emit_pipe_control_flush: * The first one with RENDER_TARGET_FLUSH and CS_STALL set to initiate a render cache flush after any concurrent rendering completes and cause the CS to stop parsing commands until the render cache becomes coherent with memory. * The second one have TEXTURE_CACHE_INVALIDATE set (and no CS stall) to clean up any stale data from the sampler caches before rendering continues. Didn't touch gen4-5, basically because I don't have a way to test them. More info on commits: `0aa4f99f56` `72473658c5` Thanks to Curro to help to tracking this down, as the root case was a hw race condition. v2: use two calls to pipe_control_flush instead of a combination of gen7_emit_cs_stall_flush and brw_emit_mi_flush calls (Curro) v3: no need to const cache invalidation (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-01 08:09:27 +02:00
Ilia Mirkin	51ca57df01	nv30: go back to not using viewport validate function for swtnl The output of draw requires a null viewport transform, which the regular code is ill-equiped to do. Reinstate the original settings in the render path, and add setting of the viewport clip polygon based on fb width/height (as that is all taken care of by draw). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-01 01:04:10 -04:00
Ilia Mirkin	71609c9954	nv30: fix viewport clipping settings to be based on viewport, not rt This fixes a ton of "clip" dEQP GLES2 tests, as well as triangle-guardband-viewport in piglit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-01 00:02:23 -04:00
Brian Paul	c823ff8dfb	gallium/util: check for window cliprects in util_can_blit_via_copy_region() We can't blit with resource_copy_region() if there are window clip rects. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-30 18:19:09 -06:00
Chuck Atkins	d8d6091a84	gallium: Force blend color to 16-byte alignment This aligns the 4-element color float array to 16 byte boundaries. This should allow compiler vectorizers to generate better optimizations. Also fixes broken vectorization generated by Intel compiler. v2: Fixed indentation and added a lengthy comment explaining the reason for the alignment. Cc: <mesa-stable@lists.freedesktop.org> Reported-by: Tim Rowley <timothy.o.rowley@intel.com> Tested-by: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-06-30 17:04:41 -05:00
Chuck Atkins	c1bf6692be	swr: Refactor checks for compiler feature flags Encapsulate the test for which flags are needed to get a compiler to support certain features. Along with this, give various options to try for AVX and AVX2 support. Ideally we want to use specific instruction set feature flags, like -mavx2 for instance instead of -march=haswell, but the flags required for certain compilers are different. This allows, for AVX2 for instance, GCC to use -mavx2 -mfma -mbmi2 -mf16c while the Intel compiler which doesn't support those flags can fall back to using -march=core-avx2. This addresses a bug where the Intel compiler will silently ignore the AVX2 instruction feature flags and then potentially fail to build. v2: Pass preprocessor-check argument as true-state instead of false-state for clarity. v3: Reduce AVX2 define test to just __AVX2__. Additional defines suchas __FMA__, __BMI2__, and __F16C__ appear to be inconsistently defined w.r.t thier availability. v4: Fix C++11 flags being added globally and add more logic to swr_require_cxx_feature_flags Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com> Tested-by: Tim Rowley <timothy.o.rowley@Intel.com> Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>	2016-06-30 16:55:01 -05:00
Brian Paul	eb79b2b331	st/wgl: make own_mutex() non-static Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-06-30 15:29:07 -06:00
Andres Gomez	e0f4504adf	glsl: atomic counters are different than their uniforms The linker deals with atomic counters in terms of uniforms but the data structure are called after the atomic counters. Renamed the data structures used in the linker for disambiguation. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-06-30 23:55:32 +03:00
Andres Gomez	0f00c6dd77	glsl: count atomic counters correctly Currently the linker uses the uniform count for the total number of atomic counters. However uniforms don't include the innermost array dimension in their count, but atomic counters are expected to include them. Although the spec doesn't directly state this, it's clear how offsets will be assigned for arrays. From OpenGL 4.2 (Core Profile), page 98: " * Arrays of type atomic_uint are stored in memory by element order, with array element member zero at the lowest offset. The difference in offsets between each pair of elements in the array in basic machine units is referred to as the array stride, and is constant across the entire array. The stride can be queried by calling GetIntegerv with a pname of ATOMIC_COUNTER_- ARRAY_STRIDE after a program is linked." From that it is clear how arrays of atomic counters will interact with GL_MAX_ATOMIC_COUNTER_BUFFER_SIZE. For other kinds of uniforms it's also clear that each entry in an array counts against the relevant limits. Hence, although inferred, this is the expected behavior. Fixes GL44-CTS.arrays_of_arrays_gl.AtomicDeclaration Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-06-30 23:55:32 +03:00
Brian Paul	c84444ea85	svga: use SVGA3D_vgpu10_BufferCopy() for buffer copies So that we do copies host-side rather than in the guest with map/memcpy. Tested with piglit arb_copy_buffer-subdata-sync test and new arb_copy_buffer-intra-buffer-copy test. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-06-30 14:32:11 -06:00
Brian Paul	29a38f37ee	svga: add SVGA3D_vgpu10_BufferCopy() Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:10 -06:00
Brian Paul	88a344253c	svga: flush buffers when mapping for reading With host-side buffer copies (via SVGA3D_vgpu10_BufferCopy()) we have to make sure any pending map-write operations are completed before reading if the buffer is dirty. Otherwise the ReadbackSubResource operation could get stale data from the host buffer. This allows the piglit arb_copy_buffer-subdata-sync test to pass when we start using the SVGA3D_vgpu10_BufferCopy command. v2: check the sbuf->dirty flag in the outer conditional, per Charmaine. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:10 -06:00
Neha Bhende	fa2cdd973d	svga: enable ARB_copy_image extension in the driver Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:09 -06:00
Brian Paul	4a54514958	svga: try blitting with copy region in more cases We previously could do blits with util_resource_copy_region() when doing 'loose' format checking. Also do blits with util_resource_copy_region() when the blit src/dst formats (not the underlying resources) exactly match. Needed for GL_ARB_copy_image. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:08 -06:00
Brian Paul	92b44efef4	svga: use copy_region_vgpu10() for region copies when possible v2: remove extra svga_define_texture_level() call, per Charmaine. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:08 -06:00
Neha Bhende	1d0be402c7	svga: use vgpu10 CopyRegion command when possible Do texture->texture copies host-side with this command when possible. Use the previous software fallback otherwise. Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:07 -06:00
Brian Paul	3a3c3d124a	svga: set render target flag for snorm surfaces We don't normally support rendering to SNORM surfaces, but with GL_ARB_copy_image we can copy to them if we treat them as typeless and use a UNORM surface view. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:07 -06:00
Brian Paul	46e7355a13	svga: add new svga_format_is_uncompressed_snorm() helper Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:07 -06:00
Brian Paul	68388043f3	svga: adjust sampler view format for RGBX We previously handled the case of a RGBX sampler view of a RGBA surface. Add the reverse case too. For GL_ARB_copy_image. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:07 -06:00
Brian Paul	1049002eae	svga: adjust render target view format for RGBX For GL_ARB_copy_image we may be asked to create an RGBA view of a RGBX surface. Use an RGBX view format for that case. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:07 -06:00
Neha Bhende	429ace2fbc	svga: don't advertise support for R32G32B32_UINT/SINT surface formats We want to be able to copy between different 32-bit, 3-channel surface formats for GL_ARB_copy_image but since we don't support R32G32B32_FLOAT for textures (it's not blendable and wouldn't work for render to texture) we can't support 32-bit, 3-channel integer formats. The state tracker will choose 4-channel formats instead. Fixes the piglit arb_copy_image-format test for several cases. Note: This change may need to be revisited if/when the texture_view exension is enabled in driver. Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Brian Paul	eb0ced74f6	svga: use untyped surface formats in most cases This allows us to do copies between different, but compatible, surface formats such as RGBA8_UNORM, RGBA8_SINT, RGBA8_UINT, etc. for GL_ARB_copy_image. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Brian Paul	5f1335878e	gallium/util: add tight_format_check param to util_can_blit_via_copy_region() The VMware driver will use this for implementing GL_ARB_copy_image. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Brian Paul	a029d9f074	gallium/util: simplify a few things in util_can_blit_via_copy_region() Since only the src box can have negative dims for flipping, just comparing the src/dst box sizes is enough to detect flips. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Brian Paul	5d31ea4b8f	gallium/util: new util_try_blit_via_copy_region() function Pulled out of the util_try_blit_via_copy_region() function. Subsequent changes build on this. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Neha Bhende	7988513ac3	svga: Fix failures caused in fedora 24 SVGA_3D_CMD_DX_GENRATE_MIPMAP & SVGA_3D_CMD_DX_SET_PREDICATION commands are not presents in fedora 24 kernel module. Because of this reason application like supertuxkart are not running. v2: Add few comments and code modifications suggested by Brian P. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 12:45:09 -06:00
Brian Paul	52f297d144	st/wgl: remove unneeded inline qualifiers No effect on size of the .o files (optimized build). Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-06-30 12:43:50 -06:00
Brian Paul	395ee18bac	st/wgl: add a stw_device::initialized field Set when the stw_dev object's initialization is completed. We test for this in the window callback function to avoid potential crashes on start-up in multi-threaded applications. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-06-30 12:43:50 -06:00
Brian Paul	128feef40e	st/wgl: refactor framebuffer locking code Split the old stw_framebuffer_reference() function into two new functions: stw_framebuffer_reference_locked() which increments the refcount and stw_framebuffer_release_locked() which decrements the refcount and destroys the buffer when the count hits zero. Original patch by Jose. Modified by Brian (clean-ups, lock assertion checks, etc). Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-06-30 12:43:50 -06:00
José Fonseca	25cccb5bec	st/wgl: rename curctx to old_ctx in stw_make_current() Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-30 12:43:49 -06:00
Brian Paul	24004a2435	st/wgl: release the pbuffer DC at the end of wglBindTexImageARB() Otherwise we were leaking DC GDI objects and if wglBindTexImageARB() was called enough we'd eventually hit the GDI limit of 10,000 objects. Things started failing at that point. v2: also release DC if we return early, per Charmaine. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-06-30 12:43:49 -06:00
Matt Turner	058c70bae1	mesa: Close fp on error path. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-30 11:08:39 -07:00
Matt Turner	e3d9125b77	i965: Simplify foreach_inst_in_block_safe() macro. We know what the end looks like without examining .tail: it's NULL. It's always NULL.	2016-06-30 11:08:39 -07:00
Andres Gomez	c4e47ab971	Revert "i965: get PrimitiveMode from the program rather than the shader struct" This reverts commit `644e015f0b`. PrimitiveMode from the program doesn't always hold a valid value that is neither of GL_TRIANGLES, GL_QUADS nor GL_ISOLINES when reaching this code. This caused regressions in the following CTS tests: GL44-CTS.stencil_texturing.functional GL44-CTS.shading_language_420pack.binding_images GL44-CTS.shading_language_420pack.binding_samplers GL44-CTS.shading_language_420pack.binding_uniform_single_block GL44-CTS.shading_language_420pack.implicit_conversions GL44-CTS.shading_language_420pack.initializer_list GL44-CTS.shading_language_420pack.length_of_vector_and_matrix GL44-CTS.shading_language_420pack.line_continuation Hence, we rather take it from the linked shader. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-06-30 16:20:22 +03:00
Timothy Arceri	1591e668e1	glsl/mesa: move duplicate shader fields into new struct gl_shader_info Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	fd2b3da5c8	glsl/main: remove unused params and make function static Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	32c410d2df	glsl: simplify link_uniform_blocks() There is only ever one shader so simplify the input params. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	1fb8c6df88	glsl/mesa: split gl_shader in two There are two distinctly different uses of this struct. The first is to store GL shader objects. The second is to store information about a shader stage thats been linked. The two uses actually share few fields and there is clearly confusion about their use. For example the linked shaders map one to one with a program so can simply be destroyed along with the program. However previously we were calling reference counting on the linked shaders. We were also creating linked shaders with a name even though it is always 0 and called the driver version of the _mesa_new_shader() function unnecessarily for GL shader objects. Acked-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	378f07ccb5	mesa: don't print name in _mesa_append_uniforms_to_file() This is only used to print linked shaders which always have a name of 0 so this was pointless. Acked-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	e8c8aa0320	mesa: remove unreachable code from _mesa_write_shader_to_file() _mesa_write_shader_to_file() is only used to print gl shader objects so Program should never be set as it only gets set for linked shaders. Acked-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	9b41c743cc	glsl: pass symbols to find_matching_signature() rather than shader This will allow us to later split gl_shader into two structs. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	47f8381730	glsl: pass symbols rather than shader to _mesa_get_main_function_signature() This will allow us to split gl_shader into two different structs, one for shader objects and one for linked shaders. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	9e9d01cbe8	mesa: don't use drivers NewShader function when creating shader objects The drivers function only needs to be used when creating a struct for linked shaders. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	962933b6d4	glsl: make cross_validate_globals() more generic Rather than passing in gl_shader we now pass in the IR. This will allow us to later split gl_shader into two structs. One for use as a linked per stage shader struct and one for use as a GL shader object. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Ian Romanick	5921f372c8	mapi: Export all GLES 3.1 functions in libGLESv2.so Khronos recommends that the GLES 3.1 library also be called libGLESv2. It also requires that functions be statically linkable from that library. NOTE: Mesa has supported the EGL_KHR_get_all_proc_addresses extension since at least Mesa 10.5, so applications targeting Linux should use eglGetProcAddress to avoid problems running binaries on systems with older, non-GLES 3.1 libGLESv2 libraries. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Cc: Mike Gorchak <mike.gorchak.qnx@gmail.com> Reported-by: Mike Gorchak <mike.gorchak.qnx@gmail.com> Acked-by: Chad Versace <chad.versace@intel.com>	2016-06-29 14:28:59 -07:00
Chad Versace	d3a147ba40	i965: Use drmIoctl for DRM_I915_GETPARAM (v2) Stop using drmCommandWriteRead for such a simple ioctl. v2: Handle errno correctly. [ickle] Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-06-29 13:44:23 -07:00
sonjiang	b928ff6f62	radeon/uvd: fix a h265 context size bug Signed-off-by: sonjiang <sonny.jiang@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-29 15:30:25 -04:00
sonjiang	5c80354a23	radeon/uvd: seperate uvd context buffer from DPB Signed-off-by: sonjiang <sonny.jiang@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-29 15:30:20 -04:00
sonjiang	28f85eab49	radeon uvd add uvd fw version for amdgpu Signed-off-by: sonjiang <sonny.jiang@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-29 15:30:14 -04:00
Samuel Pitoiset	fa10d1d674	nv50/ir: print EMIT subops in debug mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-29 20:37:38 +02:00
Samuel Pitoiset	a6d3b2e176	nv50/ir: print RSQ/RCP subops in debug mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-29 20:37:36 +02:00
Samuel Pitoiset	908ba19554	nv50/ir: print PIXLD subops in debug mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-29 20:37:33 +02:00
Samuel Pitoiset	c0d92078bb	nv50/ir: print SHFL subops in debug mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-29 20:37:18 +02:00
Rodrigo Vivi	85ea8deb26	i965: Removing PCI IDs that are no longer listed as Kabylake. This is unusual. Usually IDs listed on early stages of platform definition are kept there as reserved for later use. However these IDs here are not listed anymore in any of steppings and devices IDs tables for Kabylake on configurations overview section of BSpec. So it is better removing them before they become used in any other future platform. Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2016-06-29 11:14:19 -07:00
Rodrigo Vivi	bdff2e5547	i956: Add more Kabylake PCI IDs. The spec has been updated adding new PCI IDs. Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2016-06-29 11:14:19 -07:00
Marek Olšák	63f8d648f0	gallium/radeon: remove zombie textures kept alive by DCC stat gathering Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	44906101c4	gallium/radeon: don't re-create queries for DCC stat gathering Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	82b39f3521	gallium/radeon: assume X11 DRI3 can use at most 5 back buffers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	9ae41227c2	gallium/radeon: separate DCC starts as disabled (ps_draw_ratio = 0) DRI3: - Only slows clears can enable it for the first frame. - A good PS/draw ratio can enable it for other frames. DRI2: - Only slows clears can enable it for a frame. - Page-flipped color buffers are unref'd at the end of each frame, so it can't be enabled in any other way. - Relying on slow clears is sufficient for our synthetic benchmarks. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	9fd4eff43c	gallium/radeon: R600_DEBUG=nodccfb disables separate DCC Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	36cf5a57c2	gallium/radeon: add and use r600_texture_reference Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	6da92df538	gallium/radeon: add a HUD query for PS draw ratio stats from separate DCC Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	49e3c74cdd	gallium/radeon: add a heuristic enabling DCC for scanout surfaces (v2) DCC for displayable surfaces is allocated in a separate buffer and is enabled or disabled based on PS invocations from 2 frames ago (to let queries go idle) and the number of slow clears from the current frame. At least an equivalent of 5 fullscreen draws or slow clears must be done to enable DCC. (PS invocations / (width * height) + num_slow_clears >= 5) Pipeline statistic queries are always active if a color buffer that can have separate DCC is bound, even if separate DCC is disabled. That means the window color buffer is always monitored and DCC is enabled only when the situation is right. The tracking of per-texture queries in r600_common_context is quite ugly, but I don't see a better way. The first fast clear always enables DCC. DCC decompression can disable it. A later fast clear can enable it again. Enable/disable typically happens only once per frame. The impact is expected to be negligible because games usually don't have a high level of overdraw. DCC usually activates when too much blending is happening (smoke rendering) or when testing glClear performance and CMASK isn't supported (Stoney). v2: rename stuff, add assertions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	9124457bff	gallium/radeon: add state setup for a separate DCC buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	fa7c927625	radeonsi: always calculate DCC info even if it's not used immediately for a later use Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	ebb9c7d7c4	radeonsi: unreference framebuffer state with set_framebuffer_state Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	e607a6be2b	gallium/radeon: add flag R600_QUERY_HW_FLAG_BEGIN_RESUMES Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Chad Versace	a2ae888929	i965: Use intel_get_param() more often Replace some open-coded ioctls with intel_get_param(). This is just a cleanup. No change in behavior. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-29 09:34:21 -07:00
Chad Versace	844e0bd946	i965: Refactor intel_get_param() Replace the function's __DRIscreen parameter with struct intel_screen. The callsites feel more natural that way. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-29 09:34:21 -07:00
Marek Olšák	0c135a773f	radeonsi: don't advertise multisample shader images Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 16:34:22 +02:00
Marek Olšák	eff81cbc81	radeonsi: enable distributed tess on multi-SE parts only ported from Vulkan Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 16:34:22 +02:00
Marek Olšák	dd56d04568	radeonsi: set optimal VGT_HS_OFFCHIP_PARAM ported from Vulkan Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 16:34:22 +02:00
Marek Olšák	9a71bf8858	radeonsi: enable CU0 in each SE for LS-HS execution Offchip-only tessellation allows this. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 16:34:22 +02:00
Marek Olšák	4b11ef23b4	radeonsi: use conformant line rasterization AA lines are not completely correct (see TODO), but everything else should be. + 3 linestipple piglits Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 16:34:22 +02:00
Rob Herring	789ed13284	Android: add missing u_math.h include path for libmesa_isl Commit `87d062a940` ("i965: Fix shared local memory size for Gen9+.") added u_math.h include which broke the Android build: In file included from external/mesa3d/src/intel/isl/isl_storage_image.c:25: In file included from external/mesa3d/src/mesa/drivers/dri/i965/brw_compiler.h:29: external/mesa3d/src/mesa/main/macros.h:35:10: fatal error: 'util/u_math.h' file not found ^ Add the missing include paths for libmesa_isl. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Kenneth Garunke <kenneth@whitecape.org>	2016-06-28 12:48:46 -07:00
Charmaine Lee	6397c12f32	svga: force direct map for transfering multiple slices With commit `fb9fe35`, we start using transfer_inline_write for memcpy of TexSubImage. But SurfaceDMA command does not work well with texture array. This patch forces direct map when transfering multiple slices of a texture array. Fixes piglit regression "texelFetch fs sampler1DArray" Tested with MTT piglit, glretrace, conform. Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2016-06-28 13:43:23 -06:00
Brian Paul	d65c4e22a8	svga: whitespace, line wrapping fixes in svga_surface.c	2016-06-28 13:43:23 -06:00
Samuel Pitoiset	cc97b6a34a	gm107/ir: make sure that flagsDef is set when emitting setcond Rely on the existence of a second destination when emitting a setcond flag is dangerous, because this doesn't mean that the flag has been correctly set. Instead rely on flagsDef like what emitX() does for flagsSrc. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-28 18:38:56 +02:00
Grazvydas Ignotas	234323558d	doc: improve INTEL_DEBUG documentation Remove 'reg' option that does not actually exist, elaborate more about 'sync' and add the missing options. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-28 07:21:07 -07:00
Marek Olšák	c1dbc563f4	radeonsi: set PA_SU_SMALL_PRIM_FILTER_CNTL register on Polaris This was missing. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-06-28 15:47:13 +02:00
Boyuan Zhang	06f0a4d9ed	radeon/vce: use vce structure for vce 52 firmware Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-06-28 08:58:03 -04:00
Boyuan Zhang	533bd6ae17	radeon/vce: add vce structures Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-06-28 08:58:00 -04:00
Leo Liu	05d302ffe2	st/omx: fix decoder fillout for the OMX result buffer The call for vl_video_buffer_adjust_size is with wrong order of arguments, apparently it will have problem when interlaced false; The size of OMX result buffer depends on real size of clips, vl buffer dimension is aligned with 16, so 1080p(1920*1080) video will overflow the OMX buffer Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-06-28 08:57:56 -04:00
Hans de Goede	459cc94507	pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms Make pipe_loader_sw_probe_kms take ownership of the passed in fd, like pipe_loader_drm_probe_fd does. The only caller is dri_kms_init_screen which passes in a dupped fd, just like dri2_init_screen passes in a dupped fd to pipe_loader_drm_probe_fd. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-28 12:29:54 +02:00
Jan Vesely	87787e9079	clover: Fix kernel metadata retrieval after clang r273425 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Francisco Jerez <currojerez@riseup.net>	2016-06-27 23:12:37 -07:00
Francisco Jerez	a8a966ddb5	clover/llvm: Fix copyright attribution of invocation.cpp. This file still only has my name on the copyright notice even though most of the code (likely more than 90% of it) was authored by various contributors -- It doesn't seem right to have the whole file attributed to myself. Acked-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Serge Martin <edb+mesa@sigluy.net>	2016-06-27 23:12:35 -07:00
Kenneth Graunke	034bd25327	i965: Print EOT in fs_visitor::dump_instruction(). This was useful when debugging the previous commit's issue. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-27 16:36:57 -07:00
Kenneth Graunke	7e7e501acf	i965: Make emit_urb_writes() not produce an EOT message for GS. emit_urb_writes() contains code to emit an EOT write with no actual data when there are no output varyings. This makes sense for the VS and TES stages, where it's called once at the end of the program. However, in the geometry shader stage, emit_urb_writes() is called once for every EmitVertex(). We explicitly emit a URB write with EOT set at the end of the shader, separately from this path. So we'd better not terminate the thread. This could get us into trouble for shaders which do EmitVertex() with no varyings followed by SSBO/image/atomic writes. It also caused us to emit multiple sends with EOT set, which apparently confuses the register allocator into not using g112-g127 for all but the first one. This caused EU validation failures in OglGSCloth shaders in shader-db. (The actual application was fine, but shader-db thinks there are no outputs because it doesn't understand transform feedback.) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-27 16:36:51 -07:00
Kenneth Graunke	a36a73a7b8	glsl: Ignore ir_texture in lower_const_arrays_to_uniforms. The only part of an ir_texture which can be an array is the offsets array in textureGatherOffsets() calls. We don't want to lower those, because they're required to remain constants. Fixes textureGatherOffsets with Gallium drivers such as llvmpipe, which commit `ef78df8d3b` regressed. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-27 16:36:30 -07:00
Samuel Pitoiset	7b9b096775	gm107/ir: add missing setcond flags for LOP variants Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-28 00:30:01 +02:00
Samuel Pitoiset	83a4f28dc2	gm107/ir: make use of LOP32I for all immediates LOP only allows to emit 19-bits immediates. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-28 00:29:53 +02:00
Dave Airlie	c7cc264ca9	virgl: reduce some limits for now These need to be passed from the host in caps structure if they are larger, this fixes a bunch of tests on Intel hw, that I'd put the limits too high for. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-28 06:49:26 +10:00
Julien Isorce	6e4cf937f8	st/omx: count number of slices Used by nouveau driver. Similar patch was done for st/va: `851e7e12aa` Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-27 17:52:15 +01:00
Julien Isorce	e10f1fcebe	st/omx: add support for nouveau / interlaced Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-27 17:52:15 +01:00
Julien Isorce	23b7a83cc1	st/omx: retrieve preferred interlaced and buffer_formats Interlaced can be true for nouveau driver. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-27 17:52:15 +01:00
Marek Olšák	f6ff483646	radeonsi: use optimal WD settings for primitive restart on Polaris ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-27 13:54:39 +02:00
Gurkirpal Singh	46dba701d8	st/va: Check NULL pointer Call to handle_table_get in vlVaDestroySurfaces can return NULL on failure. CID: 1243522 Signed-off-by: Gurkirpal Singh <gurkirpal204@gmail.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com>	2016-06-27 08:09:08 +01:00
Eric Anholt	d20b89e928	nir: Fix copy_prop_src when src is an indirect access on a reg. The intent was to continue down the indirect chain, not to call ourselves with unchanged input arguments. Found by code inspection, and comparison to copy_prop_alu_src(). We haven't hit this because callers of NIR's copy prop are doing so in SSA, before indirect variable dereferences have been lowered to registers. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-26 15:38:09 -07:00
Samuel Pitoiset	c7fa3c92f8	gm107/ir: make use of MOV32I for all immediates MOV only allows to emit 19-bits immediates. This is similar to the previous fix I did for IMUL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-27 00:28:02 +02:00
Jordan Justen	367cf3a2e3	i965: Use miptree to decide format on multi-plane images for gen < 7 This wasn't handled correctly for multi-plane images on gen < 7 in `727a9b2493`. Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96674 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-26 10:49:34 -07:00
Ilia Mirkin	1f5f64b91f	nvc0: update "derived" state function names derived_1/2/etc aren't too informative. Instead name them based on the state they're derived from. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-26 12:04:55 -04:00
Ilia Mirkin	89a7496b9d	nvc0: provide support for unscaled poly offset units On at least Kepler hardware, the units differ based on RT format. Emit a properly scaled value for Z16 depth buffers vs other formats, to help out st/nine. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-26 12:04:55 -04:00
Samuel Pitoiset	b84c97587b	gm107/ir: make use of IMUL32I for all immediates IMUL only allows to emit 19-bits immediates. This is similar to `d30768025a` which fixed the same thing for the GK110 emitter. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-26 17:33:06 +02:00
Marek Olšák	d93bacc1fa	radeonsi: make si_is_format_supported static Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-25 23:13:42 +02:00
Marek Olšák	3eacbc52d5	radeonsi: boolean -> bool, TRUE -> true, FALSE -> false Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-25 23:13:42 +02:00
Marek Olšák	7db10093d3	gallium/radeon: boolean -> bool, TRUE -> true, FALSE -> false Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-25 23:13:42 +02:00
Marek Olšák	1c5a10497a	gallium/radeon/winsyses: boolean -> bool, TRUE -> true, FALSE -> false Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-25 23:13:42 +02:00
Marek Olšák	d5383a7d31	gallium/radeon: use r600_resource_reference Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-25 23:13:42 +02:00
Jason Ekstrand	81978c6feb	nir: Add a NIR_VALIDATE environment variable It defaults to true so default behavior doesn't change but it allows you to do NIR_VALIDATE=false if you don't want validation. Disabling validation can substantially speed up shader compiles so you frequently want to turn it off if compiler invariants aren't in question. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-25 07:34:20 -04:00
Axel Davy	b76fa56739	st/nine: Use offset_units_unscaled offset_units_unscaled enables proper support for depth bias for gallium nine. Use it if available. Solves issues with some games using depth bias. For example: https://github.com/iXit/Mesa-3D/issues/220 Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-06-25 10:16:15 +02:00
Axel Davy	f6704f2a4d	r600g: Implement POLYGON_OFFSET_UNITS_UNSCALED Empirical tests show that the polygon offset behaviour is entirely determined by the content of the PA_SU_POLY_OFFSET states, and not by the depth buffer format bound. PA_SU_POLY_OFFSET seems to directly set the parameters of the polygon offset formula, and setting 0 for PA_SU_POLY_OFFSET_DB_FMT_CNTL (ie setting the unorm depth bias behaviour with a scale of 2^0 = 1.0f) gives the unscaled behaviour. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Axel Davy	be7957b156	radeonsi: Implement POLYGON_OFFSET_UNITS_UNSCALED Empirical tests show that the polygon offset behaviour is entirely determined by the content of the PA_SU_POLY_OFFSET states, and not by the depth buffer format bound. PA_SU_POLY_OFFSET seems to directly set the parameters of the polygon offset formula, and setting 0 for PA_SU_POLY_OFFSET_DB_FMT_CNTL (ie setting the unorm depth bias behaviour with a scale of 2^0 = 1.0f) gives the unscaled behaviour. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Axel Davy	c2b7b48a54	radeon: Remove useless pa_su_poly_offset_db_fmt_cntl pa_su_poly_offset_db_fmt_cntl usages were removed in previous patches. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Axel Davy	fe2ec50d75	r600g: move PA_SU_POLY_OFFSET_DB_FMT_CNTL to poly offset states for evergreen Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with the other poly_offset states. This will be useful to implement PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. v2: Increase the num_dw field for the poly offset atom Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Axel Davy	400e8d8c40	r600g: move PA_SU_POLY_OFFSET_DB_FMT_CNTL to poly offset states for r600 Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with the other poly_offset states. This will be useful to implement PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. v2: Increase the num_dw field for the poly offset atom Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Axel Davy	ff5abe9d90	radeonsi: move PA_SU_POLY_OFFSET_DB_FMT_CNTL to poly offset states Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with rasterizer poly_offset states. This will be useful to implement PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Axel Davy	59a692916c	gallium: Add a cap for offset_units_unscaled D3D9 has a different behaviour for depth bias. For OGL/D3D1X, the depth bias unit is the minimal resolvable value for the depth buffer, which depends on the format (and has different behaviour for float depth buffers). For D3D9, the depth bias unit is 1.0f. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Jordan Justen	727a9b2493	i965: Skip update_texture_surface when the plane doesn't exist Reported-by: Grazvydas Ignotas <notasas@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96607 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-24 18:13:18 -07:00
Kenneth Graunke	c4a6b0d2d2	i965: Validate a few SEND-from-GRF requirements. We recently had a mistake where we emitted SEND instructions with EOT set, but from g107 rather than g112-g127. Adding validation code should prevent these sorts of problems from slipping back in. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-24 15:03:55 -07:00
Kenneth Graunke	192813e50e	i965: Delete send-from-GRF only opcodes from implied_mrf_writes(). These only exist post-Sandybridge, and always use send-from-GRF. So inst->base_mrf will be -1, and we will have already returned 0. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-24 15:03:55 -07:00
Kenneth Graunke	255cff76d9	i965: Drop unnecessary inst->base_mrf = -1 assignments. These are now unnecessary, as base_mrf is -1 by default. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-24 15:03:55 -07:00
Kenneth Graunke	3e04e3758e	i965: Set fs_inst::base_mrf = -1 by default. On MRF platforms, we need to set base_mrf to the first MRF value we'd like to use for the message. On send-from-GRF platforms, we set it to -1 to indicate that the operation doesn't use MRFs. As MRF platforms are becoming increasingly a thing of the past, we've forgotten to bother with this. It makes more sense to set it to -1 by default, so we don't have to think about it for new code. I searched the code for every instance of 'mlen =' in brw_fs*cpp, and it appears that all MRF-based messages correctly program a base_mrf. Forgetting to set base_mrf = -1 can confuse the register allocator, causing it to think we have a large fake-MRF region. This ends up moving the send-with-EOT registers earlier, sometimes even out of the g112-g127 range, which is illegal. For example, this fixes illegal sends in Piglit's arb_gpu_shader_fp64-layout-std430-fp64-shader, which had SSBO messages with mlen > 0 but base_mrf == 0. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-24 15:03:55 -07:00
Kenneth Graunke	3e258f7e31	i965: Drop unused return value from intel_finalize_mipmap_tree(). The old return type of GLuint was wonky - it should have been bool. But nothing actually uses the return value anyway, so we can just drop that and make it a void function. In theory, it might make sense to ask whether the texture validated successfully, but just checking intel_obj->mt != NULL works for that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-24 15:03:44 -07:00
Kenneth Graunke	8ee23d6866	i965: Move contents of brw_tex.c into intel_tex_validate.c. brw_tex.c is a tiny file containing a single function. It's closely tied to the validation logic in intel_tex_validate.c, so it makes sense to put both in the same file. While we're at it, update the function to our modern style. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-24 15:03:44 -07:00
Marek Olšák	28d0d0c5b4	radeonsi: fix fractional odd tessellation spacing for Polaris ported from Vulkan (and no source explains why this is needed) Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 17:36:43 +02:00
Marek Olšák	0d638f4b3d	radeonsi: set some VGT context registers on SI-CI the kernel sets them, but other UMDs can change them Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Marek Olšák	8f3ef4e8b8	radeonsi: optimize rendering to linear color buffers loosely ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Marek Olšák	e4b22c9fa1	radeonsi: set almost optimal settings in SC_MODE_CNTL_1 ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Marek Olšák	603c073ec2	gallium/radeon: let drivers specify SC_MODE_CNTL_1 fields radeonsi will set more fields Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Marek Olšák	ae0d2d15cc	gallium/radeon: disable complicated point clipping against user clip planes Nothing in the GL spec says that we should expand points to triangles. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Marek Olšák	1e8adb0ee4	radeonsi: fix a compute shader hang with big threadgroups on SI & CI ported from Vulkan Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Ilia Mirkin	b433cb51e5	nvc0: when mapping directly, provide accurate xfer info + start We were ignoring the incoming box parameters, and were providing totally bogus stride/layer stride, and other bits, for when a non-full-surface map was requested. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-24 09:53:13 -04:00
Ilia Mirkin	3f0fa3b32d	st/mesa: don't assume that the whole surface gets mapped Under some circumstances, the driver may choose to return a temporary surface instead of a pointer to the original. Make sure to pass the actual view volume to be mapped to the transfer function rather than adjusting the map pointer after-the-fact. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 09:53:13 -04:00
Nicolai Hähnle	0da890e62c	radeonsi: drop the DRAW_PREAMBLE packet on Polaris It will be removed from the firmware for the Polaris. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 13:28:46 +02:00
Nicolai Hähnle	2aa0485902	radeonsi: use DRAW_(INDEX_)INDIRECT_MULTI on Polaris The non-MULTI variants will be removed in Polaris firmware. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 13:28:32 +02:00
Francesco Ansanelli	82ab3f27ff	st/mesa: handle negative _ColorDrawBufferIndexes values correctly Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:41:22 +02:00
Nicolai Hähnle	bc4b7ebbfd	winsys/radeon: add guard pages when R600_DEBUG=check_vm is enabled This should help flush out GPU VM faults. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:03 +02:00
Nicolai Hähnle	49c0b4a0db	winsys/amdgpu: add guard pages when R600_DEBUG=check_vm is enabled This should help flush out GPU VM faults. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:03 +02:00
Nicolai Hähnle	dbac88a839	radeonsi: report a failure to parse dmesg instead of asserting Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:03 +02:00
Nicolai Hähnle	d46a9db840	radeon: check VM faults from DMA flush Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:03 +02:00
Nicolai Hähnle	80dd7870fe	radeonsi: move gfx fence wait out of si_check_vm_faults Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:03 +02:00
Nicolai Hähnle	ad8438403b	radeonsi: extract IB and bo list saving into separate functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:02 +02:00
Nicolai Hähnle	b3de274b05	st/mesa: fix readpixels regression with MESA_pack_invert Fixes an error introduced in commit `3948cd3797`. Reported-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:02 +02:00
Marek Olšák	05e741c6d6	radeonsi: set LLVM denormal flags - make sure FP32 denormals will stay disabled in LLVM in the future (the current default is disabled) - tell LLVM that FP64 denormals are enabled Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-06-24 12:31:03 +02:00
Marek Olšák	0e1fefa722	radeonsi: emit 1/sqrt for RSQ We don't need the clamped version and we don't have to use any intrinsic. Stats on Tonga: 15382 shaders in 9128 tests Totals: SGPRS: 1230560 -> 1230560 (0.00 %) VGPRS: 469577 -> 462504 (-1.51 %) Code Size: 22089908 -> 21730052 (-1.63 %) bytes LDS: 598 -> 598 (0.00 %) blocks Scratch: 283648 -> 281600 (-0.72 %) bytes per wave Max Waves: 125664 -> 126969 (1.04 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 547280 -> 547280 (0.00 %) VGPRS: 269132 -> 262059 (-2.63 %) Code Size: 15709604 -> 15349748 (-2.29 %) bytes LDS: 198 -> 198 (0.00 %) blocks Scratch: 74752 -> 72704 (-2.74 %) bytes per wave Max Waves: 47840 -> 49145 (2.73 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-06-24 12:31:03 +02:00
Jan Vesely	54c4d525da	r600g: Enable FMA on chips that support it v2: Merge with PIPE_SHADER_CAP_DOUBLES Add CHIP_HEMLOCK v3: only set the instruction on EG and CM Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:30:59 +02:00
Marek Olšák	cbb5adb908	gallium/u_queue: allow the execute function to differ per job so that independent types of jobs can use the same queue. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	4a06786efd	gallium/u_queue: reduce the number of mutexes by 2 by converting semaphores to condvars and using the main mutex Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	2fba0aaa70	gallium/u_queue: add an option to name threads for debugging v2: correct the snprintf use Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	404d0d50d8	gallium/u_queue: add an option to have multiple worker threads independent jobs don't have to be stuck on only one thread v2: use CALLOC & FREE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	4358f6dd13	gallium/u_queue: rewrite util_queue_fence to allow multiple waiters Checking "signalled" is first done without a mutex, then with a mutex. Also, checking without waiting doesn't lock the mutex. This is racy, but should be safe. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	d8367e91f2	gallium/u_queue: use a ring instead of a stack and allow specifying its size in util_queue_init. v2: use CALLOC & FREE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Jordan Justen	c36a363a2d	i965: Preserve the internal format of the dri image Since the OpenGLES API is strict about the internal format matching the for many operations, we need to preserve it. See _mesa_es3_error_check_format_and_type in src/mesa/main/glformats.c. Fixes ES2-CTS.gtf.GL2ExtensionTests.egl_image.egl_image Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96351 Reported-by: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-23 20:44:00 -07:00
Chad Versace	a0f3c3c9d4	anv: Add anv_render_pass_attachment::store_op Will be needed for resolving auxiliary surfaces. I didn't add anv_render_pass_attachment::stencil_store_op, as the driver would likely never use it, as stencil surfaces never have auxiliary surfaces. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-23 16:10:25 -07:00
Gurkirpal Singh	15d3777b74	gbm: Fix comments Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-23 13:55:03 -07:00
Eric Engestrom	b293e8b470	gbm: doc fixes Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-23 13:55:03 -07:00
Giuseppe Bilotta	60a27ad122	Remove wrongly repeated words in comments Clean up misrepetitions ('if if', 'the the' etc) found throughout the comments. This has been done manually, after grepping case-insensitively for duplicate if, is, the, then, do, for, an, plus a few other typos corrected in fly-by v2: * proper commit message and non-joke title; * replace two 'as is' followed by 'is' to 'as-is'. v3: * 'a integer' => 'an integer' and similar (originally spotted by Jason Ekstrand, I fixed a few other similar ones while at it) Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-23 13:55:03 -07:00
Brian Paul	5d07998317	svga: update some comments in svga_buffer_handle() Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 13:02:28 -06:00
Brian Paul	fe76212873	svga: add a const qualifier in svga_buffer_upload_piecewise() Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 13:02:28 -06:00
Brian Paul	e82fa96d19	svga: minor code refactor for svga_buffer_upload_command() Put the HBS code into a separate function. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 13:02:28 -06:00
Brian Paul	db721da5a3	svga: minor code simplification in svga_context_finish() Signed-off-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 13:02:28 -06:00
Kenneth Graunke	b0629e6894	i965: Implement rasterizer discard via SOL unless required for queries. We currently use CL_INVOCATION_COUNT for the GL_PRIMITIVES_GENERATED query, which involves passing all primitives to the clipper. When rasterizer discard is enabled, we program the clipper in REJECT_ALL mode, rather than using the SOL stage's "Rendering Disable" feature. See commit `f09b91f782` for an explanation of why we implement GL_PRIMITIVES_GENERATED this way. Apparently the SOL stage's "Rendering Disable" feature is a lot faster than having the clipper reject all primitives. It's safe to use when no GL_PRIMITIVES_GENERATED query is active, as we don't care about CL_INVOCATION_COUNT incrementing. This patch makes us use SO_RENDERING_DISABLE when no query is active, but continues falling back to the clipper in REJECT_ALL mode when the queries are enabled. It brings back the perf_debug for the clipper case (which I removed in commit `1f9445ff57`, thinking it wasn't useful). Improves performance in Gl32GSCloth by 84.8303% +/- 2.07132% (n = 10) on my Broadwell GT2 laptop. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	4db98f8beb	i965: Combine 3DSTATE_STREAMOUT emitters and genX_sol_state atoms. They're basically the same. Let's avoid the code duplication. v2: Fix SO_BUFFER_ENABLE stuff to only happen on Gen < 8 (caught by Jason Ekstrand). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	fb857b5eea	glsl: Don't constant propagate arrays. Constant propagation on arrays doesn't make a lot of sense. If the array is only accessed with constant indexes, then opt_array_splitting would split it up. Otherwise, we have variable indexing. If there's multiple accesses, then constant propagation would end up replicating the data. The lower_const_arrays_to_uniforms pass creates uniforms for each ir_constant with array type that it encounters. This means that it creates redundant uniforms for each copy of the constant, which means uploading too much data. It can even mean exceeding the maximum number of uniform components, causing link failures. We could try and teach the pass to de-duplicate the data by hashing constants, but it makes more sense to avoid duplicating it in the first place. We should promote constant arrays to uniforms, then propagate the uniform access. Fixes the TressFX shaders from Tomb Raider, which exceeded the maximum number of uniform components by a huge margin and failed to link. On Broadwell: total instructions in shared programs: 9067702 -> 9068202 (0.01%) instructions in affected programs: 10335 -> 10835 (4.84%) helped: 10 (Hoard, Shadow of Mordor, Amnesia: The Dark Descent) HURT: 20 (Natural Selection 2) loops in affected programs: 4 -> 0 The hurt programs appear to no longer have a constarray uniform, as all constants were successfully propagated. Apparently before this patch, we successfully unrolled a loop containing array access, but only after promoting constant arrays to uniforms. With this patch, we unroll it first, so all array access is direct, and the array is split up, and individual constants are propagated. This seems better. Cc: mesa-stable@lists.freedesktop.org Reported-by: Karol Herbst <nouveau@karolherbst.de> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	ef78df8d3b	glsl: Make lower_const_arrays_to_uniforms work directly on constants. There's really no point in looking at ir_dereference_array of a constant. It also misses cases like: (assign () (var_ref tmp) (constant (array ...) ...)) No changes in shader-db, but keeps it working after the next commit. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	f7741c5211	i965: Copy propagate before doing variable index lowering. The scalar backend currently doesn't support variable indexing on temporary arrays, but it does support it on uniform arrays, and some stages support it for input arrays. Make sure these are propagated through before exploding indirects into piles of if-ladders unnecessarily. On Broadwell, no instruction count change in shader-db. total cycles in shared programs: 80675652 -> 80674928 (-0.00%) cycles in affected programs: 649972 -> 649248 (-0.11%) helped: 386 HURT: 165 This will help avoid code quality regressions in a future commit. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	586f4a42e7	glsl: Propagate invariant/precise after lowering const arrays. The new uniform may need precise as well. Fixes copy propagation of constant array uniforms in Tomb Raider shaders. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	c264fdbc07	glsl: Split arrays even in the presence of whole-array copies. Previously, we failed to split constant arrays. Code such as int[2] numbers = int[](1, 2); would generates a whole-array assignment: (assign () (var_ref numbers) (constant (array int 4) (constant int 1) (constant int 2))) opt_array_splitting generally tried to visit ir_dereference_array nodes, and avoid recursing into the inner ir_dereference_variable. So if it ever saw a ir_dereference_variable, it assumed this was a whole-array read and bailed. However, in the above case, there's no array deref, and we can totally handle it - we just have to "unroll" the assignment, creating assignments for each element. This was mitigated by the fact that we constant propagate whole arrays, so a dereference of a single component would usually get the desired single value anyway. However, I plan to stop doing that shortly; early experiments with disabling constant propagation of arrays revealed this shortcoming. This patch causes some arrays in Gl32GSCloth's geometry shaders to be split, which allows other optimizations to eliminate unused GS inputs. The VS then doesn't have to write them, which eliminates the entire VS (5 -> 2 instructions). It still renders correctly. No other change in shader-db. v2: Drop !AOA check and improve a comment (feedback from Tim Arceri). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	acf5444044	glsl: Make constant propagation's folder not propagate into an LHS. opt_constant_propagation.cpp contains constant folding code which can actually do constant propagation in some cases. It was happily propagating constants into the left-hand-side of assignments. For example, (assign () (var_ref temp) (constant ...)) would brilliantly be turned into: (assign () (constant ...) (constant ....)) This is a bigger hammer than necessary - it prevents propagation into the left-hand-side altogether. We could certainly do better someday. Notably, the constant propagation pass itself already takes this approach - it's just the constant propagation pass's built-in constant folding code (which actually propagates, too) that was broken. No change in shader-db, but prevents regressions after future commits. It seems plausible that this could be hit today, but I haven't seen it happen. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-23 11:58:50 -07:00
Topi Pohjolainen	3487d2e7bf	i965/blorp: Disable vertex element swizzling Without vertex elements originating directly from vertex fetcher are not passed to wm-state correctly. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-23 21:39:09 +03:00
Topi Pohjolainen	12783aac50	i965/blorp: Let program data tell if push constants are needed Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-23 21:39:09 +03:00
Topi Pohjolainen	874f2e9523	i965/blorp: Use prog data counters to guide wm/ps setup just as core upload logic does. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-23 21:39:09 +03:00
Topi Pohjolainen	f5e8575ab4	i965/blorp: Use prog data counters to guide sf/sbe setup just as core upload logic does. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-23 21:39:09 +03:00
Ardinartsev Nikita	01c89ccc5d	i965: Avoid division by zero. Fixes regression introduced by `af5ca43f26` Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419	2016-06-23 10:08:58 -07:00
Tim Rowley	a16d274032	swr: [rasterizer core] fix dependency bug Never be dependent on "draw 0", instead have a bool that makes the draw dependent on the previous draw or not dependent at all. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:51:11 -05:00
Tim Rowley	73a9154bde	swr: [rasterizer core] use wrap-around safe compares for dependency checking Move drawIDs from 64-bit to 32-bit to increase perf. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:51:06 -05:00
Tim Rowley	dd189536dc	swr: [rasterizer jitter] add support for component packing for 'odd' formats Add early-out if no components are enabled. Add asserts. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:51:00 -05:00
Tim Rowley	35935ca4f2	swr: [rasterizer core] track whether GS outputs viewport array index So we can skip the index gather in PA. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:55 -05:00
Tim Rowley	2d80295a6e	swr: [rasterizer core] GS viewport array index attribute Only adds the attribute mapping to the jitter; no implementation yet. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:47 -05:00
Tim Rowley	c7cd33b605	swr: [rasterizer core] conservative rasterization frontend support Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:41 -05:00
Tim Rowley	c867c22d85	swr: [rasterizer core] stop single threaded crash exit crash Function static destructors were getting called by exit handlers before context teardown. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:36 -05:00
Tim Rowley	0f025eb478	swr: [rasterizer jitter] small fetch jit cleanup Handle SGV stores separate from the stream fetch code. Because of this change, there is a potential to jit an extra unused store. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:30 -05:00
Tim Rowley	eca877f27b	swr: [rasterizer core] remove old comment Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:25 -05:00
Tim Rowley	d3d97f8395	swr: [rasterizer jitter] cleanup supporting different llvm versions Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:19 -05:00
Tim Rowley	42215e6116	swr: [rasterizer jitter] unitialized component fix in fetch jit Was trying to store an extra uninitialized component. Only affects component packing, which isn't enabled (yet). Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:12 -05:00
Tim Rowley	b6d2c96851	swr: [rasterizer] add support for building avx512 version Currently, most code paths between AVX2 and AVX512 are identical (see changes to knobs.h). Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:05 -05:00
Tim Rowley	695af2a7e2	swr: [rasterizer common] fix include for Intel compiler Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:49:59 -05:00
Tim Rowley	95f21a9766	swr: [rasterizer common] workaround clang for windows __cpuid() bug Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:49:46 -05:00
Tim Rowley	9ca741c645	swr: push/pop DEBUG macro around llvm includes llvm redefines DEBUG; adding push/pop prevents a undefined reference to debug_refcnt_state in llvm-3.7+. v2: add undef DEBUG Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 09:58:08 -05:00
Jose Fonseca	805dbdf06d	include: Require MSVC 2013 Update 4. Earlier MSVC 2013 releases have troubles compiling some of our C99 code, so make sure we have Update 4 to avoid confusion. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-23 15:07:19 +01:00
Brian Paul	4f5d513755	svga: rename svga_surface_copy() to svga_resource_copy_region() To be consistent with the pipe_context function name. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 07:31:20 -06:00
Brian Paul	743ff588f2	svga: don't copy blit_info into local var There's no reason for doing so. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 07:31:20 -06:00
Brian Paul	e0dc3c5f19	gallium/util: fix some 4-space indentation in blitter code Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 07:31:20 -06:00
Charmaine Lee	2aa9ff0cda	svga: fix texture array update regression With commit `fb9fe35`, we start using transfer_inline_write for memcpy TexSubImage path, but that triggers a regression with texture array in the svga driver. With this patch, the direct map code will update the texture array correctly. Fixes VMware bug 1679293. Tested with MTT piglit, glretrace, conform. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-23 07:31:20 -06:00
Charmaine Lee	d4a77254cb	svga: fix index/vertex buffer surface reference at draw Currently with the SetVertexBuffers optimization, we avoid emitting redundant DXSetVertexBuffers commands. However, these buffers surfaces will still need to be referenced, otherwise, in the case of linux, the subsequent surface discard map will map to the existing mob instead of a new one, causing rendering artifacts. With this patch, we'll call resource_rebind() to reference the resources even if we are avoiding the actual set command. This fixes the rendering artifacts in the window title area running with unity in Ubuntu 14.04 Tested with piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2016-06-23 07:31:20 -06:00
Charmaine Lee	2b81e31d44	svga: fix vertex buffer references in the hw state This patch fixes three issues with vertex buffer references: (1) Instead of copy the vertex buffer resource handles to the hw state in the context structure, use pipe_resource_reference to properly reference the vertex buffer resources in the context. (2) Make sure to unbind those unused vertex buffer resources. (3) Force to rebind the vertex buffer resources at the first draw of each command buffer to make sure the vertex buffer resources are paged in. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-23 07:31:20 -06:00
Charmaine Lee	a1d74f5528	svga: fix index buffer reference in the hw state Instead of copy the index buffer resource handle to the hw state in the context structure, use pipe_resource_reference to properly reference the index buffer resource in the context. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-23 07:31:19 -06:00
Timothy Arceri	ab99196b6b	glsl/mesa: stop duplicating geom and tcs layout values We already store these in gl_shader and gl_program here we remove it from gl_shader_program and just use the values from gl_shader. This will allow us to keep the shader cache restore code as simple as it can be while making it somewhat clearer where these values originate from. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-23 11:01:46 +10:00
Timothy Arceri	24b3be0938	glsl/mesa: stop duplicating tes layout values We already store this in gl_shader and gl_program here we remove it from gl_shader_program and just use the values from gl_shader. This will allow us to keep the shader cache restore code as simple as it can be while making it somewhat clearer where these values originate from. V2: remove unnecessary NULL check Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral <itoral@igalia.com>	2016-06-23 11:01:36 +10:00
Edward O'Callaghan	f3ae370a36	.mailmap: Fixup my email address Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-06-23 00:00:46 +02:00
Christian Gmeiner	22304554a2	st/mesa: expose EXT_vertex_array_bgra when supported by backend Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-22 12:46:08 -07:00
Jason Ekstrand	c2f2c8e407	anv: Use different BOs for different scratch sizes and stages This solves a race condition where we can end up having different stages stomp on each other because they're all trying to scratch in the same BO but they have different views of its layout. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:39:45 -07:00
Jason Ekstrand	45c0f60999	genxml: Make ScratchSpaceBasePointer an address instead of an offset While we're here, we also fixup MEDIA_VFE_STATE and rename the field in 3DSTATE_VS on gen6-7.5 to be consistent with the others. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:39:42 -07:00
Jason Ekstrand	966bed17c1	anv: Add an allocator for scratch buffers Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:39:20 -07:00
Jason Ekstrand	89ded099f8	genxml: Put append counter fields before MCS in RENDER_SURFACE_STATE on gen7 The pack header generation scripts can't handle the case where you have two addresses in the same dword; they just take whatever is the last one. This meant that the MCS address wasn't properly getting handled. Since we don't care about append counters, we can just re-arrange the XML for now. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	d82322eb18	anv,isl: Lower storage image formats in anv ISL was being a bit too clever for its own good and lowering the format for us. This is all well and good if we always want to lower it. However, the GL driver selectively lowers the format depending on whether the surface is write-only or not. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	97f12773b8	isl/state: Allow for full 31-bit buffer texture sizes Ivy Bridge and above can handle up to 2^31 elements for RAW buffer surfaces. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	bb64e666ba	isl/state: Don't use designated initializers for buffer surface state Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	4061fde66e	isl/state: Add assertions for buffer surface restrictions Acked-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	ce24097abe	isl/state: Don't set SurfacePitch for gen9 1-D textures This field is ignored by the hardware in this case and, on very large 1-D textures, it can end up being larger than the maximum allowed value. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	f47e23a8b6	isl/state: Use TILEWALK_XMAJOR for linear surfaces on gen7 This matches better what happens on gen8 where the "Tiled Surface" and "Tile Walke" bits are combined into a single two-bit value. This is also more consistent with what the GL driver does. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	96706bad5f	isl/state: Emit no-op mip tail setup on SKL This hasn't ever been a problem in the past but it is recommended by the hardware docs. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	14d7c16e50	isl/state: Only set cube face enables if usage includes CUBE_BIT It seems safe to set it all the time, but this reduces the diff between the way i965 does it and what ISL does. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	5d24e9cfa1	isl/state: Use the layout for computing qpitch rather than dimensions For depth/stencil 1-D textures on SKL, we want them layed out in the old format that has been used since gen4. In order for the surface state fill-out code to handle, this it needs to distinguish based on layout rather than just dimensionality. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	6a43204afa	isl/state: Set the IntegerSurfaceFormat bit on Haswell This fixes 688 Vulkan CTS tests on Haswell. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	324103da75	isl/format: Mark R9G9B9E5 as containing 9-bit unsigned float channels Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	215282c9f4	isl/state: Don't set RenderTargetViewExtent for texture surfaces The docs specify that this only matters for render targets and surfaces used with typed dataport messages. On some platforms (gen4-6) the Depth field has more bits than RenderTargetViewExtent so we can have textures with more levels than we can render to. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	bb326f7b01	isl/state: Set SurfaceArray based on the surface dimension According to the PRM, you can't set SurfaceArray for 3D or buffer textures. There doesn't seem to be a good reason not to set it when we can. On the other hand, if we don't set it we can end up getting strange results for 1-layer array textures such as textureSize() returning the wrong results. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	d050ffbce9	isl/state: Don't force-disable L2 bypass for everything We already set the bit in the few cases where it's required by the docs so there's no need to set it all the time. This has no noticable perf impact for Dota 2 on Vulkan with the time demo I have. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	87f0ffa646	isl/state: Refactor the setup of clear colors This commit switches clear colors to use #if's instead of a C if. This lets us properly handle SNB where the clear color field doesn't exist. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	62a5e6e031	isl/state: Refactor the per-gen isl_to_gen_h/valign tables This moves the #if's around so that halign and valign have different sets of #if conditions. This also prepares us for SNB because isl_to_gen_halign is not defined at all on gen6. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	b1b0d6fb54	isl/state: Return an extent3d from the halign/valign helper Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	a60ae9e10a	isl/state: Put pitch calculations together This is purely cosmetic, but it makes things look a bit more readable. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	70c8afc0c8	isl/state: Put all dimension setup together and towards the top This is purely cosmetic, but it makes things look a bit more readable. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	e66e70ef47	isl/state: Put surface format setup at the top This is purely cosmetic, but it makes things look a bit more readable. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	39baea551f	isl/state: Remove some unused fields They're already zero-initialized and we have no plans of doing anything more interesting with them. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	caf2af4181	isl/state: Don't use designated initializers for the surface state While designated initializers are nice, they also force us to put some things in the initializer and some things later. Surface state setup is complicated enough that this really hurts readability in the long run. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	de1d194856	genxml/gen8,9: Prefix the multisample format enum with MSFMT This is what gen7 does and it's nice to have a prefix Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	320de71858	i965/blorp: Only set src_z for gen8+ 3D textures Otherwise, we end up with a bogus value in the third component. On gen6-7 where we always use 2D textures, this can cause problems if the SurfaceArray bit is set in the SURFACE_STATE. Acked-by: Chad Versace <chad.versace@intel.com>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	664dc89a1b	i965/gen7,8: Set SURFACE_IS_ARRAY for all non-3D texture types There's no real reason why we shouldn't set this bit. It does affect how the sampler operates a bit but since you can have a 2D non-array view of a 2D_ARRAY texture that distinction is very weak. Also, this is what ISL will do and we would like this change to be isolated from using ISL. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	2a1cc94d27	i965/gen4: Subtract 1 from buffer sizes The PRM states that the values put in Width, Height, and Depth should be various bits from the value size - 1. We seem to have done this wrong more-or-less from the start. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	e8580b8f98	i965: Remove fake W-tiled render target support This hasn't been used since `1cfb4bc890` where we deleted the meta stencil blit path. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	0195299c86	i965/fs: Use a default Y coordinate of 0 for TXF on gen9+ Previously, we were incrementing length but not actually putting anything in the Y coordinate. This meant that 1-D TXF operations had a garbage array index. If the surface is emitted as 1-D non-array, the coordinate gets discarded and it works fine. If it happens to be bound as an array surface, it may count as an out-of-bounds array access and you get zero. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	1436238b75	i965/gen8: Use the qpitch from the aux_mt for AUX_QPITCH Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	620f81d2ed	i965/blorp/gen8: Use the correct max level and layer in emit_surface_states We were adding in the base which is wrong because the values given in the miptree are relative to zero and not the base layer/level. Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	6ba88bce64	i965: Drop the maximum 3D texture size to 512 on Sandy Bridge The RenderTargetViewExtent field of RENDER_SURFACE_STATE is supposed to be set to the depth of a 3-D texture when rendering. Unfortunatley, that field is only 9 bits on Sandy Bridge and prior so we can't actually bind a 3-D texturing for rendering if it has depth > 512. On Ivy Bridge, this field was bumpped to 11 bits so we can go all the way up to 2048. On Iron Lake and prior, we don't support layered rendering and we use OffsetX/Y hacks to render to particular layers so 2048 is ok there too. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	0f9cd74aab	i965/gen4-6: Handle gl_texture_object::BaseLevel and MinLayer correctly This is basically a direct translation of what we do for gen7. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036 Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	ee39d3ba91	i965/gen4: Pull texture formats from the texture object not the miptree This makes texture views sort-of work. It doesn't add full texture view support for gen4-5 but it is enough to fix the GL_ARB_copy_image formats piglit test on Iron Lake. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036 Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Kenneth Graunke	77d6add00d	i965: Fix point size with tessellation/geometry shaders in GLES. Our previous code worked for desktop GL, and ES without geometry or tessellation shaders. But those features require fancier point size handling. Fortunately, we can use one rule for all APIs. Fixes a number of dEQP tests with EXT_tessellation_shader enabled: dEQP-GLES31.functional.tessellation_geometry_interaction.point_size.* Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-22 12:22:50 -07:00
Marek Olšák	5d85a21fee	.mailmap: fix my main address	2016-06-22 14:45:52 +02:00
Timothy Arceri	356ea9a8da	i965: move vs outputs written into a helper We will reuse this for fs key generation for the on disk shader cache. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-22 20:59:26 +10:00
Nicolai Hähnle	3948cd3797	st/mesa: use a single memcpy in st_ReadPixels when possible This avoids costly address recomputations, function overhead, and may trigger large copy optimizations. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-22 11:44:03 +02:00
Ilia Mirkin	36ed1b695e	glsl: only match gl_FragData and not gl_SecondaryFragDataEXT There's special logic around finding gl_FragData. It latches onto any array with FRAG_RESULT_DATA0. However gl_SecondaryFragDataEXT[], added by GL_EXT_blend_func_extended, fits those parameters as well. The real frag data array should have index 0 though, so we can use that to distinguish them. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96617 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-21 21:58:34 -04:00
Ilia Mirkin	1f4bca798d	nv50,nvc0: fix start_instance in manual push path The start instance is applied as an offset into the buffer directly, ignoring the divisor, not as an instance id offset that respects the divisor. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-21 21:50:16 -04:00
Ilia Mirkin	5b0d64886d	translate: fix start_instance parameter in sse version The generic version gets this right already, but this was using an incorrect formula in SSE. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-21 21:50:16 -04:00
Jason Ekstrand	35b53c8d47	anv/cmd: Dirty descriptor sets when a new pipeline is bound Ever since `c2581a9375`, the binding table layout has depended on the pipeline. This means that whenever we change pipelines we also need to re-emit binding tables for the new layout. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-21 16:45:25 -07:00
Jason Ekstrand	2bfe0c3374	anv/cmd: Move emit_descriptor_pointers to genX_cmd_buffer.c It's tiny and fully generic so there's really no reason for it to be in a gen7-specific file. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-21 16:45:25 -07:00
Jason Ekstrand	9df4d6bb36	anv/cmd: Move flush_descriptor_sets to anv_cmd_buffer.c There's no good reason for recompiling it Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-21 16:45:25 -07:00
Jason Ekstrand	295e03c980	spirv: Use the system value version of gl_FrontFace SPIR-V treats it as an input but NIR wants the system value. This shouldn't have been too much of a surprise given that we have to do the same conversion in the GLSL IR to NIR pass. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-21 16:45:25 -07:00
Kenneth Graunke	40013c5033	i965: Reorganize prog_data->total_scratch code a bit. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-21 10:24:45 -07:00
Marek Olšák	b16d21270f	radeonsi: add a debug flag for unsafe math LLVM optimizations Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-21 13:52:05 +02:00
Marek Olšák	70a25478fe	radeonsi: use u_blitter for mipmap generation This reduces time spend in glGenerateMipmap by a half. v2: don't decompress the levels to be overwritten Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-21 13:52:05 +02:00
Marek Olšák	5fed1122e8	gallium/u_blitter: implement mipmap generation for pipe_context::generate_mipmap first move some of the blit code from util_blitter_blit_generic to a separate function, then use it from util_blitter_generate_mipmap Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-21 13:52:05 +02:00
Nicolai Hähnle	3735a925ef	st/mesa: cache staging texture for glReadPixels v2: add ST_DEBUG flag for disabling (suggested by Ilia) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-06-21 11:02:41 +02:00
Nicolai Hähnle	a571859fc4	st/mesa: invalidate readpixels cache Whenever a draw happens or some other function call might change the result of future glReadPixels calls, we must invalidate the cache. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-21 10:54:19 +02:00
Nicolai Hähnle	615ba11563	st/mesa: add readpix_cache structure Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-21 10:54:16 +02:00
Nicolai Hähnle	b74c23138c	st/mesa: move ReadPixels blit into a separate function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-21 10:54:12 +02:00
Nicolai Hähnle	f9ddd52317	st/mesa: flush bitmap cache before CopyImageSubData Found by inspection. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-21 10:54:10 +02:00
Nicolai Hähnle	e7fff3cfe1	st/mesa: flush bitmap cache before texture functions As far as I can tell, a sequence of glBitmap followed by texture functions that refer to a texture bound as the framebuffer is well within what should be allowed. Found by inspection. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-21 10:54:08 +02:00
Nicolai Hähnle	c542b7e43d	st/mesa: flush bitmap cache before compute dispatch In the unlikely case that a program uses glBitmap to render to a framebuffer whose texture is bound in a compute shader. Found by inspection. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-21 10:54:00 +02:00
Timothy Arceri	644e015f0b	i965: get PrimitiveMode from the program rather than the shader struct This is more consistent with what we do elsewhere and will allow us to only cache one of the values in the shader cache. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-21 12:43:18 +10:00
Vedran Miletić	82e0bbd01a	clover: Fix build against clang SVN >= r273191 setLangDefaults() now requires PreprocessorOptions as an argument. Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-06-21 10:08:57 +09:00
Kenneth Graunke	cd89c834a8	i965: Fix multiplication of immediates on Cherryview/Broxton. Cherryview and Broxton don't support DW x DW multiplication. We have piles of code to handle this, but apparently weren't retyping in the immediate case. For example, tests/spec/arb_tessellation_shader/execution/dvec3-vs-tcs-tes makes the simulator angry about instructions such as: mul(8) r18<1>:D r10.0<8;8,1>:D 0x00000003:D Just retype to W or UW. It should be safe on all platforms. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-20 17:48:03 -07:00
Jason Ekstrand	eb6764c4a7	anv: Add proper support for depth clamping Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:04:08 -07:00
Jason Ekstrand	8a46b505cb	anv/cmd_buffer: Split emit_viewport in two Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:03:09 -07:00
Jason Ekstrand	20e95a746d	anv/cmd_buffer: Set depth/stencil extent based on the image It used to be based on the framebuffer which isn't quite right. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:03:05 -07:00
Jason Ekstrand	b65f2e4163	anv/cmd_buffer: Don't crash if push constants are provided for missing stages Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:03:02 -07:00
Jason Ekstrand	e6c2fe4519	anv/pipeline: Do invariance propagation on SPIR-V shaders Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:02:58 -07:00
Jason Ekstrand	bec07b7292	nir/alu_to_scalar: Respect the exact ALU operation qualifier Just setting builder->exact isn't sufficient because that only applies to instructions that are built with the builder but instructions created manually and only inserted using the builder are left alone. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:02:55 -07:00
Jason Ekstrand	202751fbb7	nir: Add a pass for propagating invariant decorations This pass is similar to propagate_invariance in the GLSL compiler. The real "output" of this pass is that any algebraic operations which are eventually consumed by an invariant variable get marked as "exact". Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:02:45 -07:00
Jason Ekstrand	68e308d853	nir/algebraic: Remove imprecise flog2 optimizations While mathematically correct, these two optimizations result in an expression with substantially lower precision than the original. For any positive finite floating-point value, log2(x) is well-defined and finite. More precisely, it is in the range [-150, 150] so any sum of logarithms log2(a) + log2(b) is also well-defined and finite as long as a and b are both positive and finite. However, if a and b are either very small or very large, their product may get flushed to infinity or zero causing log2(a * b) to be nowhere close to log2(a) + log2(b). This imprecision was causing incorrect rendering in Talos Principal because part of its HDR rendering process involves doing 8 texture operations, clamping the result to [0, 65000], taking a dot-product with a constant, and then taking the log2. This is done 6 or 8 times and summed to produce the final result which is written to a red texture. In cases where you have a region of the screen that is very dark, it can end up getting a result value of -inf which is not what is intended. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96425 Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 11:56:57 -07:00
Ian Romanick	895f7ddfb5	i965: Delete redundant extension enables A nearly identical block already exists in the gen >= 6 block above. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-20 11:18:39 -07:00
Ian Romanick	d3a5cae60a	mesa: Fix incorrect "see also" comments Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-20 11:18:39 -07:00
Ian Romanick	08cd234db8	mesa: Silence unused parameter warning main/pipelineobj.c: In function ‘delete_pipelineobj_cb’: main/pipelineobj.c:110:30: warning: unused parameter ‘id’ [-Wunused-parameter] delete_pipelineobj_cb(GLuint id, void data, void userData) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-20 11:18:38 -07:00
Rob Clark	64180de1bf	gallium: make image_view const Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-20 12:36:20 -04:00
Rob Clark	ef534b9389	gallium: make constant_buffer const Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-20 12:36:20 -04:00
Rob Clark	e1c1c40cbc	gallium: make shader_buffers const Be consistent with the rest of the "set_xyz" state interfaces. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-20 12:36:20 -04:00
Nicolai Hähnle	1167905c41	radeonsi: use trapezoid distribution for tess on Fiji and Polaris This yields a small performance improvement in Unigine Heaven. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-20 18:29:55 +02:00
Nicolai Hähnle	650137a9c8	radeonsi/sid: add Fiji+ tesselation distribution mode Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-20 18:29:15 +02:00
Nicolai Hähnle	32fd92e028	radeonsi: emit PA_SC_RASTER_CONFIG_1 only once It is the same for all SEs. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-20 18:28:34 +02:00
Nicolai Hähnle	c95175581e	radeonsi: fix calculation of valid RB mask per SE The old calculation treated too many RBs as disabled. Cc: 11.0 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-20 18:28:31 +02:00
Nicolai Hähnle	6c2e636982	radeonsi: raise SI_PM4_MAX_DW The old limit, introduced in commit `afa752d3f0`, was exceeded by 4 SE configurations which hit si_write_harvested_raster_configs. Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-20 18:28:17 +02:00
Roland Scheidegger	b0cf99165a	gallivm: don't use integer min/max sse intrinsics with llvm >= 3.9 Apparently, these are deprecated. There's some AutoUpgrade feature which is supposed to promote these to cmp/select, which apparently doesn't work with jit code. It is possible it's not actually even meant to work (see the bug filed against llvm which couldn't provide an answer neither) but in any case this is meant to be only temporary unless the intrinsics are really illegal. So, just use the fallback code (which should be cmp/select, we're actually doing cmp/sext/trunc/select, but in any case llvm 3.9 manages to optimize this back to pmin/pmax in the end). This addresses https://llvm.org/bugs/show_bug.cgi?id=28176 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Aaron Watry <awatry@gmail.com>	2016-06-20 17:19:03 +02:00
Ilia Mirkin	154c0a42a2	nvc0: don't make use of push hint if there are no non-const user vbos This makes the check match up what we do on nv50 as well - there's no point in switching over the push path if everything's in managed buffers. This can happen when a shader uses a vertex without an enabled array - we end up passing it a constant attribute. This also has the effect of "fixing" some flickering in Talos. I have no idea why. I've stared at the push logic forwards, backwards, and sideways. By always forcing the push path (which is slow), the flickering also goes away, but other rendering is still wrong (specifically draw 383068 as identified in the bug). However by not switching over to the push path, draw 383068 is correct. Note that other flickering remains in Talos, like the red/green walls/floors. This takes care of the shadow flickering though. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90513 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-19 10:14:57 -04:00
Ilia Mirkin	1804aa0b80	gk104/ir: fix tex use generation to be more careful about eliding uses If we have a loop, instructions before the tex might be added as tex uses, and those may in fact dominate all other uses of the tex results. This however doesn't mean that we don't need a texbar after the tex. Only check if uses dominate each other they are dominated by the tex. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96565 Fixes: `7752bbc44` (gk104/ir: simplify and fool-proof texbar algorithm) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-19 10:14:46 -04:00
Ilia Mirkin	194bcb49d1	nv50: add support for GL_EXT_window_rectangles Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-18 13:38:30 -04:00
Ilia Mirkin	b21a00d129	nvc0: add support for GL_EXT_window_rectangles Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-18 13:38:30 -04:00
Ilia Mirkin	d1bdc1238a	st/mesa: add support for GL_EXT_window_rectangles Make sure to pass the requisite information in draws, blits, and clears that work on the context's draw buffer. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-18 13:38:30 -04:00
Ilia Mirkin	07fcb06fe0	gallium: add PIPE_CAP_MAX_WINDOW_RECTANGLES to all drivers This says how many window rectangles are supported by the implementation, although it may not exceed PIPE_MAX_WINDOW_RECTANGLES. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-18 13:38:29 -04:00
Ilia Mirkin	82fab73246	gallium: add API for setting window rectangles Window rectangles apply to all framebuffer operations, either in inclusive or exclusive mode. They may also be specified as part of a blit operation. In exclusive mode, any fragment inside any of the specified rectangles will be discarded. In inclusive mode, any fragment outside every rectangle will be discarded. The no-op state is to have 0 rectangles in exclusive mode. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-18 12:59:12 -04:00
Ilia Mirkin	d68c1e2ac2	mesa: add GL_EXT_window_rectangles state storage/retrieval functionality Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-18 12:51:55 -04:00
Ilia Mirkin	78506ad246	glapi: add GL_EXT_window_rectangles entrypoints Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-18 12:51:55 -04:00
Samuel Pitoiset	b214e0d2fb	nv50/ir: add missing strings for some recent sysvals This is pretty useful for debugging purposes and those should not be omitted. Fixes: `517a93b3` ("nvc0: add ARB_shader_draw_parameters support") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-18 18:34:50 +02:00
Bruce Cherniak	6b0ac95c28	swr: Update screen->context pointer with multiple contexts. A pipe pointer in the screen allows for access to current device context in flush_frontbuffer and resource_destroy. This wasn't tracking current context in multi-context situations. v2: More caffeine. Corrected compare, removed unnecessary set of screen-pipe in create_context, and added a few comments.	2016-06-17 13:56:03 -05:00
Brian Paul	ace3124f22	scons: put the generated git_sha1.h file in top-level src/ directory To match what's done in the automake build. v2: Use git rev-parse to get a 10-character hash ID Fix Python imports Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-06-17 10:33:00 -06:00
Tim Rowley	5a64549f54	swr: switch from overriding -march to selecting features Acked-by: Chuck Atkins <chuck.atkins@kitware.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com>	2016-06-17 10:34:17 -05:00
Timothy Arceri	481e924951	mesa: remove remaining tabs in api_validate.c Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-06-17 22:07:21 +10:00
Samuel Iglesias Gonsálvez	bdab572a86	i965/fs: indirect addressing with doubles is not supported in CHV/BSW/BXT From the Cherryview's PRM, Volume 7, 3D Media GPGPU Engine, Register Region Restrictions, page 844: "When source or destination datatype is 64b or operation is integer DWord multiply, indirect addressing must not be used." v2: - Fix it for Broxton too. v3: - Simplify code by using subscript() and not creating a new num_components variable (Kenneth). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-17 11:33:18 +02:00
Iago Toral Quiroga	0177dbb6c2	i965/fs: Fix single-precision to double-precision conversions for CHV/BSW/BXT From the Cherryview PRM, Volume 7, 3D Media GPGPU Engine, Register Region Restrictions: "When source or destination is 64b (...), regioning in Align1 must follow these rules: 1. Source and destination horizontal stride must be aligned to the same qword. (...)" v2: - Fix it for Broxton too. v3: - Remove inst->regs_written change as it is not necessary (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-17 08:46:02 +02:00
Kenneth Graunke	48593eaf2d	docs: Mention GL_ARB_ES3_1_compatibility in release notes. Ilia reminded me that I forgot this.	2016-06-16 17:10:35 -07:00
Kenneth Graunke	a08a16541b	i965: Fix comment about CS scratch space encodings on Broadwell+. I typo'd this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-16 16:11:35 -07:00
Kenneth Graunke	93d8f80a9a	docs: Update ARB_ES3_1_compatibility status for i965.	2016-06-16 14:39:44 -07:00
Kenneth Graunke	1f9445ff57	i965: Drop perf_debug about rasterizer discard in SOL vs. clipper. I recently experimented with performing rasterizer discard in the SOL unit instead of the clipper, and as far as I can tell, it's basically the same performance. The clipper comes directly after SOL anyway, and setting the clipper to REJECT_ALL should be pretty darn cheap. Keep the perf_debug on Sandybridge, where the GS actually does work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-16 14:37:07 -07:00
Kenneth Graunke	32b1c0b694	i965: Enable GL_ARB_ES3_1_compatibility on Gen8+ if CS are available. There are almost no tests in any test suite, but what little I've found seems to work. Ilia believes everything is in place. v2: Predicate the enable on ES 3.1 being available (Gen8+) and also ARB_compute_shader being available (requested by Ilia). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-16 14:33:24 -07:00
Ian Romanick	6bec55a780	mesa: If validation fails in a debug context just emit a debug message There are quite a few pipelines that desktop applications (including a bunch of piglit test) can expect to have run but don't meet the GLES requirements. Instead of failing validation, just emit a debug message. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-16 09:33:54 -07:00
Ian Romanick	9c87282041	glsl: Always strip arrayness in precision_qualifier_allowed Previously some callers of precision_qualifier_allowed would strip the arrayness from the type and some would not. As a result, some places would not notice that float[6], for example, needed a precision qualifier. Fixes the new piglit test no-default-float-array-precision.frag. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-16 09:33:53 -07:00
Jose Fonseca	d04f652b75	mesa/main: Update _mesa_new_shader. Left over from `31dee99e05`. It should fix Clang Windows build. Trivial.	2016-06-16 15:22:37 +01:00
Christian König	6d877d7121	st/vdpau: we support lumakeying now Signed-off-by: Christian König <christian.koenig@amd.com>	2016-06-16 09:41:13 +02:00
Christian König	bf89e672cf	vl: support luma keying for interlaced surfaces as well We had the CSC code twice in there, factor it out into a separate function. Signed-off-by: Christian König <christian.koenig@amd.com>	2016-06-16 09:41:12 +02:00
Timothy Arceri	456b5d9ac9	i965: remove remaining tabs in brw_link.cpp Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-16 16:24:19 +10:00
Mathias Fröhlich	0e73d9454d	vbo: Use a bitmask to track the active arrays in vbo_save*. The use of a bitmask makes functions iterating only active attributes less visible in profiles. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	bc4e0c4868	vbo: Use a bitmask to track the active arrays in vbo_exec*. The use of a bitmask makes functions iterating only active attributes less visible in profiles. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	22e5d4a1ee	mesa: Use bitmask/ffs to iterate the active_samplers bitmask. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	34f741b080	mesa: Use bitmask/ffs to iterate the enabled textures. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	11a5b776c2	mesa: Use designated bool value to check texture unit completeness. The change helps to use the bitmask/ffs in the next change. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	c14ec9aafa	mesa: Use bitmask/ffs to iterate SamplersUsed Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	53691b7cb1	i965: Use bitmask/ffs to iterate used vertex attributes. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	b670f0d1d7	i965: Use bitmask/ffs to iterate enabled clip planes. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	a0fe569e53	radeon/r200: Use bitmask/ffs to iterate enabled clip planes. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	dc9e604ef1	mesa: Use bitmask/ffs to iterate enabled clip planes. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	d8a3ac90df	mesa: Use bitmask/ffs to iterate color material attributes. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	d4eb2f9cda	mesa: Use bitmask/ffs to build ff fragment shader keys. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. The bitmask used here for iteration is a combination of different enabled masks present for texture units. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	3ee409bebf	mesa: Use bitmask/ffs to build ff vertex shader keys. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. The bitmask used here for iteration is a combination of different enabled masks present for texture units. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	b5820759de	mesa: Remove the linked list of enabled lights Clean up after conversion to bitmasks. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	21f7f67685	mesa: Switch to bitmask based enabled lights in gen_matypes.c Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	f0391ba6c1	radeon/r200: Use bitmask/ffs to iterate enabled lights Replaces a loop that iterates all lights and test which of them is enabled by a loop only iterating over the bits set in the enabled bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	f69a400513	nouveau: Use bitmask/ffs to iterate enabled lights Replaces a loop that iterates all lights and test which of them is enabled by a loop only iterating over the bits set in the enabled bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	9a3fcb010c	tnl: Use bitmask/ffs to iterate enabled lights Replaces loops that iterate all lights and test which of them is enabled by a loop only iterating over the bits set in the enabled bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	664aec4370	mesa: Use bitmask/ffs to iterate enabled lights for ff shader keys. Replaces a loop that iterates all lights and test which of them is enabled by a loop only iterating over the bits set in the enabled bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	ccb1be2fab	mesa: Use bitmask/ffs to iterate enabled lights Replaces loops that iterate all lights and test which of them is enabled by a loop only iterating over the bits set in the enabled bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	b60c730235	mesa: Track enabled lights in a bitmask This enables some optimizations afterwards. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	6749d77c69	mesa: Rename CoordReplaceBits back to CoordReplace. It used to be called like that and fits better with 80 columns. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	291f00fa12	mesa: Remove the now unused CoordsReplace array. Now that all users are converted, remove the array. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	d19c69659a	i965: Convert i965 to use CoordsReplaceBits. Switch over to use the CoordsReplaceBits bitmask. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	97f67be0a7	i915: Convert i915 to use CoordsReplaceBits. Switch over to use the CoordsReplaceBits bitmask. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:52 +02:00
Mathias Fröhlich	8e01fd6396	r200: convert r200 to use CoordsReplaceBits. Switch over to use the CoordsReplaceBits bitmask. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:52 +02:00
Mathias Fröhlich	da79d76503	gallium: Convert the state_tracker to use CoordsReplaceBits. Switch over to use the CoordsReplaceBits bitmask. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:52 +02:00
Mathias Fröhlich	664ba9ccc9	swrast: Convert swrast to use CoordsReplaceBits. Switch over to use the CoordsReplaceBits bitmask. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:52 +02:00
Mathias Fröhlich	1c78515d93	mesa: Add gl_point_attrib::CoordReplaceBits bitfield. The aim is to replace the CoordReplace array by a bitfield. Until all drivers are converted, establish the bitfield in parallel to the CoordReplace array. v2: Fix bitmask logic. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:52 +02:00
Timothy Arceri	31dee99e05	mesa/glsl: stop using GL shader type internally Instead use the internal gl_shader_stage enum everywhere. This makes things more consistent and gets rid of unnecessary conversions. Ideally it would be nice to remove the Type field from gl_shader altogether but currently it is used to differentiate between gl_shader and gl_shader_program in the ShaderObjects hash table. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-16 10:45:35 +10:00
Brian Paul	bb1292e226	auxilary/os: allow appending to GALLIUM_LOG_FILE If the log file specified by the GALLIUM_LOG_FILE begins with '+', open the file in append mode. This is useful to log all gallium output for an entire piglit run, for example. v2: put GALLIUM_LOG_FILE support inside an #ifdef DEBUG block. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-06-15 17:16:42 -06:00
Chad Versace	c99a0a8bce	anv: Fix a harmless overflow warning anv_pipeline_binding::index is a uint8_t, but some code assigned to it UINT16_MAX. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewd-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-15 15:34:13 -07:00
Rob Herring	067c5b10b6	vc4: fix vc4_resource_from_handle() stride calculation The expected stride calculation is completely wrong. It should ultimately be multiplying cpp and width rather than dividing. The width also needs to be aligned to the tiling width first before converting to stride bytes. The whole stride check here is possibly pointless. Any buffers which were allocated outside of vc4 may have strides with larger alignment requirements. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-06-15 14:54:38 -07:00
Kenneth Graunke	c319512e16	i965: Use a uniform for gl_PatchVerticesIn in the TCS on Gen8+. We still need to recompile the passthrough shader when this value changes, as it also affects the output vertex count. But otherwise, we can eliminate recompiles on Gen8+. We probably want to do this for Gen7 as well, but that requires rewriting the input release code to use a loop, which is a trade-off I'd need to consider in more detail. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2016-06-15 12:47:37 -07:00
Kenneth Graunke	2b867264d2	glsl: Optionally lower TCS gl_PatchVerticesIn to a uniform. i965 has no special hardware for this, so the best way to implement this is to pass it in via a uniform. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2016-06-15 12:47:37 -07:00
Kenneth Graunke	1bc194cd64	i965: Use a uniform for gl_PatchVerticesIn in the TES. Fixes three GL44-CTS.tessellation_shader subtests: - max_patch_vertices - single.max_patch_vertices - tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn These use gl_PatchVerticesIn in the TES, but don't link against a TCS (which would allow the linker to lower it to a constant). We had no handling for the system value in the backend, so it would just assert fail. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2016-06-15 12:44:44 -07:00
Kenneth Graunke	0be2105137	glsl: Optionally lower TES gl_PatchVerticesIn to a uniform. i965 has no special hardware for this, so we need to pass this value in as a uniform (unless the TES is linked against a TCS, in which case the linker can just replace this with a constant). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2016-06-15 12:44:09 -07:00
Marek Olšák	d794072b3e	winsys/radeon: use the common job queue for multithreaded command submission v2 v2: fixup after renaming to util_queue_fence Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-15 21:07:34 +02:00
Marek Olšák	562cb03d76	gallium/util: import the multithreaded job queue from amdgpu winsys (v2) v2: rename the event to util_queue_fence Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-15 21:07:34 +02:00
Nicolai Hähnle	44e0c0e6ec	radeonsi: fix undefined left-shift into sign bit Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-15 09:27:56 +02:00
Nicolai Hähnle	494e4b8976	st_glsl_to_tgsi: don't read potentially uninitialized buffer variable Found by -fsanitize=undefined. Note that this should be a harmless issue in practice because the inst->op check always dominates anyway. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-15 09:27:40 +02:00
Nicolai Hähnle	6510e07345	mesa/main: fix integer overflows in _mesa_image_offset Found using -fsanitize=undefined. Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-15 09:27:30 +02:00
Timothy Arceri	a8a9d1bf41	i965: remove type_size_vec4_times_4() type_size_vec4_times_4() was introduced as a fix in `8dcf807cb4` however since `3810c1561` we can just use type_size_scalar() and get the actual number of outputs we need. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-15 15:01:10 +10:00
Kenneth Graunke	8b408972ff	mesa: Pass gl_constant_value union into _mesa_fetch_state(). We've had some trouble in the past with copying integers around via float pointers, as the C compiler sometimes uses x87 floating point registers to load values on 32-bit systems. Passing the gl_constant_value union should be safer. To avoid churn, this patch creates a "GLfloat *value" variable so existing uses can stay the same. Not observed to fix anything, but I was in the area adding more integer state vars, and thought it'd be wise. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: mesa-stable@lists.freedesktop.org	2016-06-14 16:09:57 -07:00
Marek Olšák	6ef50efc10	gallium/radeon: num-cs-flushes query should display per-frame average Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Marek Olšák	4140afd04b	gallium/radeon: add driver queries for compute/dma call stats and spills also print the average count per frame Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Marek Olšák	8fc688c303	radeonsi: don't generate "ret void undef" Use LLVMBuildRetVoid in epilogs and the GS copy shader and si_llvm_build_ret otherwise. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-14 20:22:16 +02:00
Marek Olšák	4eea710b0d	radeonsi: try to hit direct hw MSAA resolve by changing micro mode in clear We could also do MSAA resolve in a compute shader like Vulkan and remove these workarounds. v2: comment the magic numbers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Marek Olšák	373060652c	radeonsi: clarify the MSAA resolve limitation with scanout this is the correct hw requirement Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Marek Olšák	789618e3b4	gallium/radeon: add micro_tile_mode to radeon_surf for easier access Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Gurchetan Singh	63c5d5c6c4	Added pbuffer hooks for surfaceless platform This change enables the creation of pbuffer surfaces on the surfaceless platform. v3: Going back to single-buffered pbuffer plus additional code review changes Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-14 08:51:02 -07:00
Roland Scheidegger	afbf5888f5	gallium/util: don't use blocksize for minify for assertions The previous assertions required for texture sizes smaller than block_size that src_box.x + src_box.width still be block size. (e.g. for a texture with width 3, and src_box.x = 0, src_box.width would have to be 4 to not assert.) This caused some assertions with some other state tracker. It looks though like callers aren't expected to round up widths to block sizes (for sizes larger than block size the assertion would still have verified it wouldn't have been rounded up) so we simply shouldn't use a minify which rounds up to block size. (No piglit change with llvmpipe.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-14 17:03:34 +02:00
Roland Scheidegger	f4184d5450	llvmpipe: hack-fix bugs due to bogus bind flags The gallium contract would be that bind flags must indicate all possible bindings a resource might get used, but fact is the mesa state tracker does not set bind flags correctly, and this is more or less unfixable due to GL. This caused a bug with piglit arb_uniform_buffer_object-rendering-dsa since `6e6fd911da` - the commit is correct, but it caused us to miss updates to fs UBOs completely, since the corresponding buffer didn't have the appropriate bind flag set (thus we wouldn't check if it is indeed currently bound). See the discussion about this starting here: https://lists.freedesktop.org/archives/mesa-dev/2016-June/119829.html So, update the bind flags when we detect such usage. Note we update this value for now only in places which matter for us - that is creating sampler/surface view, or binding constant buffer. There's plenty more places (setting streamout buffers, vertex/index buffers, ...) where things can be set with the wrong bind flags, but the bind flags there never matter. While here also make sure we only set dirty constant bit when it's a fs constant buffer - totally doesn't matter if it's vs/gs. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-06-14 17:03:34 +02:00
Rob Clark	243417810b	freedreno: support start param for sampler views/states Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-14 11:00:59 -04:00
Rob Clark	b8eb1493a9	freedreno: only do extra vertex-buffer state logic on a2xx Possibly this should move into an fd2 wrapper fxn, similar to the texture state tracking done for fd3/fd4 (clamp emulation, etc) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-14 11:00:59 -04:00
Rob Clark	26d0efa9ce	freedreno: use util_copy_constant_buffer() helper Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-14 11:00:59 -04:00
Nayan Deshmukh	fdec8f9e42	st/vdpau: replace 0.f and 1.f with 0.0f and 1.0f respectively Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-06-14 15:32:04 +01:00
Tomasz Figa	e7ab358e81	i965: Check return value of screen->image.loader->getBuffers (v2) The images struct is an uninitialized local variable on the stack. If the callback returns 0, the struct might not have been updated and so should be considered uninitialized. Currently the code ignores the return value, which (depending on stack contents) might end up in reading a non-zero value from images.image_mask and dereferencing further fields. Another solution would be to initialize image_mask with 0, but checking the return value seems more sensible and it is what Gallium is doing. v2: fix typos in commit message, fix indentation, remove unnecessary parentheses and pointer dereference to keep line length reasonable. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-14 15:32:04 +01:00
Michel Dänzer	9ee3f097b6	st/dri: Clear drawable texture_mask in dri2_invalidate_drawable This makes sure that dri_set_tex_buffer2 -> dri_drawable_validate_att will re-create the front left attachment buffer after the drawable got invalidated. Fixes window contents not updating until the window is resized when using DRI2 PRIME. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-14 18:16:54 +09:00
Eduardo Lima Mitev	a93bb2e33f	glsl/builtin_variables: Populate MaxCombinedShaderStorageBlocks on GLSL 4.40 Built-in variable "MaxCombinedShaderStorageBlocks" was added to GLSL 4.40 revision 9. Section "1.2.1 Changes since revision 8 of GLSL version 4.40", page 3 of the PDF states: "Bug 11734: Add gl_MaxCombinedShaderOutputResources and mark gl_MaxCombinedImageUnitsAndFragmentOutputs as deprecated." Fixes: GL44-CTS.shader_image_load_store.basic-glsl-const Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-14 10:21:26 +02:00
Julien Isorce	1cdb4da1d6	st/va: ensure linear memory for dmabuf In order to do zero-copy between two different devices the memory should not be tiled. Tested with GStreamer on a laptop that has 2 GPUs: 1- gstvaapidecode: HW decoding and dmabuf export with nouveau driver on Nvidia GPU. 2- glimagesink: EGLImage imports dmabuf on Intel GPU. TEST: DRI_PRIME=1 gst-launch vaapidecodebin ! glimagesink Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-14 08:40:33 +01:00
Dylan Baker	5a87bc7181	isl: Replace bash generator with python generator This replaces the current bash generator with a python based generator using mako. It's quite fast and works with both python 2.7 and python 3.5, and should work with 3.3+ and maybe even 3.2. It produces an almost identical file except for a minor layout changes, and the addition of a "generated file, do not edit" warning. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 22:40:52 -07:00
Mathias Fröhlich	ed2dae86ae	mesa: Make use of u_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-14 05:19:10 +02:00
Mathias Fröhlich	c3b6656676	mesa/gallium: Move u_bit_scan{,64} from gallium to util. The functions are also useful for mesa. Introduce src/util/bitscan.{h,c}. Move ffs function implementations from src/mesa/main/imports.{h,c}. Move bit scan related functions from src/gallium/auxiliary/util/u_math.h. Merge platform handling with what is available from within mesa. v2: Try to fix MSVC compile. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-14 05:19:10 +02:00
Aaron Watry	fafe026dbe	clover: Include generated sources in AM_CPPFLAGS git_sha1.c is generated in $(top_builddir)/src. Fixes out-of-tree builds since `4825264f75`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96516 Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-06-14 12:04:42 +09:00
Stephan Bergmann	0140938b26	nv50/ir: make Graph destructor virtual Avoid ASan new-delete-type-mismatch when Function::domTree is created as DominatorTree in Function::convertToSSA but destroyed only as base Graph in ~Function. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-13 22:55:11 -04:00
Jason Ekstrand	be32a21327	i965/compiler: Bring back the INTEL_PRECISE_TRIG environment variable This was removed in `d9546b0c5d` and replced with the precise_trig driconf option. However, we still need precise trig in the Vulkan driver so this commit brings back the environment variable and compiler->precise_trig is effectively the logical OR of the two. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96484 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-13 19:54:47 -07:00
Samuel Iglesias Gonsálvez	a0ed8503b7	i965: Defeat the register stride checker in pull uniform messages. Pulling DF uniforms from pull constant buffer generates messages like: send(4) g12<1>DF g12<0,1,0>F sampler ld SIMD4x2 Surface = 1 Sampler = 0 mlen 1 rlen 1 which produces GPU hangs in Cherryview/Braswell: "For 64-bit Align1 operation or multiplication of dwords in CHV, source horizontal stride must be aligned to qword." This seems to be documented in the Cherryview PRM, Volume 7, Page 843: "When source or destination datatype is 64b or operation is integer DWord multiply, regioning in Align1 must follow these rules: 1. Source and Destination horizontal stride must be aligned to the same qword." We should set the destination type to UD, D, or F so that the register stride checker doesn't notice. The destination type of send messages is basically irrelevant anyway. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-13 19:36:59 -07:00
Kenneth Graunke	ed3ba651f6	i965: Defeat the register stride checker in URB reads. Pulling DF inputs from the URB generates messages like: send(8) g23<1>DF g1<8,8,1>UD urb 3 SIMD8 read mlen 1 rlen 2 { align1 1Q }; which makes the simulator angry: "For 64-bit Align1 operation or multiplication of dwords in CHV, source horizontal stride must be aligned to qword." This seems to be documented in the Cherryview PRM, Volume 7, Page 823: "When source or destination datatype is 64b or operation is integer DWord multiply, regioning in Align1 must follow these rules: 1. Source and Destination horizontal stride must be aligned to the same qword." Setting the source horizontal stride to QWord is insane, as it's the message header containing 8 URB handles in a single 32-bit DWord. Instead, we should whack the destination type to UD, D, or F so that the register stride checker doesn't notice. The destination type of send messages is basically irrelevant anyway. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-13 19:36:46 -07:00
Kenneth Graunke	9f37df06da	i965: Fix issues with number of VS URB entries on Cherryview/Broxton. Cherryview/Broxton annoyingly have a minimum number of VS URB entries of 34, which is not a multiple of 8. When the VS size is less than 9, the number of VS entries has to be a multiple of 8. Notably, BLORP programmed the minimum number of VS URB entries (34), with a size of 1 (less than 9), which is invalid. It seemed like this could be a problem in the regular URB code as well, so I went ahead and updated that to be safe. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-13 19:35:52 -07:00
Timothy Arceri	b010fa8567	glsl: make sure UBO arrays are sized in ES This check was removed in `5b2675093e` add it back in. Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> https://bugs.freedesktop.org/show_bug.cgi?id=96349	2016-06-14 11:33:24 +10:00
Vedran Miletić	4825264f75	clover: Update OpenCL version string to match OpenGL Change MESA into Mesa in CL_PLATFORM_VERSION and CL_DEVICE_VERSION. For both, always append git version suffix from git_sha1.h. v5: move semicolon to same line as MESA_GIT_SHA1. v4: drop #ifdef guards. v3: add missing include. v2: change CL_DEVICE_VERSION as well. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-06-13 15:55:59 -07:00
Francisco Jerez	bd9f972651	i965/fs: Fix regs_written for SIMD-lowered instructions some more. ISTR having suggested this during review of the recent FP64 changes to the SIMD lowering pass, but it doesn't look like it was taken into account in the end. Using the fs_reg::component_size helper instead of this open-coded variant makes sure that the stride is taken into account correctly. Fixes at least the following piglit tests with spilling forced on (since otherwise regs_written would be calculated incorrectly and the spilling code would be rather confused about how much data needs to be spilled): spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-shader spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-mixed-shader Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-13 15:55:59 -07:00
Francisco Jerez	a84b5d43e2	i965: Fix cross-primitive scratch corruption when changing the per-thread allocation. I haven't found any mention of this in the hardware docs, but experimentally what seems to be going on is that when the per-thread scratch slot size is changed between two pipelined draw calls, shader invocations using the old and new scratch size setting may end up being executed in parallel, causing their scratch offset calculations to be based in a different partitioning of the scratch space, which can cause their thread-local scratch space to overlap leading to cross-thread scratch corruption. I've been experimenting with alternative workarounds, like emitting a PIPE_CONTROL with DC flush and CS stall between draw (or dispatch compute) calls using different per-thread scratch allocation settings, or avoiding reuse of the scratch BO if the per-thread scratch allocation doesn't exactly match the original. Both seem to be as effective as this workaround, but they have potential performance implications, while this should be basically for free. Fixes over 40 failures in our CI system with spilling forced on (including CTS, dEQP and Piglit failures) on a number of different platforms from Gen4 to Gen9. The 'glsl-max-varyings' piglit test seems to be able to reproduce this bug consistently in the vertex shader on at least Gen4, Gen8 and Gen9 with spilling forced on. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-13 15:55:58 -07:00
Francisco Jerez	d960284e44	i965: Keep track of the per-thread scratch allocation in brw_stage_state. This will be used to find out what per-thread slot size a previously allocated scratch BO was used with in order to fix a hardware race condition without introducing additional stalls or memory allocations. Instead of calling brw_get_scratch_bo() manually from the various codegen functions, call a new helper function that keeps track of the per-thread scratch size and conditionally allocates a larger scratch BO. v2: Handle BO allocation manually instead of relying on brw_get_scratch_bo (Ken). Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-13 15:55:58 -07:00
Francisco Jerez	013ae4a70a	i965: Fix scratch overallocation if the original slot size was already a power of two. The bitwise arithmetic trick used in brw_get_scratch_size() to clamp the scratch allocation to 1KB has the unintended side effect that it will cause us to allocate 2x the required amount of scratch space if the original per-thread scratch size happened to be already a power of two. Instead use the obvious MAX2 idiom to clamp the scratch allocation to the expected range. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-13 15:55:58 -07:00
Kenneth Graunke	2df8f4a253	mesa: Make TexSubImage check negative dimensions sooner. Two dEQP tests expect INVALID_VALUE errors for negative width/height parameters, but get INVALID_OPERATION because they haven't actually created a destination image. This is arguably not a bug in Mesa, as there's no specified ordering of error conditions. However, it's also really easy to make the tests pass, and there's no real harm in doing these checks earlier. Fixes: dEQP-GLES3.functional.negative_api.texture.texsubimage3d_neg_width_height dEQP-GLES31.functional.debug.negative_coverage.get_error.texture.texsubimage3d_neg_width_height v2: Drop redundant check (caught by Anuj Phogat). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-06-13 15:38:47 -07:00
Brian Paul	cf9bb9acac	util: update some assertions in util_resource_copy_region() To cope with copies of compressed images which are not multiples of the block size. Suggested by Jose. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@sroland@vmware.com>	2016-06-13 13:30:19 -06:00
Kenneth Graunke	5a0d294d38	i965: Fix encode_slm_size() to take a generation, not a device info. In the Vulkan driver, we have the generation number (a compile time constant) but not necessarily the brw_device_info struct. I meant to rework the function to take a generation number instead of a brw_device_info pointer to accomodate this. But I forgot, and left it taking a brw_device_info pointer, while making Vulkan pass the generation number (8, 9, ...) directly. This led to crashes. Brown paper bag fix for commit `87d062a940`. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96504 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-13 12:23:11 -07:00
Kenneth Graunke	667e5cec76	i965: Don't leak scratch BOs for TCS/TES. These need to be freed too. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-13 12:22:06 -07:00
Nanley Chery	a4a5917248	anv/pipeline: Don't dereference NULL dynamic state pointers Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts of pCreateInfo members are moved to the earliest points at which they should not be NULL. This fixes a segfault seen in the McNopper demo, VKTS_Example09. v3 (Jason Ekstrand): - Fix disabled rasterization check - Revert opaque detection of color attachment usage Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-13 11:35:45 -07:00
Nanley Chery	a0d84a9ef9	anv: Document and rename anv_pipeline_init_dynamic_state() To reduce confusion, clarify that the state being copied is not dynamic. This agrees with the Vulkan spec's usage of the term. Various sections specify that the various pipeline state which have VkDynamicState enums (e.g. viewport, scissor, etc.) may or may not be dynamic. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-13 11:35:45 -07:00
Samuel Pitoiset	7f257abc1b	nvc0/ir: clamp the UBO index for compute on Kepler We already check that the address is not "too far", but we should also clamp the UBO index in order to avoid looking at the wrong place in the driver cb. This is a pretty rare situation though. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-13 20:12:48 +02:00
Marek Olšák	6e1b12c788	radeonsi: enable scratch coalescing This makes one particular compute shader 8x faster. Latest LLVM git is required. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-13 18:13:51 +02:00
Jimmy Berry	0c0f841e5d	st/va: hardlink driver instances to gallium_drv_video.so Removes the need to set LIBVA_DRIVER_NAME=gallium for supported targets and is consistent with vdpau and general gallium drivers. Note: some versions of libva can detect the gallium name and use the backend. Although that behaviour seems inconsistent since it only works for some platforms/backends. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:29 +01:00
Jan Vesely	1fb4179f92	vl: Fix trivial sign compare warnings v2: add whitepace fixes Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Jose Fonseca <jfonseca@vmware.com> [Emil Velikov: squash a few more whitespace issues] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:29 +01:00
Rob Herring	112e988329	Android: move libdrm settings to top-level Android.common.mk Fix warnings like these due to HAVE_LIBDRM being inconsistently defined: external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition] typedef struct drm_clip_rect drm_clip_rect_t; HAVE_LIBDRM needs to be set project wide to fix this. This change also harmlessly links libdrm with everything, but simplifies the makefiles a bit. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:29 +01:00
Rob Herring	54e550ab8a	Android: disable some noisy warnings Turn off warnings for -Wpointer-arith, -Wno-missing-field-initializers, -Wno-initializer-overrides, and -Wno-mismatched-tags. These are all deemed pointless, on purpose or no plans to fix. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:29 +01:00
Emil Velikov	db8790c0da	st/mesa: inline _mesa_create_context() into its only caller Inline the function into it's only caller. This way it's more obvious how the classic and gallium drivers (st/mesa) use _mesa_initialize_context. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:29 +01:00
Emil Velikov	a4fa8bf819	st/mesa: remove unneeded break from st_api_create_context() We have return on the previous line, thus the break will never be reached. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	6406bc1592	st/mesa: use c99 initializer for st_gl_api Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	15bc7856bf	gallium: remove st_api::get_proc_address hook It has been unused for a long time, plus makes the gallium dri modules require an extra glapi symbol relative to their classic counterparts. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	23a7fca6aa	mesa: remove _mesa_init_get_hash() The actual code of the function print_table_stats() is guarded by a ifdef GET_DEBUG, which was not been defined in years. The last fix in 2013 (`7db6b5aa91`) indicates that it's rarely used/tested. Since the issue has gone unnoticed for a whole year (broken with `2ad4a47547`). Let's remove it for now. We can always revive it at a later stage. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	b81685eb32	mesa: kill off _mesa_do_init_remap_table() ... and inline its contents in _mesa_init_remap_table(). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	bfbf286f7d	mesa: use native types when possible All of the functions and related data is internal, so there's no point if using the GL types. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	3f80c95f35	mesa: make _mesa_map_function_spec() static Used only locally. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	390678f27d	mesa: remove used _mesa_get_function_spec() and gl_function_remap Final user was killed with last commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	5b700059a8	mesa: remove unused _mesa_map_function_array() Unused as of commit `5a175127f3` ("dri: Remove all extension enabling utility functions") and the patch before the previous patch. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	5378ee8187	glapi: remap_helper.py: remove MESA_alt_functions The final user was nuked with last commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	b5dd8e0cf8	mesa: remove unused function _mesa_map_static_functions() Unused as of commit `5a175127f3` ("dri: Remove all extension enabling utility functions") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	07ae8c7df7	dri/common: remove unused libdri_test_stubs.la ... and associated file(s). No longer needed since commit `057259655e` ("i965: Don't link libmesa or libdri_test_stubs into tests") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-13 15:31:27 +01:00
Emil Velikov	fcb5a75a66	swr: automake: add missing -I flag When building from a release tarball (where the generated/built files are in srcdir) in an OOT fashion we need to have both builddir and srcdir in the includes list. Otherwise we'll error out, as the file (header gen_knobs.h in this case) won't be in the location where we are looking. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:24 +01:00
Emil Velikov	f4d26856df	automake: add SWR to `make distcheck' gallium drivers Will allows us to catch missing files and build issues before getting the tarball out for general consumption. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:24:44 +01:00
Emil Velikov	bab5ab6940	configure.ac: strip out the llvm-config -march/mtune flags Otherwise drivers such as SWR that depend on providing their own values will fail to build. v2: Add -mcpu for good measure (Chuck) Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chuck Atkins <chuck.atkins@kitware.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com>	2016-06-13 15:24:44 +01:00
Chuck Atkins	c86fcaca72	swr: Add missing headers for package inclusion CC: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:24:44 +01:00
Emil Velikov	8229fe68b5	automake: get in-tree `make distclean' working again. With earlier commit we've handled the `make distclean' out of tree build, yet we failed to attribute that for in-tree builds the test condition will return 1. Thus effectively the target will be considered as "failed". Fixes: `b7f7ec7843` ("mesa: automake: distclean git_sha1.h when building OOT") Cc: <mesa-stable@lists.freedesktop.org> Tested-by: Andy Furniss <adf.lists@gmail.com> Reported-by: Andy Furniss <adf.lists@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-06-13 15:24:44 +01:00
Jan Vesely	ace70aedcf	gallivm: Fix trivial sign warnings v2: include whitespace fixes Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-06-13 09:23:09 -04:00
Julien Isorce	a04804746f	st/va: use proper temp pipe_video_buffer template Instead of changing the format on the existing template which makes error handling not nice and confuses coverity. CoverityID: 1337953 Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-13 09:14:32 +01:00
Julien Isorce	6c43e0016e	st/va: it is valid to release the VABuffer of an exported resource pipe_resource_reference(&res, NULL) will decrement reference counting, i.e. p_atomic_dec(res->count). But the va surface still has the initial reference since it has created the resource. So calling vaDestroyImage on a derived image calls VaDestroyBuffer but the decrementation won't reach 0. It is just wrong for vlVaDestroyBuffer to rely on the export_refcount flag. Finally the vaapi intel driver has the same logic. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-13 09:14:32 +01:00
Timothy Arceri	30df78236c	glsl: fix component overlap validation for doubles This change makes sure to remove arrays when checking if type is a double. The check for the end of the first slot of a multi-slot double is also fixed by bumping the check to 4 rather than 3. Previously we were we not reserving the last component. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-12 21:56:32 +10:00
Timothy Arceri	ad3def919e	glsl: fix max varyings count for ARB_enhanced_layouts Since this extension allows more than one varying to share a single location we can't just count the number of slots a varying takes and add it to the total. Instead we now reuse the reserved varyings bitfield to determine how many slots are reserved for explicit locations instead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-12 21:56:28 +10:00
Kenneth Graunke	0fb85ac08d	i965: Use the correct number of threads for compute shaders. We were programming the number of threads per subslice, when we should have been programming the total number of threads on the GPU as a whole. Thanks to Curro and Jordan for helping track this down! On Skylake GT3e: - Improves performance in Unreal's Elemental Demo by roughly 1.5-1.7x. - Improves performance in Synmark's Gl43CSDof by roughly 3.7x. - Improves performance in Synmark's Gl43GSCloth by roughly 1.18x. On Broadwell GT2: - Improves performance in Unreal's Elemental Demo by roughly 1.2-1.5x. - Improves performance in Synmark's Gl43CSDof by roughly 2.0x. - Improves performance in Synmark's Gl43GSCloth by 1.47035% +/- 0.255654% (n=25). On Haswell GT3e: - Improves performance in Unreal's Elemental Demo (in GL 4.3 mode) by roughly 1.10x. - Improves performance in Synmark's Gl43CSDof by roughly 1.18x. - Decreases performance in Synmark's Gl43CSCloth by -1.99484% +/- 0.432771% (n=64). On Ivybridge GT2: - Improves performance in Unreal's Elemental Demo (in GL 4.2 mode) by roughly 1.03x. - Improves performance in Synmark's G/43CSDof by roughly 1.25x. - No change in Synmark's Gl43CSCloth (n=28). Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:40:15 -07:00
Kenneth Graunke	1db37ebecf	i965: Assert that the scratch spaces are in range. I don't know that anything actually guarantees this, but if we exceed the limits, we may end up overflowing and trashing random buffers that happen to be nearby in the VMA space, leading to rendering corruption, hangs, or worse. We should really fix this properly. However, the pitfall has existed for ages, so for now we should at least detect it. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:40:15 -07:00
Kenneth Graunke	a42a93dc12	i965: Fix CS scratch size calculations on Ivybridge and Baytrail. These are linear, not powers of two, and much more limited. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:40:14 -07:00
Kenneth Graunke	147a90d82a	i965: Fix Haswell CS per-thread scratch space encoding. Most scratch stages use power of two sizes, in kilobytes, where 0 means 1kB. But compute shaders on Haswell have a minimum of 2kB, and use a representation where 0 = 2kB. This meant that we were effectively telling the hardware to allocate each thread twice as much space as we meant to, while simultaneously not allocating that much space in the buffer, leading to overflows. Note that the existing code is completely wrong for Ivybridge, but that will take additional work to sort out, so I've left it as is for now. A subsequent commit will take care of that. Together with the previous patches, this fixes rendering corruption on Synmark's Gl43CSDof on Haswell. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:40:14 -07:00
Kenneth Graunke	a7d029d3df	i965: Account for poor address calculations in Haswell CS scratch size. Curro figured this out by investigating the simulator. Apparently there's also a workaround in the Windows driver. I'm not sure it's actually documented anywhere. We were underallocating the scratch buffer by a factor of 128/70. v2: Rename threads_per_subslice to scratch_ids_per_subslice (suggested by Jordan Justen). Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:39:45 -07:00
Kenneth Graunke	2213ffdb4b	i965: Allocate scratch space for the maximum number of compute threads. We were allocating enough space for the number of threads per subslice, when we should have been allocating space for the number of threads in the entire GPU. Even though we currently run with a reduced thread count (due to a bug), we might still overflow the scratch buffer because the address calculation is based on the FFTID, which can depend on exactly which threads, EUs, and threads are executing. We need to allocate enough for every possible thread that could run. Fixes rendering corruption in Synmark's Gl43CSDof on Gen8+. Earlier platforms need additional bug fixes. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:38:50 -07:00
Kenneth Graunke	9cd8f95809	i965: Set subslice_total on Gen7/7.5 platforms. We'll use this for compute shader thread counts and scratch space calculations shortly. Note that subslices are referred to as "half slices" on Ivybridge. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:38:47 -07:00
Kenneth Graunke	87d062a940	i965: Fix shared local memory size for Gen9+. Skylake changes the representation of shared local memory size: Size \| 0 kB \| 1 kB \| 2 kB \| 4 kB \| 8 kB \| 16 kB \| 32 kB \| 64 kB \| ------------------------------------------------------------------- Gen7-8 \| 0 \| none \| none \| 1 \| 2 \| 4 \| 8 \| 16 \| ------------------------------------------------------------------- Gen9+ \| 0 \| 1 \| 2 \| 3 \| 4 \| 5 \| 6 \| 7 \| The old formula would substantially underallocate the amount of space. This fixes GPU hangs on Skylake when running with full thread counts. v2: Fix the Vulkan driver too, use a helper function, and fix the table in the comments and commit message. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:38:26 -07:00
Ilia Mirkin	3f48548a6f	nv50: reinstate dedicated constbuf push path This was disabled due to occasionally incorrect behavior when trying to upload data. It later became apparent that nvc0 also had a similar but slightly different issue, which was resolved in commit `e50c01d5`. This takes the same logic as nvc0 and applies it to nv50 (which has somewhat different interfaces). Unfortunately I did not note down precisely what was broken with UBOs when removing the support from nv50, but I've tested a bunch of local traces, and none of them appear to regress. This should hopefully improve performance when UBOs are used, but this was not directly verified. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-11 12:18:43 -04:00
Ilia Mirkin	f47845596b	nv50: enable indirect addressing of fragment shader inputs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-11 11:50:42 -04:00
Ilia Mirkin	7d7e015381	mesa: add drawbuffer argument to ClearNamedFramebufferfi This was fixed in revision 47 of the ARB_dsa spec in Oct 22, 2015. Since it's horrible to have differing APIs across library versions, we should attempt to minimize the impact by backporting it as far as possible and hope no one notices. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 20:32:03 -04:00
Ilia Mirkin	92351a71a8	GL: update glcorearb.h to svn 32433 This brings in the fixed glClearNamedFramebufferfi definition, as well as a lot of GLsizei -> GLsizeiptr changes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 20:31:53 -04:00
Ilia Mirkin	f81374fd3e	GL: update glext to svn 32957 This brings in defines from GL_EXT_window_rectangles and fixes the glClearNamedFramebufferfi definition. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 20:24:53 -04:00
Brian Paul	5cfc91624c	docs: GL_ARB_copy_image done for softpipe, llvmpipe Signed-off-by: Brian Paul <brianp@vmware.com>	2016-06-10 15:50:55 -06:00
Brian Paul	e9b86bb92c	llvmpipe: turn on pipe cap for GL_ARB_copy_image support Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-10 15:50:04 -06:00
Brian Paul	2db747cf26	llvmpipe: don't use 3-component formats, except 32-bit x 3 formats This basically disallows all 8-bit x 3 and 16-bit x 3 formats for textures and render targets. Some 3-component formats were already disallowed before. This avoids problems with GL_ARB_copy_image. v2: the previous version of this patch disallowed all 3-component formats Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-10 15:50:04 -06:00
Brian Paul	672e92a146	softpipe: turn on pipe cap for GL_ARB_copy_image support Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-10 15:50:04 -06:00
Brian Paul	d8fe6332d8	softpipe: don't use 3-component formats Mesa and gallium don't have a complete set of matching 3-component texture formats. For example, 8-bit sRGB unorm. To fully support the GL_ARB_copy_image extension we need to have support for all of these formats: RGB8_UNORM, RGB8_SNORM, RGB8_SRGB, RGB8_UINT, and RGB8_SINT using the same component order. Since we don't have that, disable the 3-component formats for now. v2: Simplify 3-component format check, per Marek. Also check that target != PIPE_BUFFER. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-10 15:50:04 -06:00
Brian Paul	e295b4e800	st/mesa: tweak surface format mapping table 1. Try to choose R8G8B8A8 unorm/srgb formats before others in an effort to try to match component ordering for UINT/SINT/etc. 2. If we can't get a format such as PIPE_FORMAT_A16_UNORM, try PIPE_FORMAT_R16G16B16A16_UNORM before shallower formats. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-10 15:50:04 -06:00
Brian Paul	dd4be2e19a	util: update util_resource_copy_region() for GL_ARB_copy_image This primarily means added support for copying between compressed and uncompressed formats. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-10 15:50:04 -06:00
Anuj Phogat	466b320163	gallium: Fix region overlap conditions for rectangles with a shared edge >From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels": "The pixels corresponding to these buffers are copied from the source rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1) to the destination rectangle bounded by the locations (dstX0, dstY 0) and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive, while the upper bounds are exclusive." So, the rectangles sharing just an edge shouldn't overlap. ----------- \| \| ------- --- \| \| \| \| \| \| ------- --- Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-10 14:35:21 -07:00
Anuj Phogat	f8679badd4	mesa: Fix region overlap conditions for rectangles with a shared edge >From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels": "The pixels corresponding to these buffers are copied from the source rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1) to the destination rectangle bounded by the locations (dstX0, dstY 0) and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive, while the upper bounds are exclusive." So, the rectangles sharing just an edge shouldn't overlap. ----------- \| \| ------- --- \| \| \| \| \| \| ------- --- Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-10 14:35:21 -07:00
Dave Airlie	1584918996	gallivm: more 64-bit integer prep work. This converts one other place to using the new helper. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:44:30 +10:00
Dave Airlie	f550b6d296	radeonsi: convert to 64-bitness checks instead of doubles. This converts to testing for 64-bit types and renames some things in anticipation of 64-bit integer support. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:44:21 +10:00
Dave Airlie	e5c57824ec	gallivm: make non-float return code bitcast consistent. This just uses the same form across the fetches. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:44:17 +10:00
Dave Airlie	3b97e50b9a	gallium/gallivm: use 64-bit test instead of doubles. This just makes some generic code that currently emits double suitable for emitting 64-bit values. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:44:13 +10:00
Dave Airlie	213ab8db87	gallium/tgsi: add 64-bitness type check function. Currently this just doubles, but we'll convert users to this so making adding 64-bit integers easier. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:43:45 +10:00
Jason Ekstrand	8d37556ec9	anv/entrypoints: Rework #if guards This reworks the #if guards a bit. When Emil originally wrote them, he just guarded everything. However, part of what anv_entrypoints_gen.py generates is a hash table for looking up entrypoints based on their name. This table cannot get out of sync between C and python regardless of preprocessor flags. In order to prevent this, this commit makes us use void pointers in the dispatch table for those entrypoints which aren't available. This means that the dispatch table size and entry order is constant and it should never get out-of-sync with the python. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Emil Velikov <emil.velikov@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 13:21:07 -07:00
Jason Ekstrand	9ed0d9dd06	anv/entrypoints: Use the function pointer types provided by vulkan.h This is a bit cleaner than generating the types ourselves when making the table. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Emil Velikov <emil.velikov@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 13:21:07 -07:00
Nicolai Hähnle	42624ea837	st/mesa: use base level size as "guess" when available When an applications specifies mip levels _before_ setting a mipmap texture filter, we will initially guess a single texture level. When the second level image is created, we try to allocate the full texture -- however, we get the base level size guess wrong if that size is odd. This leads to yet another re-allocation of the texture later during st_finalize_texture. Even worse, this re-allocation breaks a (reasonable) assumption made by st_generate_mipmaps, because the re-allocation in the finalization call will again allocate a single-level pipe texture (based on the non-mipmap texture filter!). As a result, mipmap generation fails in interesting ways. All of this can be avoided by just using the fact that we already know the size of the base level. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95529 Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-10 20:20:39 +02:00
Jason Ekstrand	a1e69930e4	anv: Remove the PhysicalDeviceLimits FINISHME At this point, the limits are probably more-or-less correct. If there is an invalid limit, that's a bug not a FINSHME. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 09:43:45 -07:00
Jason Ekstrand	4f5bbf804b	anv/pipeline_cache: Allow for an zero-sized cache This gets ANV_ENABLE_PIPELINE_CACHE=false working again. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 09:43:10 -07:00
Jason Ekstrand	a1a25db699	anv/pipeline: Store the (set, binding, index) tripple in the bind map This way the the bind map (which we're caching) is mostly independent of the pipeline layout. The only coupling remaining is that we pull the array size of a binding out of the layout. However, that size is also specified in the shader and should always match so it's not really coupled. This rendering issues in Dota 2. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 09:43:07 -07:00
Jason Ekstrand	c13c5ac561	anv/descriptor_set: Ensure that bindings are always in increasing order Since applications are allowed to specify some set of bindings which need not be dense they also need not be in order. For most things, this doesn't matter, but it could result getting the wrong dynamic offsets. This adds a quick-and-dirty sort to ensure that everything is always in increasing order of binding index. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 09:43:03 -07:00
Jason Ekstrand	e2265926f2	anv/descriptor_set: Add a type field in debug builds This allows for some extra validation and makes it easier to see what's going on when poking around in gdb. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 09:42:59 -07:00
Jason Ekstrand	cd21015abd	anv/descriptor_set: Set array_size to zero for non-existant descriptors Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 09:42:45 -07:00
Leo Liu	2ad443e4cc	vl/dri3: support receiving new pixmap for front buffer With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets renewed in each frame, so when we receive a new pixmap, should get a new front buffer for it. This also fixes Totem player playback corruption. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 11:24:24 -04:00
Leo Liu	0ef8500aab	vl/dri3: get Makefile properly From original commit, the macro "if HAVE_DRI3" was in Makefile.sources, this file is shared with SCons, SCons is not able to parse this marco, the SCons build failed. Jose quickly gave two approaches and quick fix with his second approach, thanks Jose for the solutions and fixes. This patch is Jose's first approach, and it's more proper, because the dri3 c file should not be included to build when DRI3 is not enabled. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 11:24:19 -04:00
Jose Fonseca	2b4cee0571	gallivm: Never emit llvm.fmuladd on LLVM 3.3. Besides the old JIT bug, it seems the X86 backend on LLVM 3.3 doesn't handle llvm.fmuladd and instead it fall backs to a C function. Which in turn causes a segfault on Windows. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-10 16:17:04 +01:00
Jose Fonseca	320d1191c6	gallivm: Use llvm.fmuladd.*. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-10 13:47:35 +01:00
Jose Fonseca	9e8edfa190	util,gallivm: Explicitly enable/disable fma attribute. As suggested by Roland Scheidegger. Use the same logic as f16c, since fma requires VEX encoding. But disable FMA on LLVM 3.3 without MCJIT. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-10 13:47:35 +01:00
Bas Nieuwenhuizen	54f755fa0f	radeonsi: Reinitialize all descriptors in CE preamble. This fixes a problem with the CE preamble and restoring only stuff in the preamble when needed. To illustrate suppose we have two graphics IB's 1 and 2, which are submitted in that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we have a context switch at the start of IB 1, but not between IB 1 and IB 2. The old code put the CE RAM loads in the preamble of IB 2. As the preamble of IB 1 does not have the loads and the preamble of IB 2 does not get executed, the old values are not load into CE RAM. Fix this by always restoring the entire CE RAM. v2: - Just load all descriptor set buffers instead of load and store the entire CE RAM. - Leave the ce_ram_dirty tracking in place for the non-preamble case. v3: - Fixed parameter alignment. - Rebased to master (Nicolai's descriptor series). Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-10 12:18:29 +02:00
Jose Fonseca	f93c22109e	mesa: Wrap extensions.h declarations with extern "C". This should fix the MSVC linker failures that arose with commit `5e2d25894b`. Trivial.	2016-06-10 11:00:42 +01:00
Ilia Mirkin	f48f344700	st/mesa: fix type confusion with reladdrs The reality is that this doesn't matter, because we manually emit the ARL to the sampler reladdr, and those arguments don't get an extra load later, so it's effectively just a boolean. However having the types be wrong is confusing and could trigger very odd bugs should usage change down the line. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-09 21:01:53 -04:00
Dave Airlie	f140ed6d95	glsl/ir: remove TABs in ir_constant_expression.cpp Adding 64-bit integers support was going to make this file worse, just remove the tabs from it now. Acked-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-10 10:30:18 +10:00
Anuj Phogat	73a54e4892	i965/gen9: Don't change halign and valign to fit in fast copy blit An update in graphics specs has deleted the halign and valign fields from XY_FAST_COPY_BLT command. See mesa commit `97f0f91`. Cc: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-06-09 15:50:07 -07:00
Anuj Phogat	46c8967813	mesa: Add a helper function for shared code in get_tex_rgba_{un}compressed Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-09 15:50:07 -07:00
Samuel Pitoiset	5e2d25894b	mesa: Let compute shaders work in compatibility profiles The extension is already advertised in compatibility profile, but the _mesa_has_compute_shaders only returns true in core profile. If we advertise it, we should allow it to work. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-06-09 21:03:28 +02:00
Tim Rowley	2c85128e01	swr: implement clipPlanes/clipVertex/clipDistance/cullDistance v2: only load the clip vertex once v3: fix clip enable logic, add cullDistance v4: remove duplicate fields in vs jit key, fix test of clip fixup needed v5: fix clipdistance linkage for slot!=0,4 v6: support clip+cull; passes most piglit clip (failures understood) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-09 13:28:35 -05:00
Daniel Czarnowski	cf804b4455	glx: fix crash with bad fbconfig GLX documentation states: glXCreateNewContext can generate the following errors: (...) GLXBadFBConfig if config is not a valid GLXFBConfig Function checks if the given config is a valid config and sets proper error code. Fixes currently crashing glx-fbconfig-bad Piglit test. v2: coding style cleanups (Emil, Topi) use DefaultScreen macro (Emil) Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "11.2" <mesa-stable@lists.freedesktop.org>	2016-06-09 17:55:44 +03:00
Nayan Deshmukh	2d140ae70a	st/vdpau: implement luma keying Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-09 14:23:24 +02:00
Nayan Deshmukh	f24eb5a178	vl: Apply luma key filter before CSC conversion Apply the luma key filter to the YCbCr values during the CSC conversion in video buffer shader. The initial values of max and min luma are set to opposite values to disable the filter initially and will be set when enabling it. Add extra parmeters min and max luma for the luma key filter in vl_compositor_set_csc_matrix in va, xvmc. Setting them to opposite value 1.f and 0.f respectively won't effect the CSC conversion v2: -Squash 1,2 and 3 into one patch to avoid breaking build of other components. (Christian) -use ureg_swizzle. (Christian) -change name of the variables. (Christian) v3: -Squash all patches in one to avoid breaking of build. (Emil) -wrap functions properly. (Emil) -use 0.0f and 1.0f instead of 0.f and 1.f respectively. (Emil) v4: -Divide it in two patches one which introduces the functionality and assigs dummy values to the changed functions and second which implements the lumakey filter. (Christian) -use ureg_scalar instead ureg_swizzle. (Christian) Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-09 14:23:07 +02:00
Jason Ekstrand	037ce5d734	i965: Emit surface states for extra planes prior to gen8 When Kristian implemented GL_TEXTURE_EXTERNAL_OES, he hooked it up for gen8 but not for gen7 or earlier. It all works, we just need to emit the states for the extra planes. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-08 21:57:57 -07:00
Marc-André Lureau	dc81b3ad43	virgl: fix checking fences When calling virgl_fence_wait() with timeout=0, virgl_{drm,vtest}_resource_is_busy() is called. However, it returns TRUE for a busy resource, whereace virgl_fence_wait() should return TRUE for a completed (non-busy) resource. This fixes running supertuxkart in a VM (I could not reproduce locally with vtest though there is a similar fix) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 14:07:53 +10:00
Dave Airlie	15896a470b	glsl/types: rename is_dual_slot_double to is_dual_slot_64bit. In the future int64 support will have the same requirements. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 09:17:24 +10:00
Dave Airlie	45c901f7a3	st/glsl_to_tgsi: move to checking 64-bitness instead of double This uses the new types interfaces to check for 64-bit types, as futureproofing against int64 support. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:49 +10:00
Dave Airlie	bbbc45b8e1	st/glsl_to_tgsi: use enum glsl_base_type instead of unsigned This is just some better type safety that I noticed while working on 64-bit integer support. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:49 +10:00
Dave Airlie	152f5eea62	mesa: use new 64-bit checks instead of explicit double checks. This just moves to the new interfaces in advance of int64. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:47 +10:00
Dave Airlie	2df46519e4	glsl/link_varyings: switch to 64bit check instead of double. This is prep work for int64 support. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:43 +10:00
Dave Airlie	35616a9e0e	glsl: use new interfaces for 64-bit checks. This is just prep work for int64 support, changing places where 64-bit matters no doubles. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:19 +10:00
Dave Airlie	a82b8e8b36	compiler: use 64bit check for sizing instead of double check. This just moves code to the new check in advance of int64 support. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:15 +10:00
Dave Airlie	246518154e	compiler/types: add 64-bitness queries. This adds an inline and type query for if a type is 64-bit. Fow now this is equivalent to double, but int64 will change this. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:04 +10:00
Adam Jackson	a1c5cd426c	glapi/glx: Add overflow checks to the client-side indirect code Coverity complains that the computed sizes can lead to negative lengths passed to memcpy. If that happens we've been handed invalid arguments anyway, so just bomb out. The funky "0%s" is because the size string for the variable-length part of the request is of the form "+ safe_pad() ...", and a unary + would coerce the result to always be positive, defeating the overflow check. Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-08 14:39:46 -04:00
Marek Olšák	26b69ad250	radeonsi: improve the computation and comment of scratch_waves 2% isn't much. If you think the number should be decreased, please speak up. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-08 19:28:25 +02:00
Marek Olšák	1d9c1d9386	radeonsi: print the number of spilled VGPRs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-08 19:28:25 +02:00
Marek Olšák	2b18d67a1e	gallium/radeon: remove dead code creating LLVMTargetMachine This was for some old unsupported LLVM version. Only si_create_context creates the target machine now. r600g doesn't use this function. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-08 19:23:42 +02:00
Marek Olšák	a343ab55f7	radeonsi: don't enable scratch just for SGPR spills Diff from shader-db: Scratch: 3221504 -> 17408 (-99.46 %) bytes per wave v2: add "break;" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-08 19:23:41 +02:00
Marek Olšák	55b097d004	st/mesa: try not to compile compute shader on the first use Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-08 19:23:41 +02:00
Marek Olšák	95288277d5	Revert "radeonsi: allow direct hw MSAA resolve for scanout surfaces" This reverts commit `ffd54d1936`. No, it doesn't work. The test case is "glxgears -samples 2".	2016-06-08 19:21:55 +02:00
Nicolai Hähnle	bd5c41fe5f	st/mesa: directly compute level=0 texture size in st_finalize_texture The width0/height0/depth0 on stObj may not have been set at this point. Observed in a trace that set up levels 2..9 of a 2d texture, and set the base level to 2, with height 1. This made the guess logic always bail. Originally investigated by Ilia Mirkin, this patch gets rid of the somewhat redundant storage of width0/height0/depth0 and makes sure we always compute pipe texture sizes that are compatible with the base level image of the GL texture. Fixes the gl-1.2-texture-base-level piglit test provided by Brian Paul. v2: - try to re-use an existing pipe texture when possible - handle a corner case where the base level is not level 0 and it is of size 1x1x1 v3: - ptHeight = ptWidth in cube map 1x1 case (suggested by Brian) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-08 19:12:07 +02:00
Timothy Arceri	8c3ecde0e1	glsl: stop allocating memory for SSBOs and builtins This just stops counting and assigning a storage location for these uniforms, the count is only used to create the uniform storage. These uniform types don't use this storage. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-08 13:19:32 +10:00
Ilia Mirkin	6e6fd911da	st/mesa: use buffer usage history to set dirty flags for revalidation We were previously unconditionally doing this for arrays and ubo's, and ignoring texture/storage/atomic buffers. Instead use the usage history to determine which atoms need to be revalidated. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-07 22:27:04 -04:00
Gurchetan Singh	d9546b0c5d	i965: Integrate precise trig into configuration infrastructure With this change, to enable precise SIN and COS instructions on Intel hardware, one can put <option name="precise_trig" value="true"/> in the proper drirc file. V2: Make option name more generic Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com>	2016-06-07 15:42:21 -07:00
Marek Olšák	f39439d166	radeonsi: re-enable PBO ReadPixels acceleration disabled by `4f1cccf570` Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-08 00:22:45 +02:00
Marek Olšák	7c6e88b643	radeonsi: allow MSAA resolving into a texture that has DCC enabled Since DCC is enabled almost everywhere now, it's important not to disable this fast path. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	9a472a3e0b	gallium/radeon: move DCC clearing into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	ffd54d1936	radeonsi: allow direct hw MSAA resolve for scanout surfaces No idea why this was disabled, but it works fine. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	4be46c7d9d	radeonsi: don't allocate DCC for the temporary MSAA resolve surface Allocating it has no effect, but it adds overhead (useless DCC clear). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	c06246501e	radeonsi: don't enable DCC in the sampler if first_level doesn't have it If first_level > 0 and DCC is disabled for that level, let's skip DCC reads entirely. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	00389100b6	winsys/amdgpu: enable DCC for mipmapped textures Also add dcc_fast_clear_size for clearing only the necessary subset of DCC. For no AA, it's equal to the size of the whole DCC level. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	c65361763c	gallium/radeon: don't disable DCC because of SDMA We want to keep DCC enabled to save bandwidth. It was a bad idea to disable it here. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	2fd74a05bb	radeonsi: don't flag renderbuffer feedback loop if DCC has just been disabled Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	aa7fe70443	radeonsi: add per-level dcc_enabled flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	60e93ddd06	radeonsi: compute DCC register parameters in si_emit_framebuffer_state This will get more complicated with mipmapped DCC or when DCC is enabled after allocation. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	a01536a29f	gallium/radeon: add an assertion checking the validity of PIPE_BIND_SCANOUT Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	d4d733e39d	gallium/radeon: don't allocate DCC for non-renderable texture formats R9G9B9E5 is the only uncompressed one hopefully. This fixes incorrect rendering not discovered (due to a lack of tests) until DCC mipmapping was enabled. Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Nicolai Hähnle	b42bc90b6a	radeonsi: enable WQM in PS prolog when needed WQM is needed when the PS prolog computes a VGPR that is consumed by a shader with (implicit or explicit) derivatives. Depends on http://reviews.llvm.org/D20839 / LLVM r272063 for this to be effective (otherwise it's just a no-op). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Cc: 12.0 <mesa-dev@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 23:46:02 +02:00
Nicolai Hähnle	d3a584defe	tgsi/scan: add uses_derivatives (v2) v2: - TG4 does not calculate derivatives (Ilia) - also handle SAMPLE* instructions (Roland) Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-07 23:45:17 +02:00
Nanley Chery	b7a0c0ec7f	docs/devinfo: Expound on helpful extension tips Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-07 11:16:23 -07:00
Nanley Chery	9e7de50cab	docs/devinfo: Update bullet in stale extension guide Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-07 11:16:23 -07:00
Nanley Chery	26b0f023d7	docs/devinfo: Add closing paragraph tag Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-07 11:16:23 -07:00
Tim Rowley	87f0a0448f	swr: fix provoking vertex Use rasterizer provoking vertex API. Fix rasterizer provoking vertex for tristrips and quad list/strips. v2: make provoking vertex tables static const Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-07 11:47:52 -05:00
Ilia Mirkin	c81b090c92	st/mesa: revalidate image atoms when a texture is updated A texture may be redefined with _NEW_TEXTURE, which might have been bound to a shader image slot. We have to revalidate the image atoms to pick up on the new resource. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-07 10:18:34 -04:00
Ilia Mirkin	71ad8a173f	gk104/ir: fix conditions for adding a texbar Sometimes a register source can actually be double- or even quad-wide. We must make sure that the inserted texbars take that width into account. Based on an earlier patch by Samuel Pitoiset. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>	2016-06-07 10:18:13 -04:00
Nicolai Hähnle	8239da28e8	radeonsi: keep track of dirty descriptor sets Reduces CPU load for draw calls that change none or few of the descriptors. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:18:10 +02:00
Nicolai Hähnle	d152c73712	radeonsi: move si_descriptors into a per-context array Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:18:07 +02:00
Nicolai Hähnle	a29c4f9ebd	radeonsi: pass shader stage to si_disable_shader_image Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:18:05 +02:00
Nicolai Hähnle	4e0fb72786	radeonsi: access descriptor sets via local variables This will simplify moving them to a per-context array. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:18:02 +02:00
Nicolai Hähnle	ba4a2840c7	radeonsi: add si_set_rw_buffer to be used for internal descriptors So that callers outside of si_descriptors.c need to worry less about the details of descriptor handling. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:17:59 +02:00
Nicolai Hähnle	c615a055f4	radeonsi: pass shader stage to si_set_shader_image Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:17:57 +02:00
Nicolai Hähnle	e6612a3e68	radeonsi: pass shader stage to si_set_sampler_view Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:17:55 +02:00
Nicolai Hähnle	c32cd4b78d	radeonsi: move descriptor set begin_new_cs handling into a separate function Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:17:39 +02:00
Nicolai Hähnle	031b57bc2f	radeonsi: move enabled_mask out of si_descriptors This mask is irrelevant for the generic descriptor set handling, and having it outside simplifies subsequent changes slightly. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:17:23 +02:00
Jason Ekstrand	d1e141a661	anv/entrypoints: Stop using the C preprocessor Now that we emit guards for everything, we can just generate the files and trust build flags to keep us safe. This should also fix the tarball problems. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:30:25 +01:00
Jason Ekstrand	d1a53f91ee	anv/entrypoints: Emit #if guards for all platforms Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:30:25 +01:00
Haixia Shi	1ea233c6f3	platform_android: prevent deadlock in droid_swap_buffers To avoid blocking other EGL calls, release the display mutex before we enqueue buffer to android frameworks and re-acquire the mutex upon return. v2: moved lock/unlock inside droid_window_enqueue_buffer(). TEST=verify pinch zoom in Photos app no longer causes hangs Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:30:25 +01:00
Emil Velikov	b7f7ec7843	mesa: automake: distclean git_sha1.h when building OOT In the case of out-of-tree (OOT) builds, in particular when building from tarball, we'll end up with the file in both srcdir and builddir. We want the former to remain intact (since we need it on rebuild) while the latter should be removed otherwise `make distclean' gets angry at us. Ideally there'll be a solution that feels a bit less of a hack. Until then this does the job exactly as expected. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:30:23 +01:00
Emil Velikov	2c424e00c3	mesa: automake: ensure that git_sha1.h.tmp has the right attributes ... when copied from git_sha1.h. As the latter file can we lacking the write attribute, one should set it explicitly. Otherwise we'll get a warning/failure at cleanup stage. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:21:46 +01:00
Emil Velikov	359d9dfec3	mesa: automake: add directory prefix for git_sha1.h Otherwise the build will assume that we've talking about builddir, which is not the case in the else statement. Here the file is already generated and is part of the tarball. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:21:45 +01:00
Emil Velikov	1816c837c1	egl: android: don't add the image loader extension for !render_node With earlier commit we introduced support for render_node devices, which was couples with the use of the image loader extension. As the work was inspired by egl/wayland we (erroneously) added the extension for the !render_node path as well. That works for wayland, as the implementations of the DRI2 and IMAGE loader extensions converge behind the scenes. As that is not yet the case for Android we shouldn't expose the extension. Fixes: `34ddef39ce` ("egl: android: add dma-buf fd support") Cc: <mesa-stable@lists.freedesktop.org> Reported-by: Mauro Rossi <issor.oruam@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-06-07 12:21:45 +01:00
Marek Olšák	095803a37a	gallium/radeon: add support for sharing textures with DCC between processes v2: use a function for calculating WORD1 of bo metadata Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-07 11:12:26 +02:00
Marek Olšák	9e5b5fbde0	gallium/radeon: don't discard DCC if an external user can write to it We don't import textures with DCC now, but soon we will. v2: if we can't disable DCC for image writes, at least decompress DCC at bind time Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-07 11:12:26 +02:00
Dave Airlie	c6b14bafa4	i915: fix typo CAP. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-07 18:31:14 +10:00
Jakob Sinclair	b450f29073	glsl: initialise pointer to NULL Could cause issues if you tried to read from an uninitialised pointer. This just initalises the pointer to null to avoid that being a problem. Discovered by Coverity. CID: 1343616 Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-07 08:13:25 +02:00
Dave Airlie	c295923d13	i965/gen8: fix cull distance emission for tessellation shaders. This fixes some cases of: GL45-CTS.cull_distance.functional on Skylake. Reviewed-by: Chris Forbes <chrisforbes@google.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-07 11:52:17 +10:00
Ilia Mirkin	704bc0f0e9	nvc0: add support for VOTE tgsi opcodes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-06 20:49:29 -04:00
Ilia Mirkin	f64c36e2d7	st/mesa: expose GL_ARB_shader_group_vote when supported by backend Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-06 20:49:29 -04:00
Ilia Mirkin	edfa7a4b25	gallium: add PIPE_CAP_TGSI_VOTE for when the VOTE ops are allowed Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-06 20:49:29 -04:00
Ilia Mirkin	30684b50d7	gallium: add VOTE_* opcodes to implement GL_ARB_shader_group_vote Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-06 20:49:28 -04:00
Ilia Mirkin	5189f0243a	mesa: hook up core bits of GL_ARB_shader_group_vote Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-06 20:48:46 -04:00
Kenneth Graunke	13b859de04	glsl: Make opt_copy_propagation_elements actually propagate into loops. We've had a FINISHME here since Eric originally wrote the code in 2011. This patch implements his suggested approach, which makes us actually able to copy propagate into the loops, at the unfortunate cost of making this pass even more expensive. The shader-db statistics are basically a wash: No change in instruction counts. total cycles in shared programs: 78685980 -> 78680730 (-0.01%) cycles in affected programs: 2102646 -> 2097396 (-0.25%) helped: 48 HURT: 83 I figured if we're going to do this for one copy propagation pass, we may as well do it in both. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-06 14:14:31 -07:00
Kenneth Graunke	0756e3a25c	glsl: Make opt_copy_propagation actually propagate into loops. We've had a FINISHME here since Eric originally wrote the code in 2010. This patch implements his suggested approach, which makes us actually able to copy propagate into the loops, at the unfortunate cost of making this pass even more expensive. The shader-db statistics are not terribly impressive: total instructions in shared programs: 9008589 -> 9008613 (0.00%) instructions in affected programs: 4293 -> 4317 (0.56%) helped: 0 HURT: 10 total cycles in shared programs: 78550978 -> 78575760 (0.03%) cycles in affected programs: 655426 -> 680208 (3.78%) helped: 75 HURT: 88 GAINED: 2 Most of the "regressions" appear to be us successfully copy propagating uniforms, which i965 is loading as pull constants instead of push, so we occasionally have two pulls instead of one. That doesn't seem like this pass's job - it's propagating correctly, and we should be smarter about pull loads in the backend. This patch is also useful for a couple of reasons: 1. It can clean up copies created by varying packing (previously, we couldn't if the uses were inside a loop). This fixes a bug when interpolateAt*() is used on a packed varying inside a loop: glsl_to_nir struggles to see through the extra copy and mistakenly believed the variable was not an input. 2. It will help propagate uniform array access created by lower_const_array_to_uniforms(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-06 14:14:31 -07:00
Samuel Pitoiset	08ddfe7b2f	nv50/ir: use round toward 0 when converting doubles to integers Like floats, we should use the round toward 0 mode instead of the nearest one (which is the default) for doubles to integers. This fixes all arb_gpu_shader_fp64 piglits which convert doubles to integers (16 tests). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-06 22:56:04 +02:00
Marek Olšák	00e6899ae5	gallium/radeon: don't re-set BO metadata after CMASK deallocation CMASK has no effect on metadata, because it's not sharable. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-06 22:50:55 +02:00
Marek Olšák	589d6b58c3	st/mesa: change SQRT lowering to fix the game Risen Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94627 (against nouveau) Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-06 22:50:55 +02:00
Marek Olšák	991cbfcb14	radeonsi: add a performance tweak for 4 SE parts Ported from Vulkan. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-06 22:50:55 +02:00
Marek Olšák	2802310c25	radeonsi: simplify PRIMGROUP_SIZE computation for tessellation Ported from Vulkan. v2: keep the comment Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-06 22:50:55 +02:00
Marek Olšák	014c8ec770	r600g: use hw MSAA resolve for non-trivial resolves This improves MSAA resolve performance. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-06 22:50:55 +02:00
Marek Olšák	6b449783f6	radeonsi: use hw MSAA resolve for non-trivial resolves This improves MSAA resolve performance. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-06 22:50:55 +02:00
Dave Airlie	07403014c3	mesa/program_resource: return -1 for index if no location. The GL4.5 spec quote seems clear on this: "The value -1 will be returned by either command if an error occurs, if name does not identify an active variable on programInterface, or if name identifies an active variable that does not have a valid location assigned, as described above." This fixes: GL45-CTS.program_interface_query.output-built-in [airlied: use _mesa_program_resource_location_index as suggested by Eduardo] Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-07 06:10:19 +10:00
Nicolai Hähnle	ec2b52e2d9	radeonsi: set descriptor dirty mask on shader buffer unbind Found randomly while skimming the code. This might have caused VM faults in robustness tests. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-06 21:43:18 +02:00
Nicolai Hähnle	0f916d4ca7	st/mesa: fix resource leak in try_pbo_readpixels Found by inspection after seeing https://bugs.freedesktop.org/show_bug.cgi?id=96343 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-06 21:42:27 +02:00
Charmaine Lee	627e975896	tgsi: fix mixed data type comparison in tgsi_point_sprite.c Cast the unsigned semantic index to integer datatype before comparing to max_generic, otherwise, max_generic which is initialized to -1 will be converted to unsigned int before the comparison, causing a wrong semantic index to be assigned to a shader output. Fixes the assert running TurboCAD_gl.trace. (VMware bug 1667265) Also tested with glretrace, mesa demos pointblast, spriteblast and pointcoord. v2: use the original max_generic variable but add the (int) cast to the semantic index, as suggested by Brian. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-06 10:20:45 -06:00
Charmaine Lee	304b5a1446	svga: print shader linkage info when tgsi debug bit is on When TGSI debug flag is enabled, print the shader linkage info as well. Tested with mesa demos with SVGA_DEBUG=tgsi Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-06 10:20:45 -06:00
Ilia Mirkin	4f1cccf570	st/mesa: check shader image format support before using PBO download ARB_shader_image_load_store only requires a very fixed list of formats to be supported, while textures may be in all kinds of formats, like BGRA which are presently not supported on at least Kepler. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek OlÅ¡Ã¡k <marek.olsak@amd.com>	2016-06-06 12:05:59 -04:00
Lars Hamre	4163c71010	tgsi: use truncf in micro_trunc Switches to using truncf in micro_trunc. Fixes the following piglit tests (for softpipe): /spec/glsl-1.30/execution/built-in-functions/... fs-trunc-float fs-trunc-vec2 fs-trunc-vec3 fs-trunc-vec4 vs-trunc-float vs-trunc-vec2 vs-trunc-vec3 vs-trunc-vec4 /spec/glsl-1.50/execution/built-in-functions/... gs-trunc-float gs-trunc-vec2 gs-trunc-vec3 gs-trunc-vec4 Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-06 15:56:28 +02:00
Samuel Iglesias Gonsálvez	2b648ec17c	i965/gs/scalar: Fix load input for doubles Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-06 12:37:16 +02:00
Samuel Iglesias Gonsálvez	2d6f82a294	i965/fs: fix offset when loading double vector input varyings When we are not packing a double input varying, we might need to read its data in a non-aligned to 64-bit offset, so we read the wrong data. This is happening when using explicit locations in varyings because Mesa disables packing varying for that case. const_index is in 32-bit size units but offset() is multiplying it by destination type size units. When operating with double input varyings, const_index value could be not aligned to 64 bits. To fix it, we load the double vector as if it was a float based vector with twice the number of components. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-06 12:37:16 +02:00
Samuel Iglesias Gonsálvez	cb30727648	i965/fs: fix FS_OPCODE_CINTERP for unpacked double input varyings Data starts at suboffet 3 in 32-bit units (12 bytes), so it is not 64-bit aligned and the current implementation fails to read the data properly. Instead, when there is is a double input varying, read it as vector of floats with twice the number of components. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-06 12:37:16 +02:00
Dave Airlie	4c86399378	glsl: geom shader max_vertices layout must match. From GLSL 4.5 spec, "4.4.2.3 Geometry Outputs". "all geometry shader output vertex count declarations in a program must declare the same count." Fixes: GL45-CTS.geometry_shader.output.conflicted_output_vertices_max Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-06 18:02:19 +10:00
Jason Ekstrand	ffcef720b7	anv/pipeline: Add support for caching the push constant map Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-06-06 00:44:32 -07:00
Dave Airlie	78659ade40	glsl: use enum glsl_interface_packing in more places. (v2) Although the glsl_types.h stores this in a bitfield, we should hide that from everyone else. Hide the cast in an accessor method and use the enum everywhere. This makes things a bit nicer in gdb, and improves type safety. v2: fix a few pieces of interface I missed that caused some piglit regressions. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-06 15:58:37 +10:00
Dave Airlie	ff2e569153	i965: don't use NumLayers for 3D textures. For 3D textures we shouldn't be using NumLayers, we need to get it from the depth. This fixes: GL45-CTS.geometry_shader.layered_framebuffer.clear_call_support Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-06 13:07:07 +10:00
Dave Airlie	1f66a4b689	glsl: for anonymous struct matching use without_array() (v3) With tessellation shaders we can have cases where we have arrays of anon structs, so make sure we match using without_array(). Fixes: GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in v2: test lengths match as well (Ilia) v3: descend array lengths to check for matches as well (Ilia) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-06 12:54:41 +10:00
Dave Airlie	6702c15810	glsl/ast: don't crash when func_name is NULL This fixes a crash in GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types If we can't find the func_name in one of these paths, we have emitted an earlier error so just return here. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-06 12:54:30 +10:00
Dave Airlie	4336196b7f	glsl: handle ast_aggregate in has_sequence_subexpression. (v2) GL43-CTS.compute_shader.work-group-size does uniform uint g_uniform[gl_WorkGroupSize.z + 20] = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 }; The initializer triggers the GLSL 4.30/GLES3 tests for constant sequence subexpressions, so it doesn't happen unless you are using those, so just return false as this path is now reachable. v2: update commit msg with diagnosis Acked-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-06 12:54:19 +10:00
Kenneth Graunke	f657a59d98	mesa: Try to unbreak the MSVC build. PATH_MAX is apparently not a thing on Windows. Borrow the hack from pipe_loader.c to try and make this work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-05 16:32:08 -07:00
Kenneth Graunke	c417c0c9c3	mesa: Add MESA_SHADER_CAPTURE_PATH for writing .shader_test files. This writes linked shader programs to .shader_test files to $MESA_SHADER_CAPTURE_PATH in the format used by shader-db (http://cgit.freedesktop.org/mesa/shader-db). It supports both GLSL shaders and ARB programs. All stages that are linked together are written in a single .shader_test file. This eliminates the need for shader-db's split-to-files.py, as Mesa produces the desired format directly. It's much more reliable than parsing stdout/stderr, as those may contain extraneous messages, or simply be closed by the application and unavailable. We have many similar features already, but this is a bit different: - MESA_GLSL=dump writes to stdout, not files. - MESA_GLSL=log writes each stage to separate files (rather than all linked shaders in one file), at draw time (not link time), with uniform data and state flag info. - Tapani's shader replacement mechanism (MESA_SHADER_DUMP_PATH and MESA_SHADER_READ_PATH) also uses separate files per shader stage, but allows reading in files to replace an app's shader code. v2: Dump ARB programs too, not just GLSL. v3: Don't dump bogus 0.shader_test file. v4: Add "GL_ARB_separate_shader_objects" to the [require] block. v5: Print "GLSL 4.00" instead of "GLSL 4.0" in the [require] block. v6: Don't hardcode /tmp/mesa. v7: Fix memoization of getenv(). v8: Also print "SSO ENABLED" (suggested by Timothy). v9: Also handle ES shaders (suggested by Ilia). v10: Guard against MESA_SHADER_CAPTURE_PATH being too long; add _mesa_warning calls on error handling (suggested by Ben). v11: Fix crash when variable is unset introduced in v10. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-05 13:48:57 -07:00
Ilia Mirkin	092ec3920f	nv50,nvc0: fix BGR10_A2UI vertex format This is mostly academic as this is not reachable from GL, which only has the packed RGB10_A2UI vertex format. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-05 15:13:46 -04:00
Samuel Pitoiset	be365f34f0	nvc0: do not clear surfaces bins in the validate function We should not call nouveau_bufctx_reset() inside a validate function. This only affects Fermi where images are aliased between 3D and CP. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-05 19:02:59 +02:00
Samuel Pitoiset	43d3ecfb33	nvc0: re-validate images after launching a grid on Fermi Images invalidation is a bit weird on Fermi and there is already a hack which forces invalidating all images when launching a computer shader to help in fixing 3D<->CP interaction. However, we need to re-validate images for compute because nvc0_compute_invalidate_surfaces() will destroy the previous binding. This is not really good for performance purposes but this might be improved later. This fixes the following piglits: - spec/arb_compute_shader/execution/basic-uniform-access - spec/arb_compute_shader/execution/mutiple-texture-reading - spec/arb_compute_shader/execution/multiple-workgroups - spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-05 18:48:02 +02:00
Marek Olšák	3b44864ab7	radeonsi: fix images with level > 0 This should fix spec@arb_shader_image_load_store@level. Broken by: Commit: `95c5bbae66` radeonsi: set some image descriptor fields at bind time Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-05 17:00:14 +02:00
Ilia Mirkin	fd6bbc2ee2	nvc0: reduce overhead from always marking images dirty We would revalidate images when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the images. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-04 23:50:56 -04:00
Ilia Mirkin	0f673db6f0	nvc0: reduce overhead from always marking buffers dirty We would revalidate buffers when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the SSBOs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-04 23:50:56 -04:00
Ilia Mirkin	e8ee161b16	nvc0: fix memory barrier flag handling Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-04 23:50:56 -04:00
Ilia Mirkin	29abbeecd8	nvc0: mark bound buffer range valid Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-04 23:50:56 -04:00
Dave Airlie	f018456901	anv/entrypoints: don't go using wayland/xcb unless they are configured The fix in: anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards breaks things if wayland headers aren't installed. Separate things out properly to avoid that problem. [airlied: fixed up to put in pre-existing sections]. Reported-by: Arjan van de Ven Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-05 07:03:12 +10:00
Marek Olšák	d5491a81ff	gallium/radeon: don't use the DMA ring for pipelined buffer uploads Submitting a DMA IB flushes the GFX IB and all GPU caches. Vedran Miletić said: "On Tonga 380X, this improves The Talos Principle from 8.3 fps to 28.3 fps (all graphics settings Ultra, 4xAA, 1080p resolution with downsampling from 1200p)." Some anonymous dude said: R9 390 results: Tomb Raider (normal settings): 80 -> 88 FPS Talos Principle (custom settings): 23 -> 56 FPS Metro Last Light Redux (default benchmark settings): 39 -> 40 FPS Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Vedran Miletić <vedran@miletic.net> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	9c35ec2042	r600g: don't flush caches when binding shader resources Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	eff94af794	r600g: only do necessary cache flushes in cp_dma_copy_buffer The main impact is that {upload, draw, upload, draw, ..} doesn't flush framebuffer caches before every upload. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	9e62012c30	r600g: only do necessary cache flushes in cp_dma_clear_buffer The main impact is that fast color clear doesn't flush TC, CONST, DB. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	c92a3ae7e9	r600g: remove a CP DMA workaround that's not needed anymore Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	5ea5ed6050	r600g: fix CP DMA hazard with index buffer fetches (v3) v3: use PFP_SYNC_ME on EG-CM only when supported by the kernel, otherwise use MEM_WRITE + WAIT_REG_MEM to emulate that Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	ade16e1f5d	r600g: properly sync CP with CP DMA on R6xx This will allow removing useless cache & IB flushes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	7746903d3a	r600g: write WAIT_UNTIL in the correct place This has been wrong all along. Fixing this will allow removing useless cache flushes. Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	ee0c96c11e	gallium/radeon: rename allocator_so_filled_size -> allocator_zeroed_memory Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	ada3d8f31e	gallium/u_suballoc: allow different alignment for each allocation Just move the alignment parameter from u_suballocator_create to u_suballocator_alloc. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Jason Ekstrand	441194edd9	anv/blit: Use CLAMP_TO_EDGE for scaled blits When upscaling you can end up interpolating between the edge pixel and one past the edge. Using CLAMP_TO_EDGE seems like the most reasonable thing to do in this case. This fixes two of the new Vulkan CTS tests in dEQP-VK.api.copy_and_blit.blit_image.* Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	9313a56816	anv/copy: Account for the anv_surface.offset when creating a blit2d_surf This was causing problems if the user tried to copy to/from the stencil portion of a combined depth/stencil image. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	526a8de22d	nir/spirv: Make a decoration switch complete Getting rid of the default case makes the compiler warn if we are missing cases. While we're here, we also add the one missing case. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	62c6e94bd6	nir/spirv: Make unhandled decorations and capabilities non-fatal glslang frequently throw bogus decorations into shaders. While we are free to assert-fail, it's a bit nicer to the application to just warn. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	ed14d21d04	nir/spirv: Add a way to print non-fatal warnings Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	2e46a5d155	nir/spirv: Add string lookup tables for a couple of SPIR-V enums Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	5a1e56f344	nir/spirv: Complete the list of capabilities Previously we supported a subset of capabilities and just left a default case for the others. It's time to stop being lazy and actually audit the capabilities. This should bring them up-to-date with reality. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	9fa958e95b	anv/pipeline: Add support for early depth stencil Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	66bd2e1133	mesa: Get rid of _mesa_active_fragment_shader_has_side_effects It is no longer used. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	35bf4d9dc2	i965/ps_state: Use wm_prog_data.has_side_effects Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	3fb289f957	i965/fs Add a wm_prog_data bit for has_side_effects This is more accurate than calling _mesa_active_fragment_shader_has_side_effects because it looks at whether or not the SSBOs, images, or atomic buffers are actually written rather than just existing in the program. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	4d3b8318a7	nir/info: Get rid of uses_interp_var_at_offset We were using this briefly in the i965 driver to trigger recompiles but we haven't been using it since we switched to the NIR y-transform lowering pass. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	56a178922f	anv/pipeline: Silently pass tests if depth or stencil is missing Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	bc7f7e1953	anv/pipeline: Unify gen7/8 emit_ds_state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	fdc3c5dd05	genxml/gen6,7,75: s/BackFace/Backface This is more consistent with gen8+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	1f7b54ed29	nir/spirv: Handle the WorkgroupSize builtin decoration This fixes the 7 dEQP-VK.pipeline.spec_constant.compute.local_size.* tests in the latest dev version of the Vulkan CTS. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	b26cdd65e8	nir/spirv: Use breaks instead of returns in constant handling Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	a19ae36ce5	anv/pipeline: Refactor specialization constant handling a bit Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	45542f554c	nir/lower_indirect_derefs: Use the direct array deref for recursion This fixes about 100 of the new Vulkan CTS tests. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	59f06ac389	anv/clear: Handle ClearImage on 3-D images Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Francisco Jerez	7244dc1e06	Revert "i965/fs: Allow scalar source regions on SNB math instructions." This reverts commit `c1107cec44`. Apparently the hardware spec text I quoted in the commit message was outright lying about scalar source math being supported on SNB, the hardware seems to load 32 contiguous bits of data for each channel regardless of the regioning mode. Fixes regressions in the following CTS tests (which we didn't catch early due to CTS being temporarily disabled in our CI system): es2-cts.gtf.gl.atan.atan_vec3_frag_xvary es2-cts.gtf.gl.cos.cos_vec2_frag_xvary es2-cts.gtf.gl.atan.atan_vec2_frag_xvary es2-cts.gtf.gl.pow.pow_vec2_frag_xvary_yconsthalf es2-cts.gtf.gl.cos.cos_float_frag_xvary es2-cts.gtf.gl.pow.pow_float_frag_xvary_yconsthalf es2-cts.gtf.gl.atan.atan_vec3_frag_xvaryyvary es2-cts.gtf.gl.pow.pow_vec3_frag_xvary_yconsthalf es2-cts.gtf.gl.cos.cos_vec3_frag_xvary es2-cts.gtf.gl.atan.atan_vec2_frag_xvaryyvary Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96346 Reported-by: Mark Janes <mark.a.janes@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2016-06-03 18:47:29 -07:00
Francisco Jerez	a2135c6fd9	i965/vec4: Fix cmod propagation not to propagate non-identity cmod into CMP(N). The conditional mod of these instructions determines the semantics of the comparison itself (rather than being evaluated based on the result of the instruction as is usually the case for most other instructions that allow conditional mods), so it's in general not legal to propagate a conditional mod into a CMP instruction. This prevents cmod propagation from (mis)optimizing: cmp.z.f0 tmp, ... mov.z.f0 null, tmp into: cmp.z.f0 tmp, ... which gives the negation of the flag result of the original sequence. I originally noticed this while working on SIMD32 in the scalar back-end, but the same scenario is likely to be possible in vec4 programs so this commit ports the bugfix with the same name from the scalar back-end to the vec4 cmod propagation pass. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-03 18:38:51 -07:00
Emil Velikov	7a3a0d9212	anv: add the X related and Wayland CFLAGS to VULKAN_ENTRYPOINT_CPPFLAGS Otherwise we will fail to find the headers in some scenarios. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2016-06-04 00:52:00 +01:00
Emil Velikov	a1256c0ea7	nir: automake: add nir_search_helpers.h to the sources list(s) Fixes: `dfbae7d64f` ("nir/algebraic: support for power-of-two optimizations") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-06-04 00:18:40 +01:00
Rob Clark	1535519e51	freedreno/ir3: do idiv lowering after main opt loop Give algebraic-opt pass a chance to catch udiv by const power-of-two, before running lower-idiv pass. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-03 16:05:03 -04:00
Rob Clark	dfbae7d64f	nir/algebraic: support for power-of-two optimizations Some optimizations, like converting integer multiply/divide into left/ right shifts, have additional constraints on the search expression. Like requiring that a variable is a constant power of two. Support these cases by allowing a fxn name to be appended to the search var expression (ie. "a#32(is_power_of_two)"). Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-03 16:05:03 -04:00
Nicolai Hähnle	a64c7cd2ba	radeonsi: mark buffer texture range valid for shader images When a shader image view into a buffer texture can be written to, the buffer's valid range must be updated, or subsequent transfers may incorrectly skip synchronization. This fixes a bug that was exposed in Xephyr by PBO acceleration for glReadPixels, reported by Michel Dänzer. Cc: Michel Dänzer <michel.daenzer@amd.com> Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-03 14:11:05 +02:00
Marek Olšák	8c361e84ad	Revert "egl: Check if API is supported when using eglBindAPI." This reverts commit `e8b38ca202`. It broke Glamor for Gallium at least.	2016-06-03 11:33:45 +02:00
Alejandro Piñeiro	9bdbb9c0e0	mesa/formatquery: expand NUM_SAMPLE_COUNTS OpenGL ES comment For ES 3.0 NUM_SAMPLE_COUNTS spec points that some formats will be always zero. But on ES 3.1 can be different to zero. The current code is correctly checking exactly against version 3.0, but the comment only mentions 3.0 spec. It is clearer mentioning both. v2: better wording on the comment (Ian Romanick) Acked-by: Eduardo Lima <elima@igalia.com> Acked-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-03 07:38:25 +02:00
Dave Airlie	d10ae20b96	mesa/get: return correct value for layer provoking vertex. This fixes: GL45-CTS.geometry_shader.layered_rendering.layered_rendering on Skylake. Reviewed-by: Chris Forbes <chrisforbes@google.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-03 12:33:34 +10:00
Plamena Manolova	0b67efaed2	egl: Account for default values of texture target and format When validating attributes during surface creation we should account for the default values of texture target and format (EGL_NO_TEXTURE) since the user is not obligated to explicitly set both via the attribute list passed to eglCreatePbufferSurface. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-06-02 16:07:31 -07:00
Samuel Pitoiset	28590eb949	nvc0: mark buffer texture range valid for shader images Loosely based on radeonsi (Thanks to Nicolai). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-06-03 00:12:23 +02:00
Mauro Rossi	278c2212ac	isl: add support for Android libmesa_isl static library isl library is needed to build i965, libmesa_isl static library is added to fix related Android building errors. Any attempt to build libmesa_genxml as phony package module failed to deliver gen{7,75,8,9}_pack.h generated headers, needed for libmesa_isl_gen{7,75,8,9} Due to constraints in Android Build System, libmesa_genxml is built as static, at least one source is needed, so dummy.c is autogenerated for this scope, libmesa_genxml dependency is declared using LOCAL_WHOLE_STATIC_LIBRARIES, to avoid building errors due to missing genxml/gen{7,75,8,9}_pack.h headers. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-02 22:31:44 +01:00
Mauro Rossi	4143245c23	android: libmesa_glsl: add a dependency on libmesa_nir static Fixes the following building error: target C++: libmesa_glsl <= external/mesa/src/compiler/glsl/glsl_to_nir.cpp In file included from external/mesa/src/compiler/glsl/glsl_to_nir.h:28:0, from external/mesa/src/compiler/glsl/glsl_to_nir.cpp:28: external/mesa/src/compiler/nir/nir.h:42:25: fatal error: nir_opcodes.h: No such file or directory compilation terminated. build/core/binary.mk:432: recipe for target 'out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o' failed make: * [out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o] Error 1 make: * Waiting for unfinished jobs.... Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-02 22:31:00 +01:00
Emil Velikov	af1a0ae8ce	isl: automake: don't include isl_format_layout.c in two lists. Including the file in both ISL_FILES and ISL_GENERATED_FILES makes the actual dependency list less obvious. v2: Drop unrelated vulkan hunk (Jason). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-02 22:26:04 +01:00
Emil Velikov	af2637aa32	automake: bring back the .PHONY git_sha1.h.tmp rule With earlier commit `3689ef32af` ("automake: rework the git_sha1.h rule, include in tarball") we/I erroneously removed the PHONY rule and the temporary file. The former is used to ensure that the header is regenerated when on each make invocation, while the latter helps us avoid the unneeded rebuild(s) when the SHA1 hasn't changed. Reported-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-06-02 22:23:12 +01:00
Kenneth Graunke	f74a29188c	i965: Add _NEW_POINT to a couple of comments. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-02 14:11:55 -07:00
Charmaine Lee	0cf0d7c02e	svga: allow copy box in svga_transfer_dma_band() Instead of just allow copy of a rectangle in svga_transfer_dma_band(), this patch allows it to copy a box, hence allows copy a 3d texture in one transfer. Fixes black screen in running Heaven after commit `fb9fe35`. (Bug 1663282) Tested with Heaven, glretrace, piglit. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-02 15:03:41 -06:00
Rob Clark	94d8fbd217	freedreno: fix bad bitshift warnings Coverity doesn't realize idx will never be negative. Throw in some assert()s to help it out. (Hopefully assert() isn't getting compiled out for coverity build.. but there seems to be just one way to find out. We might have to change these to assume()) Fixes CID 1362442, 1362443 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 16:29:32 -04:00
Rob Clark	676c77a923	freedreno: assume builtin shaders do compile Maybe we should switch to ureg to build the builtin shaders. But at any rate, if they fail to compile it is because someone messed them up (or changed TGSI syntax?). CID 1362444 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 16:29:32 -04:00
Francisco Jerez	060c8d245d	i965/fs: Reindent emit_zip(). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-02 13:24:48 -07:00
Francisco Jerez	7aa76d66a1	i965/fs: Skip SIMD lowering destination zipping if possible. Skipping the temporary allocation and copy instructions is easy (just return dst), but the conditions used to find out whether the copy can be optimized out safely without breaking the program are rather complex: The destination must be exactly one component of at most the execution width of the lowered instruction, and all source regions of the instruction must be either fully disjoint from the destination or be aligned with it group by group. v2: Don't handle partial source-destination overlap for simplicity (Jason). No instruction count regressions with respect to v1 in either shader-db or the few FP64 shader_runner test-cases with partial overlap I've checked manually. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-02 13:24:48 -07:00
Anuj Phogat	75da9c9933	blorp: Fix 16x multisample scaled blits Piglit test ext_framebuffer_multisample_blit_scaled-blit-scaled (with added 16x sample support) now passes with this patch. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-02 13:21:26 -07:00
Anuj Phogat	59c19b7687	meta: Fix indentation in shader code Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2016-06-02 13:21:26 -07:00
Dave Airlie	af7bf610cf	mesa/copyimage: report INVALID_VALUE for missing cube face The specs says INVALID_VALUE for exceeding dimensions, which is really what is happening here. This fixes: GL45-CTS.copy_image.non_existent_mipmap Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-03 06:08:44 +10:00
Dave Airlie	c0856eacf1	mesa/copyimage: fix num samples check to handle renderbuffers. This test was only happening for textures, but there is nothing in the spec to say this, so test it for all cases. This fixes: GL45-CTS.copy_image.invalid_target Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-03 06:08:22 +10:00
Rob Clark	80c2886033	freedreno/a4xx: silence coverity warning CID 1362451 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	9b854ce53c	freedreno/a3xx+a4xx: fix potential null ptr deref Coverity spotted the a3xx case (not sure why not the a4xx). CID 1362452 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	27a97097e1	freedreno/ir3: fix coverity warning CID 1362453 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	374ad2e2bd	freedreno/ir3: use nir_shader_get_entrypoint() helper Should also fix coverity warning: CID 1362454 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	df64cd6814	freedreno/a4xx: fix incorrect enum type a4xx has it's own enum, different from a2xx/a3xx. Spotted by coverity: CID 1362458, 1362459 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	1632b0eac0	freedreno: fix coverity negative array index warning Never can happen, since query would not have been created in the first place if pidx(query_type) return negative. Lets let coverity realize this. CID 1362460 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	ba452d43e0	freedreno: fix dereference before null check ptr can actually never be null so just drop the check. CID 1362464 (#1 of 1): Dereference before null check (REVERSE_INULL) check_after_deref: Null-checking ptr suggests that it may be null, but it has already been dereferenced on all paths leading to the check. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	228b2b36f4	gallium/util: remove u_staging Unused, and fixes a couple of coverity warnings: CID `1362171`, `1362170` Signed-off-by: Rob Clark <robclark@freedesktop.org> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-06-02 15:44:07 -04:00
Rob Clark	18fb922faa	freedreno/a3xx: only update/emit bordercolor state when needed Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	11f0652404	freedreno/a4xx: only update/emit bordercolor state when needed I noticed in stk that it was contributing to a lot of overhead. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Matt Turner	0d81a684c1	i965: Add missing types to type_sz(). Coverity warns in multiple places about the potential for division by zero, caused by this function's default case. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-02 11:34:09 -07:00
Nanley Chery	c06cef7f9b	mesa/extensions: Fix ES1 extension reporting Commit `eda15abd84` , unintentionally advertised these extensions in ES1 contexts. Undo this error. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-02 10:46:59 -07:00
Plamena Manolova	e8b38ca202	egl: Check if API is supported when using eglBindAPI. According to the EGL specifications before binding an API we must check whether it's supported first. If not eglBindAPI should return EGL_FALSE and generate a EGL_BAD_PARAMETER error. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-02 07:45:19 -07:00
Eric Engestrom	17f4c723eb	st/osmesa: remove double-write (overwriting) These two lines have been here since the file was created. I'm guessing the second one was just for testing during dev, so it's the one that's going away. CoverityID: 1296205 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-02 07:05:05 -06:00
Nayan Deshmukh	6c9a352d79	st/vdpau: check for null pointer in get/put bits. Check for null pointer before accessing arrays in get/put bits native/YCbCr/Indexed in VdpOutputSurface and VdpVideoSurface. Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-02 09:28:48 +02:00
Christian König	b3e75c3997	radeon/uvd: fix the H264 level for Tonga v2 We support 5.2 for a while now. v2: we even support 5.2 for H264, 5.1 is for HEVC. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-02 09:27:57 +02:00
Alejandro Piñeiro	b48c42cd1f	mesa/formatquery: add a comment to clarify INTERNALFORMAT_PREFERRED The comment clarifies that the driver is called only to try to get a preferred internalformat, and that it was already checked if the format is supported or not. Acked-by: Eduardo Lima <elima@igalia.com> Acked-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-02 08:54:17 +02:00
Alejandro Piñeiro	c1ceee6cc9	i965/formatquery: remove INTERNALFORMAT_PREFERRED implementation Right now the implementation only checks if the internalformat is supported or not. But that implementation is wrong, returning unsupported for some internalformats. Additionally, checking if the internalformat is supported or not is already done at mesa/main before calling the driver hook, so this new check is not needed. Acked-by: Eduardo Lima <elima@igalia.com> Acked-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-02 08:54:10 +02:00
Alejandro Piñeiro	58617bcebe	i965/eu: use simd8 when exec_size != EXECUTE_16 Among other thigs, fix a gpu hang when using INTEL_DEBUG=shader_time for any shader. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-06-02 08:08:10 +02:00
Jordan Justen	0a3acff5b5	i965: Remove old CS local ID handling The old method pushed data for each channels uvec3 data of gl_LocalInvocationID. The new method pushes 1 dword of data that is a 'thread local ID' value. Based on that value, we can generate gl_LocalInvocationIndex and gl_LocalInvocationID with some calculations. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	b1f22c6317	i965: Enable cross-thread constants and compact local IDs for hsw+ The cross thread constant support appears on Haswell. It allows us to upload a set of uniform data for all threads without duplicating it per thread. One complication is that cross-thread constants are loaded into registers before per-thread constants. Previously, our local IDs were loaded before the uniform data and treated as 'payload' data, even though they were actually pushed into the registers like the other uniform data. Therefore, in this patch we simultaneously enable a newer layout where each thread now uses a single uniform slot for a unique local ID for the thread. This uniform is handled specially to make sure it is added last into the uniform push constant registers. This minimizes our usage of push constant registers, and maximizes our ability to use cross-thread constants for registers. To swap from the old to the new layout, we also need to flip some lowering pass switches to let our driver handle the lowering instead. We also no longer force thread_local_id_index to -1. v4: * Minimize size of patch that switches from the old local ID layout to the new layout (Jason) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	3ba9594f32	anv: Support new local ID generation & cross-thread constants The cross thread constant support appears on Haswell. It allows us to upload a set of uniform data for all threads without duplicating it per thread. We also support per-thread data which allows us to store a per-thread ID in one of the uniforms that can be used to calculate the gl_LocalInvocationIndex and gl_LocalInvocationID variables. v4: * Support the old local ID push constant layout as well (Jason) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	30685392e0	i965: Support new local ID push constant & cross-thread constants The cross thread constant support appears on Haswell. It allows us to upload a set of uniform data for all threads without duplicating it per thread. We also support per-thread data which allows us to store a per-thread ID in one of the uniforms that can be used to calculate the gl_LocalInvocationIndex and gl_LocalInvocationID variables. v4: * Support the old local ID push constant layout as well (Jason) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	d437798ace	i965: Add CS push constant info to brw_cs_prog_data We need information about push constants in a few places for the GL driver, and another couple places for the vulkan driver. When we add support for uploading both a common (cross-thread) set of push constants, combined with the previous per-thread push constant data, things are going to get even more complicated. To simplify things, we add push constant info into the cs prog_data struct. The cross-thread constant support is added as of Haswell. To support it we need to make sure all push constants with uniform values are added to earlier registers. The register that varies per thread and holds the thread invocation's unique local ID needs to be added last. For now we add the code that would calculate cross-thread constatn information for hsw+, but we force it (cross_thread_supported) off until the other parts of the driver support it. v4: * Support older local ID push constant layout as well. (Jason) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	1b79e7ebbd	i965: Store number of threads in brw_cs_prog_data Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	3ef0957dac	i965: Add nir based intrinsic lowering and thread ID uniform We add a lowering pass for nir intrinsics. This pass can replace nir intrinsics with driver specific nir lower code. We lower the gl_LocalInvocationIndex intrinsic based on a uniform which is loaded with a thread specific ID. We also lower the gl_LocalInvocationID based on gl_LocalInvocationIndex. v2: * Create variable during lowering pass. (Ken) v3: * Don't create a variable, but instead just insert an intrisic call to load a uniform from the allocated location. (Jason) v4: * Don't run this pass if thread_local_id_index < 0 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	04fc72501a	i965: Put CS local thread ID uniform in last push register This thread ID uniform will be used to compute the gl_LocalInvocationIndex and gl_LocalInvocationID values. It is important for this uniform to be added in the last push constant register. fs_visitor::assign_constant_locations is updated to make sure this happens. The reason this is important is that the cross-thread push constant registers are loaded first, and the per-thread push constant registers are loaded after that. (Broadwell adds another push constant upload mechanism which reverses this order, but we are ignoring this for now.) v2: * Add variable in intrinsics lowering pass * Make sure the ID is pushed last in assign_constant_locations, and that we save a spot for the ID in the push constants v3: * Simplify code based with Jason's suggestions. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	fa279dfbf0	i965: Add uniform for a CS thread local base ID v4: * Force thread_local_id_index to -1 for now, and have fs_visitor::setup_cs_payload look at thread_local_id_index. This enables us to more easily cut over from the old local ID layout to the new layout, as suggested by Jason. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	8f48d23e0f	i965: Add nir channel_num system value v2: * simd16/32 fixes (curro) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	6f316c9d86	nir: Make lowering gl_LocalInvocationIndex optional Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	7b9def3583	glsl: Add glsl LowerCsDerivedVariables option v2: * Move lower flag to context constants. (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jason Ekstrand	1205999c22	i965/fs: Copy the offset when lowering logical pull constant sends This fixes 64 Vulkan CTS tests per gen Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96299 Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-01 16:00:44 -07:00
Dave Airlie	8d4f4adfbd	glsl/distance: make sure we use clip dist varying slot for lowered var. When lowering, we always want to use the clip dist varying. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-02 07:09:21 +10:00
Nicolai Hähnle	c7877b9dab	winsys/amdgpu: decay max_ib_size over time So that memory use will eventually decrease again after a temporary peak. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:20 +02:00
Nicolai Hähnle	6aff6377b1	winsys/amdgpu: implement IB chaining on the gfx ring As a consequence, CE IB size never triggers a flush anymore. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:20 +02:00
Nicolai Hähnle	45be461f55	winsys/amdgpu: consolidate IB size management in amdgpu_ib_finalize Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:20 +02:00
Nicolai Hähnle	89ba076de4	radeon/winsys: introduce radeon_winsys_cs_chunk We will chain multiple chunks together and will keep pointers to the older chunks to support IB dumping. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:20 +02:00
Nicolai Hähnle	a7c26bfc0c	radeonsi/sid: add packet definitions for IB chaining While we're at it, add packet printing in si_debug. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:19 +02:00
Nicolai Hähnle	83a01cb498	winsys/amdgpu: start with smaller IBs, growing as necessary This avoids allocating giant IBs from the outset, especially for CE and DMA. Since we now limit max_dw only by the size that the buffer happens to be (which, due to the buffer cache, can be even larger than the rounded-up size we request), the new function amdgpu_ib_max_submit_dwords controls when we submit an IB. With this change, we effectively never flush prematurely due to the CE IB, after an initial warm-up phase. v2: - clean up buffer_size calculation Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:19 +02:00
Nicolai Hähnle	f80c6abb9e	winsys/amdgpu: add amdgpu_ib and amdgpu_cs_from_ib helper functions The latter function allows getting the containing amdgpu_cs from any IB (including non-main ones). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:19 +02:00
Nicolai Hähnle	9e5ed559ba	winsys/amdgpu: extract IB big buffer allocation for re-use Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:19 +02:00
Nicolai Hähnle	9db851b5ee	winsys/amdgpu: add IB buffer in amdgpu_get_new_ib Adding the buffer when we start using it for the IB makes the logic for chaining a bit simpler. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:19 +02:00
Nicolai Hähnle	d6211a61b0	gallium/radeon: use cs_check_space throughout Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:18 +02:00
Nicolai Hähnle	46ad3561be	radeon/winsys: add cs_check_space Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:18 +02:00
Nicolai Hähnle	92d5d97b10	winsys/amdgpu: simplify interface of amdgpu_get_new_ib We'll want to have an amdgpu_cs pointer for future changes. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:18 +02:00
Nicolai Hähnle	8396ab4241	winsys/amdgpu: add amdgpu_cs_has_user_fence v2: style change Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:18 +02:00
Kenneth Graunke	25e1b8d366	i965: Fix isoline reads in scalar TES. Isolines aren't reversed. commit `5b2d8c2273` fixed this for the vec4 TES backend, but not the scalar one. Found while debugging GL45-CTS.tessellation_shader. tessellation_control_to_tessellation_evaluation.gl_tessLevel. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2016-06-01 13:46:09 -07:00
Nicolai Hähnle	ed0e9862c5	st/mesa: implement PBO downloads for ReadPixels v2: require PIPE_CAP_SAMPLER_VIEW_TARGET; technically only needed for some of the texture targets, but all hardware that has shader images should also have this cap. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:51 +02:00
Nicolai Hähnle	f3b62d4c74	st/mesa: hook up a no-op try_pbo_readpixels For better bisectability given that the order of some of the fallback tests in the blit path are rearranged. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:48 +02:00
Nicolai Hähnle	1cb4be94ae	st/mesa: add layer_offset to PBO fragment shader This will be used to select a slice of a 3D texture. v2: fix a comment (Marek) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:43 +02:00
Nicolai Hähnle	2bf6dfac8a	st/mesa: create PBO download fragment shaders Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:40 +02:00
Nicolai Hähnle	852d3fcd3b	st/mesa: add PBO download enable bit and fragment shaders For downloads, the fragment shader must know the source texture target, hence we may cache multiple fragment shaders. v2: break long line (Marek) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:34 +02:00
Nicolai Hähnle	581c001532	st/mesa: move shareable parts of PBO upload state and draw to st_pbo.c Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:31 +02:00
Nicolai Hähnle	e16800226e	st/mesa: move PBO buffer address calculation to st_pbo.c Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:28 +02:00
Nicolai Hähnle	21e069f7d4	st/mesa: move PBO upload fs creation to st_pbo.c Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:26 +02:00
Nicolai Hähnle	979688a027	st/mesa: rename pbo_upload to pbo At the same time, rename members that are upload-specific to say so. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:23 +02:00
Nicolai Hähnle	be82065fbe	st/mesa: move PBO vertex and geometry shader creation to st_pbo.c Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:20 +02:00
Nicolai Hähnle	4ecc32b0e1	st/mesa: begin moving PBO functions into their own file Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:18 +02:00
Nicolai Hähnle	d9893feb2c	gallium/cso: allow saving the first fragment shader image slot Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:15 +02:00
Nicolai Hähnle	fc0352ff9c	gallium/u_inlines: allow NULL src in util_copy_image_view Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:12 +02:00
Nicolai Hähnle	57f576f1fb	gallium: add PIPE_BARRIER_ALL define Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:36:48 +02:00
Ian Romanick	a428c955ce	glsl: Use Geom.VerticesOut == -1 to specify unset Because apparently layout(max_vertices=0) is a thing. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-01 11:11:39 -07:00
Ian Romanick	b27dfa5403	i965: If control_data_header_size_bits is zero, don't do EndPrimitive This can occur when max_vertices=0 is explicitly specified. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-01 11:11:39 -07:00
Ian Romanick	049bb94d2e	mesa: Fix bogus strncmp The string "[0]\0" is the same as "[0]" as far as the C string datatype is concerned. That string has length 3. strncmp(s, length_3_string, 4) is the same as strcmp(s, length_3_string), so make it be strcmp. v2: Not the same as strncmp(..., 3). Noticed by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-01 11:11:25 -07:00
Marek Olšák	12740efd29	radeonsi: set correct stencil tile mode for texturing Sadly, this doesn't affect SI and VI in any way. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-06-01 17:35:30 +02:00
Marek Olšák	ea68215c54	winsys/amdgpu: set flags correctly when allocating depth-stencil buffers This mimics Vulkan. It also documents how to fix stencil texturing. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-06-01 17:35:30 +02:00
Marek Olšák	532a5af47f	gallium/radeon: lower memory usage during texture transfers This improves throughput by keeping TTM overhead down. Some piglit tests such as texelFetch and streaming-texture-leak will use less memory now. v2: use gart_size / 4 as the threshold Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-01 17:35:30 +02:00
Marek Olšák	614e3c6272	gallium/radeon: invalidate busy linear textures for whole-texture uploads Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	fc1479a954	gallium/radeon: degrade tiled textures mapped often to linear Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	9927c8138a	gallium/radeon: clean up and better comment use_staging_texture Next commits will add other things around this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	b033584299	radeonsi: set some colorbuffer register fields at emit time to allow reallocating the texture storage with different parameters Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	30b2b860b0	radeonsi: implement global resetting of texture descriptors it will be used by texture reallocation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	28de7aec0c	radeonsi: move code for setting one shader image into separate function v2: fix set_shader_images(..., NULL). Found by Christoph Haag. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	95c5bbae66	radeonsi: set some image descriptor fields at bind time mainly the fields that can change by reallocating a texture and changing the tile mode Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	ef765d0789	gallium/radeon: strenghten some checking for DMA preparation Just for consistency. This doesn't fix anything, because DCC is not supported with non-mipmapped textures. v1.1: fix the comment about DCC Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	9d881cc0ac	gallium/util: add util_texrange_covers_whole_level from radeon Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Ilia Mirkin	ca135a2612	nir: allow sat on all float destination types With the introduction of fp64 and fp16 to nir, there are now a bunch of float types running around. A F1 2015 shader ends up with an i2f.sat operation, which has a nir_type_float32 destination. Allow sat on all the float destination types. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 10:44:40 -04:00
Alex Deucher	bd85e4a041	radeonsi: fix the raster config setup for 1 RB iceland chips I didn't realize there were 1 and 2 RB variants when this code was originally added. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>	2016-06-01 09:59:57 -04:00
Dave Airlie	6400144041	mesa/sampler: fix error codes for sampler parameters. The initial ARB_sampler_objects spec had GL_INVALID_VALUE in it, however version 8 of it fixed this, and the GL specs also have the fixed value in them. Fixes: GL45-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-01 17:01:19 +10:00
Dave Airlie	0ebf4257a3	glsl: define some GLES3 constants in GLSL 4.1 The GLSL 4.1 spec adds: gl_MaxVertexUniformVectors gl_MaxFragmentUniformVectors gl_MaxVaryingVectors This fixes: GL45-CTS.gtf31.GL3Tests.uniform_buffer_object.uniform_buffer_object_build_in_constants Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-01 17:01:13 +10:00
Topi Pohjolainen	6ca118d2f4	i965: Add norbc debug option This INTEL_DEBUG option disables lossless compression (also known as render buffer compression). v2: (Matt) Use likely(!lossless_compression_disabled) instead of !likely(lossless_compression_disabled) (Grazvydas) Update docs/envvars.html Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-01 09:16:36 +03:00
Topi Pohjolainen	30e9e6bd07	i965/gen9: Configure rbc buffers as plain for non-rbc tex views Fixes rendering in Shadow of Mordor with rbc. Application writes RGBA_UNORM texture filling it with values the application wants to later on treat as SRGB_ALPHA. Intel driver enables lossless compression for the buffer by the time of writing. However, the driver fails to make sure the buffer can be sampled as something else later on and unfortunately there is restriction in the hardware for using lossless compression for srgb formats which looks to extend itself to the sampling engine also. Requesting srgb to linear conversion on top of compressed buffer results the color values to be pretty much garbage. Fortunately none of tracked benchmarks showed a regression with this. v2 (Matt): Add missing space Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-01 09:16:36 +03:00
Kenneth Graunke	a3dc99f3d4	i965: Fix the passthrough TCS for isolines. We weren't setting up several of the uniform values for the patch header, so we'd crash when uploading push constants. We at least need to initialize them to zero. We also had the isoline parameters reversed, so it would also render incorrectly (if it didn't crash). Fixes a new Piglit test() (isoline-no-tcs), as well as crashes in GL44-CTS.tessellation_shader.single.max_patch_vertices. () https://lists.freedesktop.org/archives/piglit/2016-May/019866.html Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: mesa-stable@lists.freedesktop.org	2016-05-31 23:09:13 -07:00
Dave Airlie	ebb81cd683	i965/xfb: skip components in correct buffer. The driver was adding the skip components but always for buffer 0. This fixes: GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_skip_multiple_buffers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-01 15:53:00 +10:00
Dave Airlie	1fe7bbb911	glsl/linker: fix multiple streams transform feedback. `e2791b38b4` mesa/program_interface_query: fix transform feedback varyings. caused a regression in GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_multiple_streams on radeonsi. The problem was it was using the skip components varying to set the stream id, when it should wait until a varying was written, this just adds the varying checks in the right place. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-01 13:30:41 +10:00
Dave Airlie	e891f7cf55	mesa/bufferobj: use mapping range in BufferSubData. According to GL4.5 spec: An INVALID_OPERATION error is generated if any part of the speci- fied buffer range is mapped with MapBufferRange or MapBuffer (see sec- tion 6.3), unless it was mapped with MAP_PERSISTENT_BIT set in the Map- BufferRange access flags. So we should use the if range is mapped path. This fixes: GL45-CTS.buffer_storage.map_persistent_buffer_sub_data Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "12.0, 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-01 13:30:40 +10:00
Ilia Mirkin	18d11c9989	nv50/ir: fix error finding free element in bitset in some situations This really only hits for bitsets with a size of a multiple of 32. We can end up with pos = -1 as a result of the ffs, which we in turn decide is a valid position (since we fall through the loop and i == 1, we end up adding 32 to it, so end up returning 31 again). Up until recently this was largely unreachable, as the register file sizes were all 63 or 255. However with the advent of compute shaders which can restrict the number of registers, this can now happen. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-05-31 23:25:51 -04:00
Ilia Mirkin	d873608bcf	nv50/ir: print relevant file's bitset when showing RA info Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-31 23:25:50 -04:00
Timothy Arceri	98d40b4d11	Revert "glsl: fix xfb_offset unsized array validation" This reverts commit `aac90ba292`. The commit caused a regression in: piglit.spec.glsl-1_50.compiler.gs-input-nonarray-named-block.geom Also the CTS test it was meant to fix seems like it may be bogus. Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-01 10:33:57 +10:00
Francisco Jerez	c1107cec44	i965/fs: Allow scalar source regions on SNB math instructions. I haven't found any evidence that this isn't supported by the hardware, in fact according to the SNB hardware spec: "The supported regioning modes for math instructions are align16, align1 with the following restrictions: - Scalar source is supported. [...] - Source and destination offset must be the same, except the case of scalar source." Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-31 15:57:41 -07:00
Francisco Jerez	06d8765bc0	i965/fs: Fix constant combining for instructions that cannot accept source mods. This is the case for SNB math instructions so we need to be careful and insert the literal value of the immediate into the table (rather than its absolute value) if the instruction is unable to invert the sign of the constant on the fly. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:57:41 -07:00
Francisco Jerez	303ec22ed6	i965/fs: Extend remove_duplicate_mrf_writes() to handle non-VGRF to MRF copies. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:57:41 -07:00
Francisco Jerez	4fe4f6e8a7	i965/fs: Fix compute_to_mrf() to coalesce VGRFs initialized by multiple single-GRF writes. Which requires using a bitset instead of a boolean flag to keep track of the GRFs we've seen a generating instruction for already. The search loop continues until all instructions initializing the value of the source VGRF have been found, or it is determined that coalescing is not possible. Fixes a few piglit test cases on Gen4-6 which were regressed by `6956015aa5` due to the different (yet perfectly valid) ordering in which copy instructions are emitted now by the simd lowering pass, which had the side effect of causing this optimization pass to start corrupting the program in cases where a VGRF-to-MRF copy instruction would be eliminated but only the last instruction writing to the source VGRF region would be rewritten to point to the target MRF. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:57:41 -07:00
Francisco Jerez	1898673f58	i965/fs: Teach compute_to_mrf() about the COMPR4 address transformation. This will be required to correctly transform the destination of 8-wide instructions that write a single GRF of a VGRF to MRF copy marked COMPR4. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:57:40 -07:00
Francisco Jerez	485fbaff03	i965/fs: Refactor compute_to_mrf() to split search and rewrite into separate loops. This will allow compute_to_mrf to handle cases where the source of the VGRF-to-MRF copy is initialized by more than one instruction. In such cases we cannot rewrite the destination of any of the generating instructions until it's known whether the whole VGRF source region can be coalesced into the destination MRF, which will imply continuing the search until all generating instructions have been found or it has been determined that the VGRF and MRF registers cannot be coalesced. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:57:40 -07:00
Francisco Jerez	4b0ec9f475	i965/fs: Fix compute-to-mrf VGRF region coverage condition. Compute-to-mrf was checking whether the destination of scan_inst is more than one component (making assumptions about the instruction data type) in order to find out whether the result is being fully copied into the MRF destination, which is rather inaccurate in cases where a single-component instruction is only partially contained in the source region, or when the execution size of the copy and scan_inst instructions differ. Instead check whether the destination region of the instruction is really contained within the bounds of the source region of the copy. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:57:40 -07:00
Francisco Jerez	bb61e24787	i965/fs: Simplify and improve accuracy of compute_to_mrf() by using regions_overlap(). Compute-to-mrf was being rather heavy-handed about checking whether instruction source or destination regions interfere with the copy instruction, which could conceivably lead to program miscompilation. Fix it by using regions_overlap() instead of the open-coded and dubiously correct overlap checks. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:56:54 -07:00
Francisco Jerez	88f380a2dd	i965/fs: Teach regions_overlap() about COMPR4 MRF regions. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:22:04 -07:00
Dylan Baker	604010a7ed	Don't use python 3 Now there are not files that require python 3, so for now just remove the python 3 dependency and use python 2. I think the right plan is to just get all of the python ready for python 3, and then use whatever python is available. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Dylan Baker	ab31817fed	genxml: change chbang to python 2 Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Dylan Baker	12c1a01c72	genxml: use the isalpha method rather than str.isalpha. This fixes gen_pack_header to work on python 2, where name[0] is unicode not str. Signed-off-by: Dylan Bake <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Dylan Baker	a45a25418b	genxml: require future imports for python2 compatibility. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Dylan Baker	e5681e4d70	genxml: mark re strings as raw This is a correctness issue. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Dylan Baker	de2e9da2e9	genxml: Make classes descendants of object This is the default in python3, but in python2 you get old style classes. No one likes old-style classes. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Dylan Baker	9f50e3572c	genxml: mark gen_pack_header.py as encoded in utf-8 There is unicode in this file, and I'm actually surprised that the python interpreter hasn't gotten grumpy. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Bas Nieuwenhuizen	35818129a6	radeonsi: Decompress DCC textures in a render feedback loop. By using a counter to quickly reject textures that are not bound to a framebuffer, the performance impact when binding sampler_views/images is not too large. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-31 21:43:04 +02:00
Bas Nieuwenhuizen	cbe3421f05	radeonsi: Add counter to check if a texture is bound to a framebuffer. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-31 21:43:00 +02:00
Rhys Kidd	8cb74dd4e6	vc4: Fix compiler warnings in fail_instr path of QIR validate pass Introduced in `8e2d0843c0`. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-31 10:56:02 -07:00
Emil Velikov	b8e1f59d62	anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards The generated sources should follow the example set by the vulkan headers and our non-generated code. Namely: the code for all supported platforms should be available, each one guarded by its respective VK_USE_PLATFORM_*_KHR macro. v2: Reword commit message. Cc: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96285 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1 over IRC)	2016-05-31 18:41:28 +01:00
Brian Paul	6bea33008e	svga: change enum pipe_resource_usage back to unsigned This parameter is actually a bitmask of PIPE_TRANSFER_x flags. Change it back to a simple unsigned type. IIRC, some compilers complain about masks of enum values. Also, this make the function signature match u_resource_vtbl::transfer_map() again. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-31 10:20:36 -06:00
Marek Olšák	7ca55d2da8	radeonsi: fix CP DMA hazard with index buffer fetches Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-31 16:59:32 +02:00
Marek Olšák	d427110882	r600g: do GL-compliant integer resolves The GL spec has been clarified and the new rule says we should just copy 1 sample. u_blitter does the right thing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-31 16:48:55 +02:00
Marek Olšák	d5882bb0df	radeonsi: do GL-compliant integer resolves The GL spec has been clarified and the new rule says we should just copy 1 sample. u_blitter does the right thing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-31 16:48:54 +02:00
Marek Olšák	921ab0028e	gallium/u_blitter: do GL-compliant integer resolves The GL spec has been clarified and the new rule says we should just copy 1 sample. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-31 16:48:53 +02:00
Marek Olšák	8a10192b4b	mesa: fix crash in driver_RenderTexture_is_safe This just fixed the crash with the apitrace in bug report. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95246 Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-31 16:43:34 +02:00
Marek Olšák	fc4896e686	radeonsi: don't flush TC at the end of IBs on DRM >= 3.2.0 It's not needed since it was fixed in the kernel. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-31 16:41:22 +02:00
Jakob Sinclair	877c00c653	gallium/radeon: fixed division by zero Coverity is getting a false positive that a division by zero can occur here. This change will silence the Coverity warnings as a division by zero cannot occur in this case. Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-05-31 12:51:20 +02:00
Eric Engestrom	35fd5282ea	st/glsl_to_tgsi: prevent infinite loop `unsigned j` would never fail `j >= 0`, leading to an infinite loop as `j--` wraps around. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-05-31 11:46:30 +02:00
Dave Airlie	f87352d769	glsl/images: bounds check image unit assignment The CTS test: GL45-CTS.multi_bind.dispatch_bind_image_textures binds 192 image uniforms, we reject this later, but not until after we trash the contents of the struct gl_shader. Error now reads: Too many compute shader image uniforms (192 > 16) instead of Too many compute shader image uniforms (2745344416 > 16) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-31 10:41:44 +10:00
Ilia Mirkin	4b1a167a2b	nvc0/ir: fix spilling predicates to registers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-05-30 18:15:14 -04:00
Ilia Mirkin	1f895caba0	nvc0/ir: limit max number of regs based on availability in SM This effectively limits registers to 32 and 64 for fermi and kepler when 1024 threads are used, but allows the full amount to be used with smaller thread sizes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-30 18:15:10 -04:00
Ilia Mirkin	27a51ff9b4	nv50/ir: record number of threads in a compute shader Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-30 18:14:55 -04:00
Pierre Moreau	ae70879530	nv50/ir: Add missing handling of U64/S64 in inlines Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-30 16:12:12 -04:00
Emil Velikov	9074470d7b	docs: rename release notes to 12.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `7ad2cb6f08`)	2016-05-30 20:33:30 +01:00
Ilia Mirkin	68d135011b	docs: move nvc0 out of individual lines of GL 4.2, 4.3, ES 3.1 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-30 15:18:32 -04:00
Emil Velikov	888cf6eea2	docs: add 12.1.0-devel release notes template, bump version Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 20:03:19 +01:00
Marek Olšák	4291229488	docs/GL3: mark radeonsi as all done up to GL 4.3 and GLES 3.1	2016-05-30 20:48:51 +02:00
Emil Velikov	922b471777	nir: add the SConscript.nir to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 19:19:01 +01:00

2972 changed files with 370187 additions and 193340 deletions

									
										9

.dir-locals.el
									
												View File
												
				@@ -1,4 +1,5 @@

				((prog-mode

				((nil . ((show-trailing-whitespace . t)))

				 (prog-mode

				  (indent-tabs-mode . nil)

				  (tab-width . 8)

				  (c-basic-offset . 3)

				@@ -8,6 +9,10 @@

					    (c-set-offset 'case-label '0)

					    (c-set-offset 'innamespace '0)

					    (c-set-offset 'inline-open '0)))

				  )

				  (whitespace-style face indentation)

				  (whitespace-line-column . 79)

				  (eval ignore-errors

				        (require 'whitespace)

				        (whitespace-mode 1)))

				 (makefile-mode (indent-tabs-mode . t))

				 )

									
										35

.editorconfig
									
										Normal file
									
												View File
												
				@@ -0,0 +1,35 @@

				# To use this config on you editor, follow the instructions at:

				# http://editorconfig.org

				root = true

				[*]

				charset = utf-8

				insert_final_newline = true

				tab_width = 8

				[*.{c,h,cpp,hpp,cc,hh}]

				indent_style = space

				indent_size = 3

				[{Makefile*,*.mk}]

				indent_style = tab

				[{*.py,SCons*}]

				indent_style = space

				indent_size = 4

				[*.pl]

				indent_style = space

				indent_size = 4

				[*.m4]

				indent_style = space

				indent_size = 2

				[*.yml]

				indent_style = space

				indent_size = 2

				[*.patch]

				trim_trailing_whitespace = false

1

.gitignore vendored

View File

@@ -49,3 +49,4 @@ Makefile.in
 .install-mesa-links
 .install-gallium-links
 /src/git_sha1.h
 TAGS

12

.mailmap

View File

@@ -88,9 +88,11 @@ Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <s3734770@mai
 Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <carli@carli-laptop.(none)>
 Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <Carl-Philip.Haensch@mailbox.tu-dresden.de>
 Chad Versace <chad.versace@intel.com> <chad@chad-versace.us>
 Chad Versace <chad.versace@intel.com> <Chad Versace chad@chad-versace.us>
 Chad Versace <chad.versace@intel.com> <chad.versace@linux.intel.com>
 Chad Versace <chadversary@chromium.org> <chad@kiwitree.net>
 Chad Versace <chadversary@chromium.org> <chad@chad-versace.us>
 Chad Versace <chadversary@chromium.org> <Chad Versace chad@chad-versace.us>
 Chad Versace <chadversary@chromium.org> <chad.versace@intel.com>
 Chad Versace <chadversary@chromium.org> <chad.versace@linux.intel.com>
 Chia-I Wu <olvaffe@gmail.com> <olv@lunarg.com>
 Chia-I Wu <olvaffe@gmail.com> Chia-Wu <olvaffe@gmail.com>
@@ -138,6 +140,8 @@ Dmitry Cherkassov <dcherkassov@gmail.com> Dmitry Cherkasov <dcherkassov@gmail.co
 Dylan Baker <dylanx.c.baker@intel.com> <baker.dylan.c@gmail.com>
 Edward O'Callaghan <funfunctor@folklore1984.net> <eocallaghan@alterapraxis.com>
 Emeric Grange <emeric.grange@gmail.com> Emeric <emeric.grange@gmail.com>
 Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
@@ -274,7 +278,7 @@ Marc Dietrich <marvin24@gmx.de> marvin24 <marvin24@gmx.de>
 Marcin Ślusarz <marcin.slusarz@gmail.com> Marcin Slusarz <marcin.slusarz@gmail.com>
 Marek Olšák <marek.olsak@amd.com> <maraeo@gmail.com>
 Marek Olšák <maraeo@gmail.com> <marek.olsak@amd.com>
 Mario Kleiner <mario.kleiner.de@gmail.com> kleinerm <mario.kleiner@tuebingen.mpg.de>
 Mario Kleiner <mario.kleiner.de@gmail.com> <mario.kleiner@tuebingen.mpg.de>

									
										364

.travis.yml
									
												View File
												
				@@ -1,22 +1,11 @@

				language: c

				sudo: false

				dist: trusty

				cache:

				  directories:

				    - $HOME/.ccache

				addons:

				  apt:

				    packages:

				      - libdrm-dev

				      - libudev-dev

				      - x11proto-xf86vidmode-dev

				      - libexpat1-dev

				      - libxcb-dri2-0-dev

				      - libx11-xcb-dev

				      - llvm-3.4-dev

				      - scons

				  apt: true

				  ccache: true

				env:

				  global:

				@@ -25,22 +14,277 @@ env:

				    - XORGMACROS_VERSION=util-macros-1.19.0

				    - GLPROTO_VERSION=glproto-1.4.17

				    - DRI2PROTO_VERSION=dri2proto-2.8

				    - DRI3PROTO_VERSION=dri3proto-1.0

				    - PRESENTPROTO_VERSION=presentproto-1.0

				    - LIBPCIACCESS_VERSION=libpciaccess-0.13.4

				    - LIBDRM_VERSION=libdrm-2.4.65

				    - LIBDRM_VERSION=libdrm-2.4.74

				    - XCBPROTO_VERSION=xcb-proto-1.11

				    - LIBXCB_VERSION=libxcb-1.11

				    - LIBXSHMFENCE_VERSION=libxshmfence-1.2

				    - LIBTXC_DXTN_VERSION=libtxc_dxtn-1.0.1

				    - LIBVDPAU_VERSION=libvdpau-1.1

				    - LIBVA_VERSION=libva-1.6.2

				    - LIBWAYLAND_VERSION=wayland-1.11.1

				    - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig

				  matrix:

				    - BUILD=make

				    - BUILD=scons

				    - LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"

				matrix:

				  include:

				    - env:

				        - LABEL="make loaders/classic DRI"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="make check"

				        - DRI_LOADERS="--enable-glx --enable-gbm --enable-egl --with-platforms=x11,drm,surfaceless,wayland --enable-osmesa"

				        - DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau"

				        - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS=""

				        - VULKAN_DRIVERS=""

				      addons:

				        apt:

				          packages:

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				    - env:

				        # NOTE: Building SWR is 2x (yes two) times slower than all the other

				        # gallium drivers combined.

				        # Start this early so that it doesn't hunder the run time.

				        - LABEL="make Gallium Drivers SWR"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=3.9

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - OVERRIDE_CC="gcc-5"

				        - OVERRIDE_CXX="g++-5"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="swr"

				        - VULKAN_DRIVERS=""

				      addons:

				        apt:

				          sources:

				            - ubuntu-toolchain-r-test

				            - llvm-toolchain-trusty-3.9

				          packages:

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # From sources above

				            - g++-5

				            - llvm-3.9-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				    - env:

				        - LABEL="make Gallium Drivers Other"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=3.9

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="i915,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"

				        - VULKAN_DRIVERS=""

				      addons:

				        apt:

				          sources:

				            - llvm-toolchain-trusty-3.9

				          packages:

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # From sources above

				            - llvm-3.9-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				    - env:

				        # NOTE: Analogous to SWR above, building Clover is quite slow.

				        - LABEL="make Gallium ST Clover"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=3.6

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - OVERRIDE_CC=gcc-4.7

				        - OVERRIDE_CXX=g++-4.7

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx --disable-gallium-osmesa"

				        # i915 most likely doesn't work with OpenCL.

				        # Regardless - we're doing a quick build test here.

				        - GALLIUM_DRIVERS="i915"

				        - VULKAN_DRIVERS=""

				      addons:

				        apt:

				          sources:

				            - llvm-toolchain-trusty-3.6

				          packages:

				            - libclc-dev

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            - g++-4.7

				            # From sources above

				            - llvm-3.6-dev

				            - clang-3.6

				            - libclang-3.6-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				    - env:

				        - LABEL="make Gallium ST Other"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--enable-dri --disable-opencl --enable-xa --enable-nine --enable-xvmc --enable-vdpau --enable-va --enable-omx --enable-gallium-osmesa"

				        # We need swrast for osmesa and nine.

				        # i915 most likely doesn't work with most ST.

				        # Regardless - we're doing a quick build test here.

				        - GALLIUM_DRIVERS="i915,swrast"

				        - VULKAN_DRIVERS=""

				      addons:

				        apt:

				          packages:

				            # Nine requires gcc 4.6... which is the one we have right ?

				            - libxvmc-dev

				            # Build locally, for now.

				            #- libvdpau-dev

				            #- libva-dev

				            - libomxil-bellagio-dev

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				    - env:

				        - LABEL="make Vulkan"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="make -C src/gtest check && make -C src/intel check"

				        - LLVM_VERSION=3.9

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        # XXX: we want to test the WSI, but those are enabled via the EGL toggles

				        # XXX: Platform X11 dependencies are checked when --enable-glx is set

				        - DRI_LOADERS="--enable-glx --disable-gbm --enable-egl --with-platforms=x11,wayland"

				        - DRI_DRIVERS=""

				        # XXX: enable DRI for EGL above

				        - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS=""

				        - VULKAN_DRIVERS="intel,radeon"

				      addons:

				        apt:

				          sources:

				            - llvm-toolchain-trusty-3.9

				          packages:

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # From sources above

				            - llvm-3.9-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				    - env:

				        - LABEL="scons"

				        - BUILD=scons

				        - SCONSFLAGS="-j4"

				        # Explicitly disable.

				        - SCONS_TARGET="llvm=0"

				        # Keep it symmetrical to the make build.

				        - SCONS_CHECK_COMMAND="scons llvm=0 check"

				      addons:

				        apt:

				          packages:

				            - scons

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				    - env:

				        - LABEL="scons LLVM"

				        - BUILD=scons

				        - SCONSFLAGS="-j4"

				        - SCONS_TARGET="llvm=1"

				        # Keep it symmetrical to the make build.

				        - SCONS_CHECK_COMMAND="scons llvm=1 check"

				        - LLVM_VERSION=3.3

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				      addons:

				        apt:

				          packages:

				            - scons

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            - llvm-3.3-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				    - env:

				        - LABEL="scons SWR"

				        - BUILD=scons

				        - SCONSFLAGS="-j4"

				        - SCONS_TARGET="swr=1"

				        - LLVM_VERSION=3.9

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        # Keep it symmetrical to the make build. There's no actual SWR, yet.

				        - SCONS_CHECK_COMMAND="true"

				        - OVERRIDE_CC="gcc-5"

				        - OVERRIDE_CXX="g++-5"

				      addons:

				        apt:

				          sources:

				            - ubuntu-toolchain-r-test

				            - llvm-toolchain-trusty-3.9

				          packages:

				            - scons

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # From sources above

				            - g++-5

				            - llvm-3.9-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				install:

				  - export PATH="/usr/lib/ccache:$PATH"

				  - pip install --user mako

				  # Since libdrm gets updated in configure.ac regularly, try to pick up the

				  # latest version from there.

				  - for line in `grep "^LIBDRM.*_REQUIRED=" configure.ac`; do

				      old_ver=`echo $LIBDRM_VERSION | sed 's/libdrm-//'`;

				      new_ver=`echo $line | sed 's/.*REQUIRED=//'`;

				      if `echo "$old_ver,$new_ver" | tr ',' '\n' | sort -Vc 2> /dev/null`; then

				        export LIBDRM_VERSION="libdrm-$new_ver";

				      fi;

				    done

				  # Install dependencies where we require specific versions (or where

				  # disallowed by Travis CI's package whitelisting).

				@@ -56,14 +300,6 @@ install:

				  - tar -jxvf $DRI2PROTO_VERSION.tar.bz2

				  - (cd $DRI2PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/proto/$DRI3PROTO_VERSION.tar.bz2

				  - tar -jxvf $DRI3PROTO_VERSION.tar.bz2

				  - (cd $DRI3PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/proto/$PRESENTPROTO_VERSION.tar.bz2

				  - tar -jxvf $PRESENTPROTO_VERSION.tar.bz2

				  - (cd $PRESENTPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2

				  - tar -jxvf $XCBPROTO_VERSION.tar.bz2

				  - (cd $XCBPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				@@ -78,24 +314,70 @@ install:

				  - wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2

				  - tar -jxvf $LIBDRM_VERSION.tar.bz2

				  - (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install)

				  - wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2

				  - tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2

				  - (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)

				# Disabled LLVM (and therefore r300 and r600) because the build fails

				# with "undefined reference to `clock_gettime'" and "undefined

				# reference to `setupterm'" in llvmpipe.

				  # libtxc-dxtn uses the patented S3 Texture Compression

				  # algorithm. Therefore, we don't want to use this library but it is

				  # still possible through setting the USE_TXC_DXTN variable to yes in

				  # the travis web UI.

				  #

				  # According to Wikipedia, the patent expires on October 2, 2017:

				  # https://en.wikipedia.org/wiki/S3_Texture_Compression#Patent

				  - if test "x$USE_TXC_DXTN" = xyes; then

				      wget https://people.freedesktop.org/~cbrill/libtxc_dxtn/$LIBTXC_DXTN_VERSION.tar.bz2;

				      tar -jxvf $LIBTXC_DXTN_VERSION.tar.bz2;

				      (cd $LIBTXC_DXTN_VERSION && ./configure --prefix=$HOME/prefix && make install);

				    fi

				  - wget http://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2

				  - tar -jxvf $LIBVDPAU_VERSION.tar.bz2

				  - (cd $LIBVDPAU_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget http://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2

				  - tar -jxvf $LIBVA_VERSION.tar.bz2

				  - (cd $LIBVA_VERSION && ./configure --prefix=$HOME/prefix --disable-wayland --disable-dummy-driver && make install)

				  - wget http://wayland.freedesktop.org/releases/$LIBWAYLAND_VERSION.tar.xz

				  - tar -axvf $LIBWAYLAND_VERSION.tar.xz

				  - (cd $LIBWAYLAND_VERSION && ./configure --prefix=$HOME/prefix --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation && make install)

				  # Generate the header since one is missing on the Travis instance

				  - mkdir -p linux

				  - printf "%s\n" \

				           "#ifndef _LINUX_MEMFD_H" \

				           "#define _LINUX_MEMFD_H" \

				           "" \

				           "#define __NR_memfd_create 319" \

				           "#define SYS_memfd_create __NR_memfd_create" \

				           "" \

				           "#define MFD_CLOEXEC             0x0001U" \

				           "#define MFD_ALLOW_SEALING       0x0002U" \

				           "" \

				           "#endif /* _LINUX_MEMFD_H */" > linux/memfd.h

				script:

				  - if test "x$BUILD" = xmake; then

				      test -n "$OVERRIDE_CC" && export CC="$OVERRIDE_CC";

				      test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX";

				      export CC="$CC -isystem`pwd`";

				      ./autogen.sh --enable-debug

				        --disable-gallium-llvm

				        --with-egl-platforms=x11,drm

				        --with-dri-drivers=i915,i965,radeon,r200,swrast,nouveau

				        --with-gallium-drivers=svga,swrast,vc4,virgl

				        ;

				      make && make check;

				    elif test x$BUILD = xscons; then

				      scons;

				        $DRI_LOADERS

				        --with-dri-drivers=$DRI_DRIVERS

				        $GALLIUM_ST

				        --with-gallium-drivers=$GALLIUM_DRIVERS

				        --with-vulkan-drivers=$VULKAN_DRIVERS

				        --disable-llvm-shared-libs

				        &&

				      make && eval $MAKE_CHECK_COMMAND;

				    fi

				  - if test "x$BUILD" = xscons; then

				      test -n "$OVERRIDE_CC" && export CC="$OVERRIDE_CC";

				      test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX";

				      scons $SCONS_TARGET && eval $SCONS_CHECK_COMMAND;

				    fi

									
										61

Android.common.mk
									
												View File
												
				@@ -30,16 +30,18 @@ LOCAL_C_INCLUDES += \

					$(MESA_TOP)/include

				MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)

				# define ANDROID_VERSION (e.g., 4.0.x => 0x0400)

				LOCAL_CFLAGS += \

					-Wno-unused-parameter \

					-Wno-date-time \

					-Wno-pointer-arith \

					-Wno-missing-field-initializers \

					-Wno-initializer-overrides \

					-Wno-mismatched-tags \

					-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \

					-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \

					-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)

					-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"

				LOCAL_CFLAGS += \

					-D__STDC_LIMIT_MACROS \

					-DENABLE_SHADER_CACHE \

					-DHAVE___BUILTIN_EXPECT \

					-DHAVE___BUILTIN_FFS \

					-DHAVE___BUILTIN_FFSLL \

				@@ -47,6 +49,7 @@ LOCAL_CFLAGS += \

					-DHAVE_FUNC_ATTRIBUTE_UNUSED \

					-DHAVE_FUNC_ATTRIBUTE_FORMAT \

					-DHAVE_FUNC_ATTRIBUTE_PACKED \

					-DHAVE_FUNC_ATTRIBUTE_ALIAS \

					-DHAVE___BUILTIN_CTZ \

					-DHAVE___BUILTIN_POPCOUNT \

					-DHAVE___BUILTIN_POPCOUNTLL \

				@@ -55,9 +58,17 @@ LOCAL_CFLAGS += \

					-DHAVE___BUILTIN_UNREACHABLE \

					-DHAVE_PTHREAD=1 \

					-DHAVE_DLOPEN \

					-DHAVE_DL_ITERATE_PHDR \

					-fvisibility=hidden \

					-Wno-sign-compare

				LOCAL_CPPFLAGS += \

					-D__STDC_CONSTANT_MACROS \

					-D__STDC_FORMAT_MACROS \

					-D__STDC_LIMIT_MACROS \

					-Wno-error=non-virtual-dtor \

					-Wno-non-virtual-dtor

				# mesa requires at least c99 compiler

				LOCAL_CONLYFLAGS += \

					-std=c99

				@@ -65,30 +76,36 @@ LOCAL_CONLYFLAGS += \

				ifeq ($(strip $(MESA_ENABLE_ASM)),true)

				ifeq ($(TARGET_ARCH),x86)

				LOCAL_CFLAGS += \

					-DUSE_X86_ASM \

					-DUSE_X86_ASM

				endif

				endif

				ifeq ($(MESA_ENABLE_LLVM),true)

				LOCAL_CFLAGS += \

					-DHAVE_LLVM=0x0305 -DMESA_LLVM_VERSION_PATCH=2 \

					-D__STDC_CONSTANT_MACROS \

					-D__STDC_FORMAT_MACROS \

					-D__STDC_LIMIT_MACROS

				  ifeq ($(MESA_ANDROID_MAJOR_VERSION),5)

				    LOCAL_CFLAGS += -DHAVE_LLVM=0x0305 -DMESA_LLVM_VERSION_PATCH=2

				    ELF_INCLUDES := external/elfutils/0.153/libelf

				  endif

				  ifeq ($(MESA_ANDROID_MAJOR_VERSION),6)

				    LOCAL_CFLAGS += -DHAVE_LLVM=0x0307 -DMESA_LLVM_VERSION_PATCH=0

				    ELF_INCLUDES := external/elfutils/src/libelf

				  endif

				  ifeq ($(MESA_ANDROID_MAJOR_VERSION),7)

				    LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 -DMESA_LLVM_VERSION_PATCH=0

				    ELF_INCLUDES := external/elfutils/libelf

				  endif

				endif

				LOCAL_CPPFLAGS += \

					$(if $(filter true,$(MESA_LOLLIPOP_BUILD)),-D_USING_LIBCXX) \

					-Wno-error=non-virtual-dtor \

					-Wno-non-virtual-dtor

				ifeq ($(MESA_LOLLIPOP_BUILD),true)

				  LOCAL_CFLAGS_32 += -DDEFAULT_DRIVER_DIR=\"/system/lib/$(MESA_DRI_MODULE_REL_PATH)\"

				  LOCAL_CFLAGS_64 += -DDEFAULT_DRIVER_DIR=\"/system/lib64/$(MESA_DRI_MODULE_REL_PATH)\"

				else

				  LOCAL_CFLAGS += -DDEFAULT_DRIVER_DIR=\"/system/lib/$(MESA_DRI_MODULE_REL_PATH)\"

				ifneq ($(LOCAL_IS_HOST_MODULE),true)

				# add libdrm if there are hardware drivers

				ifneq ($(filter-out swrast,$(MESA_GPU_DRIVERS)),)

				LOCAL_CFLAGS += -DHAVE_LIBDRM

				LOCAL_SHARED_LIBRARIES += libdrm

				endif

				endif

				LOCAL_CFLAGS_32 += -DDEFAULT_DRIVER_DIR=\"/system/lib/$(MESA_DRI_MODULE_REL_PATH)\"

				LOCAL_CFLAGS_64 += -DDEFAULT_DRIVER_DIR=\"/system/lib64/$(MESA_DRI_MODULE_REL_PATH)\"

				# uncomment to keep the debug symbols

				#LOCAL_STRIP_MODULE := false

				@@ -99,3 +116,7 @@ endif

				# Quiet down the build system and remove any .h files from the sources

				LOCAL_SRC_FILES := $(patsubst %.h, , $(LOCAL_SRC_FILES))

				ifneq ($(LOCAL_IS_HOST_MODULE),true)

				LOCAL_SHARED_LIBRARIES += libz

				endif

									
										20

Android.mk
									
												View File
												
				@@ -24,7 +24,7 @@

				# BOARD_GPU_DRIVERS should be defined.  The valid values are

				#

				#   classic drivers: i915 i965

				#   gallium drivers: swrast freedreno i915g ilo nouveau r300g r600g radeonsi vc4 virgl vmwgfx

				#   gallium drivers: swrast freedreno i915g nouveau r300g r600g radeonsi vc4 virgl vmwgfx

				#

				# The main target is libGLES_mesa.  For each classic driver enabled, a DRI

				# module will also be built.  DRI modules will be loaded by libGLES_mesa.

				@@ -32,15 +32,6 @@

				MESA_TOP := $(call my-dir)

				MESA_ANDROID_MAJOR_VERSION := $(word 1, $(subst ., , $(PLATFORM_VERSION)))

				MESA_ANDROID_MINOR_VERSION := $(word 2, $(subst ., , $(PLATFORM_VERSION)))

				MESA_ANDROID_VERSION := $(MESA_ANDROID_MAJOR_VERSION).$(MESA_ANDROID_MINOR_VERSION)

				ifeq ($(filter 1 2 3 4,$(MESA_ANDROID_MAJOR_VERSION)),)

				MESA_LOLLIPOP_BUILD := true

				else

				define local-generated-sources-dir

				$(call local-intermediates-dir)

				endef

				endif

				MESA_DRI_MODULE_REL_PATH := dri

				MESA_DRI_MODULE_PATH := $(TARGET_OUT_SHARED_LIBRARIES)/$(MESA_DRI_MODULE_REL_PATH)

				@@ -50,7 +41,7 @@ MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk

				MESA_PYTHON2 := python

				classic_drivers := i915 i965

				gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx vc4 virgl

				gallium_drivers := swrast freedreno i915g nouveau r300g r600g radeonsi vmwgfx vc4 virgl

				MESA_GPU_DRIVERS := $(strip $(BOARD_GPU_DRIVERS))

				@@ -95,9 +86,10 @@ SUBDIRS := \

					src/mesa \

					src/util \

					src/egl \

					src/intel/genxml \

					src/intel/isl \

					src/mesa/drivers/dri

					src/amd \

					src/intel \

					src/mesa/drivers/dri \

					src/vulkan

				INC_DIRS := $(call all-named-subdir-makefiles,$(SUBDIRS))

									
										11

Makefile.am
									
												View File
												
				@@ -27,7 +27,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \

					--enable-egl \

					--enable-gallium-tests \

					--enable-gallium-osmesa \

					--enable-gallium-llvm \

					--enable-llvm \

					--enable-gbm \

					--enable-gles1 \

					--enable-gles2 \

				@@ -40,11 +40,11 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \

					--enable-vdpau \

					--enable-xa \

					--enable-xvmc \

					--disable-llvm-shared-libs \

					--with-egl-platforms=x11,wayland,drm,surfaceless \

					--enable-llvm-shared-libs \

					--with-platforms=x11,wayland,drm,surfaceless \

					--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \

					--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr \

					--with-vulkan-drivers=intel

					--with-gallium-drivers=i915,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr,etnaviv,imx \

					--with-vulkan-drivers=intel,radeon

				ACLOCAL_AMFLAGS = -I m4

				@@ -62,6 +62,7 @@ noinst_HEADERS = \

					include/c99_math.h \

					include/c11 \

					include/D3D9 \

					include/GL/wglext.h \

					include/HaikuGL \

					include/no_extern_c.h \

					include/pci_ids

16

REVIEWERS

View File

@@ -58,6 +58,7 @@ F:	src/compiler/nir/
 DOCUMENTATION
 R: Emil Velikov <emil.l.velikov@gmail.com>
 R: Eric Engestrom <eric@engestrom.ch>
 F: docs/
 F: doxygen/
@@ -69,6 +70,10 @@ DRI LOADER
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: src/loader/
 EGL
 R: Eric Engestrom <eric@engestrom.ch>
 F: src/egl/
 GALLIUM LOADER
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: src/gallium/auxiliary/pipe-loader/
@@ -80,6 +85,7 @@ F: src/gallium/targets/
 AUTOCONF BUILD
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: autogen.sh
 F: configure.ac
 F: */Automake.inc
 F: */Makefile.*am
@@ -92,10 +98,16 @@ F: */Makefile.sources
 ANDROID BUILD
 R: Emil Velikov <emil.l.velikov@gmail.com>
 R: Rob Herring <robh@kernel.org>
 F: CleanSpec.mk
 F: */Android.*mk
 F: */Makefile.sources
 ANDROID EGL SUPPORT
 R: Rob Herring <robh@kernel.org>
 R: Tomasz Figa <tfiga@chromium.org>
 F: src/egl/drivers/dri2/platform_android.c
 WAYLAND EGL SUPPORT
 R: Daniel Stone <daniels@collabora.com>
 F: src/egl/wayland/*
@@ -104,3 +116,7 @@ F: src/egl/drivers/dri2/platform_wayland.c
 FREEDRENO
 R:	Rob Clark <robclark@freedesktop.org>
 F:	src/gallium/drivers/freedreno/
 GLX
 R: Adam Jackson <ajax@redhat.com>
 F: src/glx/

2

VERSION

View File

@@ -1 +1 @@
 .0.2
 .2.0-devel

									
										12

appveyor.yml
									
												View File
												
				@@ -34,13 +34,13 @@ branches:

				clone_depth: 100

				cache:

				- win_flex_bison-2.4.5.zip

				- win_flex_bison-2.5.9.zip

				- llvm-3.3.1-msvc2013-mtd.7z

				os: Visual Studio 2013

				environment:

				  WINFLEXBISON_ARCHIVE: win_flex_bison-2.4.5.zip

				  WINFLEXBISON_ARCHIVE: win_flex_bison-2.5.9.zip

				  LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z

				install:

				@@ -48,14 +48,16 @@ install:

				- python --version

				- python -m pip --version

				# Install Mako

				- python -m pip install --egg Mako

				- python -m pip install Mako==1.0.6

				# Install pywin32 extensions, needed by SCons

				- python -m pip install pypiwin32

				# Install python wheels, necessary to install SCons via pip

				- python -m pip install wheel

				# Install SCons

				- python -m pip install --egg scons==2.4.1

				- python -m pip install scons==2.5.1

				- scons --version

				# Install flex/bison

				- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "http://downloads.sourceforge.net/project/winflexbison/%WINFLEXBISON_ARCHIVE%"

				- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://downloads.sourceforge.net/project/winflexbison/old_versions/%WINFLEXBISON_ARCHIVE%"

				- 7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul

				- set Path=%CD%\winflexbison;%Path%

				- win_flex --version

5

bin/.cherry-ignore

View File

@@ -1,5 +0,0 @@
 # The offending commit that this patch (part) reverts isn't in 12.0
 be32a2132785fbc119f17e62070e007ee7d17af7 i965/compiler: Bring back the INTEL_PRECISE_TRIG environment variable
 # The patch depends on the batch_cache work at least.
 f00f749fda4c1beca38f362c7f86bdc6e32785 a4xx: make sure to actually clamp depth as requested

									
										3

bin/.editorconfig
									
										Normal file
									
												View File
												
				@@ -0,0 +1,3 @@

				[*.sh]

				indent_style = space

				indent_size = 2

									
										38

bin/bugzilla_mesa.sh
									
												View File
												
				@@ -1,4 +1,4 @@

				#!/bin/bash

				#!/bin/sh

				# This script is used to generate the list of fixed bugs that

				# appears in the release notes files, with HTML formatting.

				@@ -11,8 +11,6 @@

				# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3

				# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 > bugfixes

				# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee bugfixes

				# $ DRYRUN=yes bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3

				# $ DRYRUN=yes bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | wc -l

				# regex pattern: trim before bug number

				@@ -21,29 +19,17 @@ trim_before='s/.*show_bug.cgi?id=\([0-9]*\).*/\1/'

				# regex pattern: reconstruct the url

				use_after='s,^,https://bugs.freedesktop.org/show_bug.cgi?id=,'

				echo "<ul>"

				echo ""

				# extract fdo urls from commit log

				urls=$(git log $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after)

				# if DRYRUN is set to "yes", simply print the URLs and don't fetch the

				# details from fdo bugzilla.

				#DRYRUN=yes

				if [ "x$DRYRUN" = xyes ]; then

					for i in $urls

					do

						echo $i

					done

				else

					echo "<ul>"

				git log $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\

				while read url

				do

					id=$(echo $url | cut -d'=' -f2)

					summary=$(wget --quiet -O - $url | grep -e '<title>.*</title>' | sed -e 's/ *<title>[0-9]\+ &ndash; \(.*\)<\/title>/\1/')

					echo "<li><a href=\"$url\">Bug $id</a> - $summary</li>"

					echo ""

				done

					for i in $urls

					do

						id=$(echo $i | cut -d'=' -f2)

						summary=$(wget --quiet -O - $i | grep -e '<title>.*</title>' | sed -e 's/ *<title>[0-9]\+ &ndash; \(.*\)<\/title>/\1/')

						echo "<li><a href=\"$i\">Bug $id</a> - $summary</li>"

						echo ""

					done

					echo "</ul>"

				fi

				echo "</ul>"

									
										30

bin/get-extra-pick-list.sh
									
												View File
												
				@@ -10,26 +10,36 @@

				# $ bin/get-extra-pick-list.sh | tee picklist

				# Use the last branchpoint as our limit for the search

				# XXX: there should be a better way for this

				latest_branchpoint=`git branch | grep \* | cut -c 3-`-branchpoint

				latest_branchpoint=`git merge-base origin/master HEAD`

				# Grep for commits with "cherry picked from commit" in the commit message.

				git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' |\

					cut -c -8 |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//'  > already_picked

				# For each cherry-picked commit...

				cat already_picked | cut -c -8 |\

				while read sha

				do

					# Check if the original commit is referenced in master

					# ... check if it's referenced (fixed by another) patch

					git log -n1 --pretty=oneline --grep=$sha $latest_branchpoint..origin/master |\

						cut -c -8 |\

					while read candidate

					do

						# Check if the potential fix, hasn't landed in branch yet.

						found=`git log -n1 --pretty=oneline --reverse --grep=$candidate $latest_branchpoint..HEAD |wc -l`

						if test $found = 0

						then

							echo Commit $candidate might need to be picked, as it references $sha

						# And flag up if it hasn't landed in branch yet.

						if grep -q ^$candidate already_picked ; then

							continue

						fi

						# Or if it isn't in the ignore list.

						if [ -f bin/.cherry-ignore ] ; then

							if grep -q ^$candidate bin/.cherry-ignore ; then

								continue

							fi

						fi

						printf "Commit \"%s\" references %s\n" \

						       "`git log -n1 --pretty=oneline $candidate`" \

						       "$sha"

					done

				done

				rm -f already_picked

									
										71

bin/get-fixes-pick-list.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,71 @@

				#!/bin/sh

				# Script for generating a list of candidates [referenced by a Fixes tag] for

				# cherry-picking to a stable branch

				#

				# Usage examples:

				#

				# $ bin/get-fixes-pick-list.sh

				# $ bin/get-fixes-pick-list.sh > picklist

				# $ bin/get-fixes-pick-list.sh | tee picklist

				# Use the last branchpoint as our limit for the search

				latest_branchpoint=`git merge-base origin/master HEAD`

				# List all the commits between day 1 and the branch point...

				git log --reverse --pretty=%H $latest_branchpoint > already_landed

				# ... and the ones cherry-picked.

				git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//'  > already_picked

				# Grep for commits with Fixes tag

				git log --reverse --pretty=%H -i --grep="fixes:" $latest_branchpoint..origin/master |\

				while read sha

				do

					# Check to see whether the patch is on the ignore list ...

					if [ -f bin/.cherry-ignore ] ; then

						if grep -q ^$sha bin/.cherry-ignore ; then

							continue

						fi

					fi

					# For each one try to extract the tag

					fixes_count=`git show $sha | grep -i "fixes:" | wc -l`

					if [ "x$fixes_count" != x1 ] ; then

						printf "WARNING: Commit \"%s\" has more than one Fixes tag\n" \

						       "`git log -n1 --pretty=oneline $sha`"

					fi

					fixes=`git show $sha | grep -i "fixes:" | head -n 1`

					# The following sed/cut combination is borrowed from GregKH

					id=`echo ${fixes} | sed -e 's/^[ \t]*//' | cut -f 2 -d ':' | sed -e 's/^[ \t]*//' | cut -f 1 -d ' '`

					# Bail out if we cannot find suitable id.

					# Any specific validation the $id is valid and not some junk, is

					# implied with the follow up code

					if [ "x$id" = x ] ; then

						continue

					fi

					# Check if the offending commit is in branch.

					# Be that cherry-picked ...

					# ... or landed before the branchpoint.

					if grep -q ^$id already_picked ||

					   grep -q ^$id already_landed ; then

						# Finally nominate the fix if it hasn't landed yet.

						if grep -q ^$sha already_picked ; then

							continue

						fi

						printf "Commit \"%s\" fixes %s\n" \

						       "`git log -n1 --pretty=oneline $sha`" \

						       "$id"

					fi

				done

				rm -f already_picked

				rm -f already_landed

									
										7

bin/get-pick-list.sh
									
												View File
												
				@@ -8,13 +8,16 @@

				# $ bin/get-pick-list.sh > picklist

				# $ bin/get-pick-list.sh | tee picklist

				# Use the last branchpoint as our limit for the search

				latest_branchpoint=`git merge-base origin/master HEAD`

				# Grep for commits with "cherry picked from commit" in the commit message.

				git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\

				git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked

				# Grep for commits that were marked as a candidate for the stable tree.

				git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\

				git log --reverse --pretty=%H -i --grep='^CC:.*mesa-stable' $latest_branchpoint..origin/master |\

				while read sha

				do

					# Check to see whether the patch is on the ignore list.

									
										42

bin/get-typod-pick-list.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,42 @@

				#!/bin/sh

				# Script for generating a list of candidates which have typos in the nomination line

				#

				# Usage examples:

				#

				# $ bin/get-typod-pick-list.sh

				# $ bin/get-typod-pick-list.sh > picklist

				# $ bin/get-typod-pick-list.sh | tee picklist

				# NB:

				# This script intentionally _never_ checks for specific version tag

				# Should we consider folding it with the original get-pick-list.sh

				# Use the last branchpoint as our limit for the search

				latest_branchpoint=`git merge-base origin/master HEAD`

				# Grep for commits with "cherry picked from commit" in the commit message.

				git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked

				# Grep for commits that were marked as a candidate for the stable tree.

				git log --reverse --pretty=%H -i --grep='^CC:.*mesa-dev' $latest_branchpoint..origin/master |\

				while read sha

				do

					# Check to see whether the patch is on the ignore list.

					if [ -f bin/.cherry-ignore ] ; then

						if grep -q ^$sha bin/.cherry-ignore ; then

							continue

						fi

					fi

					# Check to see if it has already been picked over.

					if grep -q ^$sha already_picked ; then

						continue

					fi

					git log -n1 --pretty=oneline $sha | cat

				done

				rm -f already_picked

0

bin/perf-annotate-jit → bin/perf-annotate-jit.py

View File

									
										4

bin/shortlog_mesa.sh
									
												View File
												
				@@ -1,4 +1,4 @@

				#!/bin/bash

				#!/bin/sh

				# This script is used to generate the list of changes that

				# appears in the release notes files, with HTML formatting.

				@@ -10,7 +10,7 @@

				# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee changes

				typeset -i in_log=0

				in_log=0

				git shortlog $* | while read l

				do

									
										5

common.py
									
												View File
												
				@@ -59,7 +59,7 @@ if target_platform == 'windows' and host_platform != 'windows':

				# find default_llvm value

				if 'LLVM' in os.environ:

				if 'LLVM' in os.environ or 'LLVM_CONFIG' in os.environ:

				    default_llvm = 'yes'

				else:

				    default_llvm = 'no'

				@@ -86,7 +86,7 @@ def AddOptions(opts):

				        from SCons.Options.EnumOption import EnumOption

				    opts.Add(EnumOption('build', 'build type', 'debug',

				                        allowed_values=('debug', 'checked', 'profile',

				                                        'release')))

				                                        'release', 'opt')))

				    opts.Add(BoolOption('verbose', 'verbose output', 'no'))

				    opts.Add(EnumOption('machine', 'use machine-specific assembly code',

				                        default_machine,

				@@ -110,5 +110,6 @@ def AddOptions(opts):

				    opts.Add(BoolOption('texture_float',

				                        'enable floating-point textures and renderbuffers',

				                        'no'))

				    opts.Add(BoolOption('swr', 'Build OpenSWR', 'no'))

				    if host_platform == 'windows':

				        opts.Add('MSVC_VERSION', 'Microsoft Visual C/C++ version')

1450

configure.ac

View File

File diff suppressed because it is too large Load Diff

2

docs/README.WIN32

View File

@@ -39,7 +39,7 @@ steps that work as of this writing.
   get pywin32-218.4.win-amd64-py2.7.exe
 - install git
 - download mesa from git
   see http://www.mesa3d.org/repository.html
   see https://www.mesa3d.org/repository.html
 - run scons
 General

									
										2

docs/application-issues.html
									
												View File
												
				@@ -33,7 +33,7 @@ without a depth buffer.

				<p>

				Mesa 9.1.2 and later (will) support a DRI configuration option to work around

				this issue.

				Using the <a href="http://dri.freedesktop.org/wiki/DriConf">driconf</a> tool,

				Using the <a href="https://dri.freedesktop.org/wiki/DriConf">driconf</a> tool,

				set the "Create all visuals with a depth buffer" option before running Topogun.

				Then, all GLX visuals will be created with a depth buffer.

				</p>

									
										33

docs/autoconf.html
									
												View File
												
				@@ -55,7 +55,7 @@ to your preference, type:

				</pre>

				<p>

				This will produce libGL.so and several other libraries depending on the

				This will produce libGL.so and/or several other libraries depending on the

				options you have chosen. Later, if you want to rebuild for a different

				configuration run <code>make realclean</code> before rebuilding.

				</p>

				@@ -118,7 +118,7 @@ directories. For example, <code>LDFLAGS="-L/usr/X11R6/lib"</code>.</p>

				<dt><code>PKG_CONFIG_PATH</code></dt>

				<dd><p>The

				<code>pkg-config</code> utility is a hard requirement for cofiguring and

				<code>pkg-config</code> utility is a hard requirement for configuring and

				building mesa. It is used to search for external libraries

				on the system. This environment variable is used to control the search

				path for <code>pkg-config</code>. For instance, setting

				@@ -133,9 +133,11 @@ There are also a few general options for altering the Mesa build:

				</p>

				<dl>

				<dt><code>--enable-debug</code></dt>

				<dd><p>This option will enable compiler

				options and macros to aid in debugging the Mesa libraries.</p>

				</dd>

				<dd><p>This option will set the compiler debug/optimisation levels (if the user

				hasn't already set them via the CFLAGS/CXXFLAGS) and macros to aid in

				debugging the Mesa libraries.</p>

				<p>Note that enabling this option can lead to noticeable loss of performance.</p>

				<dt><code>--disable-asm</code></dt>

				<dd><p>There are assembly routines

				@@ -174,27 +176,22 @@ architecture, the following should be sufficient to configure multilib Mesa</p>

				</dl>

				<h2 id="driver">2. Driver Options</h2>

				<h2 id="driver">2. GL Driver Options</h2>

				<p>

				There are several different driver modes that Mesa can use. These are

				described in more detail in the <a href="install.html">basic

				installation instructions</a>. The Mesa driver is controlled through the

				configure options <code>--enable-xlib-glx</code>, <code>--enable-osmesa</code>,

				and <code>--enable-dri</code>.

				configure options <code>--enable-glx</code> and <code>--enable-osmesa</code>

				</p>

				<h3 id="xlib">Xlib</h3><p>

				It uses Xlib as a software renderer to do all rendering. It corresponds

				to the option <code>--enable-xlib-glx</code>. The libX11 and libXext

				libraries, as well as the X11 development headers, will be need to

				support the Xlib driver.

				to the option <code>--enable-glx=xlib</code> or <code>--enable-glx=gallium-xlib</code>.

				<h3 id="dri">DRI</h3><p>This mode uses the DRI hardware drivers for

				accelerated OpenGL rendering. Enable the DRI drivers with the option

				<code>--enable-dri</code>. See the <a href="install.html">basic

				installation instructions</a> for details on prerequisites for the DRI

				drivers.

				accelerated OpenGL rendering. To enable use <code>--enable-glx=dri

				--enable-dri</code>.

				<!-- DRI specific options -->

				<dl>

				@@ -252,10 +249,8 @@ will create the libOSMesa16 library with a 16-bit color channel.

				<h2 id="library">3. Library Options</h2>

				<p>

				The configure script provides more fine grained control over the GL

				libraries that will be built. More details on the specific GL libraries

				can be found in the <a href="install.html">basic installation

				instructions</a>.

				The configure script provides more fine grained control over the libraries

				that will be built.

				</div>

				</body>

									
										2

docs/bugs.html
									
												View File
												
				@@ -18,7 +18,7 @@

				<p>

				The Mesa bug database is hosted on

				<a href="http://freedesktop.org">freedesktop.org</a>.

				<a href="https://freedesktop.org">freedesktop.org</a>.

				The old bug database on SourceForge is no longer used.

				</p>

									
										142

docs/codingstyle.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,142 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Coding Style</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Coding Style</h1>

				<p>

				Mesa is over 20 years old and the coding style has evolved over time.

				Some old parts use a style that's a bit out of date.

				Different sections of mesa can use different coding style as set in the local

				EditorConfig (.editorconfig) and/or Emacs (.dir-locals.el) file.

				Alternatively the following is applicable.

				If the guidelines below don't cover something, try following the format of

				existing, neighboring code.

				</p>

				<p>

				Basic formatting guidelines

				</p>

				<ul>

				<li>3-space indentation, no tabs.

				<li>Limit lines to 78 or fewer characters.  The idea is to prevent line

				wrapping in 80-column editors and terminals.  There are exceptions, such

				as if you're defining a large, static table of information.

				<li>Opening braces go on the same line as the if/for/while statement.

				For example:

				<pre>

				   if (condition) {

				      foo;

				   } else {

				      bar;

				   }

				</pre>

				<li>Put a space before/after operators.  For example, <tt>a = b + c;</tt>

				and not <tt>a=b+c;</tt>

				<li>This GNU indent command generally does the right thing for formatting:

				<pre>

				   indent -br -i3 -npcs --no-tabs infile.c -o outfile.c

				</pre>

				<li>Use comments wherever you think it would be helpful for other developers.

				Several specific cases and style examples follow.  Note that we roughly

				follow <a href="https://www.stack.nl/~dimitri/doxygen/">Doxygen</a> conventions.

				<br>

				<br>

				Single-line comments:

				<pre>

				   /* null-out pointer to prevent dangling reference below */

				   bufferObj = NULL;

				</pre>

				Or,

				<pre>

				   bufferObj = NULL;  /* prevent dangling reference below */

				</pre>

				Multi-line comment:

				<pre>

				   /* If this is a new buffer object id, or one which was generated but

				    * never used before, allocate a buffer object now.

				    */

				</pre>

				We try to quote the OpenGL specification where prudent:

				<pre>

				   /* Page 38 of the PDF of the OpenGL ES 3.0 spec says:

				    *

				    *     "An INVALID_OPERATION error is generated for any of the following

				    *     conditions:

				    *

				    *     * <length> is zero."

				    *

				    * Additionally, page 94 of the PDF of the OpenGL 4.5 core spec

				    * (30.10.2014) also says this, so it's no longer allowed for desktop GL,

				    * either.

				    */

				</pre>

				Function comment example:

				<pre>

				   /**

				    * Create and initialize a new buffer object.  Called via the

				    * ctx->Driver.CreateObject() driver callback function.

				    * \param  name  integer name of the object

				    * \param  type  one of GL_FOO, GL_BAR, etc.

				    * \return  pointer to new object or NULL if error

				    */

				   struct gl_object *

				   _mesa_create_object(GLuint name, GLenum type)

				   {

				      /* function body */

				   }

				</pre>

				<li>Put the function return type and qualifiers on one line and the function

				name and parameters on the next, as seen above.  This makes it easy to use

				<code>grep ^function_name dir/*</code> to find function definitions.  Also,

				the opening brace goes on the next line by itself (see above.)

				<li>Function names follow various conventions depending on the type of function:

				<pre>

				   glFooBar()       - a public GL entry point (in glapi_dispatch.c)

				   _mesa_FooBar()   - the internal immediate mode function

				   save_FooBar()    - retained mode (display list) function in dlist.c

				   foo_bar()        - a static (private) function

				   _mesa_foo_bar()  - an internal non-static Mesa function

				</pre>

				<li>Constants, macros and enum names are ALL_UPPERCASE, with _ between

				words.

				<li>Mesa usually uses camel case for local variables (Ex: "localVarname")

				while gallium typically uses underscores (Ex: "local_var_name").

				<li>Global variables are almost never used because Mesa should be thread-safe.

				<li>Booleans.  Places that are not directly visible to the GL API

				should prefer the use of <tt>bool</tt>, <tt>true</tt>, and

				<tt>false</tt> over <tt>GLboolean</tt>, <tt>GL_TRUE</tt>, and

				<tt>GL_FALSE</tt>.  In C code, this may mean that

				<tt>#include &lt;stdbool.h&gt;</tt> needs to be added.  The

				<tt>try_emit_</tt>* methods in src/mesa/program/ir_to_mesa.cpp and

				src/mesa/state_tracker/st_glsl_to_tgsi.cpp can serve as examples.

				</ul>

				</p>

				</div>

				</body>

				</html>

									
										19

docs/contents.html
									
												View File
												
				@@ -53,7 +53,7 @@

				<li><a href="lists.html" target="_parent">Mailing Lists</a>

				<li><a href="bugs.html" target="_parent">Bug Database</a>

				<li><a href="webmaster.html" target="_parent">Webmaster</a>

				<li><a href="http://dri.freedesktop.org/" target="_parent">Mesa/DRI Wiki</a>

				<li><a href="https://dri.freedesktop.org/" target="_parent">Mesa/DRI Wiki</a>

				</ul>

				<b>User Topics</b>

				@@ -66,7 +66,7 @@

				<li><a href="debugging.html" target="_parent">Debugging Tips</a>

				<li><a href="perf.html" target="_parent">Performance Tips</a>

				<li><a href="extensions.html" target="_parent">Mesa Extensions</a>

				<li><a href="mangling.html" target="_parent">Function Name Mangling</a>

				<li><a href="mangling.html" target="_parent">GL Function Name Mangling</a>

				<li><a href="llvmpipe.html" target="_parent">Gallium llvmpipe driver</a>

				<li><a href="vmware-guest.html" target="_parent">VMware SVGA3D guest driver</a>

				<li><a href="postprocess.html" target="_parent">Gallium post-processing</a>

				@@ -81,23 +81,26 @@

				<li><a href="utilities.html" target="_parent">Utilities</a>

				<li><a href="helpwanted.html" target="_parent">Help Wanted</a>

				<li><a href="devinfo.html" target="_parent">Development Notes</a>

				<li><a href="codingstyle.html" target="_parent">Coding Style</a>

				<li><a href="submittingpatches.html" target="_parent">Submitting patches</a>

				<li><a href="releasing.html" target="_parent">Releasing process</a>

				<li><a href="release-calendar.html" target="_parent">Release calendar</a>

				<li><a href="sourcedocs.html" target="_parent">Source Documentation</a>

				<li><a href="dispatch.html" target="_parent">GL Dispatch</a>

				</ul>

				<b>Links</b>

				<ul>

				<li><a href="http://www.opengl.org" target="_parent">OpenGL website</a>

				<li><a href="http://dri.freedesktop.org" target="_parent">DRI website</a>

				<li><a href="http://www.freedesktop.org" target="_parent">freedesktop.org</a>

				<li><a href="http://planet.freedesktop.org" target="_parent">Developer blogs</a>

				<li><a href="https://www.opengl.org" target="_parent">OpenGL website</a>

				<li><a href="https://dri.freedesktop.org" target="_parent">DRI website</a>

				<li><a href="https://www.freedesktop.org" target="_parent">freedesktop.org</a>

				<li><a href="https://planet.freedesktop.org" target="_parent">Developer blogs</a>

				</ul>

				<b>Hosted by:</b>

				<br>

				<blockquote>

				<a href="http://sourceforge.net"

				target="_parent">sourceforge.net</a>

				<a href="https://freedesktop.org" target="_parent">freedesktop.org</a>

				</blockquote>

				</body>

									
										6

docs/developers.html
									
												View File
												
				@@ -20,7 +20,7 @@

				Both professional and volunteer developers contribute to Mesa.

				</p>

				<p>

				<a href="http://www.vmware.com/">VMware</a>

				<a href="https://www.vmware.com/">VMware</a>

				employs several of the main Mesa developers including Brian Paul

				and Keith Whitwell.

				</p>

				@@ -38,13 +38,13 @@ including:

				<p>

				Other companies including

				<a href="http://www.intellinuxgraphics.org/index.html">Intel</a>

				<a href="https://01.org/linuxgraphics">Intel</a>

				and RedHat also actively contribute to the project.

				Intel has recently contributed the new GLSL compiler in Mesa 7.9.

				</p>

				<p>

				<a href="http://www.lunarg.com/">LunarG</a> can be contacted

				<a href="https://www.lunarg.com/">LunarG</a> can be contacted

				for custom Mesa / 3D graphics development.

				</p>

									
										647

docs/devinfo.html
									
												View File
												
				@@ -18,646 +18,9 @@

				<ul>

				<li><a href="#style">Coding Style</a>

				<li><a href="#submitting">Submitting Patches</a>

				<li><a href="#release">Making a New Mesa Release</a>

				<li><a href="#extensions">Adding Extensions</a>

				</ul>

				<h2 id="style">Coding Style</h2>

				<p>

				Mesa is over 20 years old and the coding style has evolved over time.

				Some old parts use a style that's a bit out of date.

				If the guidelines below don't cover something, try following the format of

				existing, neighboring code.

				</p>

				<p>

				Basic formatting guidelines

				</p>

				<ul>

				<li>3-space indentation, no tabs.

				<li>Limit lines to 78 or fewer characters.  The idea is to prevent line

				wrapping in 80-column editors and terminals.  There are exceptions, such

				as if you're defining a large, static table of information.

				<li>Opening braces go on the same line as the if/for/while statement.

				For example:

				<pre>

				   if (condition) {

				      foo;

				   } else {

				      bar;

				   }

				</pre>

				<li>Put a space before/after operators.  For example, <tt>a = b + c;</tt>

				and not <tt>a=b+c;</tt>

				<li>This GNU indent command generally does the right thing for formatting:

				<pre>

				   indent -br -i3 -npcs --no-tabs infile.c -o outfile.c

				</pre>

				<li>Use comments wherever you think it would be helpful for other developers.

				Several specific cases and style examples follow.  Note that we roughly

				follow <a href="http://www.stack.nl/~dimitri/doxygen/">Doxygen</a> conventions.

				<br>

				<br>

				Single-line comments:

				<pre>

				   /* null-out pointer to prevent dangling reference below */

				   bufferObj = NULL;

				</pre>

				Or,

				<pre>

				   bufferObj = NULL;  /* prevent dangling reference below */

				</pre>

				Multi-line comment:

				<pre>

				   /* If this is a new buffer object id, or one which was generated but

				    * never used before, allocate a buffer object now.

				    */

				</pre>

				We try to quote the OpenGL specification where prudent:

				<pre>

				   /* Page 38 of the PDF of the OpenGL ES 3.0 spec says:

				    *

				    *     "An INVALID_OPERATION error is generated for any of the following

				    *     conditions:

				    *

				    *     * <length> is zero."

				    *

				    * Additionally, page 94 of the PDF of the OpenGL 4.5 core spec

				    * (30.10.2014) also says this, so it's no longer allowed for desktop GL,

				    * either.

				    */

				</pre>

				Function comment example:

				<pre>

				   /**

				    * Create and initialize a new buffer object.  Called via the

				    * ctx->Driver.CreateObject() driver callback function.

				    * \param  name  integer name of the object

				    * \param  type  one of GL_FOO, GL_BAR, etc.

				    * \return  pointer to new object or NULL if error

				    */

				   struct gl_object *

				   _mesa_create_object(GLuint name, GLenum type)

				   {

				      /* function body */

				   }

				</pre>

				<li>Put the function return type and qualifiers on one line and the function

				name and parameters on the next, as seen above.  This makes it easy to use

				<code>grep ^function_name dir/*</code> to find function definitions.  Also,

				the opening brace goes on the next line by itself (see above.)

				<li>Function names follow various conventions depending on the type of function:

				<pre>

				   glFooBar()       - a public GL entry point (in glapi_dispatch.c)

				   _mesa_FooBar()   - the internal immediate mode function

				   save_FooBar()    - retained mode (display list) function in dlist.c

				   foo_bar()        - a static (private) function

				   _mesa_foo_bar()  - an internal non-static Mesa function

				</pre>

				<li>Constants, macros and enumerant names are ALL_UPPERCASE, with _ between

				words.

				<li>Mesa usually uses camel case for local variables (Ex: "localVarname")

				while gallium typically uses underscores (Ex: "local_var_name").

				<li>Global variables are almost never used because Mesa should be thread-safe.

				<li>Booleans.  Places that are not directly visible to the GL API

				should prefer the use of <tt>bool</tt>, <tt>true</tt>, and

				<tt>false</tt> over <tt>GLboolean</tt>, <tt>GL_TRUE</tt>, and

				<tt>GL_FALSE</tt>.  In C code, this may mean that

				<tt>#include &lt;stdbool.h&gt;</tt> needs to be added.  The

				<tt>try_emit_</tt>* methods in src/mesa/program/ir_to_mesa.cpp and

				src/mesa/state_tracker/st_glsl_to_tgsi.cpp can serve as examples.

				</ul>

				<h2 id="submitting">Submitting patches</h2>

				<p>

				The basic guidelines for submitting patches are:

				</p>

				<ul>

				<li>Patches should be sufficiently tested before submitting.

				<li>Code patches should follow Mesa coding conventions.

				<li>Whenever possible, patches should only effect individual Mesa/Gallium

				components.

				<li>Patches should never introduce build breaks and should be bisectable (see

				<code>git bisect</code>.)

				<li>Patches should be properly formatted (see below).

				<li>Patches should be submitted to mesa-dev for review using

				<code>git send-email</code>.

				<li>Patches should not mix code changes with code formatting changes (except,

				perhaps, in very trivial cases.)

				</ul>

				<h3>Patch formatting</h3>

				<p>

				The basic rules for patch formatting are:

				</p>

				<ul>

				<li>Lines should be limited to 75 characters or less so that git logs

				displayed in 80-column terminals avoid line wrapping.  Note that git

				log uses 4 spaces of indentation (4 + 75 &lt; 80).

				<li>The first line should be a short, concise summary of the change prefixed

				with a module name.  Examples:

				<pre>

				    mesa: Add support for querying GL_VERTEX_ATTRIB_ARRAY_LONG

				    gallium: add PIPE_CAP_DEVICE_RESET_STATUS_QUERY

				    i965: Fix missing type in local variable declaration.

				</pre>

				<li>Subsequent patch comments should describe the change in more detail,

				if needed.  For example:

				<pre>

				    i965: Remove end-of-thread SEND alignment code.

				    This was present in Eric's initial implementation of the compaction code

				    for Sandybridge (commit 077d01b6). There is no documentation saying this

				    is necessary, and removing it causes no regressions in piglit on any

				    platform.

				</pre>

				<li>A "Signed-off-by:" line is not required, but not discouraged either.

				<li>If a patch address a bugzilla issue, that should be noted in the

				patch comment.  For example:

				<pre>

				   Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89689

				</pre>

				<li>If there have been several revisions to a patch during the review

				process, they should be noted such as in this example:

				<pre>

				    st/mesa: add ARB_texture_stencil8 support (v4)

				    if we support stencil texturing, enable texture_stencil8

				    there is no requirement to support native S8 for this,

				    the texture can be converted to x24s8 fine.

				    v2: fold fixes from Marek in:

				       a) put S8 last in the list

				       b) fix renderable to always test for d/s renderable

				        fixup the texture case to use a stencil only format

				        for picking the format for the texture view.

				    v3: hit fallback for getteximage

				    v4: put s8 back in front, it shouldn't get picked now (Ilia)

				</pre>

				<li>If someone tested your patch, document it with a line like this:

				<pre>

				    Tested-by: Joe Hacker &lt;jhacker@foo.com&gt;

				</pre>

				<li>If the patch was reviewed (usually the case) or acked by someone,

				that should be documented with:

				<pre>

				    Reviewed-by: Joe Hacker &lt;jhacker@foo.com&gt;

				    Acked-by: Joe Hacker &lt;jhacker@foo.com&gt;

				</pre>

				</ul>

				<h3>Testing Patches</h3>

				<p>

				It should go without saying that patches must be tested.  In general,

				do whatever testing is prudent.

				</p>

				<p>

				You should always run the Mesa test suite before submitting patches.

				The test suite can be run using the 'make check' command. All tests

				must pass before patches will be accepted, this may mean you have

				to update the tests themselves.

				</p>

				<p>

				Whenever possible and applicable, test the patch with

				<a href="http://piglit.freedesktop.org">Piglit</a> to

				check for regressions.

				</p>

				<h3>Mailing Patches</h3>

				<p>

				Patches should be sent to the Mesa mailing list for review.

				When submitting a patch make sure to use git send-email rather than attaching

				patches to emails. Sending patches as attachments prevents people from being

				able to provide in-line review comments.

				</p>

				<p>

				When submitting follow-up patches you can use --in-reply-to to make v2, v3,

				etc patches show up as replies to the originals. This usually works well

				when you're sending out updates to individual patches (as opposed to

				re-sending the whole series). Using --in-reply-to makes

				it harder for reviewers to accidentally review old patches.

				</p>

				<p>

				When submitting follow-up patches you should also login to

				<a href="https://patchwork.freedesktop.org">patchwork</a> and change the

				state of your old patches to Superseded.

				</p>

				<h3>Reviewing Patches</h3>

				<p>

				When you've reviewed a patch on the mailing list, please be unambiguous

				about your review.  That is, state either

				<pre>

				    Reviewed-by: Joe Hacker &lt;jhacker@foo.com&gt;

				</pre>

				or

				<pre>

				    Acked-by: Joe Hacker &lt;jhacker@foo.com&gt;

				</pre>

				Rather than saying just "LGTM" or "Seems OK".

				</p>

				<p>

				If small changes are suggested, it's OK to say something like:

				<pre>

				   With the above fixes, Reviewed-by: Joe Hacker &lt;jhacker@foo.com&gt;

				</pre>

				which tells the patch author that the patch can be committed, as long

				as the issues are resolved first.

				</p>

				<h3>Marking a commit as a candidate for a stable branch</h3>

				<p>

				If you want a commit to be applied to a stable branch,

				you should add an appropriate note to the commit message.

				</p>

				<p>

				Here are some examples of such a note:

				</p>

				<ul>

				  <li>CC: &lt;mesa-stable@lists.freedesktop.org&gt;</li>

				  <li>CC: "9.2 10.0" &lt;mesa-stable@lists.freedesktop.org&gt;</li>

				  <li>CC: "10.0" &lt;mesa-stable@lists.freedesktop.org&gt;</li>

				</ul>

				Simply adding the CC to the mesa-stable list address is adequate to nominate

				the commit for the most-recently-created stable branch. It is only necessary

				to specify a specific branch name, (such as "9.2 10.0" or "10.0" in the

				examples above), if you want to nominate the commit for an older stable

				branch. And, as in these examples, you can nominate the commit for the older

				branch in addition to the more recent branch, or nominate the commit

				exclusively for the older branch.

				This "CC" syntax for patch nomination will cause patches to automatically be

				copied to the mesa-stable@ mailing list when you use "git send-email" to send

				patches to the mesa-dev@ mailing list. Also, if you realize that a commit

				should be nominated for the stable branch after it has already been committed,

				you can send a note directly to the mesa-stable@lists.freedesktop.org where

				the Mesa stable-branch maintainers will receive it. Be sure to mention the

				commit ID of the commit of interest (as it appears in the mesa master branch).

				The latest set of patches that have been nominated, accepted, or rejected for

				the upcoming stable release can always be seen on the

				<a href="http://cworth.org/~cworth/mesa-stable-queue/">Mesa Stable Queue</a>

				page.

				<h3>Criteria for accepting patches to the stable branch</h3>

				Mesa has a designated release manager for each stable branch, and the release

				manager is the only developer that should be pushing changes to these

				branches. Everyone else should simply nominate patches using the mechanism

				described above.

				The stable-release manager will work with the list of nominated patches, and

				for each patch that meets the crtieria below will cherry-pick the patch with:

				<code>git cherry-pick -x &lt;commit&gt;</code>. The <code>-x</code> option is

				important so that the picked patch references the comit ID of the original

				patch.

				The stable-release manager may at times need to force-push changes to the

				stable branches, for example, to drop a previously-picked patch that was later

				identified as causing a regression). These force-pushes may cause changes to

				be lost from the stable branch if developers push things directly. Consider

				yourself warned.

				The stable-release manager is also given broad discretion in rejecting patches

				that have been nominated for the stable branch. The most basic rule is that

				the stable branch is for bug fixes only, (no new features, no

				regressions). Here is a non-exhaustive list of some reasons that a patch may

				be rejected:

				<ul>

				  <li>Patch introduces a regression. Any reported build breakage or other

				  regression caused by a particular patch, (game no longer work, piglit test

				  changes from PASS to FAIL), is justification for rejecting a patch.</li>

				  <li>Patch is too large, (say, larger than 100 lines)</li>

				  <li>Patch is not a fix. For example, a commit that moves code around with no

				  functional change should be rejected.</li>

				  <li>Patch fix is not clearly described. For example, a commit message

				  of only a single line, no description of the bug, no mention of bugzilla,

				  etc.</li>

				  <li>Patch has not obviously been reviewed, For example, the commit message

				  has no Reviewed-by, Signed-off-by, nor Tested-by tags from anyone but the

				  author.</li>

				  <li>Patch has not already been merged to the master branch. As a rule, bug

				  fixes should never be applied first to a stable branch. Patches should land

				  first on the master branch and then be cherry-picked to a stable

				  branch. (This is to avoid future releases causing regressions if the patch

				  is not also applied to master.) The only things that might look like

				  exceptions would be backports of patches from master that happen to look

				  significantly different.</li>

				  <li>Patch depends on too many other patches. Ideally, all stable-branch

				  patches should be self-contained. It sometimes occurs that a single, logical

				  bug-fix occurs as two separate patches on master, (such as an original

				  patch, then a subsequent fix-up to that patch). In such a case, these two

				  patches should be squashed into a single, self-contained patch for the

				  stable branch. (Of course, if the squashing makes the patch too large, then

				  that could be a reason to reject the patch.)</li>

				  <li>Patch includes new feature development, not bug fixes. New OpenGL

				  features, extensions, etc. should be applied to Mesa master and included in

				  the next major release. Stable releases are intended only for bug fixes.

				  Note: As an exception to this rule, the stable-release manager may accept

				  hardware-enabling "features". For example, backports of new code to support

				  a newly-developed hardware product can be accepted if they can be reasonably

				  determined to not have effects on other hardware.</li>

				  <li>Patch is a performance optimization. As a rule, performance patches are

				  not candidates for the stable branch. The only exception might be a case

				  where an application's performance was recently severely impacted so as to

				  become unusable. The fix for this performance regression could then be

				  considered for a stable branch. The optimization must also be

				  non-controversial and the patches still need to meet the other criteria of

				  being simple and self-contained</li>

				  <li>Patch introduces a new failure mode (such as an assert). While the new

				  assert might technically be correct, for example to make Mesa more

				  conformant, this is not the kind of "bug fix" we want in a stable

				  release. The potential problem here is that an OpenGL program that was

				  previously working, (even if technically non-compliant with the

				  specification), could stop working after this patch. So that would be a

				  regression that is unaacceptable for the stable branch.</li>

				</ul>

				<h2 id="release">Making a New Mesa Release</h2>

				<p>

				These are the instructions for making a new Mesa release.

				</p>

				<h3>Get latest source files</h3>

				<p>

				Use git to get the latest Mesa files from the git repository, from whatever

				branch is relevant. This document uses the convention X.Y.Z for the release

				being created, which should be created from a branch named X.Y.

				</p>

				<h3>Perform basic testing</h3>

				<p>

				The release manager should, at the very least, test the code by compiling it,

				installing it, and running the latest piglit to ensure that no piglit tests

				have regressed since the previous release.

				</p>

				<p>

				The release manager should do this testing with at least one hardware driver,

				(say, whatever is contained in the local development machine), as well as on

				both Gallium and non-Gallium software drivers. The software testing can be

				performed by running piglit with the following environment-variable set:

				</p>

				<pre>

				LIBGL_ALWAYS_SOFTWARE=1

				</pre>

				And Gallium vs. non-Gallium software drivers can be obtained by using the

				following configure flags on separate builds:

				<pre>

				--with-dri-drivers=swrast

				--with-gallium-drivers=swrast

				</pre>

				<p>

				Note: If both options are given in one build, both swrast_dri.so drivers will

				be compiled, but only one will be installed. The following command can be used

				to ensure the correct driver is being tested:

				</p>

				<pre>

				LIBGL_ALWAYS_SOFTWARE=1 glxinfo | grep "renderer string"

				</pre>

				If any regressions are found in this testing with piglit, stop here, and do

				not perform a release until regressions are fixed.

				<h3>Update version in file VERSION</h3>

				<p>

				Increment the version contained in the file VERSION at Mesa's top-level, then

				commit this change.

				</p>

				<h3>Create release notes for the new release</h3>

				<p>

				Create a new file docs/relnotes/X.Y.Z.html, (follow the style of the previous

				release notes). Note that the sha256sums section of the release notes should

				be empty at this point.

				</p>

				<p>

				Two scripts are available to help generate portions of the release notes:

				<pre>

					./bin/bugzilla_mesa.sh

					./bin/shortlog_mesa.sh

				</pre>

				<p>

				The first script identifies commits that reference bugzilla bugs and obtains

				the descriptions of those bugs from bugzilla. The second script generates a

				log of all commits. In both cases, HTML-formatted lists are printed to stdout

				to be included in the release notes.

				</p>

				<p>

				Commit these changes

				</p>

				<h3>Make the release archives, signatures, and the release tag</h3>

				<p>

				From inside the Mesa directory:

				<pre>

					./autogen.sh

					make -j1 tarballs

				</pre>

				<p>

				After the tarballs are created, the sha256 checksums for the files will

				be computed and printed. These will be used in a step below.

				</p>

				<p>

				It's important at this point to also verify that the constructed tar file

				actually builds:

				</p>

				<pre>

					tar xjf MesaLib-X.Y.Z.tar.bz2

					cd Mesa-X.Y.Z

					./configure --enable-gallium-llvm

					make -j6

					make install

				</pre>

				<p>

				Some touch testing should also be performed at this point, (run glxgears or

				more involved OpenGL programs against the installed Mesa).

				</p>

				<p>

				Create detached GPG signatures for each of the archive files created above:

				</p>

				<pre>

					gpg --sign --detach MesaLib-X.Y.Z.tar.gz

					gpg --sign --detach MesaLib-X.Y.Z.tar.bz2

					gpg --sign --detach MesaLib-X.Y.Z.zip

				</pre>

				<p>

				Tag the commit used for the build:

				</p>

				<pre>

					git tag -s mesa-X.Y.X -m "Mesa X.Y.Z release"

				</pre>

				<p>

				Note: It would be nice to investigate and fix the issue that causes the

				tarballs target to fail with multiple build process, such as with "-j4". It

				would also be nice to incorporate all of the above commands into a single

				makefile target. And instead of a custom "tarballs" target, we should

				incorporate things into the standard "make dist" and "make distcheck" targets.

				</p>

				<h3>Add the sha256sums to the release notes</h3>

				<p>

				Edit docs/relnotes/X.Y.Z.html to add the sha256sums printed as part of "make

				tarballs" in the previous step. Commit this change.

				</p>

				<h3>Push all commits and the tag created above</h3>

				<p>

				This is the first step that cannot easily be undone. The release is going

				forward from this point:

				</p>

				<pre>

					git push origin X.Y --tags

				</pre>

				<h3>Install the release files and signatures on the distribution server</h3>

				<p>

				The following commands can be used to copy the release archive files and

				signatures to the freedesktop.org server:

				</p>

				<pre>

					scp MesaLib-X.Y.Z* people.freedesktop.org:

					ssh people.freedesktop.org

					cd /srv/ftp.freedesktop.org/pub/mesa

					mkdir X.Y.Z

					cd X.Y.Z

					mv ~/MesaLib-X.Y.Z* .

				</pre>

				<h3>Back on mesa master, add the new release notes into the tree</h3>

				<p>

				Something like the following steps will do the trick:

				</p>

				<pre>

					cp docs/relnotes/X.Y.Z.html /tmp

				        git checkout master

				        cp /tmp/X.Y.Z.html docs/relnotes

				        git add docs/relnotes/X.Y.Z.html

				</pre>

				<p>

				Also, edit docs/relnotes.html to add a link to the new release notes, and edit

				docs/index.html to add a news entry. Then commit and push:

				</p>

				<pre>

					git commit -a -m "docs: Import X.Y.Z release notes, add news item."

				        git push origin

				</pre>

				<h3>Update the mesa3d.org website</h3>

				<p>

				NOTE: The recent release managers have not been performing this step

				themselves, but leaving this to Brian Paul, (who has access to the

				sourceforge.net hosting for mesa3d.org). Brian is more than willing to grant

				the permission necessary to future release managers to do this step on their

				own.

				</p>

				<p>

				Update the web site by copying the docs/ directory's files to 

				/home/users/b/br/brianp/mesa-www/htdocs/ with:

				<br>

				<code>

				sftp USERNAME,mesa3d@web.sourceforge.net

				</code>

				</p>

				<h3>Announce the release</h3>

				<p>

				Make an announcement on the mailing lists:

				<em>mesa-dev@lists.freedesktop.org</em>,

				and

				<em>mesa-announce@lists.freedesktop.org</em>

				Follow the template of previously-sent release announcements. The following

				command can be used to generate the log of changes to be included in the

				release announcement:

				<pre>

					git shortlog mesa-X.Y.Z-1..mesa-X.Y.Z

				</pre>

				</p>

				<h2 id="extensions">Adding Extensions</h2>

				<p>

				@@ -684,9 +47,11 @@ To add a new GL extension to Mesa you have to do at least the following.

				</li>

				<li>

				   Add a new entry to the <code>gl_extensions</code> struct in mtypes.h

				   if the extension requires driver capabilities not already exposed by

				   another extension.

				</li>

				<li>

				   Update the <code>extensions.c</code> file.

				   Add a new entry to the src/mesa/main/extensions_table.h file.

				</li>

				<li>

				   From this point, the best way to proceed is to find another extension,

				@@ -697,12 +62,18 @@ To add a new GL extension to Mesa you have to do at least the following.

				   If the new extension adds new GL state, the functions in get.c, enable.c

				   and attrib.c will most likely require new code.

				</li>

				<li>

				   To determine if the new extension is active in the current context,

				   use the auto-generated _mesa_has_##name_str() function defined in

				   src/mesa/main/extensions.h.

				</li>

				<li>

				   The dispatch tests check_table.cpp and dispatch_sanity.cpp

				   should be updated with details about the new extensions functions. These

				   tests are run using 'make check'

				</li>

				</ul>

				</p>

									
										45

docs/download.html
									
												View File
												
				@@ -23,44 +23,37 @@ or <a href="https://mesa.freedesktop.org/archive/">mesa.freedesktop.org</a>

				(HTTP).

				</p>

				<p>

				Starting with the first release of 2017, Mesa's version scheme is

				year-based. Filenames are in the form <tt>mesa-Y.N.P.tar.gz</tt>, where

				<tt>Y</tt> is the year (two digits), <tt>N</tt> is an incremental number

				(starting at 0) and <tt>P</tt> is the patch number (0 for the first

				release, 1 for the first patch after that).

				</p>

				<p>

				When a new release is coming, release candidates (betas) may be found

				<a href="ftp://ftp.freedesktop.org/pub/mesa/beta/">here</a>.

				in the same directory, and are recognisable by the

				<tt>mesa-Y.N.P-<b>rc</b>X.tar.gz</tt> filename.

				</p>

				<h1>Unpacking</h1>

				<p>

				Mesa releases are available in three formats: .tar.bz2, .tar.gz, and .zip

				Mesa releases are available in two formats: <tt>.tar.xz</tt> and <tt>.tar.gz</tt>.

				</p>

				<p>

				To unpack .tar.gz files:

				</p>

				To unpack the tarball:

				<pre>

					tar zxf MesaLib-x.y.z.tar.gz

					tar xf mesa-Y.N.P.tar.xz

				</pre>

				or

				<pre>

					gzcat MesaLib-x.y.z.tar.gz | tar xf -

					tar xf mesa-Y.N.P.tar.gz

				</pre>

				or

				<pre>

					gunzip MesaLib-x.y.z.tar.gz ; tar xf MesaLib-x.y.z.tar

				</pre>

				<p>

				To unpack .tar.bz2 files:

				</p>

				<pre>

					bunzip2 -c MesaLib-x.y.z.tar.gz | tar xf -

				</pre>

				<p>

				To unpack .zip files:

				</p>

				<pre>

					unzip MesaLib-x.y.z.zip

				</pre>

				<h1>Contents</h1>

				@@ -69,8 +62,8 @@ To unpack .zip files:

				After unpacking you'll have these files and directories (among others):

				</p>

				<pre>

				Makefile	- top-level Makefile for most systems

				configs/	- makefile parameter files for various systems

				autogen.sh	- Autoconf script for *nix systems

				scons/		- SCons script for Windows builds

				include/	- GL header (include) files

				bin/		- shell scripts for making shared libraries, etc

				docs/		- documentation

				@@ -109,9 +102,9 @@ In the past, GLUT, GLU and the Mesa demos were released in conjunction with

				Mesa releases.  But since GLUT, GLU and the demos change infrequently, they

				were split off into their own git repositories:

				<a href="http://cgit.freedesktop.org/mesa/glut/">GLUT</a>,

				<a href="http://cgit.freedesktop.org/mesa/glu/">GLU</a> and

				<a href="http://cgit.freedesktop.org/mesa/demos/">Demos</a>,

				<a href="https://cgit.freedesktop.org/mesa/glut/">GLUT</a>,

				<a href="https://cgit.freedesktop.org/mesa/glu/">GLU</a> and

				<a href="https://cgit.freedesktop.org/mesa/demos/">Demos</a>,

				</p>

				</div>

									
										10

docs/egl.html
									
												View File
												
				@@ -18,8 +18,8 @@

				<p>The current version of EGL in Mesa implements EGL 1.4.  More information

				about EGL can be found at

				<a href="http://www.khronos.org/egl/">

				http://www.khronos.org/egl/</a>.</p>

				<a href="https://www.khronos.org/egl/">

				https://www.khronos.org/egl/</a>.</p>

				<p>The Mesa's implementation of EGL uses a driver architecture.  The main

				library (<code>libEGL</code>) is window system neutral.  It provides the EGL

				@@ -44,7 +44,7 @@ the driver for your hardware.  For example</p>

				<p>The main library and OpenGL is enabled by default.  The first two options

				above enables <a href="opengles.html">OpenGL ES 1.x and 2.x</a>.  The last two

				options enables the listed classic and and Gallium drivers respectively.</p>

				options enables the listed classic and Gallium drivers respectively.</p>

				</li>

				@@ -83,9 +83,9 @@ drivers will be installed to <code>${libdir}/egl</code>.</p>

				<p>List the platforms (window systems) to support.  Its argument is a comma

				separated string such as <code>--with-egl-platforms=x11,drm</code>.  It decides

				the platforms a driver may support.  The first listed platform is also used by

				the main library to decide the native platform: the platform the EGL native

				the main library to decide the native platform: this defines EGL native

				types such as <code>EGLNativeDisplayType</code> or

				<code>EGLNativeWindowType</code> defined for.</p>

				<code>EGLNativeWindowType</code>.</p>

				<p>The available platforms are <code>x11</code>, <code>drm</code>,

				<code>wayland</code>, <code>surfaceless</code>, <code>android</code>,

									
										108

docs/envvars.html
									
												View File
												
				@@ -46,12 +46,26 @@ sometimes be useful for debugging end-user issues.

				<li>MESA_NO_MMX - if set, disables Intel MMX optimizations

				<li>MESA_NO_3DNOW - if set, disables AMD 3DNow! optimizations

				<li>MESA_NO_SSE - if set, disables Intel SSE optimizations

				<li>MESA_NO_ERROR - if set error checking is disabled as per KHR_no_error.

				   This will result in undefined behaviour for invalid use of the api, but

				   can reduce CPU use for apps that are known to be error free.</li>

				<li>MESA_DEBUG - if set, error messages are printed to stderr.  For example,

				   if the application generates a GL_INVALID_ENUM error, a corresponding error

				   message indicating where the error occurred, and possibly why, will be

				   printed to stderr.<br>

				   If the value of MESA_DEBUG is 'FP' floating point arithmetic errors will

				   generate exceptions.

				   For release builds, MESA_DEBUG defaults to off (no debug output).

				   MESA_DEBUG accepts the following comma-separated list of named

				   flags, which adds extra behaviour to just set MESA_DEBUG=1:

				   <ul>

				     <li>silent - turn off debug messages. Only useful for debug builds.</li>

				     <li>flush - flush after each drawing command</li>

				     <li>incomplete_tex - extra debug messages when a texture is incomplete</li>

				     <li>incomplete_fbo - extra debug messages when a fbo is incomplete</li>

				     <li>context - create a debug context (see GLX_CONTEXT_DEBUG_BIT_ARB) and

				         print error and performance messages to stderr (or MESA_LOG_FILE).</li>

				   </ul>

				<li>MESA_LOG_FILE - specifies a file name for logging all errors, warnings,

				etc., rather than stderr

				<li>MESA_TEX_PROG - if set, implement conventional texture env modes with

				@@ -103,6 +117,20 @@ glGetString(GL_VERSION) for OpenGL ES.

				glGetString(GL_SHADING_LANGUAGE_VERSION). Valid values are integers, such as

				"130".  Mesa will not really implement all the features of the given language version

				if it's higher than what's normally reported. (for developers only)

				<li>MESA_GLSL_CACHE_DISABLE - if set, disables the GLSL shader cache

				<li>MESA_GLSL_CACHE_MAX_SIZE - if set, determines the maximum size of

				the on-disk cache of compiled GLSL programs. Should be set to a number

				optionally followed by 'K', 'M', or 'G' to specify a size in

				kilobytes, megabytes, or gigabytes. By default, gigabytes will be

				assumed. And if unset, a maximum size of 1GB will be used. Note: A separate

				cache might be created for each architecture that Mesa is installed for on

				your system. For example under the default settings you may end up with a 1GB

				cache for x86_64 and another 1GB cache for i386.

				<li>MESA_GLSL_CACHE_DIR - if set, determines the directory to be used

				for the on-disk cache of compiled GLSL programs. If this variable is

				not set, then the cache will be stored in $XDG_CACHE_HOME/mesa (if

				that variable is set), or else within .cache/mesa within the user's

				home directory.

				<li>MESA_GLSL - <a href="shading.html#envvars">shading language compiler options</a>

				<li>MESA_NO_MINMAX_CACHE - when set, the minmax index cache is globally disabled.

				</ul>

				@@ -135,39 +163,50 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.

				   This is useful for debugging hangs, etc.</li>

				<li>INTEL_DEBUG - a comma-separated list of named flags, which do various things:

				<ul>

				   <li>tex - emit messages about textures.</li>

				   <li>state - emit messages about state flag tracking</li>

				   <li>blit - emit messages about blit operations</li>

				   <li>miptree - emit messages about miptrees</li>

				   <li>perf - emit messages about performance issues</li>

				   <li>perfmon - emit messages about AMD_performance_monitor</li>

				   <li>ann - annotate IR in assembly dumps</li>

				   <li>aub - dump batches into an AUB trace for use with simulation tools</li>

				   <li>bat - emit batch information</li>

				   <li>pix - emit messages about pixel operations</li>

				   <li>blit - emit messages about blit operations</li>

				   <li>blorp - emit messages about the blorp operations (blits &amp; clears)</li>

				   <li>buf - emit messages about buffer objects</li>

				   <li>reg - emit messages about regions</li>

				   <li>clip - emit messages about the clip unit (for old gens, includes the CLIP program)</li>

				   <li>color - use color in output</li>

				   <li>cs - dump shader assembly for compute shaders</li>

				   <li>do32 - generate compute shader SIMD32 programs even if workgroup size doesn't exceed the SIMD16 limit</li>

				   <li>dri - emit messages about the DRI interface</li>

				   <li>fbo - emit messages about framebuffers</li>

				   <li>fs - dump shader assembly for fragment shaders</li>

				   <li>gs - dump shader assembly for geometry shaders</li>

				   <li>sync - emit messages about synchronization</li>

				   <li>prim - emit messages about drawing primitives</li>

				   <li>vert - emit messages about vertex assembly</li>

				   <li>dri - emit messages about the DRI interface</li>

				   <li>sf - emit messages about the strips &amp; fans unit (for old gens, includes the SF program)</li>

				   <li>stats - enable statistics counters. you probably actually want perfmon or intel_gpu_top instead.</li>

				   <li>urb - emit messages about URB setup</li>

				   <li>vs - dump shader assembly for vertex shaders</li>

				   <li>clip - emit messages about the clip unit (for old gens, includes the CLIP program)</li>

				   <li>aub - dump batches into an AUB trace for use with simulation tools</li>

				   <li>shader_time - record how much GPU time is spent in each shader</li>

				   <li>hex - print instruction hex dump with the disassembly</li>

				   <li>l3 - emit messages about the new L3 state during transitions</li>

				   <li>miptree - emit messages about miptrees</li>

				   <li>no8 - don't generate SIMD8 fragment shader</li>

				   <li>no16 - suppress generation of 16-wide fragment shaders. useful for debugging broken shaders</li>

				   <li>blorp - emit messages about the blorp operations (blits &amp; clears)</li>

				   <li>nocompact - disable instruction compaction</li>

				   <li>nodualobj - suppress generation of dual-object geometry shader code</li>

				   <li>norbc - disable single sampled render buffer compression</li>

				   <li>optimizer - dump shader assembly to files at each optimization pass and iteration that make progress</li>

				   <li>vec4 - force vec4 mode in vertex shader</li>

				   <li>perf - emit messages about performance issues</li>

				   <li>perfmon - emit messages about AMD_performance_monitor</li>

				   <li>pix - emit messages about pixel operations</li>

				   <li>prim - emit messages about drawing primitives</li>

				   <li>sf - emit messages about the strips &amp; fans unit (for old gens, includes the SF program)</li>

				   <li>shader_time - record how much GPU time is spent in each shader</li>

				   <li>spill_fs - force spilling of all registers in the scalar backend (useful to debug spilling code)</li>

				   <li>spill_vec4 - force spilling of all registers in the vec4 backend (useful to debug spilling code)</li>

				   <li>norbc - disable single sampled render buffer compression</li>

				   <li>state - emit messages about state flag tracking</li>

				   <li>stats - enable statistics counters. you probably actually want perfmon or intel_gpu_top instead.</li>

				   <li>sync - after sending each batch, emit a message and wait for that batch to finish rendering</li>

				   <li>tcs - dump shader assembly for tessellation control shaders</li>

				   <li>tes - dump shader assembly for tessellation evaluation shaders</li>

				   <li>tex - emit messages about textures.</li>

				   <li>urb - emit messages about URB setup</li>

				   <li>vec4 - force vec4 mode in vertex shader</li>

				   <li>vert - emit messages about vertex assembly</li>

				   <li>vs - dump shader assembly for vertex shaders</li>

				</ul>

				<li>INTEL_PRECISE_TRIG - if set to 1, true or yes, then the driver prefers

				   accuracy over performance in trig functions.</li>

				</ul>

				@@ -198,8 +237,12 @@ Mesa EGL supports different sets of environment variables.  See the

				<li>GALLIUM_HUD_TOGGLE_SIGNAL - toggle visibility via user specified signal.

				    Especially useful to toggle hud at specific points of application and

				    disable for unencumbered viewing the rest of the time. For example, set

				    GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_SIGNAL_TOGGLE to 10 (SIGUSR1).

				    GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_TOGGLE_SIGNAL to 10 (SIGUSR1).

				    Use kill -10 <pid> to toggle the hud as desired.

				<li>GALLIUM_HUD_DUMP_DIR - specifies a directory for writing the displayed

				    hud values into files.

				<li>GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=1 for

				    choosing one of the software renderers "softpipe", "llvmpipe" or "swr".

				<li>GALLIUM_LOG_FILE - specifies a file for logging all errors, warnings, etc.

				    rather than stderr.

				<li>GALLIUM_PRINT_OPTIONS - if non-zero, print all the Gallium environment

				@@ -216,6 +259,21 @@ Setting to "tgsi", for example, will print all the TGSI shaders.

				See src/mesa/state_tracker/st_debug.c for other options.

				</ul>

				<h3>Clover state tracker environment variables</h3>

				<ul>

				<li>CLOVER_EXTRA_BUILD_OPTIONS - allows specifying additional compiler and linker

				    options. Specified options are appended after the options set by the OpenCL

				    program in clBuildProgram.

				<li>CLOVER_EXTRA_COMPILE_OPTIONS - allows specifying additional compiler

				    options. Specified options are appended after the options set by the OpenCL

				    program in clCompileProgram.

				<li>CLOVER_EXTRA_LINK_OPTIONS - allows specifying additional linker

				    options. Specified options are appended after the options set by the OpenCL

				    program in clLinkProgram.

				</ul>

				<h3>Softpipe driver environment variables</h3>

				<ul>

				<li>SOFTPIPE_DUMP_FS - if set, the softpipe driver will print fragment shaders

									
										24

docs/faq.html
									
												View File
												
				@@ -41,7 +41,7 @@ Last updated: 9 October 2012

				<p>

				Mesa is an open-source implementation of the OpenGL specification.

				OpenGL is a programming library for writing interactive 3D applications.

				See the <a href="http://www.opengl.org/">OpenGL website</a> for more

				See the <a href="https://www.opengl.org/">OpenGL website</a> for more

				information.

				</p>

				<p>

				@@ -55,13 +55,13 @@ Yes.  Specifically, Mesa serves as the OpenGL core for the open-source DRI

				drivers for X.org.

				</p>

				<ul>

				  <li>See the <a href="http://dri.freedesktop.org/">DRI website</a>

				  <li>See the <a href="https://dri.freedesktop.org/">DRI website</a>

				  for more information.</li>

				  <li>See <a href="http://intellinuxgraphics.org">intellinuxgraphics.org</a>

				  <li>See <a href="https://01.org/linuxgraphics">01.org</a>

				  for more information about Intel drivers.</li>

				  <li>See <a href="http://nouveau.freedesktop.org">nouveau.freedesktop.org</a>

				  <li>See <a href="https://nouveau.freedesktop.org">nouveau.freedesktop.org</a>

				  for more information about Nouveau drivers.</li>

				  <li>See <a href="http://www.x.org/wiki/RadeonFeature">www.x.org/wiki/RadeonFeature</a>

				  <li>See <a href="https://www.x.org/wiki/RadeonFeature">www.x.org/wiki/RadeonFeature</a>

				  for more information about Radeon drivers.</li>

				</ul>

				@@ -144,7 +144,7 @@ Mesa is much more up to date with modern features and extensions.

				</p>

				<p>

				<a href="http://sourceforge.net/projects/ogl-es/">Vincent</a> is

				<a href="https://sourceforge.net/projects/ogl-es/">Vincent</a> is

				an open-source implementation of OpenGL ES for mobile devices.

				<p>

				@@ -157,7 +157,7 @@ is a subset of OpenGL.

				</p>

				<p>

				<a href="http://sourceforge.net/projects/softgl/">SoftGL</a>

				<a href="https://sourceforge.net/projects/softgl/">SoftGL</a>

				is an OpenGL subset for mobile devices.

				</p>

				@@ -213,7 +213,7 @@ If you don't already have GLUT installed, you should grab

				<h2>2.4 Where is the GLw library?</h2>

				<p>

				GLw (OpenGL widget library) is now available from a separate <a href="http://cgit.freedesktop.org/mesa/glw/">git repository</a>.  Unless you're using very old Xt/Motif applications with OpenGL, you shouldn't need it.

				GLw (OpenGL widget library) is now available from a separate <a href="https://cgit.freedesktop.org/mesa/glw/">git repository</a>.  Unless you're using very old Xt/Motif applications with OpenGL, you shouldn't need it.

				</p>

				@@ -276,7 +276,7 @@ If you're using a hardware accelerated driver you want <code>direct rendering: Y

				</p>

				<p>

				If your DRI-based driver isn't working, go to the

				<a href="http://dri.freedesktop.org/">DRI website</a> for trouble-shooting information.

				<a href="https://dri.freedesktop.org/">DRI website</a> for trouble-shooting information.

				</p>

				@@ -284,7 +284,7 @@ If your DRI-based driver isn't working, go to the

				<p>

				Make sure the ratio of the far to near clipping planes isn't too great.

				Look

				<a href="http://www.opengl.org/resources/faq/technical/depthbuffer.htm#0040">here</a>

				<a href="https://www.opengl.org/resources/faq/technical/depthbuffer.htm#0040">here</a>

				for details.

				</p>

				<p>

				@@ -339,7 +339,7 @@ First, join the <a href="lists.html">mesa-dev mailing list</a>.

				That's where Mesa development is discussed.

				</p>

				<p>

				The <a href="http://www.opengl.org/documentation">

				The <a href="https://www.opengl.org/documentation">

				OpenGL Specification</a> is the bible for OpenGL implementation work.

				You should read it.

				</p>

				@@ -383,7 +383,7 @@ implement the extension (specifically the compression/decompression

				algorithms).

				</p>

				<p>

				In the mean time, a 3rd party <a href="http://dri.freedesktop.org/wiki/S3TC">

				In the mean time, a 3rd party <a href="https://dri.freedesktop.org/wiki/S3TC">

				plug-in library</a> is available.

				</p>

253

docs/GL3.txt → docs/features.txt

View File

@@ -33,7 +33,7 @@ are exposed in the 3.0 context as extensions.
 Feature                                                 Status
 ------------------------------------------------------- ------------------------
 GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
 GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
   glBindFragDataLocation, glGetFragDataLocation         DONE
   GL_NV_conditional_render (Conditional rendering)      DONE ()
@@ -60,12 +60,12 @@ GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
   glVertexAttribI commands                              DONE
   Depth format cube textures                            DONE ()
   GLX_ARB_create_context (GLX 1.4 is required)          DONE
   Multisample anti-aliasing                             DONE (llvmpipe (*), softpipe (*), swr (*))
   Multisample anti-aliasing                             DONE (freedreno (*), llvmpipe (*), softpipe (*), swr (*))
 (*) llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
 (*) freedreno, llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
 GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
 GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
   Forward compatible context support/deprecations       DONE ()
   GL_ARB_draw_instanced (Instanced drawing)             DONE ()
@@ -78,40 +78,40 @@ GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
   GL_EXT_texture_snorm (Signed normalized textures)     DONE ()
 GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
 GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
   Core/compatibility profiles                           DONE
   Geometry shaders                                      DONE ()
   GL_ARB_vertex_array_bgra (BGRA vertex order)          DONE (swr)
   GL_ARB_draw_elements_base_vertex (Base vertex offset) DONE (swr)
   GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (swr)
   GL_ARB_provoking_vertex (Provoking vertex)            DONE (swr)
   GL_ARB_seamless_cube_map (Seamless cubemaps)          DONE (swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (swr)
   GL_ARB_depth_clamp (Frag depth clamp)                 DONE (swr)
   GL_ARB_sync (Fence objects)                           DONE (swr)
   GL_ARB_vertex_array_bgra (BGRA vertex order)          DONE (freedreno)
   GL_ARB_draw_elements_base_vertex (Base vertex offset) DONE (freedreno)
   GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (freedreno)
   GL_ARB_provoking_vertex (Provoking vertex)            DONE (freedreno)
   GL_ARB_seamless_cube_map (Seamless cubemaps)          DONE (freedreno)
   GL_ARB_texture_multisample (Multisample textures)     DONE ()
   GL_ARB_depth_clamp (Frag depth clamp)                 DONE (freedreno)
   GL_ARB_sync (Fence objects)                           DONE (freedreno)
   GLX_ARB_create_context_profile                        DONE
 GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
   GL_ARB_blend_func_extended                            DONE (swr)
   GL_ARB_blend_func_extended                            DONE (freedreno/a3xx, swr)
   GL_ARB_explicit_attrib_location                       DONE (all drivers that support GLSL)
   GL_ARB_occlusion_query2                               DONE (swr)
   GL_ARB_occlusion_query2                               DONE (freedreno, swr)
   GL_ARB_sampler_objects                                DONE (all drivers)
   GL_ARB_shader_bit_encoding                            DONE (swr)
   GL_ARB_texture_rgb10_a2ui                             DONE (swr)
   GL_ARB_texture_swizzle                                DONE (swr)
   GL_ARB_shader_bit_encoding                            DONE (freedreno, swr)
   GL_ARB_texture_rgb10_a2ui                             DONE (freedreno, swr)
   GL_ARB_texture_swizzle                                DONE (freedreno, swr)
   GL_ARB_timer_query                                    DONE (swr)
   GL_ARB_instanced_arrays                               DONE (swr)
   GL_ARB_vertex_type_2_10_10_10_rev                     DONE (swr)
   GL_ARB_instanced_arrays                               DONE (freedreno, swr)
   GL_ARB_vertex_type_2_10_10_10_rev                     DONE (freedreno, swr)
 GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi
 GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
   GL_ARB_draw_buffers_blend                             DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_draw_indirect                                  DONE (i965, llvmpipe, softpipe, swr)
   GL_ARB_gpu_shader5                                    DONE (i965)
   GL_ARB_draw_buffers_blend                             DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_draw_indirect                                  DONE (i965/gen7+, llvmpipe, softpipe, swr)
   GL_ARB_gpu_shader5                                    DONE (i965/gen7+)
   - 'precise' qualifier                                 DONE
   - Dynamically uniform sampler array indices           DONE (softpipe)
   - Dynamically uniform UBO array indices               DONE ()
@@ -124,154 +124,215 @@ GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi
   - Enhanced per-sample shading                         DONE ()
   - Interpolation functions                             DONE ()
   - New overload resolution rules                       DONE
   GL_ARB_gpu_shader_fp64                                DONE (i965/gen8+, llvmpipe, softpipe)
   GL_ARB_sample_shading                                 DONE (i965, nv50)
   GL_ARB_shader_subroutine                              DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_tessellation_shader                            DONE (i965)
   GL_ARB_texture_buffer_object_rgb32                    DONE (i965, llvmpipe, softpipe, swr)
   GL_ARB_texture_cube_map_array                         DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_texture_gather                                 DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_gpu_shader_fp64                                DONE (i965/gen7+, llvmpipe, softpipe)
   GL_ARB_sample_shading                                 DONE (i965/gen6+, nv50)
   GL_ARB_shader_subroutine                              DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_tessellation_shader                            DONE (i965/gen7+)
   GL_ARB_texture_buffer_object_rgb32                    DONE (i965/gen6+, llvmpipe, softpipe, swr)
   GL_ARB_texture_cube_map_array                         DONE (i965/gen6+, nv50, llvmpipe, softpipe)
   GL_ARB_texture_gather                                 DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_query_lod                              DONE (i965, nv50, softpipe)
   GL_ARB_transform_feedback2                            DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_transform_feedback3                            DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_transform_feedback2                            DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_transform_feedback3                            DONE (i965/gen7+, llvmpipe, softpipe, swr)
 GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi
 GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
   GL_ARB_ES2_compatibility                              DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_get_program_binary                             DONE (0 binary formats)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_shader_precision                               DONE (all drivers that support GLSL 4.10)
   GL_ARB_vertex_attrib_64bit                            DONE (i965/gen8+, llvmpipe, softpipe)
   GL_ARB_shader_precision                               DONE (i965/gen7+, all drivers that support GLSL 4.10)
   GL_ARB_vertex_attrib_64bit                            DONE (i965/gen7+, llvmpipe, softpipe)
   GL_ARB_viewport_array                                 DONE (i965, nv50, llvmpipe, softpipe)
 GL 4.2, GLSL 4.20 -- all DONE: radeonsi
 GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, radeonsi
   GL_ARB_texture_compression_bptc                       DONE (i965, nvc0, r600, radeonsi)
   GL_ARB_texture_compression_bptc                       DONE (i965, r600)
   GL_ARB_compressed_texture_pixel_storage               DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_atomic_counters                         DONE (i965, softpipe)
   GL_ARB_texture_storage                                DONE (all drivers)
   GL_ARB_transform_feedback_instanced                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_base_instance                                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_shader_image_load_store                        DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_transform_feedback_instanced                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_base_instance                                  DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_shader_image_load_store                        DONE (i965, softpipe)
   GL_ARB_conservative_depth                             DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_420pack                       DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_internalformat_query                           DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_internalformat_query                           DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_map_buffer_alignment                           DONE (all drivers)
 GL 4.3, GLSL 4.30:
 GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, radeonsi
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_ES3_compatibility                              DONE (all drivers that support GLSL 3.30)
   GL_ARB_clear_buffer_object                            DONE (all drivers)
   GL_ARB_compute_shader                                 DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_copy_image                                     DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_compute_shader                                 DONE (i965, softpipe)
   GL_ARB_copy_image                                     DONE (i965, nv50, r600, softpipe, llvmpipe)
   GL_KHR_debug                                          DONE (all drivers)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_fragment_layer_viewport                        DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (i965, nvc0, r600, radeonsi, softpipe)
   GL_ARB_fragment_layer_viewport                        DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (i965, r600, softpipe)
   GL_ARB_internalformat_query2                          DONE (all drivers)
   GL_ARB_invalidate_subdata                             DONE (all drivers)
   GL_ARB_multi_draw_indirect                            DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_multi_draw_indirect                            DONE (i965, r600, llvmpipe, softpipe, swr)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_robust_buffer_access_behavior                  DONE (i965, nvc0, radeonsi)
   GL_ARB_shader_image_size                              DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_stencil_texturing                              DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_buffer_range                           DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
   GL_ARB_robust_buffer_access_behavior                  DONE (i965)
   GL_ARB_shader_image_size                              DONE (i965, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965, softpipe)
   GL_ARB_stencil_texturing                              DONE (i965/hsw+, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_buffer_range                           DONE (nv50, i965, r600, llvmpipe)
   GL_ARB_texture_query_levels                           DONE (all drivers that support GLSL 1.30)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_texture_view                                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_view                                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
 GL 4.4, GLSL 4.40:
 GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi
   GL_MAX_VERTEX_ATTRIB_STRIDE                           DONE (all drivers)
   GL_ARB_buffer_storage                                 DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_clear_texture                                  DONE (i965, nv50, nvc0)
   GL_ARB_enhanced_layouts                               in progress (Timothy)
   GL_ARB_buffer_storage                                 DONE (i965, nv50, r600, llvmpipe, swr)
   GL_ARB_clear_texture                                  DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_enhanced_layouts                               DONE (i965, nv50, llvmpipe, softpipe)
   - compile-time constant expressions                   DONE
   - explicit byte offsets for blocks                    DONE
   - forced alignment within blocks                      DONE
   - specified vec4-slot component numbers               in progress
   - specified vec4-slot component numbers               DONE (i965, nv50, llvmpipe, softpipe)
   - specified transform/feedback layout                 DONE
   - input/output block locations                        DONE
   GL_ARB_multi_bind                                     DONE (all drivers)
   GL_ARB_query_buffer_object                            DONE (i965/hsw+, nvc0)
   GL_ARB_texture_mirror_clamp_to_edge                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_stencil8                               DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_vertex_type_10f_11f_11f_rev                    DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_query_buffer_object                            DONE (i965/hsw+)
   GL_ARB_texture_mirror_clamp_to_edge                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_stencil8                               DONE (i965/hsw+, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_vertex_type_10f_11f_11f_rev                    DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
 GL 4.5, GLSL 4.50:
 GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi
   GL_ARB_ES3_1_compatibility                            DONE (nvc0, radeonsi)
   GL_ARB_clip_control                                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_inverted                    DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_cull_distance                                  DONE (i965, nv50, nvc0, llvmpipe, softpipe)
   GL_ARB_derivative_control                             DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_ES3_1_compatibility                            DONE (i965/hsw+)
   GL_ARB_clip_control                                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_inverted                    DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_cull_distance                                  DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_derivative_control                             DONE (i965, nv50, r600)
   GL_ARB_direct_state_access                            DONE (all drivers)
   GL_ARB_get_texture_sub_image                          DONE (all drivers)
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_texture_barrier                                DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, r600)
   GL_ARB_texture_barrier                                DONE (i965, nv50, r600)
   GL_KHR_context_flush_control                          DONE (all - but needs GLX/EGL extension to be useful)
   GL_KHR_robustness                                     DONE (i965)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
 These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1
 GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, radeonsi
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_compute_shader                                 DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_draw_indirect                                  DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_compute_shader                                 DONE (i965/gen7+, softpipe)
   GL_ARB_draw_indirect                                  DONE (i965/gen7+, r600, llvmpipe, softpipe, swr)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_framebuffer_no_attachments                     DONE (i965, nvc0, r600, radeonsi, softpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (i965/gen7+, r600, softpipe)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_image_load_store                        DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_image_size                              DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_atomic_counters                         DONE (i965/gen7+, softpipe)
   GL_ARB_shader_image_load_store                        DONE (i965/gen7+, softpipe)
   GL_ARB_shader_image_size                              DONE (i965/gen7+, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965/gen7+, softpipe)
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_stencil_texturing                              DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_stencil_texturing                              DONE (nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (i965/gen7+, nv50, r600, llvmpipe, softpipe)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
   GS5 Enhanced textureGather                            DONE (i965, nvc0, r600, radeonsi)
   GS5 Packing/bitfield/conversion functions             DONE (i965, nvc0, r600, radeonsi)
   GS5 Enhanced textureGather                            DONE (i965/gen7+, r600)
   GS5 Packing/bitfield/conversion functions             DONE (i965/gen6+, r600)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
   Additional functionality not covered above:
       glMemoryBarrierByRegion                           DONE
       glGetTexLevelParameter[fi]v - needs updates       DONE
       glGetBooleani_v - restrict to GLES enums
       gl_HelperInvocation support                       DONE (i965, nvc0, r600, radeonsi)
       gl_HelperInvocation support                       DONE (i965, r600)
 GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+
 GLES3.2, GLSL ES 3.2
   GL_EXT_color_buffer_float                             DONE (all drivers)
   GL_KHR_blend_equation_advanced                        not started
   GL_KHR_blend_equation_advanced                        DONE (i965, nvc0)
   GL_KHR_debug                                          DONE (all drivers)
   GL_KHR_robustness                                     DONE (i965)
   GL_KHR_robustness                                     DONE (i965, nvc0, radeonsi)
   GL_KHR_texture_compression_astc_ldr                   DONE (i965/gen9+)
   GL_OES_copy_image                                     DONE (i965)
   GL_OES_copy_image                                     DONE (all drivers)
   GL_OES_draw_buffers_indexed                           DONE (all drivers that support GL_ARB_draw_buffers_blend)
   GL_OES_draw_elements_base_vertex                      DONE (all drivers)
   GL_OES_geometry_shader                                started (idr)
   GL_OES_geometry_shader                                DONE (i965/hsw+, nvc0, radeonsi)
   GL_OES_gpu_shader5                                    DONE (all drivers that support GL_ARB_gpu_shader5)
   GL_OES_primitive_bounding_box                         not started
   GL_OES_primitive_bounding_box                         DONE (i965/gen7+, nvc0, radeonsi)
   GL_OES_sample_shading                                 DONE (i965, nvc0, r600, radeonsi)
   GL_OES_sample_variables                               DONE (i965, nvc0, r600, radeonsi)
   GL_OES_shader_image_atomic                            DONE (all drivers that support GL_ARB_shader_image_load_store)
   GL_OES_shader_io_blocks                               DONE (i965/gen8+, nvc0, radeonsi)
   GL_OES_shader_io_blocks                               DONE (All drivers that support GLES 3.1)
   GL_OES_shader_multisample_interpolation               DONE (i965, nvc0, r600, radeonsi)
   GL_OES_tessellation_shader                            started (Ken)
   GL_OES_tessellation_shader                            DONE (all drivers that support GL_ARB_tessellation_shader)
   GL_OES_texture_border_clamp                           DONE (all drivers)
   GL_OES_texture_buffer                                 DONE (i965, nvc0, radeonsi)
   GL_OES_texture_cube_map_array                         not started (based on GL_ARB_texture_cube_map_array, which is done for all drivers)
   GL_OES_texture_cube_map_array                         DONE (i965/hsw+, nvc0, radeonsi)
   GL_OES_texture_stencil8                               DONE (all drivers that support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array           DONE (all drivers that support GL_ARB_texture_multisample)
 More info about these features and the work involved can be found at
 http://dri.freedesktop.org/wiki/MissingFunctionality
 Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES version:
   GL_ARB_bindless_texture                               started (airlied)
   GL_ARB_cl_event                                       not started
   GL_ARB_compute_variable_group_size                    DONE (nvc0, radeonsi)
   GL_ARB_ES3_2_compatibility                            DONE (i965/gen8+)
   GL_ARB_fragment_shader_interlock                      not started
   GL_ARB_gl_spirv                                       not started
   GL_ARB_gpu_shader_int64                               DONE (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe)
   GL_ARB_indirect_parameters                            DONE (nvc0, radeonsi)
   GL_ARB_parallel_shader_compile                        not started, but Chia-I Wu did some related work in 2014
   GL_ARB_pipeline_statistics_query                      DONE (i965, nvc0, radeonsi, softpipe, swr)
   GL_ARB_post_depth_coverage                            DONE (i965)
   GL_ARB_robustness_isolation                           not started
   GL_ARB_sample_locations                               not started
   GL_ARB_seamless_cubemap_per_texture                   DONE (i965, nvc0, radeonsi, r600, softpipe, swr)
   GL_ARB_shader_atomic_counter_ops                      DONE (i965/gen7+, nvc0, radeonsi, softpipe)
   GL_ARB_shader_ballot                                  DONE (nvc0, radeonsi)
   GL_ARB_shader_clock                                   DONE (i965/gen7+, nv50, nvc0, radeonsi)
   GL_ARB_shader_draw_parameters                         DONE (i965, nvc0, radeonsi)
   GL_ARB_shader_group_vote                              DONE (nvc0, radeonsi)
   GL_ARB_shader_stencil_export                          DONE (i965/gen9+, radeonsi, softpipe, llvmpipe, swr)
   GL_ARB_shader_viewport_layer_array                    DONE (i965/gen6+, nvc0, radeonsi)
   GL_ARB_sparse_buffer                                  DONE (radeonsi/CIK+)
   GL_ARB_sparse_texture                                 not started
   GL_ARB_sparse_texture2                                not started
   GL_ARB_sparse_texture_clamp                           not started
   GL_ARB_texture_filter_minmax                          not started
   GL_ARB_transform_feedback_overflow_query              DONE (i965/gen6+)
   GL_KHR_blend_equation_advanced_coherent               DONE (i965/gen9+)
   GL_KHR_no_error                                       started (Timothy Arceri)
   GL_KHR_texture_compression_astc_hdr                   DONE (core only)
   GL_KHR_texture_compression_astc_sliced_3d             not started
   GL_OES_depth_texture_cube_map                         DONE (all drivers that support GLSL 1.30+)
   GL_OES_EGL_image                                      DONE (all drivers)
   GL_OES_EGL_image_external_essl3                       not started
   GL_OES_required_internalformat                        not started - GLES2 extension based on OpenGL ES 3.0 feature
   GL_OES_surfaceless_context                            DONE (all drivers)
   GL_OES_texture_compression_astc                       DONE (core only)
   GL_OES_texture_float                                  DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_float_linear                           DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float                             DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float_linear                      DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_view                                   not started - based on GL_ARB_texture_view
   GL_OES_viewport_array                                 DONE (i965, nvc0, radeonsi)
   GLX_ARB_context_flush_control                         not started
   GLX_ARB_robustness_application_isolation              not started
   GLX_ARB_robustness_share_group_isolation              not started
 The following extensions are not part of any OpenGL or OpenGL ES version, and
 we DO NOT WANT implementations of these extensions for Mesa.
   GL_ARB_geometry_shader4                               Superseded by GL 3.2 geometry shaders
   GL_ARB_matrix_palette                                 Superseded by GL_ARB_vertex_program
   GL_ARB_shading_language_include                       Not interesting
   GL_ARB_shadow_ambient                                 Superseded by GL_ARB_fragment_program
   GL_ARB_vertex_blend                                   Superseded by GL_ARB_vertex_program
 A graphical representation of this information can be found at
 https://mesamatrix.net/

									
										20

docs/helpwanted.html
									
												View File
												
				@@ -24,7 +24,7 @@ Here are some specific ideas and areas where help would be appreciated:

				<ol>

				<li>

				<b>Driver patching and testing.</b>

				Patches are often posted to the <a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev mailing list</a>, but aren't

				Patches are often posted to the <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev mailing list</a>, but aren't

				immediately checked into git because not enough people are testing them.

				Just applying patches, testing and reporting back is helpful.

				<li>

				@@ -39,7 +39,7 @@ issues in the code.

				Fixing MSVC builds.

				<li>

				<b>Contribute more tests to

				<a href="http://piglit.freedesktop.org/">Piglit</a>.</b>

				<a href="https://piglit.freedesktop.org/">Piglit</a>.</b>

				<li>

				<b>Automatic testing.

				</b>

				@@ -56,9 +56,9 @@ You can find some further To-do lists here:

				<b>Common To-Do lists:</b>

				</p>

				<ul>

				  <li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/docs/GL3.txt">

				    <b>GL3.txt</b></a> - Status of OpenGL 3.x / 4.x features in Mesa.</li>

				  <li><a href="http://dri.freedesktop.org/wiki/MissingFunctionality">

				  <li><a href="https://cgit.freedesktop.org/mesa/mesa/tree/docs/features.txt">

				    <b>features.txt</b></a> - Status of OpenGL 3.x / 4.x features in Mesa.</li>

				  <li><a href="https://dri.freedesktop.org/wiki/MissingFunctionality">

				    <b>MissingFunctionality</b></a> - Detailed information about missing OpenGL features.</li>

				</ul>

				@@ -66,15 +66,15 @@ You can find some further To-do lists here:

				<b>Driver specific To-Do lists:</b>

				</p>

				<ul>

				  <li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/docs/llvm-todo.txt">

				  <li><a href="https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/docs/llvm-todo.txt">

				    <b>LLVMpipe</b></a> - Software driver using LLVM for runtime code generation.</li>

				  <li><a href="http://dri.freedesktop.org/wiki/RadeonsiToDo">

				  <li><a href="https://dri.freedesktop.org/wiki/RadeonsiToDo">

				    <b>radeonsi</b></a> - Driver for AMD Southern Island.</li>

				  <li><a href="http://dri.freedesktop.org/wiki/R600ToDo">

				  <li><a href="https://dri.freedesktop.org/wiki/R600ToDo">

				    <b>r600g</b></a> - Driver for ATI/AMD R600 - Northern Island.</li>

				  <li><a href="http://dri.freedesktop.org/wiki/R300ToDo">

				  <li><a href="https://dri.freedesktop.org/wiki/R300ToDo">

				    <b>r300g</b></a> - Driver for ATI R300 - R500.</li>

				  <li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/i915/TODO">

				  <li><a href="https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/i915/TODO">

				    <b>i915g</b></a> - Driver for Intel i915/i945.</li>

				</ul>

									
										156

docs/index.html
									
												View File
												
				@@ -16,6 +16,138 @@

				<h1>News</h1>

				<h2>April 28, 2017</h2>

				<p>

				<a href="relnotes/17.0.5.html">Mesa 17.0.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 17, 2017</h2>

				<p>

				<a href="relnotes/17.0.4.html">Mesa 17.0.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 1, 2017</h2>

				<p>

				<a href="relnotes/17.0.3.html">Mesa 17.0.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>March 20, 2017</h2>

				<p>

				<a href="relnotes/13.0.6.html">Mesa 13.0.6</a> and

				<a href="relnotes/17.0.2.html">Mesa 17.0.2</a> are released.

				These are bug-fix releases from the 13.0 and 17.0 branches, respectively.

				<br>

				NOTE: It is anticipated that 13.0.6 will be the final release in the 13.0

				series. Users of 13.0 are encouraged to migrate to the 17.0 series in order

				to obtain future fixes.

				</p>

				<h2>March 4, 2017</h2>

				<p>

				<a href="relnotes/17.0.1.html">Mesa 17.0.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>February 20, 2017</h2>

				<p>

				<a href="relnotes/13.0.5.html">Mesa 13.0.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>February 13, 2017</h2>

				<p>

				<a href="relnotes/17.0.0.html">Mesa 17.0.0</a> is released.  This is a

				new development release.  See the release notes for more information

				about the release.

				</p>

				<h2>February 1, 2017</h2>

				<p>

				<a href="relnotes/13.0.4.html">Mesa 13.0.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>January 23, 2017</h2>

				<p>

				<a href="relnotes/12.0.6.html">Mesa 12.0.6</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: This is an extra release for the 12.0 stable branch, as per developers'

				feedback. It is anticipated that 12.0.6 will be the final release in the 12.0

				series. Users of 12.0 are encouraged to migrate to the 13.0 series in order

				to obtain future fixes.

				</p>

				<h2>January 5, 2017</h2>

				<p>

				<a href="relnotes/13.0.3.html">Mesa 13.0.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 5, 2016</h2>

				<p>

				<a href="relnotes/12.0.5.html">Mesa 12.0.5</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 12.0.5 will be the final release in the 12.0

				series. Users of 12.0 are encouraged to migrate to the 13.0 series in order

				to obtain future fixes.

				</p>

				<h2>November 28, 2016</h2>

				<p>

				<a href="relnotes/13.0.2.html">Mesa 13.0.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>November 14, 2016</h2>

				<p>

				<a href="relnotes/13.0.1.html">Mesa 13.0.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>November 10, 2016</h2>

				<p>

				<a href="relnotes/12.0.4.html">Mesa 12.0.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>November 1, 2016</h2>

				<p>

				<a href="relnotes/13.0.0.html">Mesa 13.0.0</a> is released.  This is a

				new development release.  See the release notes for more information

				about the release.

				</p>

				<h2>September 15, 2016</h2>

				<p>

				<a href="relnotes/12.0.3.html">Mesa 12.0.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>September 2, 2016</h2>

				<p>

				<a href="relnotes/12.0.2.html">Mesa 12.0.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>July 8, 2016</h2>

				<p>

				<a href="relnotes/12.0.1.html">Mesa 12.0.1</a> is released.

				This is a bug-fix release, resolving build issues in the r600 and

				radeonsi drivers.

				</p>

				<p>

				<a href="relnotes/12.0.0.html">Mesa 12.0.0</a> is released.  This is a

				new development release.  See the release notes for more information

				about the release.

				</p>

				<h2>May 9, 2016</h2>

				<p>

				<a href="relnotes/11.1.4.html">Mesa 11.1.4</a> and

				@@ -85,7 +217,7 @@ This is a bug-fix release.

				</p>

				<p>

				Mesa demos 8.3.0 is also released.

				See the <a href="http://lists.freedesktop.org/archives/mesa-announce/2015-December/000191.html">announcement</a> for more information about the release.

				See the <a href="https://lists.freedesktop.org/archives/mesa-announce/2015-December/000191.html">announcement</a> for more information about the release.

				You can download it from <a href="ftp://ftp.freedesktop.org/pub/mesa/demos/8.3.0/">ftp.freedesktop.org/pub/mesa/demos/8.3.0/</a>.

				</p>

				@@ -400,7 +532,7 @@ This is a bug-fix release.

				<p>

				Mesa demos 8.2.0 is released.

				See the <a href="http://lists.freedesktop.org/archives/mesa-announce/2014-July/000100.html">announcement</a> for more information about the release.

				See the <a href="https://lists.freedesktop.org/archives/mesa-announce/2014-July/000100.html">announcement</a> for more information about the release.

				You can download it from <a href="ftp://ftp.freedesktop.org/pub/mesa/demos/8.2.0/">ftp.freedesktop.org/pub/mesa/demos/8.2.0/</a>.

				</p>

				@@ -579,7 +711,7 @@ This is a bug fix release.

				<p>

				Mesa demos 8.1.0 is released.

				See the <a href="http://lists.freedesktop.org/archives/mesa-dev/2013-February/035180.html">announcement</a> for more information about the release.

				See the <a href="https://lists.freedesktop.org/archives/mesa-dev/2013-February/035180.html">announcement</a> for more information about the release.

				You can download it from <a href="ftp://ftp.freedesktop.org/pub/mesa/demos/8.1.0/">ftp.freedesktop.org/pub/mesa/demos/8.1.0/</a>.

				</p>

				@@ -1275,7 +1407,7 @@ and primarily just incorporates bug fixes.

				<h2>December 28, 2003</h2>

				<p>

				The Mesa CVS server has been moved to <a href="http://www.freedesktop.org">

				The Mesa CVS server has been moved to <a href="https://www.freedesktop.org">

				freedesktop.org</a> because of problems with SourceForge's anonymous

				CVS service.

				</p>

				@@ -1847,7 +1979,7 @@ Here's what's new:</p>

				</pre>

				<h2>March 23, 2000</h2>

				<p>I've just upload the Mesa 3.2 beta 1 files to SourceForge at <a href="http://sourceforge.net/project/showfiles.php?group_id=3">http://sourceforge.net/project/filelist.php?group_id=3</a></p>

				<p>I've just upload the Mesa 3.2 beta 1 files to SourceForge at <a href="https://sourceforge.net/project/showfiles.php?group_id=3">https://sourceforge.net/project/filelist.php?group_id=3</a></p>

				<p>3.2 (note even number) is a stabilization release of Mesa 3.1 meaning it's mainly

				just bug fixes.</p>

				<p>Here's what's changed:</p>

				@@ -1895,7 +2027,7 @@ After 3.2 is wrapped up I hope to release 3.3 beta 1 soon afterward.</p>

				<h2>December 17, 1999</h2>

				<p>A Slashdot interview with Brian about Mesa (questions submitted by Slashdot readers)

				can be found at <a href="http://slashdot.org/interviews/99/12/17/0927212.shtml">http://slashdot.org/interviews/99/12/17/0927212.shtml</a>.</p>

				can be found at <a href="https://slashdot.org/interviews/99/12/17/0927212.shtml">https://slashdot.org/interviews/99/12/17/0927212.shtml</a>.</p>

				<h2>December 14, 1999</h2>

				<p>Mesa 3.1 is released!</p>

				@@ -1929,7 +2061,7 @@ BOF meeting is now available.</p>

				<p>-Brian</p>

				<h2>August 14, 1999</h2>

				<p><a href="http://www.mesa3d.org">www.mesa3d.org</a> is having

				<p><a href="https://www.mesa3d.org">www.mesa3d.org</a> is having

				technical problems due to hardware failures at VA Linux systems. The Mac pages,

				ftp, and CVS services aren't fully restored yet. Please be patient.</p>

				<p>-Brian</p>

				@@ -1938,9 +2070,9 @@ ftp, and CVS services aren't fully restored yet. Please be patient.</p>

				<p>RPMS of the nVidia RIVA server can be found at <code>ftp://ftp.mesa3d.org/mesa/misc/nVidia/</code>.</p>

				<h2>June 2, 1999</h2>

				<p><a href="http://www.nvidia.com/">nVidia</a> has released some Linux binaries for

				<p><a href="https://www.nvidia.com/">nVidia</a> has released some Linux binaries for

				xfree86 3.3.3.1, along with the <b>full source</b>, which includes GLX acceleration

				based on Mesa 3.0. They can be downloaded from <code>http://www.nvidia.com/Products.nsf/htmlmedia/software_drivers.html</code>.</p>

				based on Mesa 3.0. They can be downloaded from <code>https://www.nvidia.com/Products.nsf/htmlmedia/software_drivers.html</code>.</p>

				<h2>May 24, 1999</h2>

				<p>Beta 2 of Mesa 3.1 has been make available at <code>ftp://ftp.mesa3d.org/mesa/beta/</code>.

				@@ -1988,11 +2120,11 @@ grateful.

				<p>The new webpages are now online. Enjoy, and let me know if you find any errors.

				<h2>February 16, 1999</h2>

				<p><a href="http://www.sgi.com/">SGI</a> releases its

				<a href="http://www.sgi.com/software/opensource/glx/">GLX source code</a>.</p>

				<p><a href="https://www.sgi.com/">SGI</a> releases its

				<a href="https://www.sgi.com/software/opensource/glx/">GLX source code</a>.</p>

				<h2>January 22, 1999</h2>

				<p><a href="http://www.mesa3d.org">www.mesa3d.org</a> established</p>

				<p><a href="https://www.mesa3d.org">www.mesa3d.org</a> established</p>

				</div>

				</body>

									
										115

docs/install.html
									
												View File
												
				@@ -24,7 +24,7 @@

				  </ul>

				<li><a href="#autoconf">Building with autoconf (Linux/Unix/X11)</a>

				<li><a href="#scons">Building with SCons (Windows/Linux)</a>

				<li><a href="#other">Building for other systems</a>

				<li><a href="#android">Building with AOSP (Android)</a>

				<li><a href="#libs">Library Information</a>

				<li><a href="#pkg-config">Building OpenGL programs with pkg-config</a>

				</ol>

				@@ -33,62 +33,85 @@

				<h1 id="prereq-general">1. Prerequisites for building</h1>

				<h2>1.1 General</h2>

				<p>

				Build system.

				</p>

				<ul>

				<li><a href="http://www.python.org/">Python</a> - Python is required.

				Version 2.6.4 or later should work.

				</li>

				<br>

				<li><a href="http://www.makotemplates.org/">Python Mako module</a> -

				Python Mako module is required. Version 0.3.4 or later should work.

				</li>

				</br>

				<li>Autoconf is required when building on *nix platforms.

				<li><a href="http://www.scons.org/">SCons</a> is required for building on

				Windows and optional for Linux (it's an alternative to autoconf/automake.)

				</li>

				<li>Android Build system when building as native Android component. Autoconf

				is used when when building ARC.

				</li>

				</ul>

				<p>

				The following compilers are known to work, if you know of others or you're

				willing to maintain support for other compiler get in touch.

				</p>

				<ul>

				<li>GCC 4.2.0 or later (some parts of Mesa may require later versions)

				<li>clang - exact minimum requirement is currently unknown.

				<li>Microsoft Visual Studio 2013 Update 4 or later is required, for building on Windows.

				</ul>

				<p>

				Third party/extra tools.

				<br>

				<li>lex / yacc - for building the GLSL compiler.

				<br>

				<br>

				On Linux systems, flex and bison are used.

				Versions 2.5.35 and 2.4.1, respectively, (or later) should work.

				<br>

				<br>

				<strong>Note</strong>: These should not be required, when building from a release tarball. If

				you think you've spotted a bug let developers know by filing a

				<a href="bugs.html">bug report</a>.

				</p>

				<ul>

				<li><a href="https://www.python.org/">Python</a> - Python is required.

				Version 2.6.4 or later should work.

				</li>

				<li><a href="http://www.makotemplates.org/">Python Mako module</a> -

				Python Mako module is required. Version 0.3.4 or later should work.

				</li>

				<li>lex / yacc - for building the Mesa IR and GLSL compiler.

				<div>

				On Linux systems, flex and bison versions 2.5.35 and 2.4.1, respectively,

				(or later) should work.

				On Windows with MinGW, install flex and bison with:

				<pre>mingw-get install msys-flex msys-bison</pre>

				For MSVC on Windows, install

				<a href="http://winflexbison.sourceforge.net/">Win flex-bison</a>.

				</li>

				<br>

				<li>For building on Windows, Microsoft Visual Studio 2013 or later is required.

				</li>

				</div>

				</ul>

				<p><strong>Note</strong>: Some versions can be buggy (eg. flex 2.6.2) so do try others if things fail.</p>

				<h3 id="prereq-dri">1.2 For DRI and hardware acceleration</h3>

				<h3 id="prereq-dri">1.2 Requirements</h3>

				<p>

				The following are required for DRI-based hardware acceleration with Mesa:

				The requirements depends on the features selected at configure stage.

				Check/install the respective -devel package as prompted by the configure error

				message.

				</p>

				<ul>

				<li><a href="http://xorg.freedesktop.org/releases/individual/proto/">

				dri2proto</a> version 2.6 or later

				<li><a href="http://dri.freedesktop.org/libdrm/">libDRM</a> latest version

				<li>Xorg server version 1.5 or later

				<li>Linux 2.6.28 or later

				</ul>

				<p>

				If you're using a fedora distro the following command should install all

				the needed dependencies:

				Here are some common ways to retrieve most/all of the dependencies based on

				the packaging tool used by your distro.

				</p>

				<pre>

				  sudo yum install flex bison imake libtool xorg-x11-proto-devel libdrm-devel \

				  gcc-c++ xorg-x11-server-devel libXi-devel libXmu-devel libXdamage-devel git \

				  expat-devel llvm-devel python-mako

				  zypper source-install --build-deps-only Mesa # openSUSE/SLED/SLES

				  yum-builddep mesa # yum Fedora, OpenSuse(?)

				  dnf builddep mesa # dnf Fedora

				  apt-get build-dep mesa # Debian and derivatives

				  ... # others

				</pre>

				<h1 id="autoconf">2. Building with autoconf (Linux/Unix/X11)</h1>

				<p>

				@@ -139,22 +162,30 @@ This will create:

				</ul>

				<p>

				Put them all in the same directory to test them.

				Additional information is available in <a href="README.WIN32">README.WIN32</a>.

				</p>

				<h1 id="other">4. Building for other systems</h1>

				<h1 id="android">4. Building with AOSP (Android)</h1>

				<p>

				Documentation for other environments (some may be very out of date):

				Currently one can build Mesa for Android as part of the AOSP project, yet

				your experience might vary.

				</p>

				<ul>

				<li><a href="README.VMS">README.VMS</a> - VMS

				<li><a href="README.CYGWIN">README.CYGWIN</a> - Cygwin

				<li><a href="README.WIN32">README.WIN32</a> - Win32

				</ul>

				<p>

				In order to achieve that one should update their local manifest to point to the

				upstream repo, set the appropriate BOARD_GPU_DRIVERS and build the

				libGLES_mesa library.

				</p>

				<p>

				FINISHME: Improve on the instructions add references to Rob H repos/Jenkins,

				Android-x86 and/or other resources.

				</p>

				<h1 id="libs">5. Library Information</h1>

									
										83

docs/intro.html
									
												View File
												
				@@ -17,22 +17,34 @@

				<h1>Introduction</h1>

				<p>

				Mesa is an open-source implementation of the

				<a href="http://www.opengl.org/">OpenGL</a> specification -

				The Mesa project began as an open-source implementation of the

				<a href="https://www.opengl.org/">OpenGL</a> specification -

				a system for rendering interactive 3D graphics.

				</p>

				<p>

				A variety of device drivers allows Mesa to be used in many different

				environments ranging from software emulation to complete hardware acceleration

				for modern GPUs.

				Over the years the project has grown to implement more graphics APIs,

				including

				<a href="https://www.khronos.org/opengles/">OpenGL ES</a> (versions 1, 2, 3),

				<a href="https://www.khronos.org/opencl/">OpenCL</a>,

				<a href="https://www.khronos.org/openmax/">OpenMAX</a>,

				<a href="https://en.wikipedia.org/wiki/VDPAU">VDPAU</a>,

				<a href="https://en.wikipedia.org/wiki/Video_Acceleration_API">VA API</a>,

				<a href="https://en.wikipedia.org/wiki/X-Video_Motion_Compensation">XvMC</a> and

				<a href="https://www.khronos.org/vulkan/">Vulkan</a>.

				</p>

				<p>

				Mesa ties into several other open-source projects: the 

				<a href="http://dri.freedesktop.org/">Direct Rendering 

				Infrastructure</a> and <a href="http://x.org">X.org</a> to 

				provide OpenGL support to users of X on Linux, FreeBSD and other operating 

				A variety of device drivers allows the Mesa libraries to be used in many

				different environments ranging from software emulation to complete hardware

				acceleration for modern GPUs.

				</p>

				<p>

				Mesa ties into several other open-source projects: the

				<a href="https://dri.freedesktop.org/">Direct Rendering

				Infrastructure</a> and <a href="https://x.org">X.org</a> to

				provide OpenGL support on Linux, FreeBSD and other operating

				systems.

				</p>

				@@ -85,7 +97,7 @@ the OpenGL API, so they didn't feel threatened by the project.

				1995-1996: I continue working on Mesa both during my spare time and during

				my work hours at the Space Science and Engineering Center at the University

				of Wisconsin in Madison.  My supervisor, Bill Hibbard, lets me do this because

				Mesa is now being using for the <a href="http://www.ssec.wisc.edu/%7Ebillh/vis.html">Vis5D</a> project.

				Mesa is now being using for the <a href="https://www.ssec.wisc.edu/%7Ebillh/vis.html">Vis5D</a> project.

				</p><p>

				October 1996: Mesa 2.0 is released.  It implements the OpenGL 1.1 specification.

				</p>

				@@ -142,7 +154,7 @@ and OpenGL Shading Language.

				<p>

				2008: Keith Whitwell and other Tungsten Graphics employees develop

				<a href="http://en.wikipedia.org/wiki/Gallium3D">Gallium</a>

				<a href="https://en.wikipedia.org/wiki/Gallium3D">Gallium</a>

				- a new GPU abstraction layer.  The latest Mesa drivers are based on

				Gallium and other APIs such as OpenVG are implemented on top of Gallium.

				</p>

				@@ -153,13 +165,22 @@ and version 1.30 of the OpenGL Shading Language.

				</p>

				<p>

				Ongoing: Mesa is the OpenGL implementation for several types of hardware

				made by Intel, AMD and NVIDIA, plus the VMware virtual GPU.

				July 2016: Mesa 12.0 is released, including OpenGL 4.3 support and initial

				support for Vulkan for Intel GPUs.  Plus, there's another gallium software

				driver ("swr") based on LLVM and developed by Intel.

				</p>

				<p>

				Ongoing: Mesa is the OpenGL implementation for devices designed by

				Intel, AMD, NVIDIA, Qualcomm, Broadcom, Vivante, plus the VMware and

				VirGL virtual GPUs.

				There's also several software-based renderers: swrast (the legacy

				Mesa rasterizer), softpipe (a gallium reference driver) and llvmpipe

				(LLVM/JIT-based high-speed rasterizer).

				Mesa rasterizer), softpipe (a gallium reference driver), llvmpipe

				(LLVM/JIT-based high-speed rasterizer) and swr (another LLVM-based driver).

				</p>

				<p>

				Work continues on the drivers and core Mesa to implement newer versions

				of the OpenGL specification.

				of the OpenGL, OpenGL ES and Vulkan specifications.

				</p>

				@@ -173,6 +194,30 @@ of the OpenGL specification is implemented.

				</p>

				<h2>Version 12.x features</h2>

				<p>

				Version 12.x of Mesa implements the OpenGL 4.3 API, but not all drivers

				support OpenGL 4.3.

				</p>

				<p>

				Initial support for Vulkan is also included.

				</p>

				<h2>Version 11.x features</h2>

				<p>

				Version 11.x of Mesa implements the OpenGL 4.1 API, but not all drivers

				support OpenGL 4.1.

				</p>

				<h2>Version 10.x features</h2>

				<p>

				Version 10.x of Mesa implements the OpenGL 3.3 API, but not all drivers

				support OpenGL 3.3.

				</p>

				<h2>Version 9.x features</h2>

				<p>

				Version 9.x of Mesa implements the OpenGL 3.1 API.

				@@ -182,6 +227,10 @@ community contributed features required for OpenGL 3.1.  The primary

				features added since the Mesa 8.0 release are

				GL_ARB_texture_buffer_object and GL_ARB_uniform_buffer_object.

				</p>

				<p>

				Version 9.0 of Mesa also included the first release of the Clover state

				tracker for OpenCL.

				</p>

				<h2>Version 8.x features</h2>

				@@ -234,7 +283,7 @@ GL_SRC2_ALPHA               GL_SOURCE2_ALPHA

				</pre>

				<p>

				See the

				<a href="http://www.opengl.org/documentation/spec.html">

				<a href="https://www.opengl.org/documentation/spec.html">

				OpenGL specification</a> for more details.

				</p>

									
										6

docs/license.html
									
												View File
												
				@@ -18,10 +18,10 @@

				<p>

				Mesa is a 3-D graphics library with an API which is very similar to

				that of <a href="http://www.opengl.org/">OpenGL</a>.*

				that of <a href="https://www.opengl.org/">OpenGL</a>.*

				To the extent that Mesa utilizes the OpenGL command syntax or state

				machine, it is being used with authorization from <a

				href="http://www.sgi.com/">Silicon Graphics,

				href="https://www.sgi.com/">Silicon Graphics,

				Inc.</a>(SGI). However, the author does not possess an OpenGL license

				from SGI, and makes no claim that Mesa is in any way a compatible

				replacement for OpenGL or associated with SGI. Those who want a

				@@ -36,7 +36,7 @@ library</em>. <br>

				</p>

				<p>

				* OpenGL is a trademark of <a href="http://www.sgi.com/"

				* OpenGL is a trademark of <a href="https://www.sgi.com/"

				>Silicon Graphics Incorporated</a>.

				</p>

									
										22

docs/lists.html
									
												View File
												
				@@ -21,23 +21,23 @@

				</p>

				<ul>

				<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-users">mesa-users</a>

				<li><p><a href="https://lists.freedesktop.org/mailman/listinfo/mesa-users">mesa-users</a>

				- intended for end-users of Mesa and DRI drivers. Newbie questions are OK,

				but please try the general OpenGL resources and Mesa/DRI documentation first.</p>

				</li>

				<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev</a>

				<li><p><a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev</a>

				- for Mesa, Gallium and DRI development

				discussion.  Not for beginners.</p>

				</li>

				<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-commit">mesa-commit</a>

				<li><p><a href="https://lists.freedesktop.org/mailman/listinfo/mesa-commit">mesa-commit</a>

				- relays git check-in messages (for developers).

				In general, people should not post to this list.</p>

				</li>

				<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-announce">mesa-announce</a>

				<li><p><a href="https://lists.freedesktop.org/mailman/listinfo/mesa-announce">mesa-announce</a>

				- announcements of new Mesa

				versions are sent to this list.  Very low traffic.</p>

				</li>

				<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/piglit">piglit</a>

				<li><p><a href="https://lists.freedesktop.org/mailman/listinfo/piglit">piglit</a>

				- for Piglit (OpenGL driver testing framework) discussion.</p>

				</li>

				</ul>

				@@ -56,22 +56,22 @@ Follow the links above for list archives.

				<p>

				The old Mesa lists hosted at SourceForge are no longer in use.

				The archives are still available, however:

				<a href="http://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-announce">mesa3d-announce</a>,

				<a href="http://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-users">mesa3d-users</a>,

				<a href="http://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-dev">mesa3d-dev</a>.

				<a href="https://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-announce">mesa3d-announce</a>,

				<a href="https://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-users">mesa3d-users</a>,

				<a href="https://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-dev">mesa3d-dev</a>.

				</p>

				<p>For mailing lists about Direct Rendering Modules (drm) in Linux/BSD 

				kernels, see the

				<a href="http://dri.freedesktop.org/wiki/MailingLists">DRI wiki</a>.

				<a href="https://dri.freedesktop.org/wiki/MailingLists">DRI wiki</a>.

				</p>

				<h1>IRC</h1>

				<p>join <a href="irc://chat.freenode.net#dri-devel">#dri-devel channel</a>

				on <a href="http://webchat.freenode.net/">irc.freenode.net</a>

				on <a href="https://webchat.freenode.net/">irc.freenode.net</a>

				</p>

				@@ -82,7 +82,7 @@ Here are some other OpenGL-related forums you might find useful:

				</p>

				<ul>

				<li><a href="http://www.opengl.org/cgi-bin/ubb/ultimatebb.cgi">OpenGL discussion forums</a>

				<li><a href="https://www.opengl.org/discussion_boards/">OpenGL discussion forums</a>

				at www.opengl.org</li>

				<li>Usenet newsgroups:

				<ul>

									
										28

docs/llvmpipe.html
									
												View File
												
				@@ -34,7 +34,7 @@ It's the fastest software rasterizer for Mesa.

				<li>

				   <p>An x86 or amd64 processor; 64-bit mode recommended.</p>

				   <p>

				   Support for SSE2 is strongly encouraged.  Support for SSSE3 and SSE4.1 will

				   Support for SSE2 is strongly encouraged.  Support for SSE3 and SSE4.1 will

				   yield the most efficient code.  The fewer features the CPU has the more

				   likely is that you run into underperforming, buggy, or incomplete code.

				   </p>

				@@ -165,8 +165,8 @@ any OpenGL drivers):

				  <li><p>load this registry settings:</p>

				  <pre>REGEDIT4

				; http://technet.microsoft.com/en-us/library/cc749368.aspx

				; http://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596

				; https://technet.microsoft.com/en-us/library/cc749368.aspx

				; https://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596

				[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL]

				"DLL"="mesadrv.dll"

				"DriverVersion"=dword:00000001

				@@ -195,7 +195,7 @@ that no tail call optimizations are done by gcc.

				<h2>Linux perf integration</h2>

				<p>

				On Linux, it is possible to have symbol resolution of JIT code with <a href="http://perf.wiki.kernel.org/">Linux perf</a>:

				On Linux, it is possible to have symbol resolution of JIT code with <a href="https://perf.wiki.kernel.org/">Linux perf</a>:

				</p>

				<pre>

				@@ -206,12 +206,12 @@ On Linux, it is possible to have symbol resolution of JIT code with <a href="htt

				<p>

				When run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with

				symbol address table.  It also dumps assembly code to /tmp/perf-XXXXX.map.asm,

				which can be used by the bin/perf-annotate-jit script to produce disassembly of

				which can be used by the bin/perf-annotate-jit.py script to produce disassembly of

				the generated code annotated with the samples.

				</p>

				<p>You can obtain a call graph via

				<a href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#linux_perf">Gprof2Dot</a>.</p>

				<a href="https://github.com/jrfonseca/gprof2dot#linux-perf">Gprof2Dot</a>.</p>

				<h1>Unit testing</h1>

				@@ -253,7 +253,7 @@ for posterior analysis, e.g.:

				  We use LLVM-C bindings for now. They are not documented, but follow the C++

				  interfaces very closely, and appear to be complete enough for code

				  generation. See 

				  <a href="http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html">

				  <a href="https://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html">

				  this stand-alone example</a>.  See the llvm-c/Core.h file for reference.

				</li>

				</ul>

				@@ -264,18 +264,18 @@ for posterior analysis, e.g.:

				  <li>

				    <p>Rasterization</p>

				    <ul>

				      <li><a href="http://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li>

				      <li><a href="https://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li>

				      <li><a href="http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602">Rasterization on Larrabee</a> (<a href="http://devmaster.net/posts/2887/rasterization-on-larrabee">DevMaster copy</a>)</li>

				      <li><a href="http://devmaster.net/posts/6133/rasterization-using-half-space-functions">Rasterization using half-space functions</a></li>

				      <li><a href="http://devmaster.net/posts/6145/advanced-rasterization">Advanced Rasterization</a></li>

				      <li><a href="http://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li>

				      <li><a href="https://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li>

				    </ul>

				  </li>

				  <li>

				    <p>Texture sampling</p>

				    <ul>

				      <li><a href="http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping">Perspective Texture Mapping</a></li>

				      <li><a href="http://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li>

				      <li><a href="https://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li>

				      <li><a href="http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php">Run-Time MIP-Map Filtering</a></li>

				      <li><a href="http://alt.3dcenter.org/artikel/2003/10-26_a_english.php">Will "brilinear" filtering persist?</a></li>

				      <li><a href="http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html">Trilinear filtering</a></li>

				@@ -294,21 +294,21 @@ for posterior analysis, e.g.:

				      <li><a href="http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807">Optimizing Pixomatic For Modern x86 Processors</a></li>

				      <li><a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel 64 and IA-32 Architectures Optimization Reference Manual</a></li>

				      <li><a href="http://www.agner.org/optimize/">Software optimization resources</a></li>

				      <li><a href="http://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li>

				      <li><a href="https://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li>

				    </ul>

				  </li>

				  <li>

				    <p>LLVM</p>

				    <ul>

				      <li><a href="http://llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></li>

				      <li><a href="http://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li>

				      <li><a href="https://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li>

				    </ul>

				  </li>

				  <li>

				    <p>General</p>

				    <ul>

				      <li><a href="http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li>

				      <li><a href="http://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li>

				      <li><a href="https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li>

				      <li><a href="https://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li>

				    </ul>

				  </li>

				</ul>

									
										11

docs/mangling.html
									
												View File
												
				@@ -2,7 +2,7 @@

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Function Name Mangling</title>

				  <title>GL Function Name Mangling</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				@@ -14,7 +14,7 @@

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Function Name Mangling</h1>

				<h1>GL Function Name Mangling</h1>

				<p>

				If you want to use both Mesa and another OpenGL library in the same

				@@ -25,12 +25,11 @@ This results in all the Mesa functions being prefixed with

				</p>

				<p>

				To do this, recompile Mesa with the compiler flag -DUSE_MGL_NAMESPACE.

				Add the flag to CFLAGS in the configuration file which you want to use.

				For example:

				This option is supported only with the autoconf build. To use it add

				--enable-mangling to your configure line.

				</p>

				<pre>

				CFLAGS += -DUSE_MGL_NAMESPACE

				<code>./configure --enable-mangling ...</code>

				</pre>

				</div>

									
										4

docs/opengles.html
									
												View File
												
				@@ -17,8 +17,8 @@

				<h1>OpenGL ES</h1>

				<p>Mesa implements OpenGL ES 1.1 and OpenGL ES 2.0.  More information about

				OpenGL ES can be found at <a href="http://www.khronos.org/opengles/">

				http://www.khronos.org/opengles/</a>.</p>

				OpenGL ES can be found at <a href="https://www.khronos.org/opengles/">

				https://www.khronos.org/opengles/</a>.</p>

				<p>OpenGL ES depends on a working EGL implementation.  Please refer to

				<a href="egl.html">Mesa EGL</a> for more information about EGL.</p>

4

docs/patents.txt

View File

@@ -27,5 +27,5 @@ ARB_texture_float:
     enable this extension.
 [1] http://www.google.com/patents/about?id=mIIOAAAAEBAJ&dq=6650327
 [2] http://www.opengl.org/registry/specs/ARB/texture_float.txt
 [1] https://www.google.com/patents/about?id=mIIOAAAAEBAJ&dq=6650327
 [2] https://www.opengl.org/registry/specs/ARB/texture_float.txt

									
										2

docs/postprocess.html
									
												View File
												
				@@ -45,7 +45,7 @@ Multiple filters can be used together.

				<li>pp_nored, pp_nogreen, pp_noblue - set to 1 to remove the corresponding color channel.

				These are basic filters for easy testing of the PP queue.

				<li>pp_jimenezmlaa, pp_jimenezmlaa_color -

				<a href="http://www.iryokufx.com/mlaa/" target=_blank>Jimenez's MLAA</a>

				<a href="https://www.iryokufx.com/mlaa/" target=_blank>Jimenez's MLAA</a>

				is a morphological antialiasing filter.

				The two versions use depth and color data, respectively.

				Which works better depends on the app - depth will not blur text, but it will

									
										10

docs/precompiled.html
									
												View File
												
				@@ -20,8 +20,14 @@

				In general, precompiled Mesa libraries are not available.

				</p>

				<p>

				However, some Linux distros (such as Ubuntu) seem to closely track

				Mesa and often have the latest Mesa release available as an update.

				Some Linux distributions closely follow the latest Mesa releases. On others one

				has to use unofficial channels.

				<br>

				There are some general directions:

				<li>Debian/Ubuntu based distros - PPA: xorg-edgers, oibaf and padoka</li>

				<li>Fedora - Corp: erp and che</li>

				<li>OpenSuse/SLES - OBS: X11:XOrg and pontostroy:X11</li>

				<li>Gentoo/Archlinux - officially provided/supported</li>

				</p>

				</div>

									
										94

docs/release-calendar.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,94 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Releasing process</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Overview</h1>

				<p>

				Mesa provides feature/development and stable releases.

				</p>

				<p>

				The table below lists the date and release manager that is expected to do the

				specific release.

				<br>

				Take a look <a href="submittingpatches.html#criteria" target="_parent">here</a>

				if you'd like to nominate a patch in the next stable release.

				</p>

				<h1 id="calendar">Calendar</h1>

				<table border="1">

				<tr>

				<th>Branch</th>

				<th>Expected date</th>

				<th>Release</th>

				<th>Release manager</th>

				<th>Notes</th>

				</tr>

				<tr>

				<td rowspan="3">17.0</td>

				<td>2017-04-28</td>

				<td>17.0.5</td>

				<td>Andres Gomez</td>

				<td></td>

				</tr>

				<tr>

				<td>2017-05-12</td>

				<td>17.0.6</td>

				<td>Emil Velikov</td>

				<td></td>

				</tr>

				<tr>

				<td>2017-05-26</td>

				<td>17.0.7</td>

				<td>Emil Velikov</td>

				<td>Final planned release for the 17.0 series</td>

				</tr>

				<tr>

				<td rowspan="5">17.1</td>

				<td>2017-04-28</td>

				<td>17.1.0-rc3</td>

				<td>Emil Velikov</td>

				<td></td>

				</tr>

				<tr>

				<td>2017-05-05</td>

				<td>17.1.0-rc4</td>

				<td>Emil Velikov</td>

				<td>May be promoted to 17.1.0 final</td>

				</tr>

				<tr>

				<td>2017-05-19</td>

				<td>17.1.1</td>

				<td>Emil Velikov</td>

				<td></td>

				<tr>

				<td>2017-06-02</td>

				<td>17.1.2</td>

				<td>Emil Velikov</td>

				<td></td>

				</tr>

				<tr>

				<td>2017-06-16</td>

				<td>17.1.3</td>

				<td>Emil Velikov</td>

				<td></td>

				</tr>

				</table>

				</div>

				</body>

				</html>

									
										551

docs/releasing.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,551 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Releasing process</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Releasing process</h1>

				<ul>

				<li><a href="#overview">Overview</a>

				<li><a href="#schedule">Release schedule</a>

				<li><a href="#pickntest">Cherry-pick and test</a>

				<li><a href="#branch">Making a branchpoint</a>

				<li><a href="#prerelease">Pre-release announcement</a>

				<li><a href="#release">Making a new release</a>

				<li><a href="#announce">Announce the release</a>

				<li><a href="#website">Update the mesa3d.org website</a>

				<li><a href="#bugzilla">Update Bugzilla</a>

				</ul>

				<h1 id="overview">Overview</h1>

				<p>

				This document uses the convention X.Y.Z for the release number with X.Y being

				the stable branch name.

				<br>

				Mesa provides feature and bugfix releases. Former use zero as patch version (Z),

				while the latter have a non-zero one.

				</p>

				<p>

				For example:

				</p>

				<pre>

					Mesa 10.1.0 - 10.1 branch, feature

					Mesa 10.1.4 - 10.1 branch, bugfix

					Mesa 12.0.0 - 12.0 branch, feature

					Mesa 12.0.2 - 12.0 branch, bugfix

				</pre>

				<h1 id="schedule">Release schedule</h1>

				<p>

				Releases should happen on Fridays. Delays can occur although those should be keep

				to a minimum.

				<br>

				See our <a href="release-calendar.html" target="_parent">calendar</a> for the

				date and other details for individual releases.

				</p>

				<h2>Feature releases</h2>

				<ul>

				<li>Available approximately every three months.

				<li>Initial timeplan available 2-4 weeks before the planned branchpoint (rc1)

				on the mesa-announce@ mailing list.

				<li>A <a href="#prerelease">pre-release</a> announcement should be available

				approximately 24 hours before the final (non-rc) release.

				</ul>

				<h2>Stable releases</h2>

				<ul>

				<li>Normally available once every two weeks.

				<li>Only the latest branch has releases. See note below.

				<li>A <a href="#prerelease">pre-release</a> announcement should be available

				approximately 48 hours before the actual release.

				</ul>

				<p>

				Note: There is one or two releases overlap when changing branches. For example:

				<br>

				The final release from the 12.0 series Mesa 12.0.5 will be out around the same

				time (or shortly after) 13.0.1 is out.

				</p>

				<h1 id="pickntest">Cherry-picking and testing</h1>

				<p>

				Commits nominated for the active branch are picked as based on the

				<a href="submittingpatches.html#criteria" target="_parent">criteria</a> as

				described in the same section.

				<p>

				Maintainer is responsible for testing in various possible permutations of

				the autoconf and scons build.

				</p>

				<h2>Cherry-picking and build/check testing</h2>

				<p>Done continuously up-to the <a href="#prerelease">pre-release</a> announcement.</p>

				<p>

				As an exception, patches can be applied up-to the last ~1h before the actual

				release. This is made <strong>only</strong> with explicit permission/request,

				and the patch <strong>must</strong> be very well contained. Thus it cannot

				affect more than one driver/subsystem.

				</p>

				<p>

				Currently Ilia Mirkin and AMD devs have requested "permanent" exception.

				</p>

				<ul>

				<li>make distcheck, scons and scons check must pass

				<li>Testing with different version of system components - LLVM and others is also

				performed where possible.

				</ul>

				<p>

				Achieved by combination of local ad-hoc scripts and AppVeyor plus Travis-CI,

				the latter as part of their Github integration.

				</p>

				<p>

				<strong>Note:</strong> If a patch in the current queue needs any additional

				fix(es), then they should be squashed together.

				<br>

				The commit messages and the <code>cherry picked from</code> tags must be preserved.

				</p>

				<p>

				This should be noted in the <a href="#prerelease">pre-announce</a> email.

				<pre>

				    git show b10859ec41d09c57663a258f43fe57c12332698e

				    commit b10859ec41d09c57663a258f43fe57c12332698e

				    Author: Jonas Pfeil &ltpfeiljonas@gmx.de&gt

				    Date:   Wed Mar 1 18:11:10 2017 +0100

				        ralloc: Make sure ralloc() allocations match malloc()'s alignment.

				        The header of ralloc needs to be aligned, because the compiler assumes

				        ...

				        (cherry picked from commit cd2b55e536dc806f9358f71db438dd9c246cdb14)

				        Squashed with commit:

				        ralloc: don't leave out the alignment factor

				        Experimentation shows that without alignment factor gcc and clang choose

				        ...

				        (cherry picked from commit ff494fe999510ea40e3ed5827e7818550b6de126)

				</pre>

				</p>

				<h2>Regression/functionality testing</h2>

				<p>

				Less often (once or twice), shortly before the pre-release announcement.

				Ensure that testing is redone if Intel devs have requested an exception, as per above.

				</p>

				<ul>

				<li><em>no regressions should be observed for Piglit/dEQP/CTS/Vulkan on Intel platforms</em>

				<li><em>no regressions should be observed for Piglit using the swrast, softpipe

				and llvmpipe drivers</em>

				</ul>

				<p>

				Currently testing is performed courtesy of the Intel OTC team and their Jenkins CI setup. Check with the Intel team over IRC how to get things setup.

				</p>

				<h1 id="branch">Making a branchpoint</h1>

				<p>

				A branchpoint is made such that new development can continue in parallel to

				stabilisation and bugfixing.

				</p>

				<p>

				Note: Before doing a branch ensure that basic build and <code>make check</code>

				testing is done and there are little to-no issues.

				<br>

				Ideally all of those should be tackled already.

				</p>

				<p>

				Check if the version number is going to remain as, alternatively

				<code> git mv docs/relnotes/{current,new}.html </code> as appropriate.

				</p>

				<p>

				To setup the branchpoint:

				</p>

				<pre>

					git checkout master # make sure we're in master first

					git tag -s X.Y-branchpoint -m "Mesa X.Y branchpoint"

					git checkout -b X.Y

					git checkout master

					$EDITOR VERSION # bump the version number

					git commit -as

					cp docs/relnotes/{X.Y,X.Y+1}.html # copy/create relnotes template

					git commit -as

					git push origin X.Y-branchpoint X.Y

				</pre>

				<p>

				Now go to

				<a href="https://bugs.freedesktop.org/editversions.cgi?action=add&amp;product=Mesa" target="_parent">Bugzilla</a> and add the new Mesa version X.Y.

				</p>

				<p>

				Check that there are no distribution breaking changes and revert them if needed.

				For example: files being overwritten on install, etc. Happens extremely rarely -

				we had only one case so far (see commit 2ced8eb136528914e1bf4e000dea06a9d53c7e04).

				</p>

				<p>

				Proceed to <a href="#release">release</a> -rc1.

				</p>

				<h1 id="prerelease">Pre-release announcement</h1>

				<p>

				It comes shortly after outstanding patches in the respective branch are pushed.

				Developers can check, in brief, what's the status of their patches. They,

				alongside very early testers, are strongly encouraged to test the branch and

				report any regressions.

				<br>

				It is followed by a brief period (normally 24 or 48 hours) before the actual

				release is made.

				</p>

				<h2>Terminology used</h2>

				<ul><li>Nominated</ul>

				<p>

				Patch that is nominated but yet to to merged in the patch queue/branch.

				</p>

				<ul><li>Queued</ul>

				<p>

				Patch is in the queue/branch and will feature in the next release.

				Barring reported regressions or objections from developers.

				</p>

				<ul><li>Rejected</ul>

				<p>

				Patch does not fit the

				<a href="submittingpatches.html#criteria" target="_parent">criteria</a> and

				is followed by a brief information.

				<br>

				The release maintainer is human so if you believe you've spotted a mistake do

				let them know.

				</p>

				<h2>Format/template</h2>

				<pre>

				Subject: [ANNOUNCE] Mesa X.Y.Z release candidate

				To: mesa-announce@...

				Cc: mesa-dev@...

				Hello list,

				The candidate for the Mesa X.Y.Z is now available. Currently we have:

				 - NUMBER queued

				 - NUMBER nominated (outstanding)

				 - and NUMBER rejected patches

				BRIEF SUMMARY OF CHANGES

				Take a look at section "Mesa stable queue" for more information.

				Testing reports/general approval

				--------------------------------

				Any testing reports (or general approval of the state of the branch) will be

				greatly appreciated.

				The plan is to have X.Y.Z this DAY (DATE), around or shortly after TIME.

				If you have any questions or suggestions - be that about the current patch

				queue or otherwise, please go ahead.

				Trivial merge conflicts

				-----------------------

				List of commits where manual intervention was required.

				Keep the authors in the CC list.

				commit SHA

				Author: AUTHOR

				    COMMIT SUMMARY

				    CHERRY PICKED FROM

				For example:

				commit 990f395e007c3204639daa34efc3049f350ee819

				Author: Emil Velikov &lt;emil.velikov@collabora.com&gt;

				    anv: automake: cleanup the generated json file during make clean

				    (cherry picked from commit 8df581520a823564be0ab5af7dbb7d501b1c9670)

				Cheers,

				Emil

				Mesa stable queue

				-----------------

				Nominated (NUMBER)

				==================

				AUTHOR (NUMBER):

				      SHA     COMMIT SUMMARY

				For example:

				Dave Airlie (1):

				      2de85eb radv: fix texturesamples to handle single sample case

				Queued (NUMBER)

				===============

				AUTHOR (NUMBER):

				      COMMIT SUMMARY

				For example:

				Jonas Pfeil (1):

				      ralloc: Make sure ralloc() allocations match malloc()'s alignment.

				Squashed with

				      ralloc: don't leave out the alignment factor

				Rejected (NUMBER)

				=================

				Rejected (11)

				=============

				AUTHOR (NUMBER):

				      SHA     COMMIT SUMMARY

				Reason: ...

				</pre>

				<h1 id="release">Making a new release</h1>

				<p>

				These are the instructions for making a new Mesa release.

				</p>

				<h3>Get latest source files</h3>

				<p>

				Ensure the latest code is available - both in your local master and the

				relevant branch.

				</p>

				<h3>Perform basic testing</h3>

				<p>

				Most of the testing should already be done during the

				<a href="#pickntest">cherry-pick</a> and

				<a href="#prerelease">pre-announce</a> stages.

				So we do a quick 'touch test'

				<ul>

				<li>make distcheck (you can omit this if you're not using --dist below)

				<li>scons (from release tarball)

				<li>the produced binaries work

				</ul>

				<p>

				Here is one solution that I've been using.

				</p>

				<pre>

					git clean -fXd; git clean -nxd

					read # quick cross check any outstanding files

					export __version=`cat VERSION`

					export __mesa_root=../

					export __build_root=./foo

					chmod 755 -fR $__build_root; rm -rf $__build_root

					mkdir -p $__build_root &amp;&amp; cd $__build_root

					$__mesa_root/autogen.sh &amp;&amp; make -j2 distcheck

					# Build check the tarballs (scons, linux)

					tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version

					scons

					cd .. &amp;&amp; rm -rf mesa-$__version

					# Build check the tarballs (scons, windows/mingw)

					tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version

					scons platform=windows toolchain=crossmingw

					cd .. &amp;&amp; rm -rf mesa-$__version

					# Test the automake binaries

					tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version

					./configure \

						--with-dri-drivers=i965,swrast \

						--with-gallium-drivers=swrast \

						--with-vulkan-drivers=intel \

						--enable-llvm-shared-libs \

						--enable-llvm \

						--enable-glx-tls \

						--enable-gbm \

						--enable-egl \

						--with-egl-platforms=x11,drm,wayland

					make -j2 &amp;&amp; DESTDIR=`pwd`/test make -j6 install

					__glxinfo_cmd='glxinfo 2>&amp;1 | egrep -o "Mesa.*|Gallium.*|.*dri\.so"'

					__glxgears_cmd='glxgears 2>&amp;1 | grep -v "configuration file"'

					__es2info_cmd='es2_info 2>&amp;1 | egrep "GL_VERSION|GL_RENDERER|.*dri\.so"'

					__es2gears_cmd='es2gears_x11 2>&amp;1 | grep -v "configuration file"'

					export LD_LIBRARY_PATH=`pwd`/test/usr/local/lib/

					export LIBGL_DRIVERS_PATH=`pwd`/test/usr/local/lib/dri/

					export LIBGL_DEBUG=verbose

					eval $__glxinfo_cmd

					eval $__glxgears_cmd

					eval $__es2info_cmd

					eval $__es2gears_cmd

					export LIBGL_ALWAYS_SOFTWARE=1

					eval $__glxinfo_cmd

					eval $__glxgears_cmd

					eval $__es2info_cmd

					eval $__es2gears_cmd

					export LIBGL_ALWAYS_SOFTWARE=1

					export GALLIUM_DRIVER=softpipe

					eval $__glxinfo_cmd

					eval $__glxgears_cmd

					eval $__es2info_cmd

					eval $__es2gears_cmd

					# Smoke test DOTA2

					unset LD_LIBRARY_PATH

					unset LIBGL_DRIVERS_PATH

					unset LIBGL_DEBUG

					unset LIBGL_ALWAYS_SOFTWARE

					export VK_ICD_FILENAMES=`pwd`/src/intel/vulkan/dev_icd.json

					steam steam://rungameid/570  -vconsole -vulkan

				</pre>

				<h3>Update version in file VERSION</h3>

				<p>

				Increment the version contained in the file VERSION at Mesa's top-level, then

				commit this change.

				</p>

				<h3>Create release notes for the new release</h3>

				<p>

				Create a new file docs/relnotes/X.Y.Z.html, (follow the style of the previous

				release notes). Note that the sha256sums section of the release notes should

				be empty (TBD) at this point.

				</p>

				<p>

				Two scripts are available to help generate portions of the release notes:

				<pre>

					./bin/bugzilla_mesa.sh

					./bin/shortlog_mesa.sh

				</pre>

				<p>

				The first script identifies commits that reference bugzilla bugs and obtains

				the descriptions of those bugs from bugzilla. The second script generates a

				log of all commits. In both cases, HTML-formatted lists are printed to stdout

				to be included in the release notes.

				</p>

				<p>

				Commit these changes and push the branch.

				</p>

				<pre>

					git push origin HEAD

				</pre>

				<h3>Use the release.sh script from xorg <a href="https://cgit.freedesktop.org/xorg/util/modular/">util-modular</a></h3>

				<p>

				Start the release process.

				</p>

				<pre>

					../relative/path/to/release.sh . # append --dist if you've already done distcheck above

				</pre>

				<p>

				Pay close attention to the prompts as you might be required to enter your GPG

				and SSH passphrase(s) to sign and upload the files, respectively.

				</p>

				<h3>Add the sha256sums to the release notes</h3>

				<p>

				Edit docs/relnotes/X.Y.Z.html to add the sha256sums as available in the mesa-X.Y.Z.announce template. Commit this change.

				</p>

				<h3>Back on mesa master, add the new release notes into the tree</h3>

				<p>

				Something like the following steps will do the trick:

				</p>

				<pre>

					git cherry-pick -x X.Y~1

					git cherry-pick -x X.Y

				</pre>

				<p>

				Also, edit docs/relnotes.html to add a link to the new release notes, and edit

				docs/index.html to add a news entry. Then commit and push:

				</p>

				<pre>

					git commit -as -m "docs: add news item and link release notes for X.Y.Z"

					git push origin master X.Y

				</pre>

				<h1 id="announce">Announce the release</h1>

				<p>

				Use the generated template during the releasing process.

				</p>

				<h1 id="website">Update the mesa3d.org website</h1>

				<p>

				As the hosting was moved to freedesktop, git hooks are deployed to update the

				website. Manually check that it is updated 5-10 minutes after the final <code>git push</code>

				</p>

				<h1 id="bugzilla">Update Bugzilla</h1>

				<p>

				Parse through the bugreports as listed in the docs/relnotes/X.Y.Z.html

				document.

				<br>

				If there's outstanding action, close the bug referencing the commit ID which

				addresses the bug and mention the Mesa version that has the fix.

				</p>

				<p>

				Note: the above is not applicable to all the reports, so use common sense.

				</p>

				</div>

				</body>

				</html>

									
										20

docs/relnotes.html
									
												View File
												
				@@ -21,6 +21,26 @@ The release notes summarize what's new or changed in each Mesa release.

				</p>

				<ul>

				<li><a href="relnotes/17.0.5.html">17.0.5 release notes</a>

				<li><a href="relnotes/17.0.4.html">17.0.4 release notes</a>

				<li><a href="relnotes/17.0.3.html">17.0.3 release notes</a>

				<li><a href="relnotes/17.0.2.html">17.0.2 release notes</a>

				<li><a href="relnotes/13.0.6.html">13.0.6 release notes</a>

				<li><a href="relnotes/17.0.1.html">17.0.1 release notes</a>

				<li><a href="relnotes/13.0.5.html">13.0.5 release notes</a>

				<li><a href="relnotes/17.0.0.html">17.0.0 release notes</a>

				<li><a href="relnotes/13.0.4.html">13.0.4 release notes</a>

				<li><a href="relnotes/12.0.6.html">12.0.6 release notes</a>

				<li><a href="relnotes/13.0.3.html">13.0.3 release notes</a>

				<li><a href="relnotes/12.0.5.html">12.0.5 release notes</a>

				<li><a href="relnotes/13.0.2.html">13.0.2 release notes</a>

				<li><a href="relnotes/13.0.1.html">13.0.1 release notes</a>

				<li><a href="relnotes/12.0.4.html">12.0.4 release notes</a>

				<li><a href="relnotes/13.0.0.html">13.0.0 release notes</a>

				<li><a href="relnotes/12.0.3.html">12.0.3 release notes</a>

				<li><a href="relnotes/12.0.2.html">12.0.2 release notes</a>

				<li><a href="relnotes/12.0.1.html">12.0.1 release notes</a>

				<li><a href="relnotes/12.0.0.html">12.0.0 release notes</a>

				<li><a href="relnotes/11.2.2.html">11.2.2 release notes</a>

				<li><a href="relnotes/11.1.4.html">11.1.4 release notes</a>

				<li><a href="relnotes/11.2.1.html">11.2.1 release notes</a>

									
										2

docs/relnotes/12.0.1.html
									
												View File
												
				@@ -16,8 +16,6 @@

				<h1>Mesa 12.0.1 Release Notes / July 8, 2016</h1>

				<h1>Mesa 12.0.1 Release Notes / July 8, 2016</h1>

				<p>

				Mesa 12.0.1 is a bug fix release which fixes bugs found since the 12.0.1 release.

				</p>

									
										3

docs/relnotes/12.0.2.html
									
												View File
												
				@@ -31,7 +31,8 @@ because compatibility contexts are not supported.

				<h2>SHA256 checksums</h2>

				<pre>

				TBD

				a08565ab1273751ebe2ffa928cbf785056594c803077c9719d0763da780f2918  mesa-12.0.2.tar.gz

				d957a5cc371dcd7ff2aa0d87492f263aece46f79352f4520039b58b1f32552cb  mesa-12.0.2.tar.xz

				</pre>

									
										71

docs/relnotes/12.0.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,71 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.3 Release Notes / September 15, 2016</h1>

				<p>

				Mesa 12.0.3 is a bug fix release which fixes bugs found since the 12.0.3 release.

				</p>

				<p>

				Mesa 12.0.3 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				79abcfab3de30dbd416d1582a3cf6b1be308466231488775f1b7bb43be353602 mesa-12.0.3.tar.gz

				1dc86dd9b51272eee1fad3df65e18cda2e556ef1bc0b6e07cd750b9757f493b1 mesa-12.0.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97781">Bug 97781</a> - [HSW, BYT, IVB] es2-cts.gtf.gl2extensiontests.depth_texture_cube_map.depth_texture_cube_map</li>

				</ul>

				<h2>Changes</h2>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 12.0.2</li>

				  <li>Revert "i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations"</li>

				  <li>Update version to 12.0.3</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>appveyor: Update winflexbison download URL.</li>

				</ul>

				</div>

				</body>

				</html>

									
										321

docs/relnotes/12.0.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,321 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.4 Release Notes / November 10, 2016</h1>

				<p>

				Mesa 12.0.4 is a bug fix release which fixes bugs found since the 12.0.4 release.

				</p>

				<p>

				Mesa 12.0.4 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				22026ce4f1c6a7908b0d10ff057decec0a5633afe7f38a0cef5c08d0689f02a6 mesa-12.0.4.tar.gz

				5d6003da867d3f54e5000b4acdfc37e6cce5b6a4459274fdad73e24bd2f0065e mesa-12.0.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71759">Bug 71759</a> - Intel driver fails with &quot;intel_do_flush_locked failed: No such file or directory&quot; if buffer imported with EGL_NATIVE_PIXMAP_KHR</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94354">Bug 94354</a> - R9285 Unigine Valley perf regression since radeonsi: use re-Z</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96770">Bug 96770</a> - include/GL/mesa_glinterop.h:62: error: redefinition of typedef ‘GLXContext’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97233">Bug 97233</a> - vkQuake VkSpecializationMapEntry related bug</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97260">Bug 97260</a> - R9 290 low performance in Linux 4.7</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97549">Bug 97549</a> - [SNB, BXT] up to 40% perf drop from &quot;loader/dri3: Overhaul dri3_update_num_back&quot; commit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97887">Bug 97887</a> - llvm segfault in janusvr -render vive</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98025">Bug 98025</a> - [radeonsi] incorrect primitive restart index used</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98134">Bug 98134</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers wants a different GL error code</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98326">Bug 98326</a> - [dEQP, EGL] pbuffer depth/stencil tests fail</li>

				</ul>

				<h2>Changes</h2>

				<p>Axel Davy (4):</p>

				<ul>

				  <li>gallium/util: Really allow aliasing of dst for u_box_union_*</li>

				  <li>st/nine: Fix the calculation of the number of vs inputs</li>

				  <li>st/nine: Fix mistake in Volume9 UnlockBox</li>

				  <li>st/nine: Fix locking CubeTexture surfaces.</li>

				</ul>

				<p>Brendan King (1):</p>

				<ul>

				  <li>configure.ac: fix the name of the Wayland Scanner pc file</li>

				</ul>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>st/mesa: fix swizzle issue in st_create_sampler_view_from_stobj()</li>

				</ul>

				<p>Chad Versace (3):</p>

				<ul>

				  <li>egl: Fix truncation error in _eglParseSyncAttribList64</li>

				  <li>i965/sync: Fix uninitalized usage and leak of mutex</li>

				  <li>egl: Don't advertise unsupported platform extensions</li>

				</ul>

				<p>Chuanbo Weng (1):</p>

				<ul>

				  <li>gbm: fix potential NULL deref of mapImage/unmapImage.</li>

				</ul>

				<p>Chuck Atkins (1):</p>

				<ul>

				  <li>autoconf: Make header install distinct for various APIs (v2)</li>

				</ul>

				<p>Dave Airlie (3):</p>

				<ul>

				  <li>anv: initialise and increment send_sbc</li>

				  <li>anv/wsi: fix apps that acquire multiple images up front</li>

				  <li>Revert "st/vdpau: use linear layout for output surfaces"</li>

				</ul>

				<p>Emil Velikov (12):</p>

				<ul>

				  <li>docs: add sha256 checksums for 12.0.3</li>

				  <li>cherry-ignore: add non-applicable i965 commit</li>

				  <li>cherry-ignore: add vaapi encode fix</li>

				  <li>cherry-ignore: add EGL_KHR_debug fix</li>

				  <li>cherry-ignore: add update_renderbuffer_read_surfaces()</li>

				  <li>isl/gen6: correctly check msaa layout samples count</li>

				  <li>egl/x11: don't crash if dri2_dpy-&gt;conn is NULL</li>

				  <li>get-pick-list.sh: Require explicit "12.0" for nominating stable patches</li>

				  <li>automake: don't forget to pick wglext.h in the tarball</li>

				  <li>cherry-ignore: add N/A EGL revert</li>

				  <li>cherry-ignore: add ClientWaitSync fixes</li>

				  <li>Update version to 12.0.4</li>

				</ul>

				<p>Eric Anholt (5):</p>

				<ul>

				  <li>travis: Parse configure.ac to pick an updated LIBDRM_VERSION.</li>

				  <li>travis: Update to the Ubuntu Trusty image.</li>

				  <li>travis: Enable vc4 in libdrm to satisfy vc4 test build dependency.</li>

				  <li>travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers.</li>

				  <li>gallium: Fix install-gallium-links.mk on non-bash /bin/sh</li>

				</ul>

				<p>Hans de Goede (1):</p>

				<ul>

				  <li>pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>glsl: Fix cut-and-paste bug in hierarchical visitor ir_expression::accept</li>

				</ul>

				<p>Ilia Mirkin (16):</p>

				<ul>

				  <li>nv30: set usage to staging so that the buffer is allocated in GART</li>

				  <li>a3xx: make sure to actually clamp depth as requested</li>

				  <li>a3xx: make use of software clipping when hw can't handle it</li>

				  <li>a3xx: use window scissor to simulate viewport xy clip</li>

				  <li>main: GL_RGB10_A2UI does not come with GL 3.0/EXT_texture_integer</li>

				  <li>mesa/formatquery: limit ES target support, fix core context support</li>

				  <li>nir: fix definition of pack_uvec2_to_uint</li>

				  <li>gm107/ir: AL2P writes to a predicate register</li>

				  <li>st/mesa: fix is_scissor_enabled when X/Y are negative</li>

				  <li>nvc0/ir: fix overwriting of value backing non-constant gather offset</li>

				  <li>nv50/ir: copy over value's register id when resolving merge of a phi</li>

				  <li>nvc0/ir: fix textureGather with a single offset</li>

				  <li>gm107/ir: fix texturing with indirect samplers</li>

				  <li>gm107/ir: fix bit offset of tex lod setting for indirect texturing</li>

				  <li>nv50,nvc0: avoid reading out of bounds when getting bogus so info</li>

				  <li>nv50/ir: process texture offset sources as regular sources</li>

				</ul>

				<p>James Legg (1):</p>

				<ul>

				  <li>radeonsi: Fix primitive restart when index changes</li>

				</ul>

				<p>Jason Ekstrand (9):</p>

				<ul>

				  <li>nir/spirv: Swap the argument order for AtomicCompareExchange</li>

				  <li>nir/spirv: Use the correct sources for CompareExchange on images</li>

				  <li>nir/spirv: Break variable decoration handling into a helper</li>

				  <li>nir/spirv: Refactor variable deocration handling</li>

				  <li>nir/spirv/cfg: Handle switches whose break block is a loop continue</li>

				  <li>nir/spirv/cfg: Detect switch_break after loop_break/continue</li>

				  <li>nir: Add a nop intrinsic</li>

				  <li>nir/spirv/cfg: Use a nop intrinsic for tagging the ends of blocks</li>

				  <li>intel/blorp: Rework our usage of ralloc when compiling shaders</li>

				</ul>

				<p>Jonathan Gray (3):</p>

				<ul>

				  <li>genxml: add generated headers to EXTRA_DIST</li>

				  <li>mapi: automake: set VISIBILITY_CFLAGS for shared glapi</li>

				  <li>mesa: automake: include mesa_glinterop.h in distfile</li>

				</ul>

				<p>Julien Isorce (1):</p>

				<ul>

				  <li>st/va: also honors interlaced preference when providing a video format</li>

				</ul>

				<p>Kenneth Graunke (8):</p>

				<ul>

				  <li>nir: Call nir_metadata_preserve from nir_lower_alu_to_scalar().</li>

				  <li>mesa: Expose RESET_NOTIFICATION_STRATEGY with KHR_robustness.</li>

				  <li>i965: Fix missing _NEW_TRANSFORM in Gen8+ 3DSTATE_DS atom.</li>

				  <li>i965: Add missing BRW_NEW_VS_PROG_DATA to 3DSTATE_CLIP.</li>

				  <li>i965: Move BRW_NEW_FRAGMENT_PROGRAM from 3DSTATE_PS to PS_EXTRA.</li>

				  <li>i965: Add missing BRW_NEW_CS_PROG_DATA to compute constant atom.</li>

				  <li>i965: Add missing BRW_CS_PROG_DATA to CS work group surface atom.</li>

				  <li>i965: Fix gl_InvocationID in dual object GS where invocations == 1.</li>

				</ul>

				<p>Marek Olšák (12):</p>

				<ul>

				  <li>radeonsi: fix cubemaps viewed as 2D</li>

				  <li>radeonsi: take compute shader and dispatch indirect memory usage into account</li>

				  <li>radeonsi: fix FP64 UBO loads with indirect uniform block indexing</li>

				  <li>mesa: fix glGetFramebufferAttachmentParameteriv w/ on-demand FRONT_BACK alloc</li>

				  <li>radeonsi: fix interpolateAt opcodes for .zw components</li>

				  <li>radeonsi: fix texture border colors for compute shaders</li>

				  <li>radeonsi: disable ReZ</li>

				  <li>gallium/radeon: make sure the address of separate CMASK is aligned properly</li>

				  <li>winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures</li>

				  <li>egl: use util/macros.h</li>

				  <li>egl: make interop ABI visible again</li>

				  <li>glx: make interop ABI visible again</li>

				</ul>

				<p>Mario Kleiner (1):</p>

				<ul>

				  <li>glx: Perform check for valid fbconfig against proper X-Screen.</li>

				</ul>

				<p>Martin Peres (2):</p>

				<ul>

				  <li>loader/dri3: add get_dri_screen() to the vtable</li>

				  <li>loader/dri3: import prime buffers in the currently-bound screen</li>

				</ul>

				<p>Matt Whitlock (5):</p>

				<ul>

				  <li>egl/android: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>

				  <li>gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>

				  <li>st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>

				  <li>st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>

				  <li>gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>

				</ul>

				<p>Max Staudt (1):</p>

				<ul>

				  <li>r300g: Set R300_VAP_CNTL on RSxxx to avoid triangle flickering</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>loader/dri3: Overhaul dri3_update_num_back</li>

				</ul>

				<p>Nicholas Bishop (2):</p>

				<ul>

				  <li>gbm: return appropriate error when queryImage() fails</li>

				  <li>st/dri: check pipe_screen-&gt;resource_get_handle() return value</li>

				</ul>

				<p>Nicolai Hähnle (10):</p>

				<ul>

				  <li>gallium/radeon: cleanup and fix branch emits</li>

				  <li>st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations</li>

				  <li>st/glsl_to_tgsi: simplify translate_tex_offset</li>

				  <li>st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets</li>

				  <li>st/mesa: fix vertex elements setup for doubles</li>

				  <li>radeonsi: fix indirect loads of 64 bit constants</li>

				  <li>st/glsl_to_tgsi: fix atomic counter addressing</li>

				  <li>st/glsl_to_tgsi: fix block copies of arrays of doubles</li>

				  <li>st/mesa: only set primitive_restart when the restart index is in range</li>

				  <li>radeonsi: fix 64-bit loads from LDS</li>

				</ul>

				<p>Samuel Pitoiset (4):</p>

				<ul>

				  <li>nvc0/ir: fix subops for IMAD</li>

				  <li>gk110/ir: fix wrong emission of OP_NOT</li>

				  <li>nvc0: use correct bufctx when invalidating CP textures</li>

				  <li>nvc0/ir: fix emission of IMAD with NEG modifiers</li>

				</ul>

				<p>Stencel, Joanna (1):</p>

				<ul>

				  <li>egl/wayland: add missing destroy_window callback</li>

				</ul>

				<p>Tapani Pälli (5):</p>

				<ul>

				  <li>egl: stop claiming support for pbuffer + msaa</li>

				  <li>egl/dri2: set max values for pbuffer width and height</li>

				  <li>egl: add check that eglCreateContext gets a valid config</li>

				  <li>mesa: fix error handling in DrawBuffers</li>

				  <li>egl: set preserved behavior for surface only if config supports it</li>

				</ul>

				<p>Tim Rowley (1):</p>

				<ul>

				  <li>configure.ac: add llvm inteljitevents component if enabled</li>

				</ul>

				<p>Vedran Miletić (1):</p>

				<ul>

				  <li>clover: Fix build against clang SVN &gt;= r273191</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>Revert "mesa_glinterop: remove inclusion of GLX header"</li>

				</ul>

				</div>

				</body>

				</html>

									
										138

docs/relnotes/12.0.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,138 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.5 Release Notes / December 5, 2016</h1>

				<p>

				Mesa 12.0.5 is a bug fix release which fixes bugs found since the 12.0.5 release.

				</p>

				<p>

				Mesa 12.0.5 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				44d08a27d98bfeacd864381189e434d98afbf451689d01f80380dc1d66450e5b  mesa-12.0.5.tar.gz

				2b0a972d8282860a11291c09c3ef01ac45171405951eb21a83c45ed2b4321924  mesa-12.0.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77662">Bug 77662</a> - Fail to render to different faces of depth-stencil cube map</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97779">Bug 97779</a> - [regression, bisected][BDW, GPU hang] stuck on render ring, always reproducible</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98415">Bug 98415</a> - Vulkan Driver JSON file contains incorrect field</li>

				</ul>

				<h2>Changes</h2>

				<p>Adam Jackson (2):</p>

				<ul>

				  <li>glx/glvnd: Don't modify the dummy slot in the dispatch table</li>

				  <li>glx/glvnd: Fix dispatch function names and indices</li>

				</ul>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>i965: Fix GPU hang related to multiple render targets and alpha testing</li>

				</ul>

				<p>Emil Velikov (4):</p>

				<ul>

				  <li>docs: add release notes for 12.0.4</li>

				  <li>docs: add sha256 checksums for 12.0.4</li>

				  <li>cherry-ignore: add reverted LLVM_LIBDIR patch</li>

				  <li>Update version to 12.0.5</li>

				</ul>

				<p>Haixia Shi (1):</p>

				<ul>

				  <li>mesa: change state query return value for RGB565</li>

				</ul>

				<p>Jason Ekstrand (3):</p>

				<ul>

				  <li>i965/fs/generator: Don't use the address immediate for MOV_INDIRECT</li>

				  <li>anv/cmd_buffer: Take a command buffer instead of a batch in two helpers</li>

				  <li>anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>intel: Fix pixel shader scratch space allocation on Gen9+ platforms.</li>

				</ul>

				<p>Marek Olšák (13):</p>

				<ul>

				  <li>gallium/radeon: fix behavior of GLSL findLSB(0)</li>

				  <li>gallium/radeon: make sure HTILE address is aligned properly</li>

				  <li>radeonsi: fix an assertion failure in si_decompress_sampler_color_textures</li>

				  <li>gallium/radeon: unify viewport emission code</li>

				  <li>gallium/radeon: set VPORT_ZMIN/MAX registers correctly</li>

				  <li>radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader</li>

				  <li>radeonsi: fix a crash in imageSize for cubemap arrays</li>

				  <li>radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it</li>

				  <li>gallium/radeon: add support for sharing textures with DCC between processes</li>

				  <li>radeonsi: always set all blend registers</li>

				  <li>radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending</li>

				  <li>radeonsi: disable RB+ blend optimizations for dual source blending</li>

				  <li>radeonsi: silence runtime warnings with LLVM 3.9</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>anv: Replace "abi_versions" with correct "api_version".</li>

				</ul>

				<p>Nanley Chery (1):</p>

				<ul>

				  <li>mesa/fbobject: Update CubeMapFace when reusing textures</li>

				</ul>

				<p>Steinar H. Gunderson (1):</p>

				<ul>

				  <li>Fix races during _mesa_HashWalk().</li>

				</ul>

				<p>Tim Rowley (3):</p>

				<ul>

				  <li>swr: [rasterizer jitter] cleanup supporting different llvm versions</li>

				  <li>swr: [rasterizer jitter] fix llvm-3.7 compile</li>

				  <li>swr: [rasterizer] add support for llvm-3.9</li>

				</ul>

				</div>

				</body>

				</html>

									
										148

docs/relnotes/12.0.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,148 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.6 Release Notes / January 23, 2017</h1>

				<p>

				Mesa 12.0.6 is a bug fix release which fixes bugs found since the 12.0.5 release.

				</p>

				<p>

				Mesa 12.0.6 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				65339ba5d76a45225b8b56f9a1da9db15c569e1d163760faa2921da0a8461741  mesa-12.0.6.tar.gz

				7d6da9744c1022a4c2ab6ad01a206984d00443fb691568011d01b3dd97e36448  mesa-12.0.6.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92234">Bug 92234</a> - [BDW] GPU hang in Shogun2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95130">Bug 95130</a> - Derivatives of gl_Color wrong when helper pixels used</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98329">Bug 98329</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99030">Bug 99030</a> - [HSW, regression] transform feedback fails on Linux 4.8</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99354">Bug 99354</a> - [G71] &quot;Assertion `bkref' failed&quot; reproducible with glmark2</li>

				</ul>

				<h2>Changes</h2>

				<p>Chad Versace (3):</p>

				<ul>

				  <li>i965/mt: Disable aux surfaces after making miptree shareable</li>

				  <li>i965/mt: Disable HiZ when sharing depth buffer externally (v2)</li>

				  <li>anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>docs: add sha256 checksums for 12.0.5</li>

				  <li>get-typod-pick-list.sh: add new script</li>

				  <li>automake: use shared llvm libs for make distcheck</li>

				  <li>egl/wayland: use the destroy_window_callback for swrast</li>

				  <li>Update version to 12.0.6</li>

				</ul>

				<p>Fredrik Höglund (1):</p>

				<ul>

				  <li>dri3: Fix MakeCurrent without a default framebuffer</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>nouveau: take extra push space into account for pushbuf_space calls</li>

				</ul>

				<p>Jason Ekstrand (19):</p>

				<ul>

				  <li>spirv/nir: Fix some texture opcode asserts</li>

				  <li>spirv/nir: Add support for shadow samplers that return vec4</li>

				  <li>spirv/nir: Properly handle gather components</li>

				  <li>anv/pipeline: Set binding_table.gather_texture_start</li>

				  <li>nir: Add a helper for determining the type of a texture source</li>

				  <li>nir/lower_tex: Add some helpers for working with tex sources</li>

				  <li>nir/lower_tex: Add support for lowering coordinate offsets</li>

				  <li>i965/nir: Enable NIR lowering of txf and rect offsets</li>

				  <li>i965: Get rid of the do_lower_unnormalized_offsets pass</li>

				  <li>spirv/nir: Don't increment coord_components for array lod queries</li>

				  <li>anv/image: Assert that the image format is actually supported</li>

				  <li>spirv/nir: Move opcode selection higher up in handle_texture</li>

				  <li>spirv/nir: Refactor type handling in handle_texture</li>

				  <li>nir/spirv: Refactor coordinate handling in handle_texture</li>

				  <li>spirv/nir: Handle texture projectors</li>

				  <li>spirv/nir: Add support for ImageQuerySamples</li>

				  <li>anv/device: Return the right error for failed maps</li>

				  <li>anv/device: Implicitly unmap memory objects in FreeMemory</li>

				  <li>anv/descriptor_set: Write the state offset in the surface state free list.</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>spirv: Move cursor before calling vtn_ssa_value() in phi 2nd pass.</li>

				  <li>i965: Properly flush in hsw_pause_transform_feedback().</li>

				</ul>

				<p>Marek Olšák (6):</p>

				<ul>

				  <li>cso: don't release sampler states that are bound</li>

				  <li>radeonsi: always restore sampler states when unbinding sampler views</li>

				  <li>radeonsi: fix incorrect FMASK checking in bind_sampler_states</li>

				  <li>radeonsi: disable CE on SI + AMDGPU</li>

				  <li>radeonsi: disable the constant engine (CE) on Carrizo and Stoney</li>

				  <li>gallium/radeon: fix the draw-calls HUD query</li>

				</ul>

				<p>Matt Turner (3):</p>

				<ul>

				  <li>i965/fs: Rename opt_copy_propagate -&gt; opt_copy_propagation.</li>

				  <li>i965/fs: Add unit tests for copy propagation pass.</li>

				  <li>i965/fs: Reject copy propagation into SEL if not min/max.</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>cso: Don't restore nr_samplers in cso_restore_fragment_samplers</li>

				</ul>

				<p>Nicolai Hähnle (1):</p>

				<ul>

				  <li>radeonsi: enable WQM in PS prolog when needed</li>

				</ul>

				</div>

				</body>

				</html>

									
										311

docs/relnotes/13.0.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,311 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 13.0.0 Release Notes / November 1, 2016</h1>

				<p>

				Mesa 13.0.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 13.0.1.

				</p>

				<p>

				Mesa 13.0.0 implements the OpenGL 4.4 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.4.  OpenGL

				4.4 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				4a54d7cdc1a94a8dae05a75ccff48356406d51b0d6a64cbdc641c266e3e008eb  mesa-13.0.0.tar.gz

				94edb4ebff82066a68be79d9c2627f15995e1fe10f67ab3fc63deb842027d727  mesa-13.0.0.tar.xz

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>OpenGL ES 3.1 on i965/hsw</li>

				<li>OpenGL ES 3.2 on i965/gen9+ (Skylake and later)</li>

				<li>GL_ARB_ES3_1_compatibility on i965</li>

				<li>GL_ARB_ES3_2_compatibility on i965/gen8+</li>

				<li>GL_ARB_clear_texture on r600, radeonsi</li>

				<li>GL_ARB_compute_variable_group_size on nvc0, radeonsi</li>

				<li>GL_ARB_cull_distance on radeonsi</li>

				<li>GL_ARB_enhanced_layouts on i965, nv50, nvc0, radeonsi, llvmpipe, softpipe</li>

				<li>GL_ARB_indirect_parameters on radeonsi</li>

				<li>GL_ARB_query_buffer_object on radeonsi</li>

				<li>GL_ARB_shader_draw_parameters on radeonsi</li>

				<li>GL_ARB_shader_group_vote on nvc0</li>

				<li>GL_ARB_shader_viewport_layer_array on i965/gen6+</li>

				<li>GL_ARB_stencil_texturing on i965/hsw</li>

				<li>GL_ARB_texture_stencil8 on i965/hsw</li>

				<li>GL_EXT_window_rectangles on nv50, nvc0</li>

				<li>GL_KHR_blend_equation_advanced on i965</li>

				<li>GL_KHR_robustness on nvc0, radeonsi</li>

				<li>GL_KHR_texture_compression_astc_sliced_3d on i965</li>

				<li>GL_OES_copy_image on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe</li>

				<li>GL_OES_geometry_shader on i965/gen8+, nvc0, radeonsi</li>

				<li>GL_OES_primitive_bounding_box on i965/gen7+, nvc0, radeonsi</li>

				<li>GL_OES_texture_cube_map_array on i965/gen8+, nvc0, radeonsi</li>

				<li>GL_OES_tessellation_shader on i965/gen7+, nvc0, radeonsi</li>

				<li>GL_OES_viewport_array on nvc0, radeonsi</li>

				<li>GL_ANDROID_extension_pack_es31a on i965/gen9+</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61907">Bug 61907</a> - Indirect rendering of multi-texture vertex arrays broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69622">Bug 69622</a> - eglTerminate then eglMakeCurrent crahes</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71759">Bug 71759</a> - Intel driver fails with &quot;intel_do_flush_locked failed: No such file or directory&quot; if buffer imported with EGL_NATIVE_PIXMAP_KHR</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83036">Bug 83036</a> - [ILK]Piglit spec_ARB_copy_image_arb_copy_image-formats fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89599">Bug 89599</a> - symbol 'x86_64_entry_start' is already defined when building with LLVM/clang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90513">Bug 90513</a> - Odd gray and red flicker in The Talos Principle on GK104</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91342">Bug 91342</a> - Very dark textures on some objects in indoors environments in Postal 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92306">Bug 92306</a> - GL Excess demo renders incorrectly on nv43</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94148">Bug 94148</a> - Framebuffer considered invalid when a draw call is done before glCheckFramebufferStatus</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94354">Bug 94354</a> - R9285 Unigine Valley perf regression since radeonsi: use re-Z</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94561">Bug 94561</a> - [llvmpipe] PIPE_CAP_VIDEO_MEMORY reports negative value on 32 bits (with 16GB ram)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94627">Bug 94627</a> - Game Risen on wine black grass</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94681">Bug 94681</a> - dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 takes 25 minutes to compile</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95000">Bug 95000</a> - deqp: assert in dEQP-GLES3.functional.vertex_arrays.single_attribute.strides.fixed.user_ptr_stride17_components2_quads1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95130">Bug 95130</a> - Derivatives of gl_Color wrong when helper pixels used</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95246">Bug 95246</a> - Segfault in glBindFramebuffer()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95419">Bug 95419</a> - [HSW][regression][bisect] RPG Maker game gives &quot;invalid floating point operation&quot; at startup</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95462">Bug 95462</a> - [BXT,BSW] arb_gpu_shader_fp64 causes gpu hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95529">Bug 95529</a> - [regression, bisected] Image corruption in Chrome</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96235">Bug 96235</a> - st_nir.h:34: error: redefinition of typedef ‘nir_shader’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96274">Bug 96274</a> - [NVC0] Failure when compiling compute shader: Assertion `bb-&gt;getFirst()-&gt;serial &lt;= bb-&gt;getExit()-&gt;serial' failed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96285">Bug 96285</a> - Mesa build broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96299">Bug 96299</a> - [vulkan] 64 regressions due to mesa d5f2f32</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96343">Bug 96343</a> - oom since st/mesa: implement PBO downloads for ReadPixels</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96346">Bug 96346</a> - [SNB,CTS] es2-cts.gtf.gl.atan regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96349">Bug 96349</a> - [CTS,SKL,BSW,BDW,KBL,BXT] es31-cts.arrays_of_arrays.interactionuniformbuffers3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96351">Bug 96351</a> - [CTS,SKL,KBL,BXT] es2-cts.gtf.gl2extensiontests.egl_image.egl_image</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96358">Bug 96358</a> - SSO: wrong interface validation between GS and VS (regresion due to latest gles 3.1)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96425">Bug 96425</a> - [bisected] occasional dark render in The Talos Principle</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96484">Bug 96484</a> - [vulkan] deqp-vk.glsl.builtin.precision.sin / cos regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96504">Bug 96504</a> - [vulkancts] compute tests crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96516">Bug 96516</a> - [bisected: 482526] &quot;clover: Update OpenCL version string to match OpenGL&quot;: clover's build fails because of missing git_sha1.h</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96528">Bug 96528</a> - Location qualifier segfaults during shader compilation</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96541">Bug 96541</a> - Tonga Unreal elemental bad rendering since radeonsi: Decompress DCC textures in a render feedback loop</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96565">Bug 96565</a> - Clive Barker's Jericho displays strange,vivid colors when motion blur enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96607">Bug 96607</a> - [bisected] texture misrender / flicker in The Talos Principle on SKL</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96617">Bug 96617</a> - gl_SecondaryFragDataEXT doesn't work for extended blend func</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96629">Bug 96629</a> - dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0: Assertion `width &gt;= 1' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96639">Bug 96639</a> - st/mesa: transfer_map with too-high level with dEQP-GLES2.functional.texture.completeness.cube.extra_level</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96674">Bug 96674</a> - [SNB, ILK] spec.ext_image_dma_buf_import.ext_image_dma_buf_import-sample_nv1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96729">Bug 96729</a> - Wrong shader compilation error message</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96762">Bug 96762</a> - [radeonsi,apitrace] Firewatch: nothing rendered in scrollable (text) areas</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96765">Bug 96765</a> - BindFragDataLocationIndexed on array fragment shader output.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96770">Bug 96770</a> - include/GL/mesa_glinterop.h:62: error: redefinition of typedef ‘GLXContext’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96782">Bug 96782</a> - [regression bisected] R600 fp64 and glsl-4.00 piglit failures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96791">Bug 96791</a> - Cannot use image from swapchains for sampling</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96825">Bug 96825</a> - anv_device.c:31:27: fatal error: anv_timestamp.h: No such file or directory</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96835">Bug 96835</a> - &quot;gallium: Force blend color to 16-byte alignment&quot; crash with &quot;-march=native -O3&quot; causes some 32bit games to crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96850">Bug 96850</a> - Crucible tests fail for 32bit mesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96878">Bug 96878</a> - [Bisected: cc2d0e6][HSW] &quot;GPU HANG&quot; msg after autologin to gnome-session</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96908">Bug 96908</a> - [radeonsi] MSAA causes graphical artifacts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96911">Bug 96911</a> - webgl2 conformance2/textures/misc/tex-mipmap-levels.html crashes 12.1 Intel driver</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96949">Bug 96949</a> - [regression] Piglit numSamples assertion failures with 9a23a177b90</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96950">Bug 96950</a> - Another regression from bc4e0c486: vbo: Use a bitmask to track the active arrays in vbo_exec*.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96971">Bug 96971</a> - invariant qualifier is not valid for shader inputs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97019">Bug 97019</a> - [clover] build failure in llvm/codegen/native.cpp:129:52</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97032">Bug 97032</a> - [BDW,SKL] piglit.spec.arb_gpu_shader5.arb_gpu_shader5-interpolateatcentroid-flat</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97033">Bug 97033</a> - [BDW,SKL] piglit.spec.arb_gpu_shader_fp64.varying-packing.simple regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97039">Bug 97039</a> - The Talos Principle and Serious Sam 3 GPU faults</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97083">Bug 97083</a> - [IVB,BYT] GPU hang on deqp-gles31.functional.separate.shader.random</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97140">Bug 97140</a> - dd_draw.c:949:11: error: implicit declaration of function 'fmemopen' is invalid in C99 [-Werror,-Wimplicit-function-declaration]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97207">Bug 97207</a> - [IVY BRIDGE] Fragment shader discard writing to depth</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97214">Bug 97214</a> - X not running with error &quot;Failed to make EGL context current&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97225">Bug 97225</a> - [i965 on HD4600 Haswell] xcom switch to ingame cinematics cause segmentation fault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97233">Bug 97233</a> - vkQuake VkSpecializationMapEntry related bug</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97260">Bug 97260</a> - R9 290 low performance in Linux 4.7</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97267">Bug 97267</a> - [BDW] GL45-CTS.texture_cube_map_array.sampling asserts inside brw_fs.cpp</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97278">Bug 97278</a> - [vulkancts,HSW] all vulkancts tests assert on HSW</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97285">Bug 97285</a> - Darkness in Dota 2 after Patch &quot;Make Gallium's BlitFramebuffer follow the GL 4.4 sRGB rules&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97286">Bug 97286</a> - `make check` fails uniform-initializer-test</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97305">Bug 97305</a> - Gallium: TBOs and images set the offset in elements, not bytes</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97307">Bug 97307</a> - glsl/glcpp/tests/glcpp-test regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97309">Bug 97309</a> - piglit.spec.glsl-1_30.compiler.switch-statement.switch-case-duplicated.vert regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97322">Bug 97322</a> - GenerateMipmap creates wrong mipmap for sRGB texture</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97331">Bug 97331</a> - glDrawElementsBaseVertex doesn't work in display list on i915</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97351">Bug 97351</a> - DrawElementsBaseVertex with VBO ignores base vertex on Intel GMA 9xx in some cases</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97413">Bug 97413</a> - BioShock Infinite crashes on startup with Mesa Git version, R7 370</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97426">Bug 97426</a> - glScissor gives vertically inverted result</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97448">Bug 97448</a> - [HSW] deqp-vk.api_.copy_and_blit.image_to_image_stencil regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97476">Bug 97476</a> - Shader binaries should not be stored in the PipelineCache</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97477">Bug 97477</a> - i915g: gl_FragCoord is always (0.0, max_y)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97513">Bug 97513</a> - clover reports wrong device pointer size</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97549">Bug 97549</a> - [SNB, BXT] up to 40% perf drop from &quot;loader/dri3: Overhaul dri3_update_num_back&quot; commit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97587">Bug 97587</a> - make check nir/tests/control_flow_tests regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97761">Bug 97761</a> - es2-cts.gtf.gl2extensiontests.egl_image_external.testsimpleunassociated crashes</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97773">Bug 97773</a> - New Mesa master now results in warnings in glrender (and subsurfaces and simple-egl), black screen</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97779">Bug 97779</a> - [regression, bisected][BDW, GPU hang] stuck on render ring, always reproducible</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97790">Bug 97790</a> - Vulkan cts regressions due to 24be63066</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97804">Bug 97804</a> - Later precision statement isn't overriding earlier one</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97808">Bug 97808</a> - &quot;tgsi/scan: don't set interp flags for inputs only used by INTERP instructions&quot; causes glitches in wine with gallium nine</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97887">Bug 97887</a> - llvm segfault in janusvr -render vive</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97894">Bug 97894</a> - Crash in u_transfer_unmap_vtbl when unmapping a buffer mapped in different context</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97952">Bug 97952</a> - /usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97969">Bug 97969</a> - [radeonsi, bisected: fb827c0] Video decoding shows green artifacts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97976">Bug 97976</a> - VCE regression BO to small for addr since winsys/amdgpu: enable buffer allocation from slabs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98005">Bug 98005</a> - VCE dual instance encoding inconsistent since st/va: enable dual instances encode by sync surface</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98025">Bug 98025</a> - [radeonsi] incorrect primitive restart index used</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98128">Bug 98128</a> - nir/tests/control_flow_tests.cpp:79:73: error: ‘nir_loop_first_cf_node’ was not declared in this scope</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98131">Bug 98131</a> - Compiler should reject lowp/mediump qualifiers on atomic_uints</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98133">Bug 98133</a> - GetSynciv should raise an error if bufSize &lt; 0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98134">Bug 98134</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers wants a different GL error code</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98135">Bug 98135</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.transform_feedback_varyings wants a different GL error code</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98167">Bug 98167</a> - [vulkan, radv] missing libgcrypt and openssl devel results in linker error in libvulkan_common</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98172">Bug 98172</a> - Concurrent call to glClientWaitSync results in segfault in one of the waiters.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98244">Bug 98244</a> - dEQP: textureOffset(sampler2DArrayShadow, ...) should not exist.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98264">Bug 98264</a> - Build broken for i965 due to multiple deifnitions of intelFenceExtension</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98307">Bug 98307</a> - &quot;st/glsl_to_tgsi: explicitly track all input and output declaration&quot; broke flightgear colors on rs780</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98326">Bug 98326</a> - [dEQP, EGL] pbuffer depth/stencil tests fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98415">Bug 98415</a> - Vulkan Driver JSON file contains incorrect field</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98431">Bug 98431</a> - UnrealEngine v4 demos startup fails to blorp blit assert</li>

				</ul>

				<h2>Changes</h2>

				Mesa no longer depends on libudev.

				</div>

				</body>

				</html>

									
										188

docs/relnotes/13.0.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,188 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 13.0.1 Release Notes / November 14, 2016</h1>

				<p>

				Mesa 13.0.1 is a bug fix release which fixes bugs found since the 13.0.0 release.

				</p>

				<p>

				Mesa 13.0.1 implements the OpenGL 4.4 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.4.  OpenGL

				4.4 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				7cbb91dead05cde279ee95f86e8321c8e1c8fc9deb88f12e0f587672a10d88c5  mesa-13.0.1.tar.gz

				71962fb2bf77d33b0ad4a565b490dbbeaf4619099c6d9722f04a73187957a731  mesa-13.0.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97715">Bug 97715</a> - [ILK,G45,G965] piglit.spec.arb_separate_shader_objects.misc api error checks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98012">Bug 98012</a> - [IVB] Segfault when running Dolphin twice with Vulkan</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98512">Bug 98512</a> - radeon r600 vdpau: Invalid command stream: texture bo too small</li>

				</ul>

				<h2>Changes</h2>

				<p>Adam Jackson (2):</p>

				<ul>

				  <li>glx/glvnd: Don't modify the dummy slot in the dispatch table</li>

				  <li>glx/glvnd: Fix dispatch function names and indices</li>

				</ul>

				<p>Andreas Boll (1):</p>

				<ul>

				  <li>glx/windows: Add wgl.h to the sources list</li>

				</ul>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>i965: Fix GPU hang related to multiple render targets and alpha testing</li>

				</ul>

				<p>Chih-Wei Huang (1):</p>

				<ul>

				  <li>android: avoid using libdrm with host modules</li>

				</ul>

				<p>Darren Salt (1):</p>

				<ul>

				  <li>radv/pipeline: Don't dereference NULL dynamic state pointers</li>

				</ul>

				<p>Dave Airlie (8):</p>

				<ul>

				  <li>radv: expose xlib platform extension</li>

				  <li>radv: fix dual source blending</li>

				  <li>Revert "st/vdpau: use linear layout for output surfaces"</li>

				  <li>radv: emit correct last export when Z/stencil export is enabled</li>

				  <li>ac/nir: add support for discard_if intrinsic (v2)</li>

				  <li>nir: add conditional discard optimisation (v4)</li>

				  <li>radv: enable conditional discard optimisation on radv.</li>

				  <li>radv: fix GetFenceStatus for signaled fences</li>

				</ul>

				<p>Emil Velikov (6):</p>

				<ul>

				  <li>docs: add sha256 checksums for 13.0.0</li>

				  <li>amd/addrlib: limit fastcall/regparm to GCC i386</li>

				  <li>anv: use correct .specVersion for extensions</li>

				  <li>radv: use correct .specVersion for extensions</li>

				  <li>radv: Suffix the radeon_icd file with the host CPU</li>

				  <li>Update version to 13.0.1</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>vc4: Use Newton-Raphson on the 1/W write to fix glmark2 terrain.</li>

				</ul>

				<p>Francisco Jerez (1):</p>

				<ul>

				  <li>nir: Flip gl_SamplePosition in nir_lower_wpos_ytransform().</li>

				</ul>

				<p>Fredrik Höglund (1):</p>

				<ul>

				  <li>radv: add support for anisotropic filtering on VI+</li>

				</ul>

				<p>Jason Ekstrand (21):</p>

				<ul>

				  <li>anv/device: Return DEVICE_LOST if execbuf2 fails</li>

				  <li>vulkan/wsi/x11: Better handle wsi_x11_connection_create failure</li>

				  <li>vulkan/wsi/x11: Clean up connections in finish_wsi</li>

				  <li>anv: Better handle return codes from anv_physical_device_init</li>

				  <li>intel/blorp: Use wm_prog_data instead of hand-rolling our own</li>

				  <li>intel/blorp: Pass a brw_stage_prog_data to upload_shader</li>

				  <li>anv/pipeline: Put actual pointers in anv_shader_bin</li>

				  <li>anv/pipeline: Properly cache prog_data::param</li>

				  <li>intel/blorp: Emit all the binding tables</li>

				  <li>anv/device: Add an execbuf wrapper</li>

				  <li>anv: Add a cmd_buffer_execbuf helper</li>

				  <li>anv: Don't presume to know what address is in a surface relocation</li>

				  <li>anv: Add a new bo_pool_init helper</li>

				  <li>anv/allocator: Simplify anv_scratch_pool</li>

				  <li>anv: Initialize anv_bo::offset to -1</li>

				  <li>anv/batch_chain: Improve write_reloc</li>

				  <li>anv: Add an anv_execbuf helper struct</li>

				  <li>anv/batch: Move last_ss_pool_bo_offset to the command buffer</li>

				  <li>anv: Move relocation handling from EndCommandBuffer to QueueSubmit</li>

				  <li>anv/cmd_buffer: Take a command buffer instead of a batch in two helpers</li>

				  <li>anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>glsl: Update deref types when resizing implicitly sized arrays.</li>

				  <li>mesa: Fix pixel shader scratch space allocation on Gen9+ platforms.</li>

				</ul>

				<p>Kristian Høgsberg (1):</p>

				<ul>

				  <li>anv: Do relocations in userspace before execbuf ioctl</li>

				</ul>

				<p>Marek Olšák (4):</p>

				<ul>

				  <li>egl: use util/macros.h</li>

				  <li>egl: make interop ABI visible again</li>

				  <li>glx: make interop ABI visible again</li>

				  <li>radeonsi: fix an assertion failure in si_decompress_sampler_color_textures</li>

				</ul>

				<p>Nicolai Hähnle (4):</p>

				<ul>

				  <li>radeonsi: fix BFE/BFI lowering for GLSL semantics</li>

				  <li>glsl: fix lowering of UBO references of named blocks</li>

				  <li>st/glsl_to_tgsi: fix dvec[34] loads from SSBO</li>

				  <li>st/mesa: fix the layer of VDPAU surface samplers</li>

				</ul>

				<p>Steven Toth (3):</p>

				<ul>

				  <li>gallium/hud: fix a problem where objects are free'd while in use.</li>

				  <li>gallium/hud: close a previously opened handle</li>

				  <li>gallium/hud: protect against and initialization race</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>mesa/glsl: delete previously linked shaders earlier when linking</li>

				</ul>

				</div>

				</body>

				</html>

									
										189

docs/relnotes/13.0.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,189 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 13.0.2 Release Notes / November 28, 2016</h1>

				<p>

				Mesa 13.0.2 is a bug fix release which fixes bugs found since the 13.0.1 release.

				</p>

				<p>

				Mesa 13.0.2 implements the OpenGL 4.4 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.4.  OpenGL

				4.4 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				6014233a5db6032ab8de4881384871bbe029de684502707794ce7b3e6beec308  mesa-13.0.2.tar.gz

				a6ed622645f4ed61da418bf65adde5bcc4bb79023c36ba7d6b45b389da4416d5  mesa-13.0.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97321">Bug 97321</a> - Query INFO_LOG_LENGTH for empty info log should return 0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97420">Bug 97420</a> - &quot;#version 0&quot; crashes glsl_compiler</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98632">Bug 98632</a> - Fix build on Hurd without PATH_MAX</li>

				</ul>

				<h2>Changes</h2>

				<p>Ben Widawsky (3):</p>

				<ul>

				  <li>i965: Add some APL and KBL SKU strings</li>

				  <li>i965: Reorder PCI ID list to match release order</li>

				  <li>i965/glk: Add basic Geminilake support</li>

				</ul>

				<p>Dave Airlie (14):</p>

				<ul>

				  <li>radv: fix texturesamples to handle single sample case</li>

				  <li>wsi: fix VK_INCOMPLETE for vkGetSwapchainImagesKHR</li>

				  <li>radv: don't crash on null swapchain destroy.</li>

				  <li>ac/nir/llvm: fix channel in texture gather lowering code.</li>

				  <li>radv: make sure to flush input attachments correctly.</li>

				  <li>radv: fix image view creation for depth and stencil only</li>

				  <li>radv: spir-v allows texture size query with and without lod.</li>

				  <li>vulkan/wsi/x11: handle timeouts properly in next image acquire (v1.1)</li>

				  <li>vulkan/wsi: store present mode in swapchain base class</li>

				  <li>vulkan/wsi/x11: add support for IMMEDIATE present mode</li>

				  <li>radv: fix texel fetch offset with 2d arrays.</li>

				  <li>radv/si: fix optimal micro tile selection</li>

				  <li>radv/ac/llvm: shadow samplers only return one value.</li>

				  <li>radv: fix 3D clears with baseMiplevel</li>

				</ul>

				<p>Eduardo Lima Mitev (2):</p>

				<ul>

				  <li>vulkan/wsi/x11: Fix behavior of vkGetPhysicalDeviceSurfaceFormatsKHR</li>

				  <li>vulkan/wsi/x11: Fix behavior of vkGetPhysicalDeviceSurfacePresentModesKHR</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>docs: add sha256 checksums for 13.0.1</li>

				  <li>cherry-ignore: add reverted LLVM_LIBDIR patch</li>

				  <li>anv: fix enumeration of properties</li>

				  <li>radv: honour the number of properties available</li>

				  <li>Update version to 13.0.2</li>

				</ul>

				<p>Eric Anholt (3):</p>

				<ul>

				  <li>vc4: Don't abort when a shader compile fails.</li>

				  <li>vc4: Clamp the shadow comparison value.</li>

				  <li>vc4: Fix register class handling of DDX/DDY arguments.</li>

				</ul>

				<p>Gwan-gyeong Mun (2):</p>

				<ul>

				  <li>util/disk_cache: close a previously opened handle in disk_cache_put (v2)</li>

				  <li>anv: Fix unintentional integer overflow in anv_CreateDmaBufImageINTEL</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>anv/format: handle unsupported formats properly</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>glcpp: Handle '#version 0' and other invalid values</li>

				  <li>glsl: Parse 0 as a preprocessor INTCONSTANT</li>

				</ul>

				<p>Jason Ekstrand (15):</p>

				<ul>

				  <li>anv/gen8: Stall when needed in Cmd(Set|Reset)Event</li>

				  <li>anv/wsi: Set the fence to signaled in AcquireNextImageKHR</li>

				  <li>anv: Rework fences</li>

				  <li>vulkan/wsi/wayland: Include pthread.h</li>

				  <li>vulkan/wsi/wayland: Clean up some error handling paths</li>

				  <li>vulkan/wsi: Report the correct min/maxImageCount</li>

				  <li>i965/gs: Allow primitive id to be a system value</li>

				  <li>anv: Handle null in all destructors</li>

				  <li>anv/fence: Handle ANV_FENCE_CREATE_SIGNALED_BIT</li>

				  <li>nir/spirv: Fix handling of gl_PrimitiveId</li>

				  <li>anv/blorp: Ignore clears for attachments first used as resolve destinations</li>

				  <li>anv: Implement a depth stall restriction on gen7</li>

				  <li>anv/cmd_buffer: Handle running out of binding tables in compute shaders</li>

				  <li>anv/cmd_buffer: Emit a CS stall before setting a CS pipeline</li>

				  <li>vulkan/wsi/x11: Implement FIFO mode.</li>

				</ul>

				<p>Jordan Justen (2):</p>

				<ul>

				  <li>isl: Fix height calculation in isl_msaa_interleaved_scale_px_to_sa</li>

				  <li>i965/hsw: Set integer mode in sampling state for stencil texturing</li>

				</ul>

				<p>Kenneth Graunke (4):</p>

				<ul>

				  <li>intel: Set min_ds_entries on Broxton.</li>

				  <li>i965: Fix compute shader crash.</li>

				  <li>mesa: Drop PATH_MAX usage.</li>

				  <li>i965: Fix GS push inputs with enhanced layouts.</li>

				</ul>

				<p>Kevin Strasser (1):</p>

				<ul>

				  <li>vulkan/wsi: Add a thread-safe queue implementation</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>anv: fix multi level clears with VK_REMAINING_MIP_LEVELS</li>

				</ul>

				<p>Lucas Stach (1):</p>

				<ul>

				  <li>gbm: request correct version of the DRI2_FENCE extension</li>

				</ul>

				<p>Nicolai Hähnle (2):</p>

				<ul>

				  <li>radeonsi: store group_size_variable in struct si_compute</li>

				  <li>glsl/lower_output_reads: fix geometry shader output handling with conditional emit</li>

				</ul>

				<p>Steinar H. Gunderson (1):</p>

				<ul>

				  <li>Fix races during _mesa_HashWalk().</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>mesa: fix empty program log length</li>

				</ul>

				</div>

				</body>

				</html>

									
										177

docs/relnotes/13.0.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,177 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 13.0.3 Release Notes / January 5, 2017</h1>

				<p>

				Mesa 13.0.3 is a bug fix release which fixes bugs found since the 13.0.2 release.

				</p>

				<p>

				Mesa 13.0.3 implements the OpenGL 4.4 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.4.  OpenGL

				4.4 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				55b07d056f9b855ba9d7c8b2ddc7d3b220a61c6ab1bdc73cbfc2f607721094c2  mesa-13.0.3.tar.gz

				d9aa8be5c176d00d0cd503cb2f64a5a403ea471ec819c022581414860d7ba40e  mesa-13.0.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77662">Bug 77662</a> - Fail to render to different faces of depth-stencil cube map</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92234">Bug 92234</a> - [BDW] GPU hang in Shogun2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98329">Bug 98329</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99038">Bug 99038</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.negative_api.create_pixmap_surface crashes</li>

				</ul>

				<h2>Changes</h2>

				<p>Chad Versace (2):</p>

				<ul>

				  <li>i965/mt: Disable aux surfaces after making miptree shareable</li>

				  <li>egl: Fix crashes in eglCreate*Surface()</li>

				</ul>

				<p>Dave Airlie (4):</p>

				<ul>

				  <li>anv: set maxFragmentDualSrcAttachments to 1</li>

				  <li>radv: set maxFragmentDualSrcAttachments to 1</li>

				  <li>radv: fix another regression since shadow fixes.</li>

				  <li>radv: add missing license file to radv_meta_bufimage.</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>docs: add sha256 checksums for 13.0.2</li>

				  <li>anv: don't double-close the same fd</li>

				  <li>anv: don't leak memory if anv_init_wsi() fails</li>

				  <li>radv: don't leak the fd if radv_physical_device_init() succeeds</li>

				  <li>Update version to 13.0.3</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>vc4: In a loop break/continue, jump if everyone has taken the path.</li>

				</ul>

				<p>Gwan-gyeong Mun (3):</p>

				<ul>

				  <li>anv: Add missing error-checking to anv_block_pool_init (v2)</li>

				  <li>anv: Update the teardown in reverse order of the anv_CreateDevice</li>

				  <li>vulkan/wsi: Fix resource leak in success path of wsi_queue_init()</li>

				</ul>

				<p>Haixia Shi (1):</p>

				<ul>

				  <li>compiler/glsl: fix precision problem of tanh</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>mesa: only verify that enabled arrays have backing buffers</li>

				</ul>

				<p>Jason Ekstrand (8):</p>

				<ul>

				  <li>anv/cmd_buffer: Re-emit MEDIA_CURBE_LOAD when CS push constants are dirty</li>

				  <li>anv/image: Rename hiz_surface to aux_surface</li>

				  <li>anv/cmd_buffer: Remove the 1-D case from the HiZ QPitch calculation</li>

				  <li>genxml/gen9: Change the default of MI_SEMAPHORE_WAIT::RegisterPoleMode</li>

				  <li>anv/device: Return the right error for failed maps</li>

				  <li>anv/device: Implicitly unmap memory objects in FreeMemory</li>

				  <li>anv/descriptor_set: Write the state offset in the surface state free list.</li>

				  <li>spirv: Use a simpler and more correct implementaiton of tanh()</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Allocate at least some URB space even when max_vertices = 0.</li>

				</ul>

				<p>Marek Olšák (17):</p>

				<ul>

				  <li>radeonsi: always set all blend registers</li>

				  <li>radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending</li>

				  <li>radeonsi: disable RB+ blend optimizations for dual source blending</li>

				  <li>radeonsi: consolidate max-work-group-size computation</li>

				  <li>radeonsi: apply a multi-wave workgroup SPI bug workaround to affected CIK chips</li>

				  <li>radeonsi: apply a TC L1 write corruption workaround for SI</li>

				  <li>radeonsi: apply a tessellation bug workaround for SI</li>

				  <li>radeonsi: add a tess+GS hang workaround for VI dGPUs</li>

				  <li>radeonsi: apply the double EVENT_WRITE_EOP workaround to VI as well</li>

				  <li>cso: don't release sampler states that are bound</li>

				  <li>radeonsi: always restore sampler states when unbinding sampler views</li>

				  <li>radeonsi: fix incorrect FMASK checking in bind_sampler_states</li>

				  <li>radeonsi: allow specifying simm16 of emit_waitcnt at call sites</li>

				  <li>radeonsi: wait for outstanding memory instructions in TCS barriers</li>

				  <li>tgsi: fix the src type of TGSI_OPCODE_MEMBAR</li>

				  <li>radeonsi: wait for outstanding LDS instructions in memory barriers if needed</li>

				  <li>radeonsi: disable the constant engine (CE) on Carrizo and Stoney</li>

				</ul>

				<p>Matt Turner (3):</p>

				<ul>

				  <li>i965/fs: Rename opt_copy_propagate -&gt; opt_copy_propagation.</li>

				  <li>i965/fs: Add unit tests for copy propagation pass.</li>

				  <li>i965/fs: Reject copy propagation into SEL if not min/max.</li>

				</ul>

				<p>Nanley Chery (1):</p>

				<ul>

				  <li>mesa/fbobject: Update CubeMapFace when reusing textures</li>

				</ul>

				<p>Nicolai Hähnle (4):</p>

				<ul>

				  <li>radeonsi: fix isolines tess factor writes to control ring</li>

				  <li>radeonsi: update all GSVS ring descriptors for new buffer allocations</li>

				  <li>radeonsi: do not kill GS with memory writes</li>

				  <li>radeonsi: fix an off-by-one error in the bounds check for max_vertices</li>

				</ul>

				<p>Rhys Kidd (1):</p>

				<ul>

				  <li>glsl: Add pthread libs to cache_test</li>

				</ul>

				<p>Timothy Arceri (2):</p>

				<ul>

				  <li>mesa: fix active subroutine uniforms properly</li>

				  <li>Revert "nir: Turn imov/fmov of undef into undef."</li>

				</ul>

				</div>

				</body>

				</html>

									
										255

docs/relnotes/13.0.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,255 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 13.0.4 Release Notes / February 1, 2017</h1>

				<p>

				Mesa 13.0.4 is a bug fix release which fixes bugs found since the 13.0.3 release.

				</p>

				<p>

				Mesa 13.0.4 implements the OpenGL 4.4 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.4.  OpenGL

				4.4 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				a78518030b0b7d77a6c426ac3ff40f4b27fb0e2cdb0dfbe685024a46cae59bad  mesa-13.0.4.tar.gz

				a95d7ce8f7bd5f88585e4be3144a341236d8c0fc91f6feaec59bb8ba3120e726  mesa-13.0.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92634">Bug 92634</a> - gallium's vl_mpeg12_decoder does not work with st/va</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94512">Bug 94512</a> - X segfaults with glx-tls enabled in a x32 environment</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94900">Bug 94900</a> - HD6950 GPU lockup loop with various steam games (octodad[always], saints row 4[always], dead island[always], grid autosport[sometimes])</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98263">Bug 98263</a> - [radv] The Talos Principle fails to launch with &quot;Fatal error: Cannot set display mode.&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98914">Bug 98914</a> - mesa-vdpau-drivers: breaks vdpau for mpeg2video</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98975">Bug 98975</a> - Wasteland 2 Directors Cut: Hangs. GPU fault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99030">Bug 99030</a> - [HSW, regression] transform feedback fails on Linux 4.8</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99085">Bug 99085</a> - [EGL] dEQP-EGL.functional.sharing.gles2.multithread intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99097">Bug 99097</a> - [vulkancts] dEQP-VK.image.store regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99100">Bug 99100</a> - [SKL,BDW,BSW,KBL] dEQP-VK.glsl.return.return_in_dynamic_loop_dynamic_vertex regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99144">Bug 99144</a> - Incorrect rendering using glDrawArraysInstancedBaseInstance and first != 0 on Skylake</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99154">Bug 99154</a> - Link time error when using multiple builtin functions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99158">Bug 99158</a> - vdpau segfaults and gpu locks with kodi on R9285</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99185">Bug 99185</a> - dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99188">Bug 99188</a> - dEQP-EGL.functional.create_context_ext.robust_gl_30.rgb565_no_depth_no_stencil</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99210">Bug 99210</a> - ES3-CTS.functional.texture.mipmap.cube.generate.rgba5551_*</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99354">Bug 99354</a> - [G71] &quot;Assertion `bkref' failed&quot; reproducible with glmark2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99450">Bug 99450</a> - [amdgpu] Payday 2 visual glitches on some models</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99451">Bug 99451</a> - polygon offset use after free</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Rodriguez (2):</p>

				<ul>

				  <li>vulkan/wsi: clarify the severity of lack of DRI3 v2</li>

				  <li>radv: fix include order for installed headers v2</li>

				</ul>

				<p>Arda Coskunses (2):</p>

				<ul>

				  <li>vulkan/wsi/x11: don't crash on null visual</li>

				  <li>vulkan/wsi/x11: don't crash on null wsi x11 connection</li>

				</ul>

				<p>Bas Nieuwenhuizen (1):</p>

				<ul>

				  <li>radv: Support loader interface version 3.</li>

				</ul>

				<p>Chad Versace (10):</p>

				<ul>

				  <li>egl: Check config's surface types in eglCreate*Surface()</li>

				  <li>dri: Add __DRI_IMAGE_FORMAT_ARGB1555</li>

				  <li>mesa/texformat: Handle GL_RGBA + GL_UNSIGNED_SHORT_5_5_5_1</li>

				  <li>egl: Emit correct error when robust context creation fails</li>

				  <li>anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0</li>

				  <li>mesa/shaderobj: Fix races on refcounts</li>

				  <li>meta: Disable dithering during glGenerateMipmap</li>

				  <li>vulkan: Add new cast macros for VkIcd types</li>

				  <li>vulkan: Update vk_icd.h to interface version 3</li>

				  <li>anv: Support loader interface version 3 (patch v2)</li>

				</ul>

				<p>Christian König (1):</p>

				<ul>

				  <li>vl/zscan: fix "Fix trivial sign compare warnings"</li>

				</ul>

				<p>Chuck Atkins (1):</p>

				<ul>

				  <li>glx: Add missing glproto dependency for gallium-xlib glx</li>

				</ul>

				<p>Damien Grassart (1):</p>

				<ul>

				  <li>anv: return count of queue families written</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>radv: flush smem for uniform buffer bit.</li>

				</ul>

				<p>Emil Velikov (10):</p>

				<ul>

				  <li>docs: add sha256 checksums for 13.0.3</li>

				  <li>cherry-ignore: add couple of intel_miptree_copy related patches</li>

				  <li>cherry-ignore: add radv: Call nir_lower_constant_initializers."</li>

				  <li>get-typod-pick-list.sh: add new script</li>

				  <li>cherry-ignore: add "_mesa_ClampColor extension/version fix"</li>

				  <li>cherry-ignore: add wayland race condition fix</li>

				  <li>egl/wayland: use the destroy_window_callback for swrast</li>

				  <li>automake: use shared llvm libs for make distcheck</li>

				  <li>get-pick-list.sh: Require explicit "13.0" for nominating stable patches</li>

				  <li>Update version to 13.0.4</li>

				</ul>

				<p>Francisco Jerez (1):</p>

				<ul>

				  <li>anv: Fix uniform and storage buffer offset alignment limits.</li>

				</ul>

				<p>Fredrik Höglund (2):</p>

				<ul>

				  <li>radv: fix dual source blending</li>

				  <li>dri3: Fix MakeCurrent without a default framebuffer</li>

				</ul>

				<p>Grazvydas Ignotas (1):</p>

				<ul>

				  <li>mapi: update the asm code to support x32</li>

				</ul>

				<p>Heiko Przybyl (1):</p>

				<ul>

				  <li>r600/sb: Fix loop optimization related hangs on eg</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>nouveau: take extra push space into account for pushbuf_space calls</li>

				</ul>

				<p>Jason Ekstrand (4):</p>

				<ul>

				  <li>i965/generator/tex: Handle an immediate sampler with an indirect texture</li>

				  <li>anv/formats: Use the real format for B4G4R4A4_UNORM_PACK16 on gen8</li>

				  <li>nir/search: Only allow matching SSA values</li>

				  <li>isl: Mark A4B4G4R4_UNORM as supported on gen8</li>

				</ul>

				<p>Jonas Ådahl (1):</p>

				<ul>

				  <li>egl/wayland: Cleanup private display connection when init fails</li>

				</ul>

				<p>Kenneth Graunke (7):</p>

				<ul>

				  <li>i965: Don't bail on vertex element processing if we need draw params.</li>

				  <li>i965: Fix last slot calculations</li>

				  <li>i965: Fix texturing in the vec4 TCS and GS backends.</li>

				  <li>spirv: Move cursor before calling vtn_ssa_value() in phi 2nd pass.</li>

				  <li>i965: Make BLORP disable the NP Z PMA stall fix.</li>

				  <li>glsl: Use ir_var_temporary when generating inline functions.</li>

				  <li>i965: Properly flush in hsw_pause_transform_feedback().</li>

				</ul>

				<p>Marek Olšák (4):</p>

				<ul>

				  <li>vdpau: call texture_get_handle while the mutex is being held</li>

				  <li>va: call texture_get_handle while the mutex is being held</li>

				  <li>radeonsi: for the tess barrier, only use emit_waitcnt on SI and LLVM 3.9+</li>

				  <li>radeonsi: don't forget to add HTILE to the buffer list for texturing</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>cso: Don't restore nr_samplers in cso_restore_fragment_samplers</li>

				</ul>

				<p>Nanley Chery (3):</p>

				<ul>

				  <li>anv/cmd_buffer: Fix arrayed depth/stencil attachments</li>

				  <li>anv/cmd_buffer: Fix programmed HiZ qpitch</li>

				  <li>anv/image: Disable HiZ for depth buffer arrays</li>

				</ul>

				<p>Nayan Deshmukh (1):</p>

				<ul>

				  <li>st/va: delay calling begin_frame until we have all parameters</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>freedreno: some fence cleanup</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>gallium/hud: add missing break in hud_cpufreq_graph_install()</li>

				</ul>

				<p>Timothy Arceri (3):</p>

				<ul>

				  <li>nir: Turn imov/fmov of undef into undef</li>

				  <li>glsl: fix opt_minmax redundancy checks against baserange</li>

				  <li>util: fix list_is_singular()</li>

				</ul>

				<p>Zachary Michaels (1):</p>

				<ul>

				  <li>radeonsi: Always leave poly_offset in a valid state</li>

				</ul>

				</div>

				</body>

				</html>

									
										210

docs/relnotes/13.0.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,210 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 13.0.5 Release Notes / February 20, 2017</h1>

				<p>

				Mesa 13.0.5 is a bug fix release which fixes bugs found since the 13.0.4 release.

				</p>

				<p>

				Mesa 13.0.5 implements the OpenGL 4.4 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.4.  OpenGL

				4.4 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				7e45e3812078726eabca6d9384364bf035a3c4279024ec9090dd1b19a8989926  mesa-13.0.5.tar.gz

				bfcea7e2c801525a60895c8aff11aa68457ee9aa35d01a4638e1f310a3f5ef87  mesa-13.0.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98329">Bug 98329</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98421">Bug 98421</a> - src/loader/loader.c:111:40: error: unknown type name ‘drmDevicePtr’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98526">Bug 98526</a> - glsl/tests/general-ir-test regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99532">Bug 99532</a> - Compute shader doesn't give right result under some circumstances</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99631">Bug 99631</a> - segfault with OSVRTrackerView and openscenegraph git master</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99633">Bug 99633</a> - rasterizer/core/clip.h:279:49: error: ‘const struct API_STATE’ has no member named ‘linkageCount’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99692">Bug 99692</a> - [radv] Mostly broken on Hawaii PRO/CIK ASICs</li>

				</ul>

				<h2>Changes</h2>

				<p>Bartosz Tomczyk (2):</p>

				<ul>

				  <li>r600: Fix stack overflow</li>

				  <li>r600/sb: Fix memory leak</li>

				</ul>

				<p>Bruce Cherniak (1):</p>

				<ul>

				  <li>swr: [rasterizer core] Remove dead code Clipper::ClipScalar()</li>

				</ul>

				<p>Chad Versace (1):</p>

				<ul>

				  <li>i965/mt: Disable HiZ when sharing depth buffer externally (v2)</li>

				</ul>

				<p>Dave Airlie (3):</p>

				<ul>

				  <li>radv: change base aligmment for allocated memory.</li>

				  <li>radv: fix cik macroModeIndex.</li>

				  <li>radv: adopt some init config workarounds from radeonsi.</li>

				</ul>

				<p>Derek Foreman (1):</p>

				<ul>

				  <li>egl/dri2: add image_loader_extension back into loader extensions for wayland</li>

				</ul>

				<p>Emil Velikov (26):</p>

				<ul>

				  <li>docs: add sha256 checksums for 13.0.4</li>

				  <li>configure.ac: list radeon in --with-vulkan-drivers help string</li>

				  <li>i965: automake: correctly set MKDIR_GEN</li>

				  <li>freedreno: automake: correctly set MKDIR_GEN</li>

				  <li>i965: automake: include builddir prior to srcdir</li>

				  <li>i915: automake: include builddir prior to srcdir</li>

				  <li>egl: automake: include builddir prior to srcdir</li>

				  <li>clover: automake: include builddir prior to srcdir</li>

				  <li>st/dri: automake: include builddir prior to srcdir</li>

				  <li>d3dadapter9: automake: include builddir prior to srcdir</li>

				  <li>glx: automake: include builddir prior to srcdir</li>

				  <li>glx/apple: automake: include builddir prior to srcdir</li>

				  <li>glx/windows: automake: include builddir prior to srcdir</li>

				  <li>loader: automake: include builddir prior to srcdir</li>

				  <li>mapi: automake: include builddir prior to srcdir</li>

				  <li>radeon, r200: automake: include builddir prior to srcdir</li>

				  <li>dri/swrast: automake: include builddir prior to srcdir</li>

				  <li>dri/osmesa: automake: include builddir prior to srcdir</li>

				  <li>mesa/tests: automake: include builddir prior to srcdir</li>

				  <li>bin/get-extra-pick-list: use git merge-base to get the branchpoint</li>

				  <li>bin/get-extra-pick-list: rework to use already_picked list</li>

				  <li>bin/get-typod-pick-list.sh: limit `git grep ...' to only as needed</li>

				  <li>bin/get-pick-list.sh: limit `git grep ...' only as needed</li>

				  <li>bin/get-pick-list.sh: remove ancient way of nominating patches</li>

				  <li>bin/get-fixes-pick-list.sh: add new script</li>

				  <li>Update version to 13.0.5</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>vc4: Avoid emitting small immediates for UBO indirect load address guards.</li>

				</ul>

				<p>Hans de Goede (1):</p>

				<ul>

				  <li>glx/glvnd: Fix GLXdispatchIndex sorting</li>

				</ul>

				<p>Ian Romanick (11):</p>

				<ul>

				  <li>linker: Slight code rearrange to prevent duplication in the next commit</li>

				  <li>linker: Accurately track gl_uniform_block::stageref</li>

				  <li>glsl: Split process_block_array into two functions</li>

				  <li>glsl: Fix wonkey indentation left from previous commit</li>

				  <li>glsl: Track the linearized array index for each UBO instance array element</li>

				  <li>glsl: Use simpler visitor to determine which UBO and SSBO blocks are used</li>

				  <li>glsl: Add tracking for elements of an array-of-arrays that have been accessed</li>

				  <li>glsl: Add structures to track accessed elements of a single array</li>

				  <li>glsl: Mark a set of array elements as accessed using a list of array_deref_range</li>

				  <li>glsl: Walk a list of ir_dereference_array to mark array elements as accessed</li>

				  <li>linker: Accurately mark a uniform block instance array element as used in a stage</li>

				</ul>

				<p>Ilia Mirkin (3):</p>

				<ul>

				  <li>vbo: process buffer binding state changes on draw when recording</li>

				  <li>st/mesa: MAX_VARYING is the max supported number of patch varyings, not min</li>

				  <li>nvc0: disable linked tsc mode in compute launch descriptor</li>

				</ul>

				<p>Jason Ekstrand (11):</p>

				<ul>

				  <li>nir/search: Use the correct bit size for integer comparisons</li>

				  <li>i965/blorp: Use the correct ISL format for combined depth/stencil</li>

				  <li>intel/blorp: Handle clearing of A4B4G4R4 on all platforms</li>

				  <li>isl/formats: Only advertise sampling for A4B4G4R4 on Broadwell</li>

				  <li>anv: Flush render cache before STATE_BASE_ADDRESS on gen7</li>

				  <li>anv: Improve flushing around STATE_BASE_ADDRESS</li>

				  <li>vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormats</li>

				  <li>vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModes</li>

				  <li>vulkan/wsi: Lower the maximum image sizes</li>

				  <li>i965/sampler_state: Pass texObj into update_sampler_state</li>

				  <li>i965/sampler_state: Set the "Base Mip Level" field on Sandy Bridge</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.</li>

				</ul>

				<p>Lionel Landwerlin (5):</p>

				<ul>

				  <li>anv: don't require render target isl bit for depth/stencil surfaces</li>

				  <li>anv: set command buffer to NULL when allocations fail</li>

				  <li>anv: fix descriptor pool internal size allocation</li>

				  <li>spirv: handle OpUndef as part of the variable parsing pass</li>

				  <li>spirv: handle undefined components for OpVectorShuffle</li>

				</ul>

				<p>Marc-André Lureau (1):</p>

				<ul>

				  <li>tgsi-dump: dump label if instruction has one</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>radeonsi: always set the TCL1_ACTION_ENA when invalidating L2</li>

				  <li>gallium/radeon: fix performance of buffer readbacks</li>

				</ul>

				<p>Topi Pohjolainen (2):</p>

				<ul>

				  <li>i965: Make depth clear flushing more explicit</li>

				  <li>i965/gen6: Issue direct depth stall and flush after depth clear</li>

				</ul>

				<p>Vinson Lee (2):</p>

				<ul>

				  <li>scons: Require libdrm &gt;= 2.4.66 for DRM.</li>

				  <li>util: Fix Clang trivial destructor check.</li>

				</ul>

				</div>

				</body>

				</html>

									
										287

docs/relnotes/13.0.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,287 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 13.0.6 Release Notes / March 20, 2017</h1>

				<p>

				Mesa 13.0.6 is a bug fix release which fixes bugs found since the 13.0.5 release.

				</p>

				<p>

				Mesa 13.0.6 implements the OpenGL 4.4 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.4.  OpenGL

				4.4 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				1076590f29103f022a2cd87e6dff6ae77072013745603d06b0410c373ab2bb1a  mesa-13.0.6.tar.gz

				29ef104a7fc082d352b1599bd6cb1d040be424ccd22f5e0eb7ee9b0e9acd3597  mesa-13.0.6.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68504">Bug 68504</a> - 9.2-rc1 workaround for clover build failure on ppc/altivec: cannot convert 'bool' to '__vector(4) __bool int' in return</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97102">Bug 97102</a> - [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98869">Bug 98869</a> - Electronic Super Joy graphic artefacts (regression,bisected)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99401">Bug 99401</a> - [g33] regression: piglit.spec.!opengl 1_0.gl-1_0-beginend-coverage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99456">Bug 99456</a> - Firefox crashing when opening about:support with WebGL2 enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99677">Bug 99677</a> - heap-use-after-free in glsl</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99715">Bug 99715</a> - Don't print: &quot;Note: Buggy applications may crash, if they do please report to vendor&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99850">Bug 99850</a> - Tessellation bug on Carrizo</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100049">Bug 100049</a> - &quot;ralloc: Make sure ralloc() allocations match malloc()'s alignment.&quot; causes seg fault in 32bit build</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Smith (2):</p>

				<ul>

				  <li>radv: Emit pending flushes before executing a secondary command buffer</li>

				  <li>radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBuffer</li>

				</ul>

				<p>Bartosz Tomczyk (1):</p>

				<ul>

				  <li>glsl: fix heap-buffer-overflow</li>

				</ul>

				<p>Bas Nieuwenhuizen (8):</p>

				<ul>

				  <li>radv: Pass CMASK alignment to application.</li>

				  <li>radv: Pass DCC alignment to application.</li>

				  <li>radv: Never try to create more than max_sets descriptor sets.</li>

				  <li>radv: Reset emitted compute pipeline when calling secondary cmd buffer.</li>

				  <li>radv: Only use PKT3_OCCLUSION_QUERY when it doesn't hang.</li>

				  <li>radv: Use correct size for availability flag.</li>

				  <li>radv: Disable HTILE for textures with multiple layers/levels.</li>

				  <li>radv: Emit cache flushes before CP DMA.</li>

				</ul>

				<p>Ben Crocker (3):</p>

				<ul>

				  <li>gallivm: Improve debug output (V2)</li>

				  <li>gallivm: Override getHostCPUName() "generic" w/ "pwr8" (v4)</li>

				  <li>gallivm: Reenable PPC VSX (v3)</li>

				</ul>

				<p>Brendan King (1):</p>

				<ul>

				  <li>egl/dri3: implement query surface hook</li>

				</ul>

				<p>Bruce Cherniak (1):</p>

				<ul>

				  <li>swr: Prune empty nodes in CalculateProcessorTopology.</li>

				</ul>

				<p>Connor Abbott (1):</p>

				<ul>

				  <li>anv: fix Get*MemoryRequirements for !LLC</li>

				</ul>

				<p>Dave Airlie (13):</p>

				<ul>

				  <li>radv: program a default point size.</li>

				  <li>radv: handle transfer_write as a dst flag.</li>

				  <li>radv/ac: handle nir irem opcode.</li>

				  <li>radv/ac: implement txs for buffer textures.</li>

				  <li>radv/ac: correctly size shared memory usage.</li>

				  <li>radv/ac: avoid the fmask path when doing txs.</li>

				  <li>radv: pass FMASK alignment to application</li>

				  <li>tgsi: fix memory leak in tgsi sanity check</li>

				  <li>radv: fix depth format in blit2d.</li>

				  <li>radv: fix txs for sampler buffers</li>

				  <li>radv: drop Z24 support.</li>

				  <li>radv: disable mip point pre clamping.</li>

				  <li>radv: setup llvm target data layout</li>

				</ul>

				<p>Emil Velikov (6):</p>

				<ul>

				  <li>docs: add sha256 checksums for 13.0.5</li>

				  <li>Revert "get-pick-list.sh: Require explicit "13.0" for nominating stable patches"</li>

				  <li>cherry-ignore: don't pick nir_op_pack_double optimisation fix</li>

				  <li>i965: move brw_define.h ifndef guard to the top</li>

				  <li>cherry-ignore: add ANV fast clears related fixes</li>

				  <li>Update version to 13.0.6</li>

				</ul>

				<p>Fredrik Höglund (2):</p>

				<ul>

				  <li>radv: fix the dynamic buffer index in vkCmdBindDescriptorSets</li>

				  <li>radv/ac: fix multiple descriptor sets with dynamic buffers</li>

				</ul>

				<p>George Kyriazis (1):</p>

				<ul>

				  <li>swr: Align query results allocation</li>

				</ul>

				<p>Grazvydas Ignotas (3):</p>

				<ul>

				  <li>r300g: only allow byteswapped formats on big endian</li>

				  <li>gallium/u_queue: fix a crash with atexit handlers</li>

				  <li>gallium/u_queue: set num_threads correctly if not all threads start</li>

				</ul>

				<p>Gregory Hainaut (1):</p>

				<ul>

				  <li>glapi: fix typo in count_scale</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>mesa: Don't advertise GL_OES_read_format in core profile</li>

				</ul>

				<p>Ilia Mirkin (8):</p>

				<ul>

				  <li>nvc0: increase number of ubo binding points</li>

				  <li>nvc0/ir: fix robustness guarantees for constbuf loads on kepler+ compute</li>

				  <li>nvc0/ir: fix ubo max clamp, reset file index</li>

				  <li>gm107/ir: fix address offset bitfield for ATOMS</li>

				  <li>nvc0: set the render condition in the compute object</li>

				  <li>st/mesa: don't pass compare mode for stencil-sampled textures</li>

				  <li>nvc0: take extra pushbuf space into account for pushbuf_space calls</li>

				  <li>nvc0: increase alignment to 256 for texture buffers on fermi</li>

				</ul>

				<p>Jacob Lifshay (1):</p>

				<ul>

				  <li>vulkan/wsi: Improve the DRI3 error message</li>

				</ul>

				<p>Jason Ekstrand (11):</p>

				<ul>

				  <li>i965: Use a better guardband calculation.</li>

				  <li>intel/blorp: Swizzle clear colors on the CPU</li>

				  <li>i965/fs: Remove the inline pack_double_2x32 optimization</li>

				  <li>anv: Add an invalidate_range helper</li>

				  <li>anv/query: clflush the bo map on non-LLC platforms</li>

				  <li>genxml: Make MI_STORE_DATA_IMM more consistent</li>

				  <li>anv/query: Perform CmdResetQueryPool on the GPU</li>

				  <li>blorp/exec: Use uint32_t for copying varying data</li>

				  <li>intel/blorp: Explicitly flush all allocated state</li>

				  <li>anv: Accurately advertise dynamic descriptor limits</li>

				  <li>anv: Properly handle destroying NULL devices and instances</li>

				</ul>

				<p>Jonas Pfeil (1):</p>

				<ul>

				  <li>ralloc: Make sure ralloc() allocations match malloc()'s alignment.</li>

				</ul>

				<p>Jose Maria Casanova Crespo (1):</p>

				<ul>

				  <li>glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1</li>

				</ul>

				<p>Kenneth Graunke (7):</p>

				<ul>

				  <li>i965: Fix fast depth clears for surfaces with a dimension of 16384.</li>

				  <li>i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.</li>

				  <li>i965: Fix check for negative pitch in can_do_fast_copy_blit().</li>

				  <li>i965: Support the force_glsl_version driconf option.</li>

				  <li>i965: Combine the Gen6 SF and Clip viewport atoms.</li>

				  <li>mesa: Do (TCS &amp;&amp; !TES) draw time validation in ES as well.</li>

				  <li>egl: Ensure ResetNotificationStrategy matches for shared contexts.</li>

				</ul>

				<p>Lionel Landwerlin (3):</p>

				<ul>

				  <li>spirv: don't assert with location decorations on non i/o variables</li>

				  <li>anv: wsi: report presentation error per image request</li>

				  <li>i965/fs: fix uninitialized memory access</li>

				</ul>

				<p>Marc Di Luzio (1):</p>

				<ul>

				  <li>glsl: correct compute shader checks for memoryBarrier functions</li>

				</ul>

				<p>Marek Olšák (10):</p>

				<ul>

				  <li>st/mesa: destroy pipe_context before destroying st_context (v2)</li>

				  <li>radeonsi: don't invoke DCC decompression in update_all_texture_descriptors</li>

				  <li>radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start (v2)</li>

				  <li>gallium/util: remove unused u_index_modify helpers</li>

				  <li>gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionally</li>

				  <li>gallium/u_queue: fix random crashes when the app calls exit()</li>

				  <li>st/mesa: reset sample_mask, min_sample, and render_condition for PBO ops</li>

				  <li>st/mesa: set blend state for PBO readbacks</li>

				  <li>radeonsi: fix broken tessellation on Carrizo and Stoney</li>

				  <li>radeonsi: mark all bound shader buffer ranges as initialized</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>clover: Work around build failure with AltiVec.</li>

				</ul>

				<p>Nicolai Hähnle (12):</p>

				<ul>

				  <li>mesa/main: fix meta caller of _mesa_ClampColor</li>

				  <li>radeonsi: fix texture gather on stencil textures</li>

				  <li>glsl: split DIV_TO_MUL_RCP into single- and double-precision flags</li>

				  <li>glx/dri3: handle NULL pointers in loader-to-DRI3 drawable conversion</li>

				  <li>glx/dri3: guard in_current_context against a disappeared drawable</li>

				  <li>glx: guard swap-interval functions against destroyed drawables</li>

				  <li>dri/common: clear the loaderPrivate pointer in driDestroyDrawable</li>

				  <li>winsys/amdgpu: reduce max_alloc_size based on GTT limits</li>

				  <li>radeonsi: handle MultiDrawIndirect in si_get_draw_start_count</li>

				  <li>radeonsi: fix UINT/SINT clamping for 10-bit formats on &lt;= CIK</li>

				  <li>st/glsl_to_tgsi: avoid iterating past the head of the instruction list</li>

				  <li>st/mesa: inform the driver of framebuffer changes before compute dispatches</li>

				</ul>

				<p>Samuel Iglesias Gonsálvez (6):</p>

				<ul>

				  <li>glsl: fix heap-use-after-free in ast_declarator_list::hir()</li>

				  <li>i965/fs: mark last DF uniform array element as 64 bit live one</li>

				  <li>i965/fs: detect different bit size accesses to uniforms to push them in proper locations</li>

				  <li>i965/fs: fix indirect load DF uniforms on BSW/BXT</li>

				  <li>i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handles</li>

				  <li>i965/fs: emit MOV_INDIRECT with the source with the right register type</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>winsys/amdgpu: avoid potential segfault in amdgpu_bo_map()</li>

				</ul>

				</div>

				</body>

				</html>

									
										285

docs/relnotes/17.0.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,285 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.0.0 Release Notes / February 13, 2017</h1>

				<p>

				Mesa 17.0.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 17.0.1.

				</p>

				<p>

				Mesa 17.0.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				696578f0b83796470511a88a95fff15a2a25fa201a9e487716f2ca20c177c3ab  mesa-17.0.0.tar.gz

				39db3d59700159add7f977307d12a7dfe016363e760ad82280ac4168ea668481  mesa-17.0.0.tar.xz

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_ARB_post_depth_coverage on i965/gen9+</li>

				<li>GL_KHR_blend_equation_advanced on nvc0</li>

				<li>GL_INTEL_conservative_rasterization on i965/gen9+</li>

				<li>GL_NV_image_formats on any driver supporting GL_ARB_shader_image_load_store (i965, nvc0, radeonsi, softpipe)</li>

				<li>GL_ARB_gpu_shader_fp64 in i965/haswell</li>

				<li>GL_ARB_vertex_attrib_64bit in i965/haswell</li>

				<li>GL_ARB_shader_precision in i965/haswell</li>

				<li>Intel Haswell now supports OpenGL 4.2</li>

				<li>GL_OES_geometry_shader on i965/haswell</li>

				<li>GL_OES_texture_cube_map_array on i965/haswell</li>

				<li>GL_OES_viewport_array on i965/haswell</li>

				<li>Vulkan Float64 capability support on Intel's ANV driver</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70623">Bug 70623</a> - libglx.so: undefined symbol: _glapi_tls_Context</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72902">Bug 72902</a> - [IVB/HSW/BDW] DOTA2 segfaults unless Mesa is configured with (non-default) --enable-glx-tls</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73778">Bug 73778</a> - _glapi_tls_Dispatch undefined</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77662">Bug 77662</a> - Fail to render to different faces of depth-stencil cube map</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89043">Bug 89043</a> - undefined symbol: _glapi_tls_Dispatch</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91281">Bug 91281</a> - Tonga VCE 2160p encode fails with  BO to small for addr</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92234">Bug 92234</a> - [BDW] GPU hang in Shogun2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92634">Bug 92634</a> - gallium's vl_mpeg12_decoder does not work with st/va</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92760">Bug 92760</a> - Add FP64 support to the i965 shader backends</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92925">Bug 92925</a> - Incorrect GEN for ASTC in Surface Format Table</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93551">Bug 93551</a> - Divinity: Original Sin Enhanced Edition(Native) crash on start</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94512">Bug 94512</a> - X segfaults with glx-tls enabled in a x32 environment</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94900">Bug 94900</a> - HD6950 GPU lockup loop with various steam games (octodad[always], saints row 4[always], dead island[always], grid autosport[sometimes])</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94904">Bug 94904</a> - [vulkan, BSW] dEQP-VK.api.object_management.multithreaded_per_thread_device intermittent crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95460">Bug 95460</a> - Please add more drivers (freedreno, virgl) to features.txt status document</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96959">Bug 96959</a> - nop.sat generated by pow workaround?</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97102">Bug 97102</a> - [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97232">Bug 97232</a> - Line rendering broken in Dolphin when using gl_ClipDistance</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97287">Bug 97287</a> - GL45-CTS.vertex_attrib_binding.basic-inputL-case1 fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97321">Bug 97321</a> - Query INFO_LOG_LENGTH for empty info log should return 0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97420">Bug 97420</a> - &quot;#version 0&quot; crashes glsl_compiler</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97422">Bug 97422</a> - trying to call a number as a function results into a crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97447">Bug 97447</a> - GL 3.0 compatibility context exposes GL_ARB_compute_shader</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97473">Bug 97473</a> - Memory corruption when uploading DXT5 cubemap faces</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97715">Bug 97715</a> - [ILK,G45,G965] piglit.spec.arb_separate_shader_objects.misc api error checks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97779">Bug 97779</a> - [regression, bisected][BDW, GPU hang] stuck on render ring, always reproducible</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97804">Bug 97804</a> - Later precision statement isn't overriding earlier one</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97952">Bug 97952</a> - /usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97967">Bug 97967</a> - glsl/tests/cache-test regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98005">Bug 98005</a> - VCE dual instance encoding inconsistent since st/va: enable dual instances encode by sync surface</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98012">Bug 98012</a> - [IVB] Segfault when running Dolphin twice with Vulkan</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98134">Bug 98134</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers wants a different GL error code</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98172">Bug 98172</a> - Concurrent call to glClientWaitSync results in segfault in one of the waiters.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98238">Bug 98238</a> - witcher 2: objects are black when changing lod</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98243">Bug 98243</a> - dEQP mismatched UBO precision qualifiers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98245">Bug 98245</a> - GLES3.1 link negative dEQP &quot;expected linking to fail, but passed.&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98250">Bug 98250</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.texture.texparameterIiv/texparameterIuiv failure</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98263">Bug 98263</a> - [radv] The Talos Principle fails to launch with &quot;Fatal error: Cannot set display mode.&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98297">Bug 98297</a> - Can't configure a desktop with 3x4k monitors in one row</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98299">Bug 98299</a> - Compute shaders generate stupid divides</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98307">Bug 98307</a> - &quot;st/glsl_to_tgsi: explicitly track all input and output declaration&quot; broke flightgear colors on rs780</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98326">Bug 98326</a> - [dEQP, EGL] pbuffer depth/stencil tests fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98327">Bug 98327</a> - [dEQP, EGL] dEQP-EGL.functional.resize not supported</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98328">Bug 98328</a> - [dEQP, EGL] luminance tests fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98329">Bug 98329</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98330">Bug 98330</a> - [dEQP, EGL] dEQP-EGL.functional.buffer_age.no_preserve fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98339">Bug 98339</a> - dEQP-EGL: Got EGL_BAD_MATCH: eglCreateSyncKHR()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98343">Bug 98343</a> - dEQP-EGL: GL_INVALID_ENUM at teglCreateContextExtTests</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98415">Bug 98415</a> - Vulkan Driver JSON file contains incorrect field</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98421">Bug 98421</a> - src/loader/loader.c:111:40: error: unknown type name ‘drmDevicePtr’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98431">Bug 98431</a> - UnrealEngine v4 demos startup fails to blorp blit assert</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98480">Bug 98480</a> - Support R8 image texture in ES 3.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98512">Bug 98512</a> - radeon r600 vdpau: Invalid command stream: texture bo too small</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98518">Bug 98518</a> - [r600g, bisected] regression: NI/Turks MSAA texture corruption with FreeCAD and Wine games</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98526">Bug 98526</a> - glsl/tests/general-ir-test regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98595">Bug 98595</a> - glsl: ralloc assertion &quot;info-&gt;canary == CANARY&quot; failed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98599">Bug 98599</a> - xterm menus corrupt since tgsi/scan: handle indirect image indexing correctly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98632">Bug 98632</a> - Fix build on Hurd without PATH_MAX</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98681">Bug 98681</a> - ir_builder_print_visitor.cpp:401:67: error: expected ')' before 'PRIx64'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98694">Bug 98694</a> - &quot;(5=2)?1:1&quot; as array size decleration crashes glsl_compiler</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98740">Bug 98740</a> - bitcode.cpp:102:8: error: ‘Error’ is not a member of ‘llvm’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98767">Bug 98767</a> - [swrast] ralloc.c:84: get_header: Assertion `info-&gt;canary == CANARY' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98774">Bug 98774</a> - glsl/tests/warnings-test regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98815">Bug 98815</a> - [SKL/BDW GT2] large perf regression in TessMark</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98840">Bug 98840</a> - nir clone test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98893">Bug 98893</a> - [SKL] piglit.spec.arb_shader_image_load_store.semantics intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98914">Bug 98914</a> - mesa-vdpau-drivers: breaks vdpau for mpeg2video</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98917">Bug 98917</a> - [BDW SKL BSW KBL] Tessellation CTS tests regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98975">Bug 98975</a> - Wasteland 2 Directors Cut: Hangs. GPU fault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99010">Bug 99010</a> - --disable-gallium-llvm no longer recognized</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99013">Bug 99013</a> - [regression, bisected] radeonsi: commit 4c8c13b3  &quot;Use amdgcn intrinsics for fs interpolation&quot; makes system unusable</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99030">Bug 99030</a> - [HSW, regression] transform feedback fails on Linux 4.8</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99038">Bug 99038</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.negative_api.create_pixmap_surface crashes</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99072">Bug 99072</a> - [byt,ivb,snb] ES3-CTS.gtf.GL3Tests.shadow regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99085">Bug 99085</a> - [EGL] dEQP-EGL.functional.sharing.gles2.multithread intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99097">Bug 99097</a> - [vulkancts] dEQP-VK.image.store regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99100">Bug 99100</a> - [SKL,BDW,BSW,KBL] dEQP-VK.glsl.return.return_in_dynamic_loop_dynamic_vertex regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99119">Bug 99119</a> - swr_fence_work.cpp(42): error: argument of type &quot;std::nullptr_t&quot; is incompatible with parameter of type &quot;unsigned long&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99144">Bug 99144</a> - Incorrect rendering using glDrawArraysInstancedBaseInstance and first != 0 on Skylake</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99154">Bug 99154</a> - Link time error when using multiple builtin functions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99158">Bug 99158</a> - vdpau segfaults and gpu locks with kodi on R9285</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99185">Bug 99185</a> - dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99188">Bug 99188</a> - dEQP-EGL.functional.create_context_ext.robust_gl_30.rgb565_no_depth_no_stencil</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99210">Bug 99210</a> - ES3-CTS.functional.texture.mipmap.cube.generate.rgba5551_*</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99214">Bug 99214</a> - Crash in library libswrAVX.so when assigning vertex buffer object pointers with elements of type GL_DOUBLE</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99219">Bug 99219</a> - The Stanley Parable GPU hang when starting a new game</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99229">Bug 99229</a> - [G33] thousands of tests crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99231">Bug 99231</a> - [HSW][i965] Crash in upload_3dstate_streamout()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99287">Bug 99287</a> - piglit.spec.glsl-1_10.execution.vs-nested-return-sibling-loop regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99303">Bug 99303</a> - [REGRESSION][BISECTED] DMs are crashing on start with &quot;radeon&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99314">Bug 99314</a> - [g33] glsl regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99339">Bug 99339</a> - Blender line rendering broken after removing XY clipping of lines</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99354">Bug 99354</a> - [G71] &quot;Assertion `bkref' failed&quot; reproducible with glmark2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99389">Bug 99389</a> - Mesa build broken: sid_tables.h</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99391">Bug 99391</a> - [ILK,G45,G965] piglit regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99401">Bug 99401</a> - [g33] regression: piglit.spec.!opengl 1_0.gl-1_0-beginend-coverage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99419">Bug 99419</a> - Crash(Segmentation fault) si_shader_select in Master Of Orion</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99450">Bug 99450</a> - [amdgpu] Payday 2 visual glitches on some models</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99451">Bug 99451</a> - polygon offset use after free</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99456">Bug 99456</a> - Firefox crashing when opening about:support with WebGL2 enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99631">Bug 99631</a> - segfault with OSVRTrackerView and openscenegraph git master</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99633">Bug 99633</a> - rasterizer/core/clip.h:279:49: error: ‘const struct API_STATE’ has no member named ‘linkageCount’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99637">Bug 99637</a> - VLC video has corrupted colors when using VDPAU output on Radeon SI</li>

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>Building RADV requires --enable-gallium-llvm</li>

				<li>The vulkan headers vk_platform.h and vulkan.h are no longer installed</li>

				<li>The configure options --with-sha1 and --disable-shader-cache are

				removed alongside their respective library requirements</li>

				</ul>

				</div>

				</body>

				</html>

									
										221

docs/relnotes/17.0.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,221 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.0.1 Release Notes / March 4, 2017</h1>

				<p>

				Mesa 17.0.1 is a bug fix release which fixes bugs found since the 17.0.0 release.

				</p>

				<p>

				Mesa 17.0.1 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				e819bd3e515dac26faf9836d8f27a4ddf05323b9b23afb6c06536d4ac82e2743  mesa-17.0.1.tar.gz

				96fd70ef5f31d276a17e424e7e1bb79447ccbbe822b56844213ef932e7ad1b0c  mesa-17.0.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98869">Bug 98869</a> - Electronic Super Joy graphic artefacts (regression,bisected)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99532">Bug 99532</a> - Compute shader doesn't give right result under some circumstances</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99677">Bug 99677</a> - heap-use-after-free in glsl</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99692">Bug 99692</a> - [radv] Mostly broken on Hawaii PRO/CIK ASICs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99850">Bug 99850</a> - Tessellation bug on Carrizo</li>

				</ul>

				<h2>Changes</h2>

				<p>Bas Nieuwenhuizen (4):</p>

				<ul>

				  <li>radv: Never try to create more than max_sets descriptor sets.</li>

				  <li>radv: Reset emitted compute pipeline when calling secondary cmd buffer.</li>

				  <li>radv: Only use PKT3_OCCLUSION_QUERY when it doesn't hang.</li>

				  <li>radv: Use correct size for availability flag.</li>

				</ul>

				<p>Ben Crocker (3):</p>

				<ul>

				  <li>gallivm: Reenable PPC VSX (v3)</li>

				  <li>gallivm: Improve debug output (V2)</li>

				  <li>gallivm: Override getHostCPUName() "generic" w/ "pwr8" (v4)</li>

				</ul>

				<p>Brendan King (1):</p>

				<ul>

				  <li>egl/dri3: implement query surface hook</li>

				</ul>

				<p>Christian Gmeiner (2):</p>

				<ul>

				  <li>etnaviv: move pctx initialisation to avoid a null dereference</li>

				  <li>etnaviv: remove number of pixel pipes validation</li>

				</ul>

				<p>Connor Abbott (1):</p>

				<ul>

				  <li>anv: fix Get*MemoryRequirements for !LLC</li>

				</ul>

				<p>Daniel Stone (1):</p>

				<ul>

				  <li>egl/wayland: Don't use DRM format codes for SHM</li>

				</ul>

				<p>Dave Airlie (6):</p>

				<ul>

				  <li>tgsi: fix memory leak in tgsi sanity check</li>

				  <li>radv: change base aligmment for allocated memory.</li>

				  <li>radv: fix cik macroModeIndex.</li>

				  <li>radv: adopt some init config workarounds from radeonsi.</li>

				  <li>radv: fix depth format in blit2d.</li>

				  <li>radv: fix txs for sampler buffers</li>

				</ul>

				<p>Emil Velikov (8):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.0.0</li>

				  <li>bin/get-extra-pick-list: use git merge-base to get the branchpoint</li>

				  <li>bin/get-extra-pick-list: rework to use already_picked list</li>

				  <li>bin/get-typod-pick-list.sh: limit `git grep ...' to only as needed</li>

				  <li>bin/get-pick-list.sh: limit `git grep ...' only as needed</li>

				  <li>bin/get-pick-list.sh: remove ancient way of nominating patches</li>

				  <li>bin/get-fixes-pick-list.sh: add new script</li>

				  <li>Update version to 17.0.1</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>vc4: Avoid emitting small immediates for UBO indirect load address guards.</li>

				</ul>

				<p>Grazvydas Ignotas (3):</p>

				<ul>

				  <li>r300g: only allow byteswapped formats on big endian</li>

				  <li>gallium/u_queue: fix a crash with atexit handlers</li>

				  <li>gallium/u_queue: set num_threads correctly if not all threads start</li>

				</ul>

				<p>Hans de Goede (1):</p>

				<ul>

				  <li>glx/glvnd: Fix GLXdispatchIndex sorting</li>

				</ul>

				<p>Ilia Mirkin (4):</p>

				<ul>

				  <li>gm107/ir: fix address offset bitfield for ATOMS</li>

				  <li>nvc0: set the render condition in the compute object</li>

				  <li>st/mesa: don't pass compare mode for stencil-sampled textures</li>

				  <li>nvc0: disable linked tsc mode in compute launch descriptor</li>

				</ul>

				<p>Jason Ekstrand (10):</p>

				<ul>

				  <li>i965/sampler_state: Clamp min/max LOD to 14 on gen7+</li>

				  <li>i965/sampler_state: Pass texObj into update_sampler_state</li>

				  <li>i965/sampler_state: Set the "Base Mip Level" field on Sandy Bridge</li>

				  <li>intel/blorp: Swizzle clear colors on the CPU</li>

				  <li>i965/fs: Fix the inline nir_op_pack_double optimization</li>

				  <li>anv: Add an invalidate_range helper</li>

				  <li>anv/query: clflush the bo map on non-LLC platforms</li>

				  <li>genxml: Make MI_STORE_DATA_IMM more consistent</li>

				  <li>anv/query: Perform CmdResetQueryPool on the GPU</li>

				  <li>intel/blorp: Explicitly flush all allocated state</li>

				</ul>

				<p>Jose Maria Casanova Crespo (1):</p>

				<ul>

				  <li>glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>mesa: Do (TCS &amp;&amp; !TES) draw time validation in ES as well.</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>configure.ac: check require_basic_egl only if egl enabled</li>

				</ul>

				<p>Lionel Landwerlin (2):</p>

				<ul>

				  <li>anv: wsi: report presentation error per image request</li>

				  <li>i965/fs: fix uninitialized memory access</li>

				</ul>

				<p>Marek Olšák (6):</p>

				<ul>

				  <li>radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start (v2)</li>

				  <li>gallium/util: remove unused u_index_modify helpers</li>

				  <li>gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionally</li>

				  <li>gallium/u_queue: fix random crashes when the app calls exit()</li>

				  <li>radeonsi: fix broken tessellation on Carrizo and Stoney</li>

				  <li>amd/common: fix ASICREV_IS_POLARIS11_M for Polaris12</li>

				</ul>

				<p>Mauro Rossi (2):</p>

				<ul>

				  <li>android: radeonsi: fix sid_table.h generated header include path</li>

				  <li>android: glsl: build shader cache sources</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>configure.ac: Drop LLVM compiler flags more radically</li>

				</ul>

				<p>Nicolai Hähnle (3):</p>

				<ul>

				  <li>winsys/amdgpu: reduce max_alloc_size based on GTT limits</li>

				  <li>radeonsi: handle MultiDrawIndirect in si_get_draw_start_count</li>

				  <li>radeonsi: fix UINT/SINT clamping for 10-bit formats on &lt;= CIK</li>

				</ul>

				<p>Samuel Iglesias Gonsálvez (1):</p>

				<ul>

				  <li>glsl: fix heap-use-after-free in ast_declarator_list::hir()</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>android: fix droid_create_image_from_prime_fd_yuv for YV12</li>

				</ul>

				</div>

				</body>

				</html>

									
										185

docs/relnotes/17.0.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,185 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.0.2 Release Notes / March 20, 2017</h1>

				<p>

				Mesa 17.0.2 is a bug fix release which fixes bugs found since the 17.0.1 release.

				</p>

				<p>

				Mesa 17.0.2 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				2e0f41e7974ba7a36ca32bbeaf8ebcd65c8fd4d2dc9872f04d4becbd5e7a8cb5  mesa-17.0.2.tar.gz

				f8f191f909e01e65de38d5bdea5fb057f21649a3aed20948be02348e77a689d4  mesa-17.0.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68504">Bug 68504</a> - 9.2-rc1 workaround for clover build failure on ppc/altivec: cannot convert 'bool' to '__vector(4) __bool int' in return</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97988">Bug 97988</a> - [radeonsi] playing back videos with VDPAU exhibits deinterlacing/anti-aliasing issues not visible with VA-API</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99484">Bug 99484</a> - Crusader Kings 2 - Loading bars, siege bars, morale bars, etc. do not render correctly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99715">Bug 99715</a> - Don't print: &quot;Note: Buggy applications may crash, if they do please report to vendor&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100049">Bug 100049</a> - &quot;ralloc: Make sure ralloc() allocations match malloc()'s alignment.&quot; causes seg fault in 32bit build</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Smith (3):</p>

				<ul>

				  <li>radv: Emit pending flushes before executing a secondary command buffer</li>

				  <li>radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBuffer</li>

				  <li>radv/ac: Fix shared memory offset calculation</li>

				</ul>

				<p>Bas Nieuwenhuizen (3):</p>

				<ul>

				  <li>radv: Disable HTILE for textures with multiple layers/levels.</li>

				  <li>radv: Emit cache flushes before CP DMA.</li>

				  <li>Revert "radv: Emit cache flushes before CP DMA."</li>

				</ul>

				<p>Dave Airlie (3):</p>

				<ul>

				  <li>radv: drop Z24 support.</li>

				  <li>radv: disable mip point pre clamping.</li>

				  <li>radv: setup llvm target data layout</li>

				</ul>

				<p>Emil Velikov (4):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.0.1</li>

				  <li>cherry-ignore: add the swizzle blorp_clear fix</li>

				  <li>i965: move brw_define.h ifndef guard to the top</li>

				  <li>Update version to 17.0.2</li>

				</ul>

				<p>Fredrik Höglund (2):</p>

				<ul>

				  <li>radv: fix the dynamic buffer index in vkCmdBindDescriptorSets</li>

				  <li>radv/ac: fix multiple descriptor sets with dynamic buffers</li>

				</ul>

				<p>Gregory Hainaut (1):</p>

				<ul>

				  <li>glapi: fix typo in count_scale</li>

				</ul>

				<p>Ilia Mirkin (2):</p>

				<ul>

				  <li>nvc0: take extra pushbuf space into account for pushbuf_space calls</li>

				  <li>nvc0: increase alignment to 256 for texture buffers on fermi</li>

				</ul>

				<p>Jacob Lifshay (1):</p>

				<ul>

				  <li>vulkan/wsi: Improve the DRI3 error message</li>

				</ul>

				<p>James Legg (1):</p>

				<ul>

				  <li>radv: Fix using more than 4 bound descriptor sets</li>

				</ul>

				<p>Jason Ekstrand (7):</p>

				<ul>

				  <li>anv/blorp/clear_subpass: Only set surface clear color for fast clears</li>

				  <li>anv: Accurately advertise dynamic descriptor limits</li>

				  <li>anv: Stall before fast-clear operations</li>

				  <li>anv: Properly handle destroying NULL devices and instances</li>

				  <li>anv/blorp: Turn off AUX after doing a CCS_D resolve</li>

				  <li>anv/blorp: Only set a clear color for resolves if fast-cleared</li>

				  <li>nir/intrinsics: Make load_barycentric_input take a 2-component coor</li>

				</ul>

				<p>Jonas Pfeil (1):</p>

				<ul>

				  <li>ralloc: Make sure ralloc() allocations match malloc()'s alignment.</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>egl: Ensure ResetNotificationStrategy matches for shared contexts.</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>st/mesa: reset sample_mask, min_sample, and render_condition for PBO ops</li>

				  <li>st/mesa: set blend state for PBO readbacks</li>

				  <li>radeonsi: mark all bound shader buffer ranges as initialized</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>clover: Work around build failure with AltiVec.</li>

				</ul>

				<p>Nanley Chery (2):</p>

				<ul>

				  <li>anv/pass: Avoid accessing attachment array out of bounds</li>

				  <li>anv/image: Remove extra dependency on HiZ-specific variable</li>

				</ul>

				<p>Nicolai Hähnle (2):</p>

				<ul>

				  <li>st/glsl_to_tgsi: avoid iterating past the head of the instruction list</li>

				  <li>st/mesa: inform the driver of framebuffer changes before compute dispatches</li>

				</ul>

				<p>Robert Foss (1):</p>

				<ul>

				  <li>mesa: Avoid read of uninitialized variable</li>

				</ul>

				<p>Samuel Iglesias Gonsálvez (5):</p>

				<ul>

				  <li>i965/fs: mark last DF uniform array element as 64 bit live one</li>

				  <li>i965/fs: detect different bit size accesses to uniforms to push them in proper locations</li>

				  <li>i965/fs: fix indirect load DF uniforms on BSW/BXT</li>

				  <li>i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handles</li>

				  <li>i965/fs: emit MOV_INDIRECT with the source with the right register type</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radeonsi: disable sinking common instructions down to the end block</li>

				</ul>

				</div>

				</body>

				</html>

									
										189

docs/relnotes/17.0.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,189 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.0.3 Release Notes / April 1, 2017</h1>

				<p>

				Mesa 17.0.3 is a bug fix release which fixes bugs found since the 17.0.2 release.

				</p>

				<p>

				Mesa 17.0.3 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				8253edf1bdd7b14ab63d5982349143a5c9ac3767f39a63257cc9d7e7d92f60f1  mesa-17.0.3.tar.gz

				ca646f5075a002d60ef9123c8a4331cede155c01712ef945a65c59a5e69fe7ed  mesa-17.0.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96743">Bug 96743</a> - [BYT, HSW, SKL, BXT, KBL] GPU hangs with GfxBench 4.0 CarChase</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99246">Bug 99246</a> - [d3dadapter+radeonsi &amp; bisect] EVE-Online : hang on wormhole sight</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100061">Bug 100061</a> - LODQ instruction generated with invalid dst mask</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100182">Bug 100182</a> - Flickering in The Talos Principle on Sky Lake GT4.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100201">Bug 100201</a> - Windows scons build with MSVC toolchain and LLVM 4.0 fails</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Deucher (1):</p>

				<ul>

				  <li>radeonsi: add new polaris12 pci id</li>

				</ul>

				<p>Andres Gomez (5):</p>

				<ul>

				  <li>glsl: on UBO/SSBOs link error reset the number of active blocks to 0</li>

				  <li>cherry-ignore: add the Invalidate L2 for TRANSFER_WRITE barriers fix</li>

				  <li>cherry-ignore: add the Flush after unmap in gbm/dri fix</li>

				  <li>cherry-ignore: corrected typo in the Flush after unmap in gbm/dri fix</li>

				  <li>Update version to 17.0.3</li>

				</ul>

				<p>Axel Davy (2):</p>

				<ul>

				  <li>st/nine: Resolve deadlock in surface/volume dtors when using csmt</li>

				  <li>st/nine: Use atomics for available_texture_mem</li>

				</ul>

				<p>Bas Nieuwenhuizen (1):</p>

				<ul>

				  <li>radv: flush DB cache before and after HTILE decompress.</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>radv: fix primitive reset index emission</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.0.2</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>st/mesa: set result writemask based on ir type</li>

				</ul>

				<p>Jan Vesely (1):</p>

				<ul>

				  <li>clover: use pipe_resource references</li>

				</ul>

				<p>Jason Ekstrand (9):</p>

				<ul>

				  <li>anv/query: Invalidate the correct range</li>

				  <li>anv/GetQueryPoolResults: Actually implement the spec</li>

				  <li>anv/image: Return early when unbinding an image</li>

				  <li>anv/query: Fix the location of timestamp availability</li>

				  <li>anv: Make anv_get_layerCount a macro</li>

				  <li>anv/blorp: Use anv_get_layerCount everywhere</li>

				  <li>anv/cmd_buffer: Apply flush operations prior to executing secondaries</li>

				  <li>anv/cmd_buffer: Fix bad indentation</li>

				  <li>anv: Flush caches prior to PIPELINE_SELECT on all gens</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>c11/threads: Include thr/xtimec.h for xtime definition when building with MSVC.</li>

				</ul>

				<p>Juan A. Suarez Romero (1):</p>

				<ul>

				  <li>tests/cache_test: allow crossing mount points</li>

				</ul>

				<p>Karol Herbst (1):</p>

				<ul>

				  <li>nvc0/ir: treat FMA like MAD for operand propagation</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Fall back to GL 4.2/4.3 on Haswell if the kernel isn't new enough.</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: don't hang on shader compile failure</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>i965/fs: Don't emit SEL instructions for type-converting MOVs.</li>

				</ul>

				<p>Nanley Chery (1):</p>

				<ul>

				  <li>intel: Correct the BDW surface state size</li>

				</ul>

				<p>Nicolai Hähnle (1):</p>

				<ul>

				  <li>mesa/main: fix MultiDrawElements[BaseVertex] validation of primcount</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>freedreno: fix memory leak</li>

				</ul>

				<p>Tim Rowley (1):</p>

				<ul>

				  <li>swr: [rasterizer jitter] fix llvm &gt;= 5.0 build break</li>

				</ul>

				<p>Timothy Arceri (2):</p>

				<ul>

				  <li>glsl: fix lower jumps for returns when loop is inside an if</li>

				  <li>mesa: update lower_jumps tests after bug fix</li>

				</ul>

				<p>Topi Pohjolainen (1):</p>

				<ul>

				  <li>i965/gen8+: Do full stall when switching pipeline</li>

				</ul>

				<p>Xu Randy (2):</p>

				<ul>

				  <li>anv/blorp: Fix a crash in CmdClearColorImage</li>

				  <li>anv/genX: Solve the vkCreateGraphicsPipelines crash</li>

				</ul>

				</div>

				</body>

				</html>

									
										156

docs/relnotes/17.0.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,156 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.0.4 Release Notes / April 17, 2017</h1>

				<p>

				Mesa 17.0.4 is a bug fix release which fixes bugs found since the 17.0.3 release.

				</p>

				<p>

				Mesa 17.0.4 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				c4c34ba05d48f76b45bc05bc4b6e9242077f403d63c4f0c355c7b07786de233e  mesa-17.0.4.tar.gz

				1269dc8545a193932a0779b2db5bce9be4a5f6813b98c38b93b372be8362a346  mesa-17.0.4.tar.xz

				</pre>

				<h2>Next release</h2>

				<p>

				Mesa 17.0.5 is expected in approximatelly two weeks. See the release

				<a href="../release-calendar.html#calendar" target="_parent">calendar</a>

				for details.

				</p>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99515">Bug 99515</a> - SIGSEGV MAPERR on Android nougat-x86 with mesa 17.0.0rc</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100391">Bug 100391</a> - SachaWillems deferredmultisampling asserts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100452">Bug 100452</a> - push_constants host memory leak when resetting command buffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100582">Bug 100582</a> - [GEN8+] piglit.spec.arb_stencil_texturing.glblitframebuffer corrupts state.gl_texture* assertions</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Deucher (1):</p>

				<ul>

				  <li>radeonsi: add new polaris10 pci id</li>

				</ul>

				<p>Alex Smith (1):</p>

				<ul>

				  <li>radv: Invalidate L2 for TRANSFER_WRITE barriers</li>

				</ul>

				<p>Andres Gomez (1):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.0.3</li>

				</ul>

				<p>Craig Stout (1):</p>

				<ul>

				  <li>anv/cmd_buffer: fix host memory leak</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>Revert "cherry-ignore: add the Flush after unmap in gbm/dri fix"</li>

				  <li>Revert "freedreno: fix memory leak"</li>

				  <li>Update version to 17.0.4</li>

				</ul>

				<p>Fabio Estevam (1):</p>

				<ul>

				  <li>loader: Move non-error message to debug level</li>

				</ul>

				<p>Ilia Mirkin (4):</p>

				<ul>

				  <li>nvc0/ir: fix LSB/BFE/BFI implementations</li>

				  <li>nvc0/ir: fix overwriting of offset register with interpolateAtOffset</li>

				  <li>nvc0: increase texture buffer object alignment to 256 for pre-GM107</li>

				  <li>nouveau: when mapping a persistent buffer, synchronize on former xfers</li>

				</ul>

				<p>Jason Ekstrand (5):</p>

				<ul>

				  <li>i965/fs: Always provide a default LOD of 0 for TXS and TXL</li>

				  <li>anv/pipeline: Properly handle unset gl_Layer and gl_ViewportIndex</li>

				  <li>anv/blorp: Align vertex buffers to 64B</li>

				  <li>i965/blorp: Align vertex buffers to 64B</li>

				  <li>i965/blorp: Bump the batch space estimate</li>

				</ul>

				<p>Jerome Duval (2):</p>

				<ul>

				  <li>haiku: build fixes around debug defines</li>

				  <li>haiku/winsys: fix dt prototype args</li>

				</ul>

				<p>Julien Isorce (4):</p>

				<ul>

				  <li>winsys/radeon: check null in radeon_cs_create_fence</li>

				  <li>winsys/radeon: check null return from radeon_cs_create_fence in cs_flush</li>

				  <li>radeon: initialize hole variable before calling container_of</li>

				  <li>radeon_drm_bo: explicitly check return value of drmCommandWriteRead</li>

				</ul>

				<p>Kenneth Graunke (4):</p>

				<ul>

				  <li>i965: Document the sad story of the kernel command parser.</li>

				  <li>i965: Set screen-&gt;cmd_parser_version to 0 if we can't write registers.</li>

				  <li>i965: Skip register write detection when possible.</li>

				  <li>i965: Set kernel features before computing max GL version.</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>targets: export radeon winsys_create functions to silence LLVM warning</li>

				</ul>

				<p>Michal Srb (1):</p>

				<ul>

				  <li>st: Add cubeMapFace parameter to st_finalize_texture.</li>

				</ul>

				<p>Thomas Hellstrom (1):</p>

				<ul>

				  <li>gbm/dri: Flush after unmap</li>

				</ul>

				</div>

				</body>

				</html>

									
										144

docs/relnotes/17.0.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,144 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.0.5 Release Notes / April 28, 2017</h1>

				<p>

				Mesa 17.0.5 is a bug fix release which fixes bugs found since the 17.0.4 release.

				</p>

				<p>

				Mesa 17.0.5 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				7510eee0d0077860b250d30d73305048c2df4ba09ea8fc04e4f3eec7beece301  mesa-17.0.5.tar.gz

				668efa445d2f57a26e5c096b1965a685733a3b57d9c736f9d6460263847f9bfe  mesa-17.0.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97524">Bug 97524</a> - Samplers referring to the same texture unit with different types should raise GL_INVALID_OPERATION</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (16):</p>

				<ul>

				  <li>cherry-ignore: Add the pci_id into the shader cache UUID</li>

				  <li>cherry-ignore: fix crash if ctx torn down with no rendering</li>

				  <li>cherry-ignore: Fix typos.</li>

				  <li>cherry-ignore: Revert "etnaviv: Cannot render to rb-swapped formats"</li>

				  <li>cherry-ignore: Revert "i965/fs: Don't emit SEL instructions for type-converting MOVs."</li>

				  <li>cherry-ignore: fix typo in a2b10g10r10 fast clear calculation</li>

				  <li>cherry-ignore: remove unused anv_dispatch_table dtable</li>

				  <li>cherry-ignore: remove unused radv_dispatch_table dtable</li>

				  <li>cherry-ignore: make radv_resolve_entrypoint static</li>

				  <li>cherry-ignore: vulkan: add support for libmesa_vulkan_util</li>

				  <li>cherry-ignore: r600: fix libmesa_amd_common dependency</li>

				  <li>cherry-ignore: remove dead brw_new_shader() declaration</li>

				  <li>cherry-ignore: remove i965_symbols_test reference from .gitignore</li>

				  <li>cherry-ignore: automake: ensure that the destination directory is created</li>

				  <li>cherry-ignore: provide required gem stubs for the tests</li>

				  <li>Update version to 17.0.5</li>

				</ul>

				<p>Boyan Ding (2):</p>

				<ul>

				  <li>nvc0/ir: Properly handle a "split form" of predicate destination</li>

				  <li>nir: Destination component count of shader_clock intrinsic is 2</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.0.4</li>

				  <li>winsys/sw/dri: don't use GNU void pointer arithmetic</li>

				  <li>st/clover: add space between &lt; and ::</li>

				  <li>configure.ac: check require_basic_egl only if egl enabled</li>

				  <li>st/mesa: automake: honour the vdpau header install location</li>

				</ul>

				<p>Francisco Jerez (2):</p>

				<ul>

				  <li>intel/fs: Use regs_written() in spilling cost heuristic for improved accuracy.</li>

				  <li>intel/fs: Take into account amount of data read in spilling cost heuristic.</li>

				</ul>

				<p>Grazvydas Ignotas (1):</p>

				<ul>

				  <li>radv: report timestampPeriod correctly</li>

				</ul>

				<p>Jason Ekstrand (5):</p>

				<ul>

				  <li>anv/blorp: Flush the texture cache in UpdateBuffer</li>

				  <li>anv/cmd_buffer: Flush the VF cache at the top of all primaries</li>

				  <li>anv/cmd_buffer: Always set up a null surface state</li>

				  <li>anv/cmd_buffer: Use the null surface state for ATTACHMENT_UNUSED</li>

				  <li>anv/blorp: Properly handle VK_ATTACHMENT_UNUSED</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>st/mesa: invalidate the readpix cache in st_indirect_draw_vbo</li>

				</ul>

				<p>Nanley Chery (1):</p>

				<ul>

				  <li>anv/cmd_buffer: Disable CCS on BDW input attachments</li>

				</ul>

				<p>Nicolai Hähnle (4):</p>

				<ul>

				  <li>mesa: fix remaining xfb prims check for GLES with multiple instances</li>

				  <li>mesa: extract need_xfb_remaining_prims_check</li>

				  <li>mesa: move glMultiDrawArrays to vbo and fix error handling</li>

				  <li>vbo: fix gl_DrawID handling in glMultiDrawArrays</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>util/queue: don't hang at exit</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>mesa: validate sampler type across the whole program</li>

				</ul>

				</div>

				</body>

				</html>

									
										82

docs/relnotes/17.1.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,82 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.1.0 Release Notes / TBD</h1>

				<p>

				Mesa 17.1.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for

				<a href="../release-calendar.html#calendar" target="_parent">Mesa 17.1.1</a>.

				</p>

				<p>

				Mesa 17.1.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				TBD.

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>OpenGL 4.2 on i965/ivb</li>

				<li>GL_ARB_gpu_shader_fp64 on i965/ivybridge</li>

				<li>GL_ARB_gpu_shader_int64 on i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe</li>

				<li>GL_ARB_shader_ballot on nvc0, radeonsi</li>

				<li>GL_ARB_shader_clock on nv50, nvc0, radeonsi</li>

				<li>GL_ARB_shader_group_vote on radeonsi</li>

				<li>GL_ARB_shader_precision on i965/ivb</li>

				<li>GL_ARB_shader_viewport_layer_array on radeonsi</li>

				<li>GL_ARB_sparse_buffer on radeonsi/CIK+</li>

				<li>GL_ARB_transform_feedback2 on i965/gen6</li>

				<li>GL_ARB_transform_feedback_overflow_query on i965/gen6+</li>

				<li>GL_ARB_vertex_attrib_64bit on i965/ivb</li>

				<li>GL_NV_fill_rectangle on nvc0</li>

				<li>Geometry shaders enabled on swr</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>Removed the ilo gallium driver.</li>

				<li>The configure option --enable-gallium-llvm is superseded by --enable-llvm.</li>

				<li>The swr driver now requires LLVM &gt;= 3.9.0 and a C++14 capable compiler.</li>

				<li>The radeonsi driver now requires LLVM 3.8.0.</li>

				<li>The MESA_GLSL=opt and MESA_GLSL=no_opt environment vars have been removed.</li>

				<li>The --with-egl-platforms configure option is deprecated. Use --with-platforms instead.</li>

				</ul>

				</div>

				</body>

				</html>

									
										66

docs/relnotes/17.2.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,66 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.2.0 Release Notes / TBD</h1>

				<p>

				Mesa 17.2.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 17.2.1.

				</p>

				<p>

				Mesa 17.2.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				TBD.

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_ARB_shader_viewport_layer_array on nvc0 (GM200+)</li>

				<li>GL_AMD_vertex_shader_layer on nvc0 (GM200+)</li>

				<li>GL_AMD_vertex_shader_viewport_index on nvc0 (GM200+)</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				TBD

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>GL_APPLE_vertex_array_object support removed.</li>

				</ul>

				</div>

				</body>

				</html>

									
										2

docs/relnotes/6.5.2.html
									
												View File
												
				@@ -57,7 +57,7 @@ copy texturing).

				<li>New Intel i965 DRI driver

				<li>New <code>minstall</code> script to replace normal install program

				<li>Faster fragment program execution in software

				<li>Added (or fixed) support for <a href="http://www.opengl.org/registry/specs/SGI/make_current_read.txt">

				<li>Added (or fixed) support for <a href="https://www.khronos.org/registry/OpenGL/extensions/SGI/GLX_SGI_make_current_read.txt">

				    GLX_SGI_make_current_read</a> to the following drivers:

				    <ul>

				    <li>radeon</li>

									
										2

docs/relnotes/7.11.html
									
												View File
												
				@@ -226,7 +226,7 @@ did not exist in the 7.10 release series at all.</p>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=36086">Bug 36086</a> - [wine] Segfault r300_resource_copy_region with some wine apps and RADEON_HYPERZ</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=36182">Bug 36182</a> - Game Trine from http://www.humblebundle.com/ needs ATI_draw_buffers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=36182">Bug 36182</a> - Game Trine from https://www.humblebundle.com/ needs ATI_draw_buffers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=36268">Bug 36268</a> - [r300g, bisected] minor flickering in Unigine Sanctuary</li>

									
										2

docs/relnotes/7.5.1.html
									
												View File
												
				@@ -21,7 +21,7 @@ Mesa 7.5.1 is a bug-fix release fixing issues found since the 7.5 release.

				</p>

				<p>

				The main new feature of Mesa 7.5.x is the

				<a href="http://wiki.freedesktop.org/wiki/Software/gallium">Gallium3D</a> infrastructure.

				<a href="https://www.freedesktop.org/wiki/Software/gallium">Gallium3D</a> infrastructure.

				</p>

				<p>

				Mesa 7.5.1 implements the OpenGL 2.1 API, but the version reported by

									
										2

docs/relnotes/7.5.2.html
									
												View File
												
				@@ -21,7 +21,7 @@ Mesa 7.5.2 is a bug-fix release fixing issues found since the 7.5.1 release.

				</p>

				<p>

				The main new feature of Mesa 7.5.x is the

				<a href="http://wiki.freedesktop.org/wiki/Software/gallium">Gallium3D</a> infrastructure.

				<a href="https://www.freedesktop.org/wiki/Software/gallium">Gallium3D</a> infrastructure.

				</p>

				<p>

				Mesa 7.5.2 implements the OpenGL 2.1 API, but the version reported by

									
										2

docs/relnotes/7.5.html
									
												View File
												
				@@ -23,7 +23,7 @@ with the 7.4.x branch or wait for Mesa 7.5.1.

				</p>

				<p>

				The main new feature of Mesa 7.5 is the

				<a href="http://wiki.freedesktop.org/wiki/Software/gallium">Gallium3D</a> infrastructure.

				<a href="https://www.freedesktop.org/wiki/Software/gallium">Gallium3D</a> infrastructure.

				</p>

				<p>

				Mesa 7.5 implements the OpenGL 2.1 API, but the version reported by

									
										2

docs/relnotes/9.0.html
									
												View File
												
				@@ -90,7 +90,7 @@ The two supported build methods are now autoconf/automake and SCons.

				<li>Removed support for GL_ARB_shadow_ambient extension</li>

				<li>Removed Gallium3D - nvfx driver (use nv30 instead)</li>

				<li>

				libGLU has been moved into its own repository, found at <a href="http://cgit.freedesktop.org/mesa/glu/">http://cgit.freedesktop.org/mesa/glu/</a>

				libGLU has been moved into its own repository, found at <a href="https://cgit.freedesktop.org/mesa/glu/">https://cgit.freedesktop.org/mesa/glu/</a>

				</li>

				</ul>

									
										4

docs/relnotes/9.1.2.html
									
												View File
												
				@@ -68,9 +68,9 @@ b1ae5a4d9255953980bc9254f5323420  MesaLib-9.1.2.zip

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62434">Bug 62434</a> - [bisected] 3284.073] (EE) AIGLX error: dlopen of /usr/lib/xorg/modules/dri/r600_dri.so failed (/usr/lib/libllvmradeon9.2.0.so: undefined symbol: lp_build_tgsi_intrinsic)</li>

				<li><a href="http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=349437">Debian bug #349437</a> - mesa - FTBFS: error: 'IEEE_ONE' undeclared</li>

				<li><a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=349437">Debian bug #349437</a> - mesa - FTBFS: error: 'IEEE_ONE' undeclared</li>

				<li><a href="http://bugzilla.redhat.com/show_bug.cgi?id=918661">Redhat bug #918661</a> - crash in routine Avogadro UI manipulation</li>

				<li><a href="https://bugzilla.redhat.com/show_bug.cgi?id=918661">Redhat bug #918661</a> - crash in routine Avogadro UI manipulation</li>

				</ul>

									
										21

docs/repository.html
									
												View File
												
				@@ -17,13 +17,13 @@

				<h1>Code Repository</h1>

				<p>

				Mesa uses <a href="http://git-scm.com">git</a>

				Mesa uses <a href="https://git-scm.com">git</a>

				as its source code management system.

				</p>

				<p>

				The master git repository is hosted on

				<a href="http://www.freedesktop.org">freedesktop.org</a>.

				<a href="https://www.freedesktop.org">freedesktop.org</a>.

				</p>

				<p>

				@@ -35,9 +35,9 @@ You may access the repository either as an

				<p>

				You may also 

				<a href="http://cgit.freedesktop.org/mesa/mesa/"

				<a href="https://cgit.freedesktop.org/mesa/mesa/"

				>browse the main Mesa git repository</a> and the

				<a href="http://cgit.freedesktop.org/mesa/demos"

				<a href="https://cgit.freedesktop.org/mesa/demos"

				>Mesa demos and tests git repository</a>.

				</p>

				@@ -73,9 +73,10 @@ follow this procedure:

				</p>

				<ol>

				<li>Subscribe to the

				<a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev</a>

				<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev</a>

				mailing list.

				<li>Start contributing to the project by posting patches / review requests to

				<li>Start contributing to the project by

				<a href="submittingpatches.html" target="_parent">submitting patches</a> to

				the mesa-dev list.  Specifically,

				<ul>

				<li>Use <code>git send-mail</code> to post your patches to mesa-dev.

				@@ -91,7 +92,7 @@ only if they're being supervised by another Mesa developer at the same

				organization and planning to work in a limited area of the code or on a

				separate branch.

				<li>To apply for an account, follow

				<a href="http://www.freedesktop.org/wiki/AccountRequests">these directions</a>.

				<a href="https://www.freedesktop.org/wiki/AccountRequests">these directions</a>.

				It's also appreciated if you briefly describe what you intend to do (work

				on a particular driver, add a new extension, etc.) in the bugzilla record.

				</ol>

				@@ -120,7 +121,7 @@ Once your account is established:

				<h2>Windows Users</h2>

				<p>

				If you're <a href="http://git.wiki.kernel.org/index.php/WindowsInstall">

				If you're <a href="https://git.wiki.kernel.org/index.php/WindowsInstall">

				using git on Windows</a> you'll want to enable automatic CR/LF conversion in

				your local copy of the repository:

				</p>

				@@ -143,7 +144,7 @@ Unix users don't need to set this option.

				<p>

				At any given time, there may be several active branches in Mesa's

				repository.

				Generally, the trunk contains the latest development (unstable)

				Generally, <tt>master</tt> contains the latest development (unstable)

				code while a branch has the latest stable code.

				</p>

				@@ -234,7 +235,7 @@ If you want the rebase action to be the default action, then

				    git config --global branch.autosetuprebase=always

				</pre>

				<p>

				See <a href="http://www.eecs.harvard.edu/~cduan/technical/git/">Understanding Git Conceptually</a> for a fairly clear explanation about all of this.

				See <a href="https://www.eecs.harvard.edu/~cduan/technical/git/">Understanding Git Conceptually</a> for a fairly clear explanation about all of this.

				</p>

				</ol>

									
										19

docs/shading.html
									
												View File
												
				@@ -18,7 +18,7 @@

				<p>

				This page describes the features and status of Mesa's support for the

				<a href="http://opengl.org/documentation/glsl/">

				<a href="https://opengl.org/documentation/glsl/">

				OpenGL Shading Language</a>.

				</p>

				@@ -49,8 +49,7 @@ execution.  These are generally used for debugging.

				<li><b>log</b> - log all GLSL shaders to files.

				    The filenames will be "shader_X.vert" or "shader_X.frag" where X

				    the shader ID.

				<li><b>nopt</b> - disable compiler optimizations

				<li><b>opt</b> - force compiler optimizations

				<li><b>cache_info</b> - print debug information about shader cache

				<li><b>uniform</b> - print message to stdout when glUniform is called

				<li><b>nopvert</b> - force vertex shaders to be a simple shader that just transforms

				    the vertex position with ftransform() and passes through the color and

				@@ -172,7 +171,7 @@ This tool is useful for:

				</ul>

				<p>

				After building Mesa, the compiler can be found at src/glsl/glsl_compiler

				After building Mesa, the compiler can be found at src/compiler/glsl/glsl_compiler

				</p>

				<p>

				@@ -180,7 +179,7 @@ Here's an example of using the compiler to compile a vertex shader and

				emit GL_ARB_vertex_program-style instructions:

				</p>

				<pre>

				    src/glsl/glsl_compiler --dump-ast myshader.vert

				    src/compiler/glsl/glsl_compiler --version XXX --dump-ast myshader.vert

				</pre>

				Options include

				@@ -188,7 +187,11 @@ Options include

				<li><b>--dump-ast</b> - dump GPU code

				<li><b>--dump-hir</b> - dump high-level IR code

				<li><b>--dump-lir</b> - dump low-level IR code

				<li><b>--link</b> - ???

				<li><b>--dump-builder</b> - dump GLSL IR code

				<li><b>--link</b> - link shaders

				<li><b>--just-log</b> - display only shader / linker info if exist,

				without any header or separator

				<li><b>--version</b> - [Mandatory] define the GLSL version to use

				</ul>

				@@ -196,7 +199,7 @@ Options include

				<p>

				The source code for Mesa's shading language compiler is in the

				<code>src/glsl/</code> directory.

				<code>src/compiler/glsl/</code> directory.

				</p>

				<p>

				@@ -217,7 +220,7 @@ regressions.

				</p>

				<p>

				The <a href="http://piglit.freedesktop.org/">Piglit</a> project

				The <a href="https://piglit.freedesktop.org/">Piglit</a> project

				has many GLSL tests.

				</p>

									
										7

docs/sourcedocs.html
									
												View File
												
				@@ -31,7 +31,7 @@ the <code>doxygen</code> directory and run <code>make</code>.

				<p>

				For an example of Doxygen usage in Mesa, see a recent source file

				such as <a href="http://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/main/bufferobj.c">bufferobj.c</a>.

				such as <a href="https://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/main/bufferobj.c">bufferobj.c</a>.

				</p>

				@@ -41,6 +41,11 @@ run the doxygen scripts, you can read the documentation

				<a href="../doxygen/main/index.html">here</a>

				</p>

				<p>

				Gallium is also documented using Sphinx. The generated output can be found

				<a href="https://gallium.readthedocs.io">on Gallium.ReadTheDocs.io</a>.

				</p>

				</div>

				</body>

				</html>

									
										28

docs/sourcetree.html
									
												View File
												
				@@ -27,14 +27,18 @@ each directory.

				<li><b>include</b> - Public OpenGL header files

				<li><b>src</b>

				  <ul>

				  <li><b>compiler</b> - Common utility sources for different compilers.

				    <ul>

				    <li><b>glsl</b> - the GLSL IR and compiler

				    <li><b>nir</b> - the NIR IR and compiler

				    <li><b>spirv</b> - the SPIR-V compiler

				    </ul>

				  <li><b>egl</b> - EGL library sources

				    <ul>

				    <li><b>docs</b> - EGL documentation

				    <li><b>drivers</b> - EGL drivers

				    <li><b>main</b> - main EGL library implementation.  This is where all

				        the EGL API functions are implemented, like eglCreateContext().

				    </ul>

				  <li><b>glsl</b> - the GLSL compiler

				  <li><b>mapi</b> - Mesa APIs

				    <li><b>glapi</b> - OpenGL API dispatch layer.  This is where all the

				        GL entrypoints like glClear, glBegin, etc. are generated, as well as

				@@ -94,7 +98,8 @@ each directory.

				      <ul>

				      <li><b>i915</b> - Driver for Intel i915/i945.

				      <li><b>llvmpipe</b> - Software driver using LLVM for runtime code generation.

				      <li><b>nv*</b> - Drivers for NVIDIA GPUs.

				      <li><b>nouveau</b> - Driver for NVIDIA GPUs.

				      <li><b>radeon</b> - Shared module for the r600 and radeonsi drivers.

				      <li><b>radeonsi</b> - Driver for AMD Southern Island.

				      <li><b>r300</b> - Driver for ATI R300 - R500.

				      <li><b>r600</b> - Driver for ATI/AMD R600 - Northern Island.

				@@ -128,16 +133,19 @@ each directory.

				          to another.

				      <li><b>util</b> - assorted utilities for arithmetic, hashing, surface

				          creation, memory management, 2D blitting, simple rendering, etc.

				      <li>XXX more

				      </ul>

				    <li><b>state_trackers</b> -

				       <ul>

				       <li><b>clover</b> - OpenCL state tracker

				       <li><b>dri</b> - Meta state tracker for DRI drivers

				       <li><b>glx</b> - Meta state tracker for GLX

				       <li><b>vdpau</b> - VDPAU state tracker

				       <li><b>wgl</b> -

				       <li><b>xorg</b> - Meta state tracker for Xorg video drivers

				       <li><b>wgl</b> - Windows WGL state tracker

				       <li><b>xa</b> - XA state tracker

				       <li><b>xvmc</b> - XvMC state tracker

				       <li><b>vdpau</b> - VDPAU state tracker

				       <li><b>va</b> - VA-API state tracker

				       <li><b>omx</b> - OpenMAX state tracker

				       </ul>

				    <li><b>winsys</b> -

				       <ul>

				@@ -148,11 +156,11 @@ each directory.

				    </ul>

				  </ul>

				  <ul>

				  <li><b>glx</b> - The GLX library code for building libGL.  This is used for

				         direct rendering drivers.  It will dynamically load one of the 

				         xxx_dri.so drivers.

				  <li><b>glx</b> - The GLX library code for building libGL using DRI drivers.

				  </ul>

				<li><b>lib</b> - where the GL libraries are placed

				<li><b>lib</b> - hardlinks to most binaries as produced by <strong>make</strong>.

				        These (shortcuts) are used for development purposes in conjunction with

				        LD_LIBRARY_PATH and/or LIBGL_DRIVERS_PATH.

				</ul>

				</div>

98

docs/specs/EGL_MESA_drm_image_formats.txt Normal file

View File

@@ -0,0 +1,98 @@
 Name
     MESA_drm_image_formats
 Name Strings
     EGL_MESA_drm_image_formats
 Contributors
     Nicolai Hähnle <Nicolai.Haehnle@amd.com>
     Qiang Yu <Qiang.Yu@amd.com>
 Contact
     Nicolai Hähnle <Nicolai.Haehnle@amd.com>
 Status
     Proposal
 Version
     Version 1, January 26, 2017
 Number
     EGL Extension #??
 Dependencies
     This extension requires the EGL_MESA_drm_image extension.
     This extension is written against the wording of EGL_MESA_drm_image
     specification.
 Overview
     This extension extends the functionality of EGL_MESA_drm_image by adding
     additional formats required by Glamor for use with DRM buffers.
 IP Status
     Open-source; freely implementable.
 New Procedures and Functions
     None
 New Tokens
     Accepted as values for the EGL_IMAGE_FORMAT_MESA attribute:
         EGL_DRM_BUFFER_FORMAT_ARGB2101010_MESA  0x3290
         EGL_DRM_BUFFER_FORMAT_ARGB1555_MESA     0x3291
         EGL_DRM_BUFFER_FORMAT_RGB565_MESA       0x3292
 Additions to the EGL_MESA_drm_image Specification:
     Remove the sentence "The only format specified ..." from the paragraph
     describing eglCreateDRMImageMESA and add the following paragraph:
         The formats specified for use with EGL_DRM_BUFFER_FORMAT_MESA are:
       * EGL_DRM_BUFFER_FORMAT_ARGB32_MESA, where each pixel is a CPU-endian
 -bit quantity, with alpha in the upper 8 bits, then red, then green,
         then blue,
       * EGL_DRM_BUFFER_FORMAT_ARGB2101010_MESA, where each pixel is a CPU-
         endian, 32-bit quantity, with alpha in the most significant 2 bits,
         followed by 10 bits each for red, green, and blue,
       * EGL_DRM_BUFFER_FORMAT_ARGB1555_MESA, where each pixel is a CPU-endian
 -bit quantity, with alpha in the most significant bit, followed by
 bits each for red, green, and blue, and
       * EGL_DRM_BUFFER_FORMAT_RGB565_MESA, where each pixel is a CPU-endian
 -bit quantity, with red in the 5 most significant bits, followed by
 bits of green and 5 bits of blue.
 Issues
 . Should we expose the full set of channel permutations for the formats,
        e.g. ABGR2101010, RGBA1010102, and BGRA1010102 in addition to
        ARGB2101010?
        RESOLVED: No.
        DISCUSSION: The original extension sets a precedent of only exposing one
        of the possible permutations of 8-bit channel formats. It is also not
        clear where the additional permutations would be used. For example,
        Glamor has a fixed mapping from pixmap/screen depth to format that
        doesn't allow for the other permutations.
 Revision History
     Version 1, January, 2017
         Initial draft (Nicolai Hähnle)

120

docs/specs/EGL_MESA_platform_surfaceless.txt Normal file

View File

@@ -0,0 +1,120 @@
 Name
     MESA_platform_surfaceless
 Name Strings
     EGL_MESA_platform_surfaceless
 Contributors
     Chad Versace <chadversary@google.com>
     Haixia Shi <hshi@google.com>
     Stéphane Marchesin <marcheu@google.com>
     Zach Reizner <zachr@chromium.org>
     Gurchetan Singh <gurchetansingh@google.com>
 Contacts
     Chad Versace <chadversary@google.com>
 Status
     DRAFT
 Version
     Version 2, 2016-10-13
 Number
     EGL Extension #TODO
 Extension Type
     EGL client extension
 Dependencies
     Requires EGL 1.5 or later; or EGL 1.4 with EGL_EXT_platform_base.
     This extension is written against the EGL 1.5 Specification (draft
     20140122).
     This extension interacts with EGL_EXT_platform_base as follows. If the
     implementation supports EGL_EXT_platform_base, then text regarding
     eglGetPlatformDisplay applies also to eglGetPlatformDisplayEXT;
     eglCreatePlatformWindowSurface to eglCreatePlatformWindowSurfaceEXT; and
     eglCreatePlatformPixmapSurface to eglCreatePlatformPixmapSurfaceEXT.
 Overview
     This extension defines a new EGL platform, the "surfaceless" platform. This
     platfom's defining property is that it has no native surfaces, and hence
     neither eglCreatePlatformWindowSurface nor eglCreatePlatformPixmapSurface
     can be used. The platform is independent of any native window system.
     The platform's intended use case is for enabling OpenGL and OpenGL ES
     applications on systems where no window system exists. However, the
     platform's permitted usage is not restricted to this case.  Since the
     platform is independent of any native window system, it may also be used on
     systems where a window system is present.
 New Types
     None
 New Procedures and Functions
     None
 New Tokens
     Accepted as the <platform> argument of eglGetPlatformDisplay:
         EGL_PLATFORM_SURFACELESS_MESA           0x31DD
 Additions to the EGL Specification
     None.
 New Behavior
     To determine if the EGL implementation supports this extension, clients
     should query the EGL_EXTENSIONS string of EGL_NO_DISPLAY.
     To obtain an EGLDisplay on the surfaceless platform, call
     eglGetPlatformDisplay with <platform> set to EGL_PLATFORM_SURFACELESS_MESA.
     The <native_display> parameter must be EGL_DEFAULT_DISPLAY.
     eglCreatePlatformWindowSurface fails when called with a <display> that
     belongs to the surfaceless platform. It returns EGL_NO_SURFACE and
     generates EGL_BAD_NATIVE_WINDOW. The justification for this unconditional
     failure is that the surfaceless platform has no native windows, and
     therefore the <native_window> parameter is always invalid.
     Likewise, eglCreatePlatformPixmapSurface also fails when called with a
     <display> that belongs to the surfaceless platform.  It returns
     EGL_NO_SURFACE and generates EGL_BAD_NATIVE_PIXMAP.
     The surfaceless platform imposes no platform-specific restrictions on the
     creation of pbuffers, as eglCreatePbufferSurface has no native surface
     parameter.  Specifically, if the EGLDisplay advertises an EGLConfig whose
     EGL_SURFACE_TYPE attribute contains EGL_PBUFFER_BIT, then the EGLDisplay
     permits the creation of pbuffers with that config.
 Issues
     None.
 Revision History
     Version 2, 2016-10-13 (Chad Versace)
         - Assign enum values
         - Define interfactions with EGL 1.4 and EGL_EXT_platform_base.
         - Add Gurchetan as contributor, as he implemented the pbuffer support.
     Version 1, 2016-09-23 (Chad Versace)
         - Initial version
         - Posted for review at
           https://lists.freedesktop.org/archives/mesa-dev/2016-September/129549.html

									
										8

docs/specs/MESA_configless_context.spec
									
												View File
												
				@@ -12,11 +12,12 @@ Contact

				Status

				    Proposal

				    Superseded by the functionally identical EGL_KHR_no_config_context

				    extension.

				Version

				    Version 1, February 28, 2014

				    Version 2, September 9, 2016

				Number

				@@ -121,5 +122,8 @@ Issues

				Revision History

				    Version 2, September 9, 2016

				        Defer to EGL_KHR_no_config_context (Adam Jackson)

				    Version 1, February 28, 2014

				        Initial draft (Neil Roberts)

522

docs/specs/MESA_shader_integer_functions.txt Normal file

View File

@@ -0,0 +1,522 @@
 Name
     MESA_shader_integer_functions
 Name Strings
     GL_MESA_shader_integer_functions
 Contact
     Ian Romanick <ian.d.romanick@intel.com>
 Contributors
     All the contributors of GL_ARB_gpu_shader5
 Status
     Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later
 Version
     Version 3, March 31, 2017
 Number
     OpenGL Extension #495
 Dependencies
     This extension is written against the OpenGL 3.2 (Compatibility Profile)
     Specification.
     This extension is written against Version 1.50 (Revision 09) of the OpenGL
     Shading Language Specification.
     GLSL 1.30 (OpenGL) or GLSL ES 3.00 (OpenGL ES) is required.
     This extension interacts with ARB_gpu_shader5.
     This extension interacts with ARB_gpu_shader_fp64.
     This extension interacts with NV_gpu_shader5.
 Overview
     GL_ARB_gpu_shader5 extends GLSL in a number of useful ways.  Much of this
     added functionality requires significant hardware support.  There are many
     aspects, however, that can be easily implmented on any GPU with "real"
     integer support (as opposed to simulating integers using floating point
     calculations).
     This extension provides a set of new features to the OpenGL Shading
     Language to support capabilities of these GPUs, extending the
     capabilities of version 1.30 of the OpenGL Shading Language and version
 .00 of the OpenGL ES Shading Language.  Shaders using the new
     functionality provided by this extension should enable this
     functionality via the construct
       #extension GL_MESA_shader_integer_functions : require   (or enable)
     This extension provides a variety of new features for all shader types,
     including:
       * support for implicitly converting signed integer types to unsigned
         types, as well as more general implicit conversion and function
         overloading infrastructure to support new data types introduced by
         other extensions;
       * new built-in functions supporting:
         * splitting a floating-point number into a significand and exponent
           (frexp), or building a floating-point number from a significand and
           exponent (ldexp);
         * integer bitfield manipulation, including functions to find the
           position of the most or least significant set bit, count the number
           of one bits, and bitfield insertion, extraction, and reversal;
         * extended integer precision math, including add with carry, subtract
           with borrow, and extenended multiplication;
     The resulting extension is a strict subset of GL_ARB_gpu_shader5.
 IP Status
     No known IP claims.
 New Procedures and Functions
     None
 New Tokens
     None
 Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
 (OpenGL Operation)
     None.
 Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
 (Rasterization)
     None.
 Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
 (Per-Fragment Operations and the Frame Buffer)
     None.
 Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
 (Special Functions)
     None.
 Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
 (State and State Requests)
     None.
 Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
 Specification (Invariance)
     None.
 Additions to the AGL/GLX/WGL Specifications
     None.
 Modifications to The OpenGL Shading Language Specification, Version 1.50
 (Revision 09)
     Including the following line in a shader can be used to control the
     language features described in this extension:
       #extension GL_MESA_shader_integer_functions : <behavior>
     where <behavior> is as specified in section 3.3.
     New preprocessor #defines are added to the OpenGL Shading Language:
       #define GL_MESA_shader_integer_functions        1
     Modify Section 4.1.10, Implicit Conversions, p. 27
     (modify table of implicit conversions)
                                 Can be implicitly
         Type of expression        converted to
         ---------------------   -----------------
         int                     uint, float
         ivec2                   uvec2, vec2
         ivec3                   uvec3, vec3
         ivec4                   uvec4, vec4
         uint                    float
         uvec2                   vec2
         uvec3                   vec3
         uvec4                   vec4
     (modify second paragraph of the section) No implicit conversions are
     provided to convert from unsigned to signed integer types or from
     floating-point to integer types.  There are no implicit array or structure
     conversions.
     (insert before the final paragraph of the section) When performing
     implicit conversion for binary operators, there may be multiple data types
     to which the two operands can be converted.  For example, when adding an
     int value to a uint value, both values can be implicitly converted to uint
     and float.  In such cases, a floating-point type is chosen if either
     operand has a floating-point type.  Otherwise, an unsigned integer type is
     chosen if either operand has an unsigned integer type.  Otherwise, a
     signed integer type is chosen.
     Modify Section 5.9, Expressions, p. 57
     (modify bulleted list as follows, adding support for implicit conversion
     between signed and unsigned types)
     Expressions in the shading language are built from the following:
     * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector
       types, and all matrix types.
     ...
     * The operator modulus (%) operates on signed or unsigned integer scalars
       or vectors.  If the fundamental types of the operands do not match, the
       conversions from Section 4.1.10 "Implicit Conversions" are applied to
       produce matching types.  ...
     Modify Section 6.1, Function Definitions, p. 63
     (modify description of overloading, beginning at the top of p. 64)
      Function names can be overloaded.  The same function name can be used for
      multiple functions, as long as the parameter types differ.  If a function
      name is declared twice with the same parameter types, then the return
      types and all qualifiers must also match, and it is the same function
      being declared.  For example,
        vec4 f(in vec4 x, out vec4  y);   // (A)
        vec4 f(in vec4 x, out uvec4 y);   // (B) okay, different argument type
        vec4 f(in ivec4 x, out uvec4 y);  // (C) okay, different argument type
        int  f(in vec4 x, out ivec4 y);  // error, only return type differs
        vec4 f(in vec4 x, in  vec4  y);  // error, only qualifier differs
        vec4 f(const in vec4 x, out vec4 y);  // error, only qualifier differs
      When function calls are resolved, an exact type match for all the
      arguments is sought.  If an exact match is found, all other functions are
      ignored, and the exact match is used.  If no exact match is found, then
      the implicit conversions in Section 4.1.10 (Implicit Conversions) will be
      applied to find a match.  Mismatched types on input parameters (in or
      inout or default) must have a conversion from the calling argument type
      to the formal parameter type.  Mismatched types on output parameters (out
      or inout) must have a conversion from the formal parameter type to the
      calling argument type.
      If implicit conversions can be used to find more than one matching
      function, a single best-matching function is sought.  To determine a best
      match, the conversions between calling argument and formal parameter
      types are compared for each function argument and pair of matching
      functions.  After these comparisons are performed, each pair of matching
      functions are compared.  A function definition A is considered a better
      match than function definition B if:
        * for at least one function argument, the conversion for that argument
          in A is better than the corresponding conversion in B; and
        * there is no function argument for which the conversion in B is better
          than the corresponding conversion in A.
      If a single function definition is considered a better match than every
      other matching function definition, it will be used.  Otherwise, a
      semantic error occurs and the shader will fail to compile.
      To determine whether the conversion for a single argument in one match is
      better than that for another match, the following rules are applied, in
      order:
 . An exact match is better than a match involving any implicit
           conversion.
 . A match involving an implicit conversion from float to double is
           better than a match involving any other implicit conversion.
 . A match involving an implicit conversion from either int or uint to
           float is better than a match involving an implicit conversion from
           either int or uint to double.
      If none of the rules above apply to a particular pair of conversions,
      neither conversion is considered better than the other.
      For the function prototypes (A), (B), and (C) above, the following
      examples show how the rules apply to different sets of calling argument
      types:
        f(vec4, vec4);        // exact match of vec4 f(in vec4 x, out vec4 y)
        f(vec4, uvec4);       // exact match of vec4 f(in vec4 x, out ivec4 y)
        f(vec4, ivec4);       // matched to vec4 f(in vec4 x, out vec4 y)
                              //   (C) not relevant, can't convert vec4 to
                              //   ivec4.  (A) better than (B) for 2nd
                              //   argument (rule 2), same on first argument.
        f(ivec4, vec4);       // NOT matched.  All three match by implicit
                              //   conversion.  (C) is better than (A) and (B)
                              //   on the first argument.  (A) is better than
                              //   (B) and (C).
     Modify Section 8.3, Common Functions, p. 84
     (add support for single-precision frexp and ldexp functions)
     Syntax:
       genType frexp(genType x, out genIType exp);
       genType ldexp(genType x, in genIType exp);
     The function frexp() splits each single-precision floating-point number in
     <x> into a binary significand, a floating-point number in the range [0.5,
 .0), and an integral exponent of two, such that:
       x = significand * 2 ^ exponent
     The significand is returned by the function; the exponent is returned in
     the parameter <exp>.  For a floating-point value of zero, the significant
     and exponent are both zero.  For a floating-point value that is an
     infinity or is not a number, the results of frexp() are undefined.
     If the input <x> is a vector, this operation is performed in a
     component-wise manner; the value returned by the function and the value
     written to <exp> are vectors with the same number of components as <x>.
     The function ldexp() builds a single-precision floating-point number from
     each significand component in <x> and the corresponding integral exponent
     of two in <exp>, returning:
       significand * 2 ^ exponent
     If this product is too large to be represented as a single-precision
     floating-point value, the result is considered undefined.
     If the input <x> is a vector, this operation is performed in a
     component-wise manner; the value passed in <exp> and returned by the
     function are vectors with the same number of components as <x>.
     (add support for new integer built-in functions)
     Syntax:
       genIType bitfieldExtract(genIType value, int offset, int bits);
       genUType bitfieldExtract(genUType value, int offset, int bits);
       genIType bitfieldInsert(genIType base, genIType insert, int offset,
                               int bits);
       genUType bitfieldInsert(genUType base, genUType insert, int offset,
                               int bits);
       genIType bitfieldReverse(genIType value);
       genUType bitfieldReverse(genUType value);
       genIType bitCount(genIType value);
       genIType bitCount(genUType value);
       genIType findLSB(genIType value);
       genIType findLSB(genUType value);
       genIType findMSB(genIType value);
       genIType findMSB(genUType value);
     The function bitfieldExtract() extracts bits <offset> through
     <offset>+<bits>-1 from each component in <value>, returning them in the
     least significant bits of corresponding component of the result.  For
     unsigned data types, the most significant bits of the result will be set
     to zero.  For signed data types, the most significant bits will be set to
     the value of bit <offset>+<base>-1.  If <bits> is zero, the result will be
     zero.  The result will be undefined if <offset> or <bits> is negative, or
     if the sum of <offset> and <bits> is greater than the number of bits used
     to store the operand.  Note that for vector versions of bitfieldExtract(),
     a single pair of <offset> and <bits> values is shared for all components.
     The function bitfieldInsert() inserts the <bits> least significant bits of
     each component of <insert> into the corresponding component of <base>.
     The result will have bits numbered <offset> through <offset>+<bits>-1
     taken from bits 0 through <bits>-1 of <insert>, and all other bits taken
     directly from the corresponding bits of <base>.  If <bits> is zero, the
     result will simply be <base>.  The result will be undefined if <offset> or
     <bits> is negative, or if the sum of <offset> and <bits> is greater than
     the number of bits used to store the operand.  Note that for vector
     versions of bitfieldInsert(), a single pair of <offset> and <bits> values
     is shared for all components.
     The function bitfieldReverse() reverses the bits of <value>.  The bit
     numbered <n> of the result will be taken from bit (<bits>-1)-<n> of
     <value>, where <bits> is the total number of bits used to represent
     <value>.
     The function bitCount() returns the number of one bits in the binary
     representation of <value>.
     The function findLSB() returns the bit number of the least significant one
     bit in the binary representation of <value>.  If <value> is zero, -1 will
     be returned.
     The function findMSB() returns the bit number of the most significant bit
     in the binary representation of <value>.  For positive integers, the
     result will be the bit number of the most significant one bit.  For
     negative integers, the result will be the bit number of the most
     significant zero bit.  For a <value> of zero or negative one, -1 will be
     returned.
     (support for unsigned integer add/subtract with carry-out)
     Syntax:
       genUType uaddCarry(genUType x, genUType y, out genUType carry);
       genUType usubBorrow(genUType x, genUType y, out genUType borrow);
     The function uaddCarry() adds 32-bit unsigned integers or vectors <x> and
     <y>, returning the sum modulo 2^32.  The value <carry> is set to zero if
     the sum was less than 2^32, or one otherwise.
     The function usubBorrow() subtracts the 32-bit unsigned integer or vector
     <y> from <x>, returning the difference if non-negative or 2^32 plus the
     difference, otherwise.  The value <borrow> is set to zero if x >= y, or
     one otherwise.
     (support for signed and unsigned multiplies, with 32-bit inputs and a
 -bit result spanning two 32-bit outputs)
     Syntax:
       void umulExtended(genUType x, genUType y, out genUType msb,
                         out genUType lsb);
       void imulExtended(genIType x, genIType y, out genIType msb,
                         out genIType lsb);
     The functions umulExtended() and imulExtended() multiply 32-bit unsigned
     or signed integers or vectors <x> and <y>, producing a 64-bit result.  The
 least significant bits are returned in <lsb>; the 32 most significant
     bits are returned in <msb>.
 GLX Protocol
     None.
 Dependencies on ARB_gpu_shader_fp64
     This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
     of implicit conversions supported in the OpenGL Shading Language.  If more
     than one of these extensions is supported, an expression of one type may
     be converted to another type if that conversion is allowed by any of these
     specifications.
     If ARB_gpu_shader_fp64 or a similar extension introducing new data types
     is not supported, the function overloading rule in the GLSL specification
     preferring promotion an input parameters to smaller type to a larger type
     is never applicable, as all data types are of the same size.  That rule
     and the example referring to "double" should be removed.
 Dependencies on NV_gpu_shader5
     This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
     of implicit conversions supported in the OpenGL Shading Language.  If more
     than one of these extensions is supported, an expression of one type may
     be converted to another type if that conversion is allowed by any of these
     specifications.
     If NV_gpu_shader5 is supported, integer data types are supported with four
     different precisions (8-, 16, 32-, and 64-bit) and floating-point data
     types are supported with three different precisions (16-, 32-, and
 -bit).  The extension adds the following rule for output parameters,
     which is similar to the one present in this extension for input
     parameters:
 . If the formal parameters in both matches are output parameters, a
           conversion from a type with a larger number of bits per component is
           better than a conversion from a type with a smaller number of bits
           per component.  For example, a conversion from an "int16_t" formal
           parameter type to "int"  is better than one from an "int8_t" formal
           parameter type to "int".
     Such a rule is not provided in this extension because there is no
     combination of types in this extension and ARB_gpu_shader_fp64 where this
     rule has any effect.
 Errors
     None
 New State
     None
 New Implementation Dependent State
     None
 Issues
     (1) What should this extension be called?
       UNRESOLVED.  This extension borrows from GL_ARB_gpu_shader5, so creating
       some sort of a play on that name would be viable.  However, nothing in
       this extension should require SM5 hardware, so such a name would be a
       little misleading and weird.
       Since the primary purpose is to add integer related functions from
       GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions
       for now.
     (2) Why is some of the formatting in this extension weird?
       RESOLVED: This extension is formatted to minimize the differences (as
       reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5
       specification.
     (3) Should ldexp and frexp be included?
       RESOLVED: Yes.  Few GPUs have native instructions to implement these
       functions.  These are generally implemented using existing GLSL built-in
       functions and the other functions provided by this extension.
     (4) Should umulExtended and imulExtended be included?
       RESOLVED: Yes.  These functions should be implementable on any GPU that
       can support the rest of this extension, but the implementation may be
       complex.  The implementation on a GPU that only supports 32bit x 32bit =
 bit multiplication would be quite expensive.  However, many GPUs
       (including OpenGL 4.0 GPUs that already support this function) have a
 bit x 16bit = 48bit multiplier.  The implementation there is only
       trivially more expensive than regular 32bit multiplication.
     (5) Should the pack and unpack functions be included?
       RESOLVED: No.  These functions are already available via
       GL_ARB_shading_language_packing.
     (6) Should the "BitsTo" functions be included?
       RESOLVED: No.  These functions are already available via
       GL_ARB_shader_bit_encoding.
 Revision History
     Rev.      Date     Author    Changes
     ----  -----------  --------  -----------------------------------------
 31-Mar-2017  Jon Leech Add ES support (OpenGL-Registry/issues/3)
 7-Jul-2016  idr       Fix typo in #extension line
 20-Jun-2016  idr       Initial version based on GL_ARB_gpu_shader5.

									
										6

docs/specs/MESA_texture_array.spec
									
												View File
												
				@@ -76,9 +76,9 @@ Overview

				    References:

				        http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=011557

				        http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=000516

				        http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=011903

				        https://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=011557

				        https://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=000516

				        https://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=011903

				        http://www.delphi3d.net/articles/viewarticle.php?article=terraintex.htm

				New Procedures and Functions

0

src/egl/docs/EGL_MESA_screen_surface → docs/specs/OLD/EGL_MESA_screen_surface.txt

View File

									
										5

docs/specs/WL_bind_wayland_display.spec
									
												View File
												
				@@ -75,6 +75,7 @@ New Tokens

				        EGL_TEXTURE_Y_U_V_WL                    0x31D7

				        EGL_TEXTURE_Y_UV_WL                     0x31D8

				        EGL_TEXTURE_Y_XUXV_WL                   0x31D9

				        EGL_TEXTURE_EXTERNAL_WL                 0x31DA

				    Accepted in the <attribute> parameter of eglQueryWaylandBufferWL:

				@@ -148,6 +149,10 @@ Additions to the EGL 1.4 Specification:

				                Two planes, samples Y from the first plane to r in

				                the shader, U and V from the second plane to g and a.

				        EGL_TEXTURE_EXTERNAL_WL

				                Treated as a single plane texture, but sampled with

				                samplerExternalOES according to OES_EGL_image_external

				    After querying the wl_buffer layout, create EGLImages for the

				    planes by calling eglCreateImageKHR with wl_buffer as

				    EGLClientBuffer, EGL_WAYLAND_BUFFER_WL as the target, NULL

46

docs/specs/enums.txt

View File

@@ -1,10 +1,18 @@
 The definitive source for enum values and reserved ranges are the XML files in
 the Khronos registry:
 See the OpenGL ARB enum registry at http://www.opengl.org/registry/api/enum.spec
     https://github.com/KhronosGroup/EGL-Registry/blob/master/api/egl.xml
     https://github.com/KhronosGroup/OpenGL-Registry/blob/master/xml/gl.xml
     https://github.com/KhronosGroup/OpenGL-Registry/blob/master/xml/glx.xml
     https://github.com/KhronosGroup/OpenGL-Registry/blob/master/xml/wgl.xml
 Blocks allocated to Mesa:
 GL blocks allocated to Mesa:
 x8750-0x875F
 x8BB0-0x8BBF
 EGL blocks allocated to Mesa:
 x31D0-0x31DF
 x3290-0x329F
 GL_MESA_packed_depth_stencil
 	GL_DEPTH_STENCIL_MESA            0x8750
@@ -13,7 +21,7 @@ GL_MESA_packed_depth_stencil
 	GL_UNSIGNED_SHORT_15_1_MESA      0x8753
 	GL_UNSIGNED_SHORT_1_15_REV_MESA  0x8754
 GL_MESA_trace.spec:
 GL_MESA_trace:
 	GL_TRACE_ALL_BITS_MESA           0xFFFF
 	GL_TRACE_OPERATIONS_BIT_MESA     0x0001
 	GL_TRACE_PRIMITIVES_BIT_MESA     0x0002
@@ -24,12 +32,12 @@ GL_MESA_trace.spec:
 	GL_TRACE_MASK_MESA               0x8755
 	GL_TRACE_NAME_MESA               0x8756
 MESA_ycbcr_texture.spec:
 GL_MESA_ycbcr_texture:
 	GL_YCBCR_MESA                    0x8757
 	GL_UNSIGNED_SHORT_8_8_MESA       0x85BA /* same as Apple's */
 	GL_UNSIGNED_SHORT_8_8_REV_MESA   0x85BB /* same as Apple's */
 GL_MESA_pack_invert.spec
 GL_MESA_pack_invert:
 	GL_PACK_INVERT_MESA              0x8758
 GL_MESA_shader_debug.spec: (obsolete)
@@ -37,7 +45,7 @@ GL_MESA_shader_debug.spec: (obsolete)
         GL_DEBUG_PRINT_MESA              0x875A
         GL_DEBUG_ASSERT_MESA             0x875B
 GL_MESA_program_debug.spec: (obsolete)
 GL_MESA_program_debug: (obsolete)
 	GL_FRAGMENT_PROGRAM_CALLBACK_MESA      0x????
 	GL_VERTEX_PROGRAM_CALLBACK_MESA        0x????
 	GL_FRAGMENT_PROGRAM_POSITION_MESA      0x????
@@ -55,3 +63,29 @@ GL_MESAX_texture_stack:
 	GL_TEXTURE_1D_STACK_BINDING_MESAX    0x875D
 	GL_TEXTURE_2D_STACK_BINDING_MESAX    0x875E
 EGL_MESA_drm_image
         EGL_DRM_BUFFER_FORMAT_MESA		0x31D0
         EGL_DRM_BUFFER_USE_MESA			0x31D1
         EGL_DRM_BUFFER_FORMAT_ARGB32_MESA	0x31D2
         EGL_DRM_BUFFER_MESA			0x31D3
         EGL_DRM_BUFFER_STRIDE_MESA		0x31D4
 EGL_MESA_platform_gbm
         EGL_PLATFORM_GBM_MESA                   0x31D7
 EGL_MESA_platform_surfaceless
         EGL_PLATFORM_SURFACELESS_MESA           0x31DD
 EGL_MESA_drm_image
         EGL_DRM_BUFFER_FORMAT_ARGB2101010_MESA  0x3290
         EGL_DRM_BUFFER_FORMAT_ARGB1555_MESA     0x3291
         EGL_DRM_BUFFER_FORMAT_RGB565_MESA       0x3292
 EGL_WL_bind_wayland_display
         EGL_TEXTURE_FORMAT                      0x3080
         EGL_WAYLAND_BUFFER_WL                   0x31D5
         EGL_WAYLAND_PLANE_WL                    0x31D6
         EGL_TEXTURE_Y_U_V_WL                    0x31D7
         EGL_TEXTURE_Y_UV_WL                     0x31D8
         EGL_TEXTURE_Y_XUXV_WL                   0x31D9
         EGL_WAYLAND_Y_INVERTED_WL               0x31DB

									
										374

docs/submittingpatches.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,374 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Submitting patches</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Submitting patches</h1>

				<ul>

				<li><a href="#guidelines">Basic guidelines</a>

				<li><a href="#formatting">Patch formatting</a>

				<li><a href="#testing">Testing Patches</a>

				<li><a href="#mailing">Mailing Patches</a>

				<li><a href="#reviewing">Reviewing Patches</a>

				<li><a href="#nominations">Nominating a commit for a stable branch</a>

				<li><a href="#criteria">Criteria for accepting patches to the stable branch</a>

				<li><a href="#backports">Sending backports for the stable branch</a>

				<li><a href="#gittips">Git tips</a>

				</ul>

				<h2 id="guidelines">Basic guidelines</h2>

				<ul>

				<li>Patches should not mix code changes with code formatting changes (except,

				perhaps, in very trivial cases.)

				<li>Code patches should follow Mesa

				<a href="codingstyle.html" target="_parent">coding conventions</a>.

				<li>Whenever possible, patches should only effect individual Mesa/Gallium

				components.

				<li>Patches should never introduce build breaks and should be bisectable (see

				<code>git bisect</code>.)

				<li>Patches should be properly <a href="#formatting">formatted</a>.

				<li>Patches should be sufficiently <a href="#testing">tested</a> before submitting.

				<li>Patches should be submitted to <a href="#mailing">mesa-dev</a>

				for <a href="#reviewing">review</a> using <code>git send-email</code>.

				</ul>

				<h2 id="formatting">Patch formatting</h2>

				<ul>

				<li>Lines should be limited to 75 characters or less so that git logs

				displayed in 80-column terminals avoid line wrapping.  Note that git

				log uses 4 spaces of indentation (4 + 75 &lt; 80).

				<li>The first line should be a short, concise summary of the change prefixed

				with a module name.  Examples:

				<pre>

				    mesa: Add support for querying GL_VERTEX_ATTRIB_ARRAY_LONG

				    gallium: add PIPE_CAP_DEVICE_RESET_STATUS_QUERY

				    i965: Fix missing type in local variable declaration.

				</pre>

				<li>Subsequent patch comments should describe the change in more detail,

				if needed.  For example:

				<pre>

				    i965: Remove end-of-thread SEND alignment code.

				    This was present in Eric's initial implementation of the compaction code

				    for Sandybridge (commit 077d01b6). There is no documentation saying this

				    is necessary, and removing it causes no regressions in piglit on any

				    platform.

				</pre>

				<li>A "Signed-off-by:" line is not required, but not discouraged either.

				<li>If a patch addresses a bugzilla issue, that should be noted in the

				patch comment.  For example:

				<pre>

				   Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89689

				</pre>

				<li>If a patch addresses a issue introduced with earlier commit, that should be

				noted in the patch comment.  For example:

				<pre>

				   Fixes: d7b3707c612 "util/disk_cache: use stat() to check if entry is a directory"

				</pre>

				<li>If there have been several revisions to a patch during the review

				process, they should be noted such as in this example:

				<pre>

				    st/mesa: add ARB_texture_stencil8 support (v4)

				    if we support stencil texturing, enable texture_stencil8

				    there is no requirement to support native S8 for this,

				    the texture can be converted to x24s8 fine.

				    v2: fold fixes from Marek in:

				       a) put S8 last in the list

				       b) fix renderable to always test for d/s renderable

				        fixup the texture case to use a stencil only format

				        for picking the format for the texture view.

				    v3: hit fallback for getteximage

				    v4: put s8 back in front, it shouldn't get picked now (Ilia)

				</pre>

				<li>If someone tested your patch, document it with a line like this:

				<pre>

				    Tested-by: Joe Hacker &lt;jhacker@foo.com&gt;

				</pre>

				<li>If the patch was reviewed (usually the case) or acked by someone,

				that should be documented with:

				<pre>

				    Reviewed-by: Joe Hacker &lt;jhacker@foo.com&gt;

				    Acked-by: Joe Hacker &lt;jhacker@foo.com&gt;

				</pre>

				<li>If sending later revision of a patch, add all the tags - ack, r-b,

				Cc: mesa-stable and/or other. This provides reviewers with quick feedback if the

				patch has already been reviewed.

				<li>In order for your patch to reach the prospective reviewer easier/faster,

				use the script scripts/get_reviewer.pl to get a list of individuals and include

				them in the CC list.

				<br>

				Please use common sense and do <strong>not</strong> blindly add everyone.

				<br>

				<pre>

				    $ scripts/get_reviewer.pl --help # to get the help screen

				    $ scripts/get_reviewer.pl -f src/egl/drivers/dri2/platform_android.c

				    Rob Herring <robh@kernel.org> (reviewer:ANDROID EGL SUPPORT,added_lines:188/700=27%,removed_lines:58/283=20%)

				    Tomasz Figa <tfiga@chromium.org> (reviewer:ANDROID EGL SUPPORT,authored:12/41=29%,added_lines:308/700=44%,removed_lines:115/283=41%)

				    Emil Velikov <emil.l.velikov@gmail.com> (authored:13/41=32%,removed_lines:76/283=27%)

				</pre>

				</ul>

				<h2 id="testing">Testing Patches</h2>

				<p>

				It should go without saying that patches must be tested.  In general,

				do whatever testing is prudent.

				</p>

				<p>

				You should always run the Mesa test suite before submitting patches.

				The test suite can be run using the 'make check' command. All tests

				must pass before patches will be accepted, this may mean you have

				to update the tests themselves.

				</p>

				<p>

				Whenever possible and applicable, test the patch with

				<a href="https://piglit.freedesktop.org">Piglit</a> and/or

				<a href="https://android.googlesource.com/platform/external/deqp/">dEQP</a>

				to check for regressions.

				</p>

				<h2 id="mailing">Mailing Patches</h2>

				<p>

				Patches should be sent to the mesa-dev mailing list for review:

				<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">

				mesa-dev@lists.freedesktop.org</a>.

				When submitting a patch make sure to use

				<a href="https://git-scm.com/docs/git-send-email">git send-email</a>

				rather than attaching patches to emails. Sending patches as

				attachments prevents people from being able to provide in-line review

				comments.

				</p>

				<p>

				When submitting follow-up patches you can use --in-reply-to to make v2, v3,

				etc patches show up as replies to the originals. This usually works well

				when you're sending out updates to individual patches (as opposed to

				re-sending the whole series). Using --in-reply-to makes

				it harder for reviewers to accidentally review old patches.

				</p>

				<p>

				When submitting follow-up patches you should also login to

				<a href="https://patchwork.freedesktop.org">patchwork</a> and change the

				state of your old patches to Superseded.

				</p>

				<p>

				Some companies' mail server automatically append a legal disclaimer,

				usually containing something along the lines of "The information in this

				email is confidential" and "distribution is strictly prohibited".<br/>

				These legal notices prevent us from being able to accept your patch,

				rendering the whole process pointless. Please make sure these are

				disabled before sending your patches. (Note that you may need to contact

				your email administrator for this.)

				</p>

				<h2 id="reviewing">Reviewing Patches</h2>

				<p>

				When you've reviewed a patch on the mailing list, please be unambiguous

				about your review.  That is, state either

				</p>

				<pre>

				    Reviewed-by: Joe Hacker &lt;jhacker@foo.com&gt;

				</pre>

				or

				<pre>

				    Acked-by: Joe Hacker &lt;jhacker@foo.com&gt;

				</pre>

				<p>

				Rather than saying just "LGTM" or "Seems OK".

				</p>

				<p>

				If small changes are suggested, it's OK to say something like:

				</p>

				<pre>

				   With the above fixes, Reviewed-by: Joe Hacker &lt;jhacker@foo.com&gt;

				</pre>

				<p>

				which tells the patch author that the patch can be committed, as long

				as the issues are resolved first.

				</p>

				<h2 id="nominations">Nominating a commit for a stable branch</h2>

				<p>

				There are three ways to nominate a patch for inclusion in the stable branch and

				release.

				</p>

				<ul>

				<li> By adding the Cc: mesa-stable@ tag as described below.

				<li> Sending the commit ID (as seen in master branch) to the mesa-stable@ mailing list.

				<li> Forwarding the patch from the mesa-dev@ mailing list.

				</li>

				</ul>

				<p>

				Note: resending patch identical to one on mesa-dev@ or one that differs only

				by the extra mesa-stable@ tag is <strong>not</strong> recommended.

				</p>

				<h3 id="thetag">The stable tag</h3>

				<p>

				If you want a commit to be applied to a stable branch,

				you should add an appropriate note to the commit message.

				</p>

				<p>

				Here are some examples of such a note:

				</p>

				<ul>

				  <li>CC: &lt;mesa-stable@lists.freedesktop.org&gt;</li>

				</ul>

				Simply adding the CC to the mesa-stable list address is adequate to nominate

				the commit for all the active stable branches. If the commit is not applicable

				for said branch the stable-release manager will reply stating so.

				This "CC" syntax for patch nomination will cause patches to automatically be

				copied to the mesa-stable@ mailing list when you use "git send-email" to send

				patches to the mesa-dev@ mailing list. If you prefer using --suppress-cc that

				won't have any negative effect on the patch nomination.

				<p>

				Note: by removing the tag [as the commit is pushed] the patch is

				<strong>explicitly</strong> rejected from inclusion in the stable branch(es).

				<br>

				Thus, drop the line <strong>only</strong> if you want to cancel the nomination.

				</p>

				Alternatively, if one uses the "Fixes" tag as described in the "Patch formatting"

				section, it nominates a commit for all active stable branches that include the

				commit that is referred to.

				<h2 id="criteria">Criteria for accepting patches to the stable branch</h2>

				Mesa has a designated release manager for each stable branch, and the release

				manager is the only developer that should be pushing changes to these branches.

				Everyone else should nominate patches using the mechanism described above.

				The following rules define which patches are accepted and which are not. The

				stable-release manager is also given broad discretion in rejecting patches

				that have been nominated.

				<ul>

				  <li>Patch must conform with the <a href="#guidelines">Basic guidelines</a></li>

				  <li>Patch must have landed in master first. In case where the original

				  patch is too large and/or otherwise contradicts with the rules set within, a

				  backport is appropriate.</li>

				  <li>It must not introduce a regression - be that build or runtime wise.

				  Note:  If the regression is due to faulty piglit/dEQP/CTS/other test the

				  latter must be fixed first. A reference to the offending test(s) and

				  respective fix(es) should be provided in the nominated patch.</li>

				  <li>Patch cannot be larger than 100 lines.</li>

				  <li>Patches that move code around with no functional change should be

				  rejected.</li>

				  <li>Patch must be a bug fix and not a new feature.

				  Note: An exception to this rule, are hardware-enabling "features". For

				  example, <a href="#backports">backports</a> of new code to support a

				  newly-developed hardware product can be accepted if they can be reasonably

				  determined not to have effects on other hardware.</li>

				  <li>Patch must be reviewed, For example, the commit message has Reviewed-by,

				  Signed-off-by, or Tested-by tags from someone but the author.</li>

				  <li>Performance patches are considered only if they provide information

				  about the hardware, program in question and observed improvement. Use numbers

				  to represent your measurements.</li>

				</ul>

				If the patch complies with the rules it will be

				<a href="releasing.html#pickntest">cherry-picked</a>. Alternatively the release

				manager will reply to the patch in question stating why the patch has been

				rejected or would request a backport.

				A summary of all the picked/rejected patches will be presented in the

				<a href="releasing.html#prerelease">pre-release</a> announcement.

				The stable-release manager may at times need to force-push changes to the

				stable branches, for example, to drop a previously-picked patch that was later

				identified as causing a regression). These force-pushes may cause changes to

				be lost from the stable branch if developers push things directly. Consider

				yourself warned.

				<h2 id="backports">Sending backports for the stable branch</h2>

				By default merge conflicts are resolved by the stable-release manager. In which

				case he/she should provide a comment about the changes required, alongside the

				<code>Conflicts</code> section. Summary of which will be provided in the

				<a href="releasing.html#prerelease">pre-release</a> announcement.

				<br>

				Developers are interested in sending backports are recommended to use either a

				<code>[BACKPORT #branch]</code> subject prefix or provides similar information

				within the commit summary.

				<h2 id="gittips">Git tips</h2>

				<ul>

				<li><code>git rebase -i ...</code> is your friend. Don't be afraid to use it.

				<li>Apply a fixup to commit FOO.

				<pre>

				    git add ...

				    git commit --fixup=FOO

				    git rebase -i --autosquash ...

				</pre>

				<li>Test for build breakage between patches e.g last 8 commits.

				<pre>

				    git rebase -i --exec="make -j4" HEAD~8

				</pre>

				<li>Sets the default mailing address for your repo.

				<pre>

				    git config --local sendemail.to mesa-dev@lists.freedesktop.org

				</pre>

				<li> Add version to subject line of patch series in this case for the last 8

				commits before sending.

				<pre>

				    git send-email --subject-prefix="PATCH v4" HEAD~8

				    git send-email -v4 @~8 # shorter version, inherited from git format-patch

				</pre>

				<li> Configure git to use the get_reviewer.pl script interactively. Thus you

				can avoid adding the world to the CC list.

				<pre>

				    git config sendemail.cccmd "./scripts/get_reviewer.pl -i"

				</pre>

				</ul>

				</div>

				</body>

				</html>

									
										8

docs/systems.html
									
												View File
												
				@@ -36,10 +36,10 @@ Hardware drivers include:

				  <li>Intel i965, i945, i915.

				    See <a href="https://01.org/linuxgraphics">Intel's website</a></li>

				  <li>AMD Radeon series.

				  See <a href="http://www.x.org/wiki/RadeonFeature">RadeonFeature</a></li>

				  See <a href="https://www.x.org/wiki/RadeonFeature">RadeonFeature</a></li>

				  <li>NVIDIA GPUs.

				  See <a href="http://nouveau.freedesktop.org">Nouveau Wiki</a></li>

				  <li><a href="http://www.x.org/wiki/vmware">VMware virtual GPU</a></li>

				  See <a href="https://nouveau.freedesktop.org">Nouveau Wiki</a></li>

				  <li><a href="https://www.x.org/wiki/vmware">VMware virtual GPU</a></li>

				</ul>

				<p>

				@@ -57,7 +57,7 @@ Additional driver information:

				</p>

				<ul>

				<li><a href="http://dri.freedesktop.org/"> DRI hardware

				<li><a href="https://dri.freedesktop.org/"> DRI hardware

				drivers</a> for the X Window System

				<li><a href="xlibdriver.html">Xlib / swrast driver</a> for the X Window System

				and Unix-like operating systems

									
										13

docs/thanks.html
									
												View File
												
				@@ -24,7 +24,7 @@ This list is far from complete and somewhat dated, unfortunately.

				<ul>

				<li>Early Mesa development was done while Brian was part of the

				<a href="http://www.ssec.wisc.edu/~billh/vis.html">

				<a href="https://www.ssec.wisc.edu/~billh/vis.html">

				SSEC Visualization Project</a> at the University of

				Wisconsin. He'd like to thank Bill Hibbard for letting him work on

				Mesa as part of that project.

				@@ -40,14 +40,9 @@ Tungsten Graphics, Inc. have supported the ongoing development of Mesa.

				<br>

				<br>

				<li>The

				<a href="http://www.mesa3d.org">Mesa</a>

				website is hosted by

				<a href="http://sourceforge.net">sourceforge.net</a>.

				<br>

				<br>

				<li>The Mesa git repository is hosted by

				<a href="http://freedesktop.org/">freedesktop.org</a>.

				<a href="https://www.mesa3d.org">Mesa</a>

				website and git repository are hosted by

				<a href="https://freedesktop.org/">freedesktop.org</a>.

				<br>

				<br>

									
										6

docs/utilities.html
									
												View File
												
				@@ -17,11 +17,11 @@

				<h1>Development Utilities</h1>

				<dl>

				  <dt><a href="http://cgit.freedesktop.org/mesa/demos">Mesa demos collection</a></dt>

				  <dt><a href="https://cgit.freedesktop.org/mesa/demos">Mesa demos collection</a></dt>

				  <dd>includes several utility routines in the <code>src/util/</code>

				  directory.</dd>

				  <dt><a href="http://piglit.freedesktop.org">Piglit</a></dt>

				  <dt><a href="https://piglit.freedesktop.org">Piglit</a></dt>

				  <dd>is an open-source test suite for OpenGL implementations.</dd>

				  <dt><a href="https://github.com/apitrace/apitrace">ApiTrace</a></dt>

				@@ -31,7 +31,7 @@

				  <dd>is a very useful tool for tracking down

				  memory-related problems in your code.</dd>

				  <dt><a href="http://scan.coverity.com/projects/mesa">Coverity</a><dt>

				  <dt><a href="https://scan.coverity.com/projects/mesa">Coverity</a><dt>

				  <dd>provides static code analysis of Mesa.  If you create an account

				  you can see the results and try to fix outstanding issues.</dd>

				</dl>

									
										8

docs/viewperf.html
									
												View File
												
				@@ -18,7 +18,7 @@

				<p>

				This page lists known issues with

				<a href="http://www.spec.org/gwpg/gpc.static/vp11info.html" target="_main">SPEC Viewperf 11</a>

				<a href="https://www.spec.org/gwpg/gpc.static/vp11info.html" target="_main">SPEC Viewperf 11</a>

				and <a href="https://www.spec.org/gwpg/gpc.static/vp12info.html" target="_main">SPEC Viewperf 12</a>

				when running on Mesa-based drivers.

				</p>

				@@ -66,10 +66,10 @@ either in Viewperf or the Mesa driver.

				<p>

				These tests use features of the

				<a href="http://www.opengl.org/registry/specs/NV/fragment_program2.txt"

				<a href="https://www.opengl.org/registry/specs/NV/fragment_program2.txt"

				target="_main">

				GL_NV_fragment_program2</a> and

				<a href="http://www.opengl.org/registry/specs/NV/vertex_program3.txt"

				<a href="https://www.opengl.org/registry/specs/NV/vertex_program3.txt"

				target="_main">

				GL_NV_vertex_program3</a> extensions without checking if the driver supports

				them.

				@@ -86,7 +86,7 @@ Subsequent drawing calls become no-ops and the rendering is incorrect.

				<p>

				These tests depend on the

				<a href="http://www.opengl.org/registry/specs/NV/primitive_restart.txt"

				<a href="https://www.opengl.org/registry/specs/NV/primitive_restart.txt"

				target="_main">GL_NV_primitive_restart</a> extension.

				</p>

									
										10

docs/vmware-guest.html
									
												View File
												
				@@ -18,7 +18,7 @@

				<p>

				This page describes how to build, install and use the

				<a href="http://www.vmware.com/">VMware</a> guest GL driver

				<a href="https://www.vmware.com/">VMware</a> guest GL driver

				(aka the SVGA or SVGA3D driver) for Linux using the latest source code.

				This driver gives a Linux virtual machine access to the host's GPU for

				hardware-accelerated 3D.

				@@ -62,9 +62,9 @@ these instructions explain what to do.

				For more information about the X components see these wiki pages at x.org:

				</p>

				<ul>

				<li><a href="http://wiki.x.org/wiki/vmware">

				<li><a href="https://wiki.x.org/wiki/vmware">

				Driver Overview</a>

				<li><a href="http://wiki.x.org/wiki/vmware/vmware3D">

				<li><a href="https://wiki.x.org/wiki/vmware/vmware3D">

				xf86-video-vmware Details</a>

				</ul>

				@@ -82,8 +82,8 @@ The components involved in this include:

				<p>

				All of these components reside in the guest Linux virtual machine.

				On the host, all you're doing is running VMware

				<a href="http://www.vmware.com/products/workstation/">Workstation</a> or

				<a href="http://www.vmware.com/products/fusion/">Fusion</a>.

				<a href="https://www.vmware.com/products/workstation/">Workstation</a> or

				<a href="https://www.vmware.com/products/fusion/">Fusion</a>.

				</p>

									
										7

docs/xlibdriver.html
									
												View File
												
				@@ -171,9 +171,8 @@ drawn with glDrawPixels.

				</p>

				<p>

				For more information about gamma correction see:

				<a href="http://www.inforamp.net/~poynton/notes/colour_and_gamma/GammaFAQ.html">

				the Gamma FAQ</a>

				For more information about gamma correction, see the

				<a href="https://en.wikipedia.org/wiki/Gamma_correction">Wikipedia article</a>

				</p>

				@@ -199,7 +198,7 @@ This incurs a small performance penalty.

				<h2>Extensions</h2>

				<p>

				The following MESA-specific extensions are implemented in the Xlib driver.

				The following Mesa-specific extensions are implemented in the Xlib driver.

				</p>

				<h3>GLX_MESA_pixmap_colormap</h3>

									
										2

include/D3D9/.editorconfig
									
										Normal file
									
												View File
												
				@@ -0,0 +1,2 @@

				[*.h]

				indent_style = tab

									
										24

include/EGL/egl.h
									
												View File
												
				@@ -6,7 +6,7 @@ extern "C" {

				#endif

				/*

				** Copyright (c) 2013-2014 The Khronos Group Inc.

				** Copyright (c) 2013-2017 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				@@ -31,14 +31,14 @@ extern "C" {

				** This header is generated from the Khronos OpenGL / OpenGL ES XML

				** API Registry. The current version of the Registry, generator scripts

				** used to make the header, and the header can be found at

				**   http://www.opengl.org/registry/

				**   http://www.opengl.org/registry/egl

				**

				** Khronos $Revision: 31039 $ on $Date: 2015-05-04 17:01:57 -0700 (Mon, 04 May 2015) $

				** Khronos $Revision$ on $Date$

				*/

				#include <EGL/eglplatform.h>

				/* Generated on date 20150504 */

				/* Generated on date 20161230 */

				/* Generated C header for:

				 * API: egl

				@@ -78,7 +78,7 @@ typedef void (*__eglMustCastToProperFunctionPointerType)(void);

				#define EGL_CONFIG_ID                     0x3028

				#define EGL_CORE_NATIVE_ENGINE            0x305B

				#define EGL_DEPTH_SIZE                    0x3025

				#define EGL_DONT_CARE                     ((EGLint)-1)

				#define EGL_DONT_CARE                     EGL_CAST(EGLint,-1)

				#define EGL_DRAW                          0x3059

				#define EGL_EXTENSIONS                    0x3055

				#define EGL_FALSE                         0

				@@ -95,9 +95,9 @@ typedef void (*__eglMustCastToProperFunctionPointerType)(void);

				#define EGL_NONE                          0x3038

				#define EGL_NON_CONFORMANT_CONFIG         0x3051

				#define EGL_NOT_INITIALIZED               0x3001

				#define EGL_NO_CONTEXT                    ((EGLContext)0)

				#define EGL_NO_DISPLAY                    ((EGLDisplay)0)

				#define EGL_NO_SURFACE                    ((EGLSurface)0)

				#define EGL_NO_CONTEXT                    EGL_CAST(EGLContext,0)

				#define EGL_NO_DISPLAY                    EGL_CAST(EGLDisplay,0)

				#define EGL_NO_SURFACE                    EGL_CAST(EGLSurface,0)

				#define EGL_PBUFFER_BIT                   0x0001

				#define EGL_PIXMAP_BIT                    0x0002

				#define EGL_READ                          0x305A

				@@ -197,7 +197,7 @@ typedef void *EGLClientBuffer;

				#define EGL_RGB_BUFFER                    0x308E

				#define EGL_SINGLE_BUFFER                 0x3085

				#define EGL_SWAP_BEHAVIOR                 0x3093

				#define EGL_UNKNOWN                       ((EGLint)-1)

				#define EGL_UNKNOWN                       EGL_CAST(EGLint,-1)

				#define EGL_VERTICAL_RESOLUTION           0x3091

				EGLAPI EGLBoolean EGLAPIENTRY eglBindAPI (EGLenum api);

				EGLAPI EGLenum EGLAPIENTRY eglQueryAPI (void);

				@@ -224,7 +224,7 @@ EGLAPI EGLBoolean EGLAPIENTRY eglWaitClient (void);

				#ifndef EGL_VERSION_1_4

				#define EGL_VERSION_1_4 1

				#define EGL_DEFAULT_DISPLAY               ((EGLNativeDisplayType)0)

				#define EGL_DEFAULT_DISPLAY               EGL_CAST(EGLNativeDisplayType,0)

				#define EGL_MULTISAMPLE_RESOLVE_BOX_BIT   0x0200

				#define EGL_MULTISAMPLE_RESOLVE           0x3099

				#define EGL_MULTISAMPLE_RESOLVE_DEFAULT   0x309A

				@@ -266,7 +266,7 @@ typedef void *EGLImage;

				#define EGL_FOREVER                       0xFFFFFFFFFFFFFFFFull

				#define EGL_TIMEOUT_EXPIRED               0x30F5

				#define EGL_CONDITION_SATISFIED           0x30F6

				#define EGL_NO_SYNC                       ((EGLSync)0)

				#define EGL_NO_SYNC                       EGL_CAST(EGLSync,0)

				#define EGL_SYNC_FENCE                    0x30F9

				#define EGL_GL_COLORSPACE                 0x309D

				#define EGL_GL_COLORSPACE_SRGB            0x3089

				@@ -283,7 +283,7 @@ typedef void *EGLImage;

				#define EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_Z 0x30B7

				#define EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_Z 0x30B8

				#define EGL_IMAGE_PRESERVED               0x30D2

				#define EGL_NO_IMAGE                      ((EGLImage)0)

				#define EGL_NO_IMAGE                      EGL_CAST(EGLImage,0)

				EGLAPI EGLSync EGLAPIENTRY eglCreateSync (EGLDisplay dpy, EGLenum type, const EGLAttrib *attrib_list);

				EGLAPI EGLBoolean EGLAPIENTRY eglDestroySync (EGLDisplay dpy, EGLSync sync);

				EGLAPI EGLint EGLAPIENTRY eglClientWaitSync (EGLDisplay dpy, EGLSync sync, EGLint flags, EGLTime timeout);

									
										312

include/EGL/eglext.h
									
												View File
												
				@@ -6,7 +6,7 @@ extern "C" {

				#endif

				/*

				** Copyright (c) 2013-2014 The Khronos Group Inc.

				** Copyright (c) 2013-2017 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				@@ -31,14 +31,14 @@ extern "C" {

				** This header is generated from the Khronos OpenGL / OpenGL ES XML

				** API Registry. The current version of the Registry, generator scripts

				** used to make the header, and the header can be found at

				**   http://www.opengl.org/registry/

				**   http://www.opengl.org/registry/egl

				**

				** Khronos $Revision$ on $Date$

				*/

				#include <EGL/eglplatform.h>

				#define EGL_EGLEXT_VERSION 20150508

				#define EGL_EGLEXT_VERSION 20161230

				/* Generated C header for:

				 * API: egl

				@@ -77,6 +77,13 @@ EGLAPI EGLSyncKHR EGLAPIENTRY eglCreateSync64KHR (EGLDisplay dpy, EGLenum type,

				#define EGL_VG_ALPHA_FORMAT_PRE_BIT_KHR   0x0040

				#endif /* EGL_KHR_config_attribs */

				#ifndef EGL_KHR_context_flush_control

				#define EGL_KHR_context_flush_control 1

				#define EGL_CONTEXT_RELEASE_BEHAVIOR_NONE_KHR 0

				#define EGL_CONTEXT_RELEASE_BEHAVIOR_KHR  0x2097

				#define EGL_CONTEXT_RELEASE_BEHAVIOR_FLUSH_KHR 0x2098

				#endif /* EGL_KHR_context_flush_control */

				#ifndef EGL_KHR_create_context

				#define EGL_KHR_create_context 1

				#define EGL_CONTEXT_MAJOR_VERSION_KHR     0x3098

				@@ -99,6 +106,33 @@ EGLAPI EGLSyncKHR EGLAPIENTRY eglCreateSync64KHR (EGLDisplay dpy, EGLenum type,

				#define EGL_CONTEXT_OPENGL_NO_ERROR_KHR   0x31B3

				#endif /* EGL_KHR_create_context_no_error */

				#ifndef EGL_KHR_debug

				#define EGL_KHR_debug 1

				typedef void *EGLLabelKHR;

				typedef void *EGLObjectKHR;

				typedef void (EGLAPIENTRY  *EGLDEBUGPROCKHR)(EGLenum error,const char *command,EGLint messageType,EGLLabelKHR threadLabel,EGLLabelKHR objectLabel,const char* message);

				#define EGL_OBJECT_THREAD_KHR             0x33B0

				#define EGL_OBJECT_DISPLAY_KHR            0x33B1

				#define EGL_OBJECT_CONTEXT_KHR            0x33B2

				#define EGL_OBJECT_SURFACE_KHR            0x33B3

				#define EGL_OBJECT_IMAGE_KHR              0x33B4

				#define EGL_OBJECT_SYNC_KHR               0x33B5

				#define EGL_OBJECT_STREAM_KHR             0x33B6

				#define EGL_DEBUG_MSG_CRITICAL_KHR        0x33B9

				#define EGL_DEBUG_MSG_ERROR_KHR           0x33BA

				#define EGL_DEBUG_MSG_WARN_KHR            0x33BB

				#define EGL_DEBUG_MSG_INFO_KHR            0x33BC

				#define EGL_DEBUG_CALLBACK_KHR            0x33B8

				typedef EGLint (EGLAPIENTRYP PFNEGLDEBUGMESSAGECONTROLKHRPROC) (EGLDEBUGPROCKHR callback, const EGLAttrib *attrib_list);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDEBUGKHRPROC) (EGLint attribute, EGLAttrib *value);

				typedef EGLint (EGLAPIENTRYP PFNEGLLABELOBJECTKHRPROC) (EGLDisplay display, EGLenum objectType, EGLObjectKHR object, EGLLabelKHR label);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLint EGLAPIENTRY eglDebugMessageControlKHR (EGLDEBUGPROCKHR callback, const EGLAttrib *attrib_list);

				EGLAPI EGLBoolean EGLAPIENTRY eglQueryDebugKHR (EGLint attribute, EGLAttrib *value);

				EGLAPI EGLint EGLAPIENTRY eglLabelObjectKHR (EGLDisplay display, EGLenum objectType, EGLObjectKHR object, EGLLabelKHR label);

				#endif

				#endif /* EGL_KHR_debug */

				#ifndef EGL_KHR_fence_sync

				#define EGL_KHR_fence_sync 1

				typedef khronos_utime_nanoseconds_t EGLTimeKHR;

				@@ -161,7 +195,7 @@ EGLAPI EGLBoolean EGLAPIENTRY eglGetSyncAttribKHR (EGLDisplay dpy, EGLSyncKHR sy

				#define EGL_KHR_image 1

				typedef void *EGLImageKHR;

				#define EGL_NATIVE_PIXMAP_KHR             0x30B0

				#define EGL_NO_IMAGE_KHR                  ((EGLImageKHR)0)

				#define EGL_NO_IMAGE_KHR                  EGL_CAST(EGLImageKHR,0)

				typedef EGLImageKHR (EGLAPIENTRYP PFNEGLCREATEIMAGEKHRPROC) (EGLDisplay dpy, EGLContext ctx, EGLenum target, EGLClientBuffer buffer, const EGLint *attrib_list);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLDESTROYIMAGEKHRPROC) (EGLDisplay dpy, EGLImageKHR image);

				#ifdef EGL_EGLEXT_PROTOTYPES

				@@ -223,6 +257,16 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurface64KHR (EGLDisplay dpy, EGLSurface s

				#endif

				#endif /* EGL_KHR_lock_surface3 */

				#ifndef EGL_KHR_mutable_render_buffer

				#define EGL_KHR_mutable_render_buffer 1

				#define EGL_MUTABLE_RENDER_BUFFER_BIT_KHR 0x1000

				#endif /* EGL_KHR_mutable_render_buffer */

				#ifndef EGL_KHR_no_config_context

				#define EGL_KHR_no_config_context 1

				#define EGL_NO_CONFIG_KHR                 EGL_CAST(EGLConfig,0)

				#endif /* EGL_KHR_no_config_context */

				#ifndef EGL_KHR_partial_update

				#define EGL_KHR_partial_update 1

				#define EGL_BUFFER_AGE_KHR                0x313D

				@@ -265,7 +309,7 @@ EGLAPI EGLBoolean EGLAPIENTRY eglSetDamageRegionKHR (EGLDisplay dpy, EGLSurface

				#define EGL_SYNC_REUSABLE_KHR             0x30FA

				#define EGL_SYNC_FLUSH_COMMANDS_BIT_KHR   0x0001

				#define EGL_FOREVER_KHR                   0xFFFFFFFFFFFFFFFFull

				#define EGL_NO_SYNC_KHR                   ((EGLSyncKHR)0)

				#define EGL_NO_SYNC_KHR                   EGL_CAST(EGLSyncKHR,0)

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLSIGNALSYNCKHRPROC) (EGLDisplay dpy, EGLSyncKHR sync, EGLenum mode);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLBoolean EGLAPIENTRY eglSignalSyncKHR (EGLDisplay dpy, EGLSyncKHR sync, EGLenum mode);

				@@ -278,7 +322,7 @@ EGLAPI EGLBoolean EGLAPIENTRY eglSignalSyncKHR (EGLDisplay dpy, EGLSyncKHR sync,

				typedef void *EGLStreamKHR;

				typedef khronos_uint64_t EGLuint64KHR;

				#ifdef KHRONOS_SUPPORT_INT64

				#define EGL_NO_STREAM_KHR                 ((EGLStreamKHR)0)

				#define EGL_NO_STREAM_KHR                 EGL_CAST(EGLStreamKHR,0)

				#define EGL_CONSUMER_LATENCY_USEC_KHR     0x3210

				#define EGL_PRODUCER_FRAME_KHR            0x3212

				#define EGL_CONSUMER_FRAME_KHR            0x3213

				@@ -306,6 +350,24 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQueryStreamu64KHR (EGLDisplay dpy, EGLStreamKHR

				#endif /* KHRONOS_SUPPORT_INT64 */

				#endif /* EGL_KHR_stream */

				#ifndef EGL_KHR_stream_attrib

				#define EGL_KHR_stream_attrib 1

				#ifdef KHRONOS_SUPPORT_INT64

				typedef EGLStreamKHR (EGLAPIENTRYP PFNEGLCREATESTREAMATTRIBKHRPROC) (EGLDisplay dpy, const EGLAttrib *attrib_list);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLSETSTREAMATTRIBKHRPROC) (EGLDisplay dpy, EGLStreamKHR stream, EGLenum attribute, EGLAttrib value);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYSTREAMATTRIBKHRPROC) (EGLDisplay dpy, EGLStreamKHR stream, EGLenum attribute, EGLAttrib *value);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLSTREAMCONSUMERACQUIREATTRIBKHRPROC) (EGLDisplay dpy, EGLStreamKHR stream, const EGLAttrib *attrib_list);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLSTREAMCONSUMERRELEASEATTRIBKHRPROC) (EGLDisplay dpy, EGLStreamKHR stream, const EGLAttrib *attrib_list);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLStreamKHR EGLAPIENTRY eglCreateStreamAttribKHR (EGLDisplay dpy, const EGLAttrib *attrib_list);

				EGLAPI EGLBoolean EGLAPIENTRY eglSetStreamAttribKHR (EGLDisplay dpy, EGLStreamKHR stream, EGLenum attribute, EGLAttrib value);

				EGLAPI EGLBoolean EGLAPIENTRY eglQueryStreamAttribKHR (EGLDisplay dpy, EGLStreamKHR stream, EGLenum attribute, EGLAttrib *value);

				EGLAPI EGLBoolean EGLAPIENTRY eglStreamConsumerAcquireAttribKHR (EGLDisplay dpy, EGLStreamKHR stream, const EGLAttrib *attrib_list);

				EGLAPI EGLBoolean EGLAPIENTRY eglStreamConsumerReleaseAttribKHR (EGLDisplay dpy, EGLStreamKHR stream, const EGLAttrib *attrib_list);

				#endif

				#endif /* KHRONOS_SUPPORT_INT64 */

				#endif /* EGL_KHR_stream_attrib */

				#ifndef EGL_KHR_stream_consumer_gltexture

				#define EGL_KHR_stream_consumer_gltexture 1

				#ifdef EGL_KHR_stream

				@@ -325,7 +387,7 @@ EGLAPI EGLBoolean EGLAPIENTRY eglStreamConsumerReleaseKHR (EGLDisplay dpy, EGLSt

				#define EGL_KHR_stream_cross_process_fd 1

				typedef int EGLNativeFileDescriptorKHR;

				#ifdef EGL_KHR_stream

				#define EGL_NO_FILE_DESCRIPTOR_KHR        ((EGLNativeFileDescriptorKHR)(-1))

				#define EGL_NO_FILE_DESCRIPTOR_KHR        EGL_CAST(EGLNativeFileDescriptorKHR,-1)

				typedef EGLNativeFileDescriptorKHR (EGLAPIENTRYP PFNEGLGETSTREAMFILEDESCRIPTORKHRPROC) (EGLDisplay dpy, EGLStreamKHR stream);

				typedef EGLStreamKHR (EGLAPIENTRYP PFNEGLCREATESTREAMFROMFILEDESCRIPTORKHRPROC) (EGLDisplay dpy, EGLNativeFileDescriptorKHR file_descriptor);

				#ifdef EGL_EGLEXT_PROTOTYPES

				@@ -402,11 +464,28 @@ EGLAPI void EGLAPIENTRY eglSetBlobCacheFuncsANDROID (EGLDisplay dpy, EGLSetBlobF

				#endif

				#endif /* EGL_ANDROID_blob_cache */

				#ifndef EGL_ANDROID_create_native_client_buffer

				#define EGL_ANDROID_create_native_client_buffer 1

				#define EGL_NATIVE_BUFFER_USAGE_ANDROID   0x3143

				#define EGL_NATIVE_BUFFER_USAGE_PROTECTED_BIT_ANDROID 0x00000001

				#define EGL_NATIVE_BUFFER_USAGE_RENDERBUFFER_BIT_ANDROID 0x00000002

				#define EGL_NATIVE_BUFFER_USAGE_TEXTURE_BIT_ANDROID 0x00000004

				typedef EGLClientBuffer (EGLAPIENTRYP PFNEGLCREATENATIVECLIENTBUFFERANDROIDPROC) (const EGLint *attrib_list);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLClientBuffer EGLAPIENTRY eglCreateNativeClientBufferANDROID (const EGLint *attrib_list);

				#endif

				#endif /* EGL_ANDROID_create_native_client_buffer */

				#ifndef EGL_ANDROID_framebuffer_target

				#define EGL_ANDROID_framebuffer_target 1

				#define EGL_FRAMEBUFFER_TARGET_ANDROID    0x3147

				#endif /* EGL_ANDROID_framebuffer_target */

				#ifndef EGL_ANDROID_front_buffer_auto_refresh

				#define EGL_ANDROID_front_buffer_auto_refresh 1

				#define EGL_FRONT_BUFFER_AUTO_REFRESH_ANDROID 0x314C

				#endif /* EGL_ANDROID_front_buffer_auto_refresh */

				#ifndef EGL_ANDROID_image_native_buffer

				#define EGL_ANDROID_image_native_buffer 1

				#define EGL_NATIVE_BUFFER_ANDROID         0x3140

				@@ -424,6 +503,15 @@ EGLAPI EGLint EGLAPIENTRY eglDupNativeFenceFDANDROID (EGLDisplay dpy, EGLSyncKHR

				#endif

				#endif /* EGL_ANDROID_native_fence_sync */

				#ifndef EGL_ANDROID_presentation_time

				#define EGL_ANDROID_presentation_time 1

				typedef khronos_stime_nanoseconds_t EGLnsecsANDROID;

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLPRESENTATIONTIMEANDROIDPROC) (EGLDisplay dpy, EGLSurface surface, EGLnsecsANDROID time);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLBoolean EGLAPIENTRY eglPresentationTimeANDROID (EGLDisplay dpy, EGLSurface surface, EGLnsecsANDROID time);

				#endif

				#endif /* EGL_ANDROID_presentation_time */

				#ifndef EGL_ANDROID_recordable

				#define EGL_ANDROID_recordable 1

				#define EGL_RECORDABLE_ANDROID            0x3142

				@@ -457,6 +545,11 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurfacePointerANGLE (EGLDisplay dpy, EGLSu

				#define EGL_FIXED_SIZE_ANGLE              0x3201

				#endif /* EGL_ANGLE_window_fixed_size */

				#ifndef EGL_ARM_implicit_external_sync

				#define EGL_ARM_implicit_external_sync 1

				#define EGL_SYNC_PRIOR_COMMANDS_IMPLICIT_EXTERNAL_ARM 0x328A

				#endif /* EGL_ARM_implicit_external_sync */

				#ifndef EGL_ARM_pixmap_multisample_discard

				#define EGL_ARM_pixmap_multisample_discard 1

				#define EGL_DISCARD_SAMPLES_ARM           0x3286

				@@ -482,7 +575,7 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurfacePointerANGLE (EGLDisplay dpy, EGLSu

				#ifndef EGL_EXT_device_base

				#define EGL_EXT_device_base 1

				typedef void *EGLDeviceEXT;

				#define EGL_NO_DEVICE_EXT                 ((EGLDeviceEXT)(0))

				#define EGL_NO_DEVICE_EXT                 EGL_CAST(EGLDeviceEXT,0)

				#define EGL_BAD_DEVICE_EXT                0x322B

				#define EGL_DEVICE_EXT                    0x322C

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDEVICEATTRIBEXTPROC) (EGLDeviceEXT device, EGLint attribute, EGLAttrib *value);

				@@ -515,6 +608,21 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQueryDisplayAttribEXT (EGLDisplay dpy, EGLint a

				#define EGL_EXT_device_query 1

				#endif /* EGL_EXT_device_query */

				#ifndef EGL_EXT_gl_colorspace_bt2020_linear

				#define EGL_EXT_gl_colorspace_bt2020_linear 1

				#define EGL_GL_COLORSPACE_BT2020_LINEAR_EXT 0x333F

				#endif /* EGL_EXT_gl_colorspace_bt2020_linear */

				#ifndef EGL_EXT_gl_colorspace_bt2020_pq

				#define EGL_EXT_gl_colorspace_bt2020_pq 1

				#define EGL_GL_COLORSPACE_BT2020_PQ_EXT   0x3340

				#endif /* EGL_EXT_gl_colorspace_bt2020_pq */

				#ifndef EGL_EXT_gl_colorspace_scrgb_linear

				#define EGL_EXT_gl_colorspace_scrgb_linear 1

				#define EGL_GL_COLORSPACE_SCRGB_LINEAR_EXT 0x3350

				#endif /* EGL_EXT_gl_colorspace_scrgb_linear */

				#ifndef EGL_EXT_image_dma_buf_import

				#define EGL_EXT_image_dma_buf_import 1

				#define EGL_LINUX_DMA_BUF_EXT             0x3270

				@@ -541,6 +649,27 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQueryDisplayAttribEXT (EGLDisplay dpy, EGLint a

				#define EGL_YUV_CHROMA_SITING_0_5_EXT     0x3285

				#endif /* EGL_EXT_image_dma_buf_import */

				#ifndef EGL_EXT_image_dma_buf_import_modifiers

				#define EGL_EXT_image_dma_buf_import_modifiers 1

				#define EGL_DMA_BUF_PLANE3_FD_EXT         0x3440

				#define EGL_DMA_BUF_PLANE3_OFFSET_EXT     0x3441

				#define EGL_DMA_BUF_PLANE3_PITCH_EXT      0x3442

				#define EGL_DMA_BUF_PLANE0_MODIFIER_LO_EXT 0x3443

				#define EGL_DMA_BUF_PLANE0_MODIFIER_HI_EXT 0x3444

				#define EGL_DMA_BUF_PLANE1_MODIFIER_LO_EXT 0x3445

				#define EGL_DMA_BUF_PLANE1_MODIFIER_HI_EXT 0x3446

				#define EGL_DMA_BUF_PLANE2_MODIFIER_LO_EXT 0x3447

				#define EGL_DMA_BUF_PLANE2_MODIFIER_HI_EXT 0x3448

				#define EGL_DMA_BUF_PLANE3_MODIFIER_LO_EXT 0x3449

				#define EGL_DMA_BUF_PLANE3_MODIFIER_HI_EXT 0x344A

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDMABUFFORMATSEXTPROC) (EGLDisplay dpy, EGLint max_formats, EGLint *formats, EGLint *num_formats);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDMABUFMODIFIERSEXTPROC) (EGLDisplay dpy, EGLint format, EGLint max_modifiers, EGLuint64KHR *modifiers, EGLBoolean *external_only, EGLint *num_modifiers);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLBoolean EGLAPIENTRY eglQueryDmaBufFormatsEXT (EGLDisplay dpy, EGLint max_formats, EGLint *formats, EGLint *num_formats);

				EGLAPI EGLBoolean EGLAPIENTRY eglQueryDmaBufModifiersEXT (EGLDisplay dpy, EGLint format, EGLint max_modifiers, EGLuint64KHR *modifiers, EGLBoolean *external_only, EGLint *num_modifiers);

				#endif

				#endif /* EGL_EXT_image_dma_buf_import_modifiers */

				#ifndef EGL_EXT_multiview_window

				#define EGL_EXT_multiview_window 1

				#define EGL_MULTIVIEW_VIEW_COUNT_EXT      0x3134

				@@ -550,8 +679,8 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQueryDisplayAttribEXT (EGLDisplay dpy, EGLint a

				#define EGL_EXT_output_base 1

				typedef void *EGLOutputLayerEXT;

				typedef void *EGLOutputPortEXT;

				#define EGL_NO_OUTPUT_LAYER_EXT           ((EGLOutputLayerEXT)0)

				#define EGL_NO_OUTPUT_PORT_EXT            ((EGLOutputPortEXT)0)

				#define EGL_NO_OUTPUT_LAYER_EXT           EGL_CAST(EGLOutputLayerEXT,0)

				#define EGL_NO_OUTPUT_PORT_EXT            EGL_CAST(EGLOutputPortEXT,0)

				#define EGL_BAD_OUTPUT_LAYER_EXT          0x322D

				#define EGL_BAD_OUTPUT_PORT_EXT           0x322E

				#define EGL_SWAP_INTERVAL_EXT             0x322F

				@@ -588,6 +717,13 @@ EGLAPI const char *EGLAPIENTRY eglQueryOutputPortStringEXT (EGLDisplay dpy, EGLO

				#define EGL_OPENWF_PORT_ID_EXT            0x3239

				#endif /* EGL_EXT_output_openwf */

				#ifndef EGL_EXT_pixel_format_float

				#define EGL_EXT_pixel_format_float 1

				#define EGL_COLOR_COMPONENT_TYPE_EXT      0x3339

				#define EGL_COLOR_COMPONENT_TYPE_FIXED_EXT 0x333A

				#define EGL_COLOR_COMPONENT_TYPE_FLOAT_EXT 0x333B

				#endif /* EGL_EXT_pixel_format_float */

				#ifndef EGL_EXT_platform_base

				#define EGL_EXT_platform_base 1

				typedef EGLDisplay (EGLAPIENTRYP PFNEGLGETPLATFORMDISPLAYEXTPROC) (EGLenum platform, void *native_display, const EGLint *attrib_list);

				@@ -616,9 +752,13 @@ EGLAPI EGLSurface EGLAPIENTRY eglCreatePlatformPixmapSurfaceEXT (EGLDisplay dpy,

				#define EGL_PLATFORM_X11_SCREEN_EXT       0x31D6

				#endif /* EGL_EXT_platform_x11 */

				#ifndef EGL_EXT_protected_content

				#define EGL_EXT_protected_content 1

				#define EGL_PROTECTED_CONTENT_EXT         0x32C0

				#endif /* EGL_EXT_protected_content */

				#ifndef EGL_EXT_protected_surface

				#define EGL_EXT_protected_surface 1

				#define EGL_PROTECTED_CONTENT_EXT         0x32C0

				#endif /* EGL_EXT_protected_surface */

				#ifndef EGL_EXT_stream_consumer_egloutput

				@@ -629,6 +769,20 @@ EGLAPI EGLBoolean EGLAPIENTRY eglStreamConsumerOutputEXT (EGLDisplay dpy, EGLStr

				#endif

				#endif /* EGL_EXT_stream_consumer_egloutput */

				#ifndef EGL_EXT_surface_SMPTE2086_metadata

				#define EGL_EXT_surface_SMPTE2086_metadata 1

				#define EGL_SMPTE2086_DISPLAY_PRIMARY_RX_EXT 0x3341

				#define EGL_SMPTE2086_DISPLAY_PRIMARY_RY_EXT 0x3342

				#define EGL_SMPTE2086_DISPLAY_PRIMARY_GX_EXT 0x3343

				#define EGL_SMPTE2086_DISPLAY_PRIMARY_GY_EXT 0x3344

				#define EGL_SMPTE2086_DISPLAY_PRIMARY_BX_EXT 0x3345

				#define EGL_SMPTE2086_DISPLAY_PRIMARY_BY_EXT 0x3346

				#define EGL_SMPTE2086_WHITE_POINT_X_EXT   0x3347

				#define EGL_SMPTE2086_WHITE_POINT_Y_EXT   0x3348

				#define EGL_SMPTE2086_MAX_LUMINANCE_EXT   0x3349

				#define EGL_SMPTE2086_MIN_LUMINANCE_EXT   0x334A

				#endif /* EGL_EXT_surface_SMPTE2086_metadata */

				#ifndef EGL_EXT_swap_buffers_with_damage

				#define EGL_EXT_swap_buffers_with_damage 1

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLSWAPBUFFERSWITHDAMAGEEXTPROC) (EGLDisplay dpy, EGLSurface surface, EGLint *rects, EGLint n_rects);

				@@ -697,6 +851,12 @@ EGLAPI EGLSurface EGLAPIENTRY eglCreatePixmapSurfaceHI (EGLDisplay dpy, EGLConfi

				#define EGL_CONTEXT_PRIORITY_LOW_IMG      0x3103

				#endif /* EGL_IMG_context_priority */

				#ifndef EGL_IMG_image_plane_attribs

				#define EGL_IMG_image_plane_attribs 1

				#define EGL_NATIVE_BUFFER_MULTIPLANE_SEPARATE_IMG 0x3105

				#define EGL_NATIVE_BUFFER_PLANE_OFFSET_IMG 0x3106

				#endif /* EGL_IMG_image_plane_attribs */

				#ifndef EGL_MESA_drm_image

				#define EGL_MESA_drm_image 1

				#define EGL_DRM_BUFFER_FORMAT_MESA        0x31D0

				@@ -729,6 +889,11 @@ EGLAPI EGLBoolean EGLAPIENTRY eglExportDMABUFImageMESA (EGLDisplay dpy, EGLImage

				#define EGL_PLATFORM_GBM_MESA             0x31D7

				#endif /* EGL_MESA_platform_gbm */

				#ifndef EGL_MESA_platform_surfaceless

				#define EGL_MESA_platform_surfaceless 1

				#define EGL_PLATFORM_SURFACELESS_MESA     0x31DD

				#endif /* EGL_MESA_platform_surfaceless */

				#ifndef EGL_NOK_swap_region

				#define EGL_NOK_swap_region 1

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLSWAPBUFFERSREGIONNOKPROC) (EGLDisplay dpy, EGLSurface surface, EGLint numRects, const EGLint *rects);

				@@ -812,6 +977,129 @@ EGLAPI EGLBoolean EGLAPIENTRY eglPostSubBufferNV (EGLDisplay dpy, EGLSurface sur

				#endif

				#endif /* EGL_NV_post_sub_buffer */

				#ifndef EGL_NV_robustness_video_memory_purge

				#define EGL_NV_robustness_video_memory_purge 1

				#define EGL_GENERATE_RESET_ON_VIDEO_MEMORY_PURGE_NV 0x334C

				#endif /* EGL_NV_robustness_video_memory_purge */

				#ifndef EGL_NV_stream_consumer_gltexture_yuv

				#define EGL_NV_stream_consumer_gltexture_yuv 1

				#define EGL_YUV_PLANE0_TEXTURE_UNIT_NV    0x332C

				#define EGL_YUV_PLANE1_TEXTURE_UNIT_NV    0x332D

				#define EGL_YUV_PLANE2_TEXTURE_UNIT_NV    0x332E

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLSTREAMCONSUMERGLTEXTUREEXTERNALATTRIBSNVPROC) (EGLDisplay dpy, EGLStreamKHR stream, EGLAttrib *attrib_list);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLBoolean EGLAPIENTRY eglStreamConsumerGLTextureExternalAttribsNV (EGLDisplay dpy, EGLStreamKHR stream, EGLAttrib *attrib_list);

				#endif

				#endif /* EGL_NV_stream_consumer_gltexture_yuv */

				#ifndef EGL_NV_stream_cross_display

				#define EGL_NV_stream_cross_display 1

				#define EGL_STREAM_CROSS_DISPLAY_NV       0x334E

				#endif /* EGL_NV_stream_cross_display */

				#ifndef EGL_NV_stream_cross_object

				#define EGL_NV_stream_cross_object 1

				#define EGL_STREAM_CROSS_OBJECT_NV        0x334D

				#endif /* EGL_NV_stream_cross_object */

				#ifndef EGL_NV_stream_cross_partition

				#define EGL_NV_stream_cross_partition 1

				#define EGL_STREAM_CROSS_PARTITION_NV     0x323F

				#endif /* EGL_NV_stream_cross_partition */

				#ifndef EGL_NV_stream_cross_process

				#define EGL_NV_stream_cross_process 1

				#define EGL_STREAM_CROSS_PROCESS_NV       0x3245

				#endif /* EGL_NV_stream_cross_process */

				#ifndef EGL_NV_stream_cross_system

				#define EGL_NV_stream_cross_system 1

				#define EGL_STREAM_CROSS_SYSTEM_NV        0x334F

				#endif /* EGL_NV_stream_cross_system */

				#ifndef EGL_NV_stream_fifo_next

				#define EGL_NV_stream_fifo_next 1

				#define EGL_PENDING_FRAME_NV              0x3329

				#define EGL_STREAM_TIME_PENDING_NV        0x332A

				#endif /* EGL_NV_stream_fifo_next */

				#ifndef EGL_NV_stream_fifo_synchronous

				#define EGL_NV_stream_fifo_synchronous 1

				#define EGL_STREAM_FIFO_SYNCHRONOUS_NV    0x3336

				#endif /* EGL_NV_stream_fifo_synchronous */

				#ifndef EGL_NV_stream_frame_limits

				#define EGL_NV_stream_frame_limits 1

				#define EGL_PRODUCER_MAX_FRAME_HINT_NV    0x3337

				#define EGL_CONSUMER_MAX_FRAME_HINT_NV    0x3338

				#endif /* EGL_NV_stream_frame_limits */

				#ifndef EGL_NV_stream_metadata

				#define EGL_NV_stream_metadata 1

				#define EGL_MAX_STREAM_METADATA_BLOCKS_NV 0x3250

				#define EGL_MAX_STREAM_METADATA_BLOCK_SIZE_NV 0x3251

				#define EGL_MAX_STREAM_METADATA_TOTAL_SIZE_NV 0x3252

				#define EGL_PRODUCER_METADATA_NV          0x3253

				#define EGL_CONSUMER_METADATA_NV          0x3254

				#define EGL_PENDING_METADATA_NV           0x3328

				#define EGL_METADATA0_SIZE_NV             0x3255

				#define EGL_METADATA1_SIZE_NV             0x3256

				#define EGL_METADATA2_SIZE_NV             0x3257

				#define EGL_METADATA3_SIZE_NV             0x3258

				#define EGL_METADATA0_TYPE_NV             0x3259

				#define EGL_METADATA1_TYPE_NV             0x325A

				#define EGL_METADATA2_TYPE_NV             0x325B

				#define EGL_METADATA3_TYPE_NV             0x325C

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDISPLAYATTRIBNVPROC) (EGLDisplay dpy, EGLint attribute, EGLAttrib *value);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLSETSTREAMMETADATANVPROC) (EGLDisplay dpy, EGLStreamKHR stream, EGLint n, EGLint offset, EGLint size, const void *data);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYSTREAMMETADATANVPROC) (EGLDisplay dpy, EGLStreamKHR stream, EGLenum name, EGLint n, EGLint offset, EGLint size, void *data);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLBoolean EGLAPIENTRY eglQueryDisplayAttribNV (EGLDisplay dpy, EGLint attribute, EGLAttrib *value);

				EGLAPI EGLBoolean EGLAPIENTRY eglSetStreamMetadataNV (EGLDisplay dpy, EGLStreamKHR stream, EGLint n, EGLint offset, EGLint size, const void *data);

				EGLAPI EGLBoolean EGLAPIENTRY eglQueryStreamMetadataNV (EGLDisplay dpy, EGLStreamKHR stream, EGLenum name, EGLint n, EGLint offset, EGLint size, void *data);

				#endif

				#endif /* EGL_NV_stream_metadata */

				#ifndef EGL_NV_stream_remote

				#define EGL_NV_stream_remote 1

				#define EGL_STREAM_STATE_INITIALIZING_NV  0x3240

				#define EGL_STREAM_TYPE_NV                0x3241

				#define EGL_STREAM_PROTOCOL_NV            0x3242

				#define EGL_STREAM_ENDPOINT_NV            0x3243

				#define EGL_STREAM_LOCAL_NV               0x3244

				#define EGL_STREAM_PRODUCER_NV            0x3247

				#define EGL_STREAM_CONSUMER_NV            0x3248

				#define EGL_STREAM_PROTOCOL_FD_NV         0x3246

				#endif /* EGL_NV_stream_remote */

				#ifndef EGL_NV_stream_reset

				#define EGL_NV_stream_reset 1

				#define EGL_SUPPORT_RESET_NV              0x3334

				#define EGL_SUPPORT_REUSE_NV              0x3335

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLRESETSTREAMNVPROC) (EGLDisplay dpy, EGLStreamKHR stream);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLBoolean EGLAPIENTRY eglResetStreamNV (EGLDisplay dpy, EGLStreamKHR stream);

				#endif

				#endif /* EGL_NV_stream_reset */

				#ifndef EGL_NV_stream_socket

				#define EGL_NV_stream_socket 1

				#define EGL_STREAM_PROTOCOL_SOCKET_NV     0x324B

				#define EGL_SOCKET_HANDLE_NV              0x324C

				#define EGL_SOCKET_TYPE_NV                0x324D

				#endif /* EGL_NV_stream_socket */

				#ifndef EGL_NV_stream_socket_inet

				#define EGL_NV_stream_socket_inet 1

				#define EGL_SOCKET_TYPE_INET_NV           0x324F

				#endif /* EGL_NV_stream_socket_inet */

				#ifndef EGL_NV_stream_socket_unix

				#define EGL_NV_stream_socket_unix 1

				#define EGL_SOCKET_TYPE_UNIX_NV           0x324E

				#endif /* EGL_NV_stream_socket_unix */

				#ifndef EGL_NV_stream_sync

				#define EGL_NV_stream_sync 1

				#define EGL_SYNC_NEW_FRAME_NV             0x321F

				@@ -838,7 +1126,7 @@ typedef khronos_utime_nanoseconds_t EGLTimeNV;

				#define EGL_SYNC_TYPE_NV                  0x30ED

				#define EGL_SYNC_CONDITION_NV             0x30EE

				#define EGL_SYNC_FENCE_NV                 0x30EF

				#define EGL_NO_SYNC_NV                    ((EGLSyncNV)0)

				#define EGL_NO_SYNC_NV                    EGL_CAST(EGLSyncNV,0)

				typedef EGLSyncNV (EGLAPIENTRYP PFNEGLCREATEFENCESYNCNVPROC) (EGLDisplay dpy, EGLenum condition, const EGLint *attrib_list);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLDESTROYSYNCNVPROC) (EGLSyncNV sync);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLFENCENVPROC) (EGLSyncNV sync);

Compare commits

9470 Commits mesa-12.0. ... chadv/revi

9 .dir-locals.el Unescape Escape View File

35 .editorconfig Normal file Unescape Escape View File

1 .gitignore vendored Unescape Escape View File

12 .mailmap Unescape Escape View File

364 .travis.yml Unescape Escape View File

61 Android.common.mk Unescape Escape View File

20 Android.mk Unescape Escape View File

11 Makefile.am Unescape Escape View File

16 REVIEWERS Unescape Escape View File

2 VERSION Unescape Escape View File

12 appveyor.yml Unescape Escape View File

5 bin/.cherry-ignore Unescape Escape View File

3 bin/.editorconfig Normal file Unescape Escape View File

38 bin/bugzilla_mesa.sh Unescape Escape View File

30 bin/get-extra-pick-list.sh Unescape Escape View File

71 bin/get-fixes-pick-list.sh Executable file Unescape Escape View File

7 bin/get-pick-list.sh Unescape Escape View File

42 bin/get-typod-pick-list.sh Executable file Unescape Escape View File

0 bin/perf-annotate-jit → bin/perf-annotate-jit.py Unescape Escape View File

4 bin/shortlog_mesa.sh Unescape Escape View File

5 common.py Unescape Escape View File

1450 configure.ac View File

2 docs/README.WIN32 Unescape Escape View File

2 docs/application-issues.html Unescape Escape View File

33 docs/autoconf.html Unescape Escape View File

2 docs/bugs.html Unescape Escape View File

142 docs/codingstyle.html Normal file Unescape Escape View File

19 docs/contents.html Unescape Escape View File

6 docs/developers.html Unescape Escape View File

647 docs/devinfo.html Unescape Escape View File

45 docs/download.html Unescape Escape View File

10 docs/egl.html Unescape Escape View File

108 docs/envvars.html Unescape Escape View File

24 docs/faq.html Unescape Escape View File

253 docs/GL3.txt → docs/features.txt Unescape Escape View File

20 docs/helpwanted.html Unescape Escape View File

156 docs/index.html Unescape Escape View File

115 docs/install.html Unescape Escape View File

83 docs/intro.html Unescape Escape View File

6 docs/license.html Unescape Escape View File

22 docs/lists.html Unescape Escape View File

28 docs/llvmpipe.html Unescape Escape View File

11 docs/mangling.html Unescape Escape View File

4 docs/opengles.html Unescape Escape View File

4 docs/patents.txt Unescape Escape View File

2 docs/postprocess.html Unescape Escape View File

10 docs/precompiled.html Unescape Escape View File

94 docs/release-calendar.html Normal file Unescape Escape View File

551 docs/releasing.html Normal file Unescape Escape View File

20 docs/relnotes.html Unescape Escape View File

2 docs/relnotes/12.0.1.html Unescape Escape View File

3 docs/relnotes/12.0.2.html Unescape Escape View File

71 docs/relnotes/12.0.3.html Normal file Unescape Escape View File

321 docs/relnotes/12.0.4.html Normal file Unescape Escape View File

138 docs/relnotes/12.0.5.html Normal file Unescape Escape View File

148 docs/relnotes/12.0.6.html Normal file Unescape Escape View File

311 docs/relnotes/13.0.0.html Normal file Unescape Escape View File

188 docs/relnotes/13.0.1.html Normal file Unescape Escape View File

189 docs/relnotes/13.0.2.html Normal file Unescape Escape View File

177 docs/relnotes/13.0.3.html Normal file Unescape Escape View File

255 docs/relnotes/13.0.4.html Normal file Unescape Escape View File

210 docs/relnotes/13.0.5.html Normal file Unescape Escape View File

287 docs/relnotes/13.0.6.html Normal file Unescape Escape View File

285 docs/relnotes/17.0.0.html Normal file Unescape Escape View File

221 docs/relnotes/17.0.1.html Normal file Unescape Escape View File

185 docs/relnotes/17.0.2.html Normal file Unescape Escape View File

189 docs/relnotes/17.0.3.html Normal file Unescape Escape View File

156 docs/relnotes/17.0.4.html Normal file Unescape Escape View File

144 docs/relnotes/17.0.5.html Normal file Unescape Escape View File

82 docs/relnotes/17.1.0.html Normal file Unescape Escape View File

66 docs/relnotes/17.2.0.html Normal file Unescape Escape View File

2 docs/relnotes/6.5.2.html Unescape Escape View File

2 docs/relnotes/7.11.html Unescape Escape View File

2 docs/relnotes/7.5.1.html Unescape Escape View File

2 docs/relnotes/7.5.2.html Unescape Escape View File

2 docs/relnotes/7.5.html Unescape Escape View File

2 docs/relnotes/9.0.html Unescape Escape View File

4 docs/relnotes/9.1.2.html Unescape Escape View File

9470 Commits

mesa-12.0. ... chadv/revi

9

.dir-locals.el

View File

35

.editorconfig Normal file

View File

1

.gitignore vendored

View File

12

.mailmap

View File

364

.travis.yml

View File

61

Android.common.mk

View File

20

Android.mk

View File

11

Makefile.am

View File

16

REVIEWERS

View File

2

VERSION

View File

12

appveyor.yml

View File

5

bin/.cherry-ignore

View File

3

bin/.editorconfig Normal file

View File

38

bin/bugzilla_mesa.sh

View File

30

bin/get-extra-pick-list.sh

View File

71

bin/get-fixes-pick-list.sh Executable file

View File

7

bin/get-pick-list.sh

View File

42

bin/get-typod-pick-list.sh Executable file

View File

0

bin/perf-annotate-jit → bin/perf-annotate-jit.py

View File

4

bin/shortlog_mesa.sh

View File

5

common.py

View File

1450

configure.ac

View File

2

docs/README.WIN32

View File

2

docs/application-issues.html

View File

33

docs/autoconf.html

View File

2

docs/bugs.html

View File

142

docs/codingstyle.html Normal file

View File

19

docs/contents.html

View File

6

docs/developers.html

View File

647

docs/devinfo.html

View File

45

docs/download.html

View File

10

docs/egl.html

View File

108

docs/envvars.html

View File

24

docs/faq.html

View File

253

docs/GL3.txt → docs/features.txt

View File

20

docs/helpwanted.html

View File

156

docs/index.html

View File

115

docs/install.html

View File

83

docs/intro.html

View File

6

docs/license.html

View File

22

docs/lists.html

View File

28

docs/llvmpipe.html

View File

11

docs/mangling.html

View File

4

docs/opengles.html

View File

4

docs/patents.txt

View File

2

docs/postprocess.html

View File

10

docs/precompiled.html

View File

94

docs/release-calendar.html Normal file

View File

551

docs/releasing.html Normal file

View File

20

docs/relnotes.html

View File

2

docs/relnotes/12.0.1.html

View File

3

docs/relnotes/12.0.2.html

View File

71

docs/relnotes/12.0.3.html Normal file

View File

321

docs/relnotes/12.0.4.html Normal file

View File

138

docs/relnotes/12.0.5.html Normal file

View File

148

docs/relnotes/12.0.6.html Normal file

View File

311

docs/relnotes/13.0.0.html Normal file

View File

188

docs/relnotes/13.0.1.html Normal file

View File

189

docs/relnotes/13.0.2.html Normal file

View File

177

docs/relnotes/13.0.3.html Normal file

View File

255

docs/relnotes/13.0.4.html Normal file

View File

210

docs/relnotes/13.0.5.html Normal file

View File

287

docs/relnotes/13.0.6.html Normal file

View File

285

docs/relnotes/17.0.0.html Normal file

View File

221

docs/relnotes/17.0.1.html Normal file

View File

185

docs/relnotes/17.0.2.html Normal file

View File

189

docs/relnotes/17.0.3.html Normal file

View File

156

docs/relnotes/17.0.4.html Normal file

View File

144

docs/relnotes/17.0.5.html Normal file

View File

82

docs/relnotes/17.1.0.html Normal file

View File

66

docs/relnotes/17.2.0.html Normal file

View File

2

docs/relnotes/6.5.2.html

View File

2

docs/relnotes/7.11.html

View File

2

docs/relnotes/7.5.1.html

View File

2

docs/relnotes/7.5.2.html

View File

2

docs/relnotes/7.5.html

View File

2

docs/relnotes/9.0.html

View File

4

docs/relnotes/9.1.2.html

View File

21

docs/repository.html

View File