Compare commits

...

491 Commits

Author SHA1 Message Date
Emil Velikov
879d24c497 docs: add sha256 checksums for 13.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-20 11:54:35 +00:00
Emil Velikov
fcef88d13a docs: add release notes for 13.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-20 11:42:19 +00:00
Emil Velikov
069f00bf92 Update version to 13.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-17 18:13:30 +00:00
Matt Turner
3b782f6bc4 clover: Work around build failure with AltiVec.
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=587210
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68504
Acked-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 7d1195c1e4)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	configure.ac
2017-03-15 23:59:57 +00:00
Alex Smith
97d68c863c radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBuffer
Need to flush before updating the buffer to ensure that the copy is
ordered after previous accesses (assuming the app has performed the
appropriate barriers).

This fixes potential issues due to draws prior to an update reading
the new buffer content, despite having the necessary barriers between
them.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e0cc32b85b)
2017-03-15 18:02:32 +00:00
Bas Nieuwenhuizen
01c264c35c radv: Emit cache flushes before CP DMA.
The flushes could be due to TRANSFER barriers.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit cce43f6d8c)
2017-03-15 18:02:32 +00:00
Dave Airlie
1d143f0018 radv: setup llvm target data layout
Ported from radeonsi, pointed out by Tom.

"This prevents LLVM from using sext instructions for local memory
offsets and allows the backend to fold immediate offsets into the
instruction. This also prevents some incorrect code generation for
ptrtoint and inttoptr instructions."

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <tstellar@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b8ee70384a)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/common/ac_nir_to_llvm.c
2017-03-15 18:02:32 +00:00
Marek Olšák
dac86c5d3c radeonsi: mark all bound shader buffer ranges as initialized
This should prevent cases when a buffer was incorrectly mapped without
synchronization just because this wasn't done.

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 71a2e4e945)
2017-03-15 18:02:32 +00:00
Emil Velikov
0fbac2d641 cherry-ignore: add ANV fast clears related fixes
There is no ANV fast_clear support in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-15 18:02:32 +00:00
Ben Crocker
56044c43a8 gallivm: Reenable PPC VSX (v3)
Reenable the PPC64LE Vector-Scalar Extension for LLVM versions >= 3.8.1,
now that LLVM bug 26775 and its corollary, 25503, are fixed.

Amendment: remove extraneous spaces in macro def & invocations.

We would prefer a runtime check, e.g. via an LLVMQueryString
(analogous to glGetString, eglQueryString) or LLVMGetVersion API,
but no such API exists at this time.

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
[Emil Velikov: remove LLVM_VERSION macro]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit 3f1b6ef2aa)
2017-03-15 18:02:32 +00:00
Ilia Mirkin
5e5c720c6a nvc0: increase alignment to 256 for texture buffers on fermi
When binding as textures, the alignment can be 16. However when binding
as an image, the address has to be aligned to 256. (Also when binding as
an RT, but that can't happen with GL or current gallium APIs.)

Reported-by: Roy Spliet <nouveau@spliet.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 32dd8d59b6)
2017-03-15 18:02:31 +00:00
Gregory Hainaut
f42e8a76e5 glapi: fix typo in count_scale
2*4=8

Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 2ab5eccf5d)
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-15 18:02:31 +00:00
Grazvydas Ignotas
d348151c99 gallium/u_queue: set num_threads correctly if not all threads start
If i-th thread could not be created it means we have i threads,
not i+1, because we start from 0.

Fixes: 404d0d5 "gallium/u_queue: add an option to have multiple worker threads"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 7f268cf12b)
2017-03-15 18:02:31 +00:00
Grazvydas Ignotas
ce5770728a gallium/u_queue: fix a crash with atexit handlers
Commit 4aea8fe ("gallium/u_queue: fix random crashes when the app calls
exit()") added a atexit handler which calls
util_queue_killall_and_wait() for each queue to stop the threads.
However the app is also free to use atexit handlers to clean up things,
leading to util_queue_destroy() call which will also call
util_queue_killall_and_wait() for the same queue again, causing threads
being joined twice, and that is undefined. This happens with libglut,
for example. A simple fix is to just set num_threads to 0 as there are
no more valid threads after util_queue_killall_and_wait() returns.

Fixes: 4aea8fe "gallium/u_queue: fix random crashes when the app calls exit()"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 9936121935)
2017-03-15 18:02:31 +00:00
Grazvydas Ignotas
0d8d249f08 r300g: only allow byteswapped formats on big endian
They cause regressions on little endian.

Fixes: 172bfdaa9e ("r300g: add support for PIPE_FORMAT_x8R8G8B8_*")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98869
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 66d1cb587a)
2017-03-15 18:02:31 +00:00
Jacob Lifshay
c0c964489b vulkan/wsi: Improve the DRI3 error message
This commit improves the message by telling them that they could probably
enable DRI3.  More importantly, it includes a little heuristic to check
to see if we're running on AMD or NVIDIA's proprietary X11 drivers and,
if we are, doesn't emit the warning.  This way, users with both a discrete
card and Intel graphics don't get the warning when they're just running
on the discrete card.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715
Co-authored-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Rene Lindsay <rjklindsay@hotmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 3d8feb38e8)
2017-03-15 18:02:31 +00:00
Jason Ekstrand
010f672d10 anv: Properly handle destroying NULL devices and instances
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0 13.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit e3d33a23e6)
2017-03-15 18:02:31 +00:00
Jason Ekstrand
357f50f0e2 anv: Accurately advertise dynamic descriptor limits
The number of dynamic descriptors is limited by both the number of
descriptors and the total number of dynamic things.  Because there isn't
a single "maximum dynamic things" limit, we need to divide by two so
that they can create the maximum of both UBOs and SSBOs.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5e44ef4a76)
2017-03-15 18:02:31 +00:00
Emil Velikov
cea0fa44ae i965: move brw_define.h ifndef guard to the top
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 077078ce77)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_defines.h
2017-03-15 18:02:31 +00:00
Dave Airlie
6f3579008d radv: disable mip point pre clamping.
No idea what this does, but disabling it fixes a bunch
of failing CTS tests in the lod area, so let's go with that.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d81bd2f754)
2017-03-15 18:02:30 +00:00
Fredrik Höglund
e3d65d779a radv/ac: fix multiple descriptor sets with dynamic buffers
The dynamic_offset_offset in the descriptor set binding layout is
relative to the dynamic_offset_start for the set in the pipeline
layout.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 162beb2abb)
2017-03-15 18:02:30 +00:00
Fredrik Höglund
51f48cbaa2 radv: fix the dynamic buffer index in vkCmdBindDescriptorSets
This fixes the wrong dynamic buffer descriptors being updated when
firstSet > 0.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 0941d1a574)
2017-03-15 18:02:30 +00:00
Bas Nieuwenhuizen
f0c61906a2 radv: Disable HTILE for textures with multiple layers/levels.
It has issues and the fix I'm working on is too complicated for stable,
so disable for now.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0ab2dd361f)
2017-03-15 18:02:30 +00:00
Alex Smith
8fc606b0f4 radv: Emit pending flushes before executing a secondary command buffer
If we have any pending flushes on the primary command buffer, these
must be performed before executing the secondary buffer.

This fixes potential corruption when the contents of a subpass which
clears any of its render targets are given in a secondary buffer: the
flushes after a fast clear would not have been performed until the
vkCmdEndRenderPass call.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 290d7e892d)
2017-03-15 18:02:30 +00:00
Dave Airlie
62ff763329 radv: drop Z24 support.
This isn't exposed in -pro, the hw docs say it is deprecated,
so let's not bother with it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit cc59e24a6b)
2017-03-15 18:02:30 +00:00
Ilia Mirkin
9932d54418 nvc0: take extra pushbuf space into account for pushbuf_space calls
See detailed explanation of why this is needed in commit eb60a89bc3.
This spot was missed/overlooked. Basically as a result of the fact
that BEGIN_* ends up calling PUSH_SPACE, which in turn adds an extra 8
to the requested amount, we have to be mindful of that when doing bare
nouveau_pushbuf_space calls.

Reportedly this fixes some crashes when replaying a hitman trace taken
on radeonsi.

Fixes: eb60a89bc3 ("nouveau: take extra push space into account for pushbuf_space calls")
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 8e6d67685e)
2017-03-15 18:02:30 +00:00
Jonas Pfeil
b10859ec41 ralloc: Make sure ralloc() allocations match malloc()'s alignment.
The header of ralloc needs to be aligned, because the compiler assumes
that malloc returns will be aligned to 8/16 bytes depending on the
platform, leading to degraded performance or alignment faults with ralloc.

Fixes SIGBUS on Raspberry Pi at high optimization levels.

This patch is not perfect for MSVC, as maybe in the future the alignment
for the most demanding data type might change to more than 8.

v2: Commit message reword/typo fix, and add a bigger explanation in the
    code (by anholt)

Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit cd2b55e536)

Squashed with commit:

ralloc: don't leave out the alignment factor

Experimentation shows that without alignment factor gcc and clang choose
a factor of 16 even on IA-32, which doesn't match what malloc() uses (8).
The problem is it makes gcc assume the pointer is 16 byte aligned, so
with -O3 it starts using aligned SSE instructions that later fault,
so always specify a suitable alignment factor.

Cc: Jonas Pfeil <pfeiljonas@gmx.de>
Fixes: cd2b55e5 "ralloc: Make sure ralloc() allocations match malloc()'s alignment."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100049
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Tested by: Mike Lothian <mike@fireburn.co.uk>
Tested by: Jonas Pfeil <pfeiljonas@gmx.de>

(cherry picked from commit ff494fe999)
2017-03-15 18:02:30 +00:00
Kenneth Graunke
132fc9a975 egl: Ensure ResetNotificationStrategy matches for shared contexts.
Fixes:
dEQP-EGL.functional.robustness.negative_context.invalid_robust_shared_context_creation

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 4061bbccf2)
2017-03-15 14:39:57 +00:00
Nicolai Hähnle
4c8810f335 st/mesa: inform the driver of framebuffer changes before compute dispatches
Even though compute shaders cannot access the framebuffer, there is a
synchronization issue when a compute dispatch accesses a texture that
was previously bound and drawn to as a framebuffer.

Section 9.3 (Feedback Loops Between Textures and the Framebuffer) of
the OpenGL 4.5 spec rather implicitly clarifies that undefined behavior
results if the texture is still attached to the currently bound
framebuffer. However, the feedback loop is broken when the application
changes the framebuffer binding before a compute dispatch, and the
state tracker needs to let the driver known about this.

Fixes GL45-CTS.compute_shader.pipeline-post-fs on SI family Radeons.

Cc: mesa-stable@lists.freedesktop.org

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 40c77bbf83)
2017-03-15 14:39:57 +00:00
Nicolai Hähnle
b18e5cf340 st/glsl_to_tgsi: avoid iterating past the head of the instruction list
exec_node::get_prev() does not guard against going past the beginning
of the list, so we need to add explicit checks here.

Found by ASAN in piglit arb_shader_storage_buffer_object-rendering.

Cc: mesa-stable@lists.freedesktop.org

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 911391bd70)
2017-03-15 14:39:57 +00:00
Samuel Iglesias Gonsálvez
c89ca598ed i965/fs: emit MOV_INDIRECT with the source with the right register type
This was hiding bugs as it retyped the source to destination's type.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 0dddad5b1b)
2017-03-15 14:39:57 +00:00
Samuel Iglesias Gonsálvez
afa603aedf i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handles
When generating the MOV INDIRECT instruction, the source type is ignored
and it is set to destination's type. However, this is going to change in a
later patch, so we need to explicitly set the proper source type.

brw_vec8_grf() creates an float type's fs_reg by default, when the
ICP handle is actually unsigned. This patch fixes these cases before
applying the aforementioned patch.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit d8122128bc)
2017-03-15 14:39:57 +00:00
Samuel Iglesias Gonsálvez
842ea1f3d7 i965/fs: fix indirect load DF uniforms on BSW/BXT
The lowered BSW/BXT indirect move instructions had incorrect
source types, which luckily wasn't causing incorrect assembly to be
generated due to the bug fixed in the next patch, but would have
confused the remaining back-end IR infrastructure due to the mismatch
between the IR source types and the emitted machine code.

v2:
- Improve commit log (Curro)
- Fix read_size (Curro)
- Fix DF uniform array detection in assign_constant_locations() when
  it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT.

v3:
- Move changes in assign_constant_locations() to other patch.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 56266df7ed)
2017-03-15 14:39:57 +00:00
Samuel Iglesias Gonsálvez
89f653e4ae i965/fs: detect different bit size accesses to uniforms to push them in proper locations
Previously, if we had accesses with different sizes to the same uniform, we might not
push it aligned with the bigger one. This is a problem in BSW/BXT when we access
an array of DF uniform with both direct and indirect addressing because for the latter
we use 32-bit MOV INDIRECT instructions. However this problem can happen with other
generations and bitsizes.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit a497ab6838)
2017-03-15 14:39:57 +00:00
Samuel Iglesias Gonsálvez
7466ffa468 i965/fs: mark last DF uniform array element as 64 bit live one
This bug can make that we don't detect the end of a contiguous area
correctly and push larger areas than the real ones.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 7427425247)
2017-03-15 14:39:57 +00:00
Dave Airlie
9919331f4d radv: fix txs for sampler buffers
I messed this up when I wrote it, this fixes:
dEQP-VK.memory.pipeline_barrier.*uniform_texel_buffer.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e66be3d3bb)
2017-03-15 14:39:57 +00:00
Brendan King
f3dd22c9ff egl/dri3: implement query surface hook
This is a DRI3 version of a change made for DRI2
(4d6d4f939e, "egl/dri2: implement query surface hook"),
that fixed failures in dEQP-EGL.functional.resize.surface_size.grow
and dEQP-EGL.functional.resize.surface_size.shrink.

Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: Chad Versace <chadversary@chromium.org>
Signed-off-by: Brendan King <Brendan.King@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 884f65e185)
2017-03-15 14:39:57 +00:00
Dave Airlie
915a56ff5c radv: fix depth format in blit2d.
For blitting we need to use the depth or stencil format, never
the combined.

This fixes:
dEQP-VK.texture.shadow.2d.nearest.less_or_equal_d32_sfloat_s8_uint
and a few others.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 800b82ea13)
2017-03-15 14:39:56 +00:00
Bas Nieuwenhuizen
aed7340a86 radv: Use correct size for availability flag.
Per spec, VK_QUERY_RESULT_64_BIT specifies the integer size and the
availability flag is an integer. We apparently handled this correctly
already for the copy to buffer case.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43d833ae97)
2017-03-15 14:39:56 +00:00
Bas Nieuwenhuizen
08cc01e796 radv: Only use PKT3_OCCLUSION_QUERY when it doesn't hang.
PKT3_OCCLUSION_QUERY hangs when used in a nested IB. This only
calls it when in a primary command buffer and we change
GetQueryPoolResults to not need it. CmdCopyQueryPoolResults
still needs it so we break that behavior for secondary command buffers.
However, that would hang already and using an unitialized value is
better than a hang.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8ea34a98c0)
2017-03-15 14:39:56 +00:00
Bas Nieuwenhuizen
da55203a24 radv: Reset emitted compute pipeline when calling secondary cmd buffer.
Otherwise if the new compute pipeline is the same as the last used
pipeline before the call, we don't emit it again.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bb878db7eb)
2017-03-15 14:39:56 +00:00
Marek Olšák
4d011d1a48 radeonsi: fix broken tessellation on Carrizo and Stoney
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99850

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 35915af6c9)
2017-03-15 14:39:56 +00:00
Marek Olšák
fc4e02c925 st/mesa: set blend state for PBO readbacks
v2: restore the state

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit cc2f92b09f)
2017-03-15 14:39:56 +00:00
Marek Olšák
def8f8360e st/mesa: reset sample_mask, min_sample, and render_condition for PBO ops
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit a40b76143d)
2017-03-15 14:39:56 +00:00
Samuel Iglesias Gonsálvez
f7e3209fb7 glsl: fix heap-use-after-free in ast_declarator_list::hir()
The get_variable_being_redeclared() function can free 'var' because
a re-declaration of an unsized array variable can establish the size, so
we set the array type to the 'earlier' declaration and free 'var' as it is
not needed anymore.

However, the same 'var' is referenced later in ast_declarator_list::hir().

This patch fixes it by picking the ir_variable_mode from the proper
ir_variable.

This error was detected by Address Sanitizer.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99677
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a73a618933)
2017-03-15 14:39:56 +00:00
Marek Olšák
f9de12cf8a gallium/u_queue: fix random crashes when the app calls exit()
This fixes:
    vdpauinfo: ../lib/CodeGen/TargetPassConfig.cpp:579: virtual void
    llvm::TargetPassConfig::addMachinePasses(): Assertion `TPI && IPI &&
    "Pass ID not registered!"' failed.

v2: use list_head, switch the call order in destroy

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 4aea8fe7e0)
2017-03-15 14:39:56 +00:00
Jason Ekstrand
3b9e01c52c intel/blorp: Explicitly flush all allocated state
Found by inspection.  However, I expect it fixes real bugs when using
blorp from Vulkan on little-core platforms.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 075ed20614)
2017-03-15 14:39:56 +00:00
Jason Ekstrand
cfe0bcccf8 blorp/exec: Use uint32_t for copying varying data
Some things may not be floats and intel CPUs are known for mangling bits
when a float type is used for copying integers.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 8c8095c260)
2017-03-15 14:39:56 +00:00
Jason Ekstrand
71e4003981 anv/query: Perform CmdResetQueryPool on the GPU
This fixes a some rendering corruption in The Talos Principle

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 40087bcb51)
2017-03-15 14:39:55 +00:00
Jason Ekstrand
f43bed413a genxml: Make MI_STORE_DATA_IMM more consistent
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dc9abd0e6b)
2017-03-15 14:39:55 +00:00
Jason Ekstrand
a157a8db37 anv/query: clflush the bo map on non-LLC platforms
Found by inspection

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3788cd3239)
2017-03-15 14:39:55 +00:00
Jason Ekstrand
7da210dc3f anv: Add an invalidate_range helper
This is similar to clflush_range except that it puts the mfence on the
other side to ensure caches are flushed prior to reading.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8582ab2d6e)
2017-03-15 14:39:55 +00:00
Nicolai Hähnle
64537aa6b4 radeonsi: fix UINT/SINT clamping for 10-bit formats on <= CIK
The same PS epilog workaround as for 8-bit integer formats is required,
since the CB doesn't do clamping.

Fixes GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels*.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 066a117be7)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_shader.c
	src/gallium/drivers/radeonsi/si_state_shaders.c
2017-03-15 14:39:55 +00:00
Nicolai Hähnle
5c1a970aa1 radeonsi: handle MultiDrawIndirect in si_get_draw_start_count
Also handle the GL_ARB_indirect_parameters case where the count itself
is in a buffer.

Use transfers rather than mapping the buffers directly. This anticipates
the possibility that the buffers are sparse (once ARB_sparse_buffer is
implemented), in which case they cannot be mapped directly.

Fixes GL45-CTS.gtf43.GL3Tests.multi_draw_indirect.multi_draw_indirect_type
on <= CIK.

v2:
- unmap the indirect buffer correctly
- handle the corner case where we have indirect draws, but all of them
  have count 0.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 6a1d9684f4)
2017-03-15 14:39:55 +00:00
Nicolai Hähnle
f9673f8213 winsys/amdgpu: reduce max_alloc_size based on GTT limits
Allocating huge buffers in VRAM is not a problem, but when those buffers
start being migrated, the kernel runs into errors because it cannot split
those buffer up for moving through GTT.

This should fix intermittent failures of
GL45-CTS.texture_buffer.texture_buffer_max_size

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 550125e1e7)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
2017-03-15 14:39:55 +00:00
Ben Crocker
9130d79f7a gallivm: Override getHostCPUName() "generic" w/ "pwr8" (v4)
If llvm::sys::getHostCPUName() returns "generic", override
it with "pwr8" (on PPC64LE).

This is a work-around for a bug in LLVM: a table entry for "POWER8NVL"
is missing, resulting in (big-endian) "generic" being returned on
little-endian Power8NVL systems.  The result is that code that
attempts to load the least significant 32 bits of a 64-bit quantity in
memory loads the wrong half.

This omission should be fixed in the next version of LLVM (4.0),
but this work-around should be left in place in case some
future version of POWER<n> also ends up unrepresented in LLVM's table.

This workaround fixes failures in the Piglit arb_gpu_shader_fp64 conversion
tests on POWER8NVL processors.

(V4: add similar comment in the code.)

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Cc: 12.0 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b934aae364)
2017-03-15 14:39:55 +00:00
Ben Crocker
f17a153497 gallivm: Improve debug output (V2)
Improve debug output from gallivm_compile_module and
lp_build_create_jit_compiler_for_module, printing the
-mcpu and -mattr options passed to LLC.

V2: enclose MAttrs debug_printf block and llc -mcpu debug_printf
in "if (gallivm_debug & <flags>)..."

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Cc: 12.0 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v2)
[Emil Velikov: rebase]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit a8e9c630f3)
2017-03-15 14:39:55 +00:00
Marek Olšák
1441b436e7 gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionally
It's OK for r300g (because r300g can't write to buffers via the GPU), but
not later hardware. This issue was spotted randomly.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit c8ef512398)
(cherry picked from commit bc8d047068)
2017-03-15 14:39:55 +00:00
Marek Olšák
65a6a9fab8 gallium/util: remove unused u_index_modify helpers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 42297c862f)
2017-03-15 14:39:54 +00:00
Marek Olšák
ea6ae74cb0 radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start (v2)
start can only be non-zero with MultiDrawElements, which is unlikely
to occur with UNSIGNED_BYTE indices.

v2: Also fix the util_shorten_ubyte_elts_to_userptr call.
    Tested with the new piglit.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a264fee624)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state_draw.c
2017-03-15 14:39:54 +00:00
Lionel Landwerlin
c1c6aa8151 i965/fs: fix uninitialized memory access
Found while running shader-db under valgrind.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0ac118398)
2017-03-15 14:39:54 +00:00
Bas Nieuwenhuizen
212c3399f0 radv: Never try to create more than max_sets descriptor sets.
We only use the freed ones after all free space has been used. If
the app only allocates small descriptor sets, we might go over
max_sets before the memory is full.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
Fixes: f4e499ec79
(cherry picked from commit f448701622)
2017-03-15 14:39:53 +00:00
Emil Velikov
db1231a572 cherry-ignore: don't pick nir_op_pack_double optimisation fix
The optimisation itself is broken and thus removed with previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-15 14:38:37 +00:00
Jason Ekstrand
1966fa2b7f i965/fs: Remove the inline pack_double_2x32 optimization
It's broken in a number of ways.  In particular, a bunch of the
conditions are backwards so it doesn't actually detect what it's
supposed to detect.  Since it's been broken, it hasn't actually been
helping anything so just deleting it isn't a regression.

This (and removing another optimization) were done on master in commit
b073811617.

Cc: "Kenneth Grunke" <kenneth@whitecape.org>
Cc: "Mark Janes" <mark.a.janes@intel.com>

[Emil Velikov: patch is a backport of the below "cherry pick"]
Fixes: a4393bd97f ("i965/fs: Fix the inline nir_op_pack_double optimization")

(cherry picked from commit b073811617)
2017-03-15 14:36:10 +00:00
Lionel Landwerlin
ae241b73e6 anv: wsi: report presentation error per image request
vkQueuePresentKHR() takes VkPresentInfoKHR pointer and includes a
pResults fields which must holds the results of all the images
requested to be presented. Currently we're not filling this field.

Also as a side effect we probably want to go through all the images
rather than stopping on the first error.

This commit also makes the QueuePresentKHR() implementation return the
first error encountered.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0fcb92c17d)
2017-03-14 00:13:16 +00:00
Connor Abbott
aede9fc7ae anv: fix Get*MemoryRequirements for !LLC
Even though we supported both coherent and non-coherent memory types, we
effectively forced apps to use the coherent types by accident. Found by
inspection, only compile tested.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6319bfc2a6)
2017-03-14 00:13:16 +00:00
Dave Airlie
6e0ad8b73e tgsi: fix memory leak in tgsi sanity check
This just fixes this without repeating the code.

Reported-by: Li Qiang
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 69fc7a2c82)
2017-03-14 00:13:16 +00:00
Jason Ekstrand
b8098045ac intel/blorp: Swizzle clear colors on the CPU
It's trivial to swizzle clear colors on the CPU, easily deals with the
hardware restrictions for render target swizzles, and makes swizzled
clears work on all hardware as opposed to just HSW+.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e233db6e93)
2017-03-14 00:13:16 +00:00
Kenneth Graunke
95c038ee4e mesa: Do (TCS && !TES) draw time validation in ES as well.
Now that we have OES_tessellation_shader, the same situation can occur
in ES too, not just GL core profile.

Having a TCS but no TES may confuse drivers - i965 crashes, for example.

This prevents regressions in
ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage
with some SSO pipeline validation changes I'm making.

v2: Add an ES spec citation (suggested by Alejandro)

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit 05a56893aa)
2017-03-14 00:13:16 +00:00
Ilia Mirkin
457a408134 st/mesa: don't pass compare mode for stencil-sampled textures
Fixes dEQP-GLES31.functional.stencil_texturing.misc.compare_mode_effect

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3970257cef)
2017-03-14 00:13:16 +00:00
Ilia Mirkin
37ca4e2432 nvc0: set the render condition in the compute object
Fixes GL45-CTS.compute_shader.conditional-dispatching

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 48f04862c1)
2017-03-14 00:13:16 +00:00
Ilia Mirkin
0b78455be3 gm107/ir: fix address offset bitfield for ATOMS
Fixes GL45-CTS.compute_shader.atomic-case1 on Maxwell

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 7e75f0913a)
2017-03-14 00:13:16 +00:00
Jose Maria Casanova Crespo
97b2fb4d3e glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1
From GLSL ES 3.10 spec, section 4.1.9 "Arrays":

"If an array is declared as the last member of a shader storage block
 and the size is not specified at compile-time, it is sized at run-time.
 In all other cases, arrays are sized only at compile-time."

In desktop GLSL it is allowed to have unsized-arrays that are
not last, as long as we can determine that they are implicitly
sized, which is detected at link-time.

With this patch Mesa reports a compilation error as glslang does with
the following shader:

buffer SSBO { vec4 data[]; vec4 moreData;};
void main (void)
{
}

Fixes:
dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5bc222ebaf)
2017-03-14 00:13:15 +00:00
Ilia Mirkin
97745d0f17 nvc0/ir: fix ubo max clamp, reset file index
We just increased the max UBO, so we should also increase the clamp that
we do for robustness. Similarly, as we're including the fileIndex in the
new indirect value, we should reset fileIndex to 0 so that it is not
added in a second time.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c95f821cb4)
2017-03-14 00:13:15 +00:00
Ilia Mirkin
d51c4f7c13 nvc0/ir: fix robustness guarantees for constbuf loads on kepler+ compute
Kepler and up unfortunately only support up to 8 constbufs. We work
around this by loading from constbufs as if they were storage buffers.
However we were not consistently applying limits to loads from these
buffers. Make sure to do the same thing we do for storage buffers.

Fixes GL45-CTS.robust_buffer_access_behavior.uniform_buffer

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1acdd62847)
2017-03-14 00:13:15 +00:00
Ilia Mirkin
384b14b6d2 nvc0: increase number of ubo binding points
Apparently GL 4.5 requires 14 of these (there's a "*" in the spec, but
it's unclear what it refers to). We need to expose an extra binding
point for the "program parameters", which means this must be 15. Remove
the last vestige of the "use c14 for immediates" idea.

Fixes GL45-CTS.shading_language_420pack.binding_uniform_block_array

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 59ca352fc5)
2017-03-14 00:13:15 +00:00
Marc Di Luzio
60da51f8cf glsl: correct compute shader checks for memoryBarrier functions
As per the spec -
"The functions memoryBarrierShared() and groupMemoryBarrier() are
available only in compute shaders; the other functions are available
in all shader types."

Conform to this by adding another delegate to check for compute
shader support instead of only whether the current stage is compute

This allows some fragment shaders in Dirt Rally to compile

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 21efe2528c)
2017-03-14 00:13:15 +00:00
Jason Ekstrand
39becf9d58 i965: Use a better guardband calculation.
(Patch co-authored by Jason and Ken.)

We scaled the guardband based on the viewport size, but failed to
take into account the translation portion of the viewport transform.

This meant the guardband was always centered around the origin.
We want it to be centered around the screen-space drawing area,
which is the intersection of the viewport and the render target.

At best, getting this wrong would reduce the guardband's effectiveness
in some cases.  At worst, it might break things - objects outside of the
guardband are trivially rejected, so getting the guardband in the wrong
place and leaving guardband clipping enabled could cause problems.

v2: drop clamping of positive maximums.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f3c068c5c8)
2017-03-14 00:13:15 +00:00
Kenneth Graunke
66964e9f7e i965: Combine the Gen6 SF and Clip viewport atoms.
The next patch will make the guardband calculation dependent on the
transformation matrix.  Instead of computing it in both atoms, just
combine them into a single atom.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit 89ad7f1be6)
2017-03-14 00:13:15 +00:00
Dave Airlie
9851e392a0 radv: pass FMASK alignment to application
As was done for dcc and cmask.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 90ac2285f0)
2017-03-14 00:13:15 +00:00
Bas Nieuwenhuizen
dc94e70e27 radv: Pass DCC alignment to application.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
(cherry picked from commit 47ca0f537d)
2017-03-14 00:13:15 +00:00
Bas Nieuwenhuizen
ba725de721 radv: Pass CMASK alignment to application.
CMASK alignment can be greater than image data alignment, so pass
it to the app so that it knows what alignment to backing memory
should have.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit eb01b20cc4)
2017-03-14 00:13:14 +00:00
Dave Airlie
a085b5e051 radv/ac: avoid the fmask path when doing txs.
This fixes the vulkan samples deferredmultisampling test.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a864ef7f48)
2017-03-14 00:13:14 +00:00
Nicolai Hähnle
63b68840e9 dri/common: clear the loaderPrivate pointer in driDestroyDrawable
The GLX specification says about glXDestroyPixmap:

    "The storage for the GLX pixmap will be freed when it is not current
     to any client."

We're not really following this language to the letter: some of the storage
is freed immediately (in particular, the dri3_drawable, which contains both
GLXDRIdrawable and loader_dri3_drawable). So we NULL out the pointers to
that freed storage; the previous patches added the corresponding NULL-pointer
checks.

This fixes memory corruption in piglit
./bin/glx-visuals-depth/stencil -pixmap -auto

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 7be0e602ed)
2017-03-14 00:13:14 +00:00
Nicolai Hähnle
c0de475acc glx: guard swap-interval functions against destroyed drawables
The GLX specification says about glXDestroyPixmap:

    "The storage for the GLX pixmap will be freed when it is not current
     to any client."

So arguably, functions like glXSwapIntervalMESA can be called after
glXDestroyPixmap has been called for the currently bound GLXPixmap.
In that case, the GLXDRIDrawable no longer exists, and so we just skip
those calls.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f446f3fb33)
2017-03-14 00:13:14 +00:00
Nicolai Hähnle
fae270d3ba glx/dri3: guard in_current_context against a disappeared drawable
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 21ec35566b)
2017-03-14 00:13:14 +00:00
Nicolai Hähnle
c974cb118e glx/dri3: handle NULL pointers in loader-to-DRI3 drawable conversion
With a subsequent patch, we might see NULL loaderPrivates, e.g. when
a DRIdrawable is flushed whose corresponding GLXDRIdrawable was destroyed.
This resulted in a crash, since the loader vs. DRI3 drawable structures
have a non-zero offset.

Fixes glx-visuals-{depth,stencil} -pixmap

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 40c304fc06)
2017-03-14 00:13:14 +00:00
Dave Airlie
fb58b03434 radv/ac: correctly size shared memory usage.
We count the number of slots used, but slots are vec4 sized,
so we have to scale by 16 not 4.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a1a8aef4c9)
2017-03-14 00:13:14 +00:00
Samuel Pitoiset
346081e7ad winsys/amdgpu: avoid potential segfault in amdgpu_bo_map()
cs can be NULL when it comes from r600_buffer_map_sync_with_rings()
to avoid doing the same checks. It was checked for write mappings
but not for read mappings.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit af303abcdb)
2017-03-14 00:13:14 +00:00
Bartosz Tomczyk
eca9916254 glsl: fix heap-buffer-overflow
The `end+1` skips the ']', whereas the `strlen+1` includes the final
'\0' in the move to terminate the string.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit fc27181f9e)
2017-03-14 00:13:14 +00:00
Dave Airlie
ce748ec4eb radv/ac: implement txs for buffer textures.
This fixes a bunch of buffer related:
dEQP-VK.memory.pipeline_barrier.*
tests, that were crashing in LLVM due to this being missing.

Reviewed-by: Andres Rodriguez<andresx7@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0ecd426490)
2017-03-14 00:13:14 +00:00
Dave Airlie
32ff925529 radv/ac: handle nir irem opcode.
This fixes:
dEQP-VK.spirv_assembly.instruction.compute.opsrem.*

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org"
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ecc3fa3ba3)
2017-03-14 00:13:13 +00:00
Dave Airlie
0218055e9a radv: handle transfer_write as a dst flag.
It appears we can get image barriers like:
    srcStageMask:                   VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT)
    dstStageMask:                   VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT)
    dependencyFlags:                VkDependencyFlags = 0
    memoryBarrierCount:             uint32_t = 0
    pMemoryBarriers:                const VkMemoryBarrier* = NULL
    bufferMemoryBarrierCount:       uint32_t = 0
    pBufferMemoryBarriers:          const VkBufferMemoryBarrier* = NULL
    imageMemoryBarrierCount:        uint32_t = 1
    pImageMemoryBarriers:           const VkImageMemoryBarrier* = 0x7ffc882367b0
        pImageMemoryBarriers[0]:        const VkImageMemoryBarrier = 0x7ffc882367b0:
            sType:                          VkStructureType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER (45)
            pNext:                          const void* = NULL
            srcAccessMask:                  VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
            dstAccessMask:                  VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
            oldLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL (7)
            newLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_GENERAL (1)
            srcQueueFamilyIndex:            uint32_t = 4294967295
            dstQueueFamilyIndex:            uint32_t = 4294967295
            image:                          VkImage = 0x2df55e0
            subresourceRange:               VkImageSubresourceRange = 0x7ffc882367e0:
                aspectMask:                     VkImageAspectFlags = 1 (VK_IMAGE_ASPECT_COLOR_BIT)
                baseMipLevel:                   uint32_t = 0
                levelCount:                     uint32_t = 1
                baseArrayLayer:                 uint32_t = 0
                layerCount:                     uint32_t = 1

This fixes all the CTS dEQP-VK.memory.pipeline_barrier.transfer_dst tests here,
not sure if this is a too large hammer.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a1c1ba7d56)
2017-03-14 00:13:13 +00:00
Marek Olšák
26456ad340 radeonsi: don't invoke DCC decompression in update_all_texture_descriptors
This fixes a bug uncovered by the 17-part patch series, specifically:
  "gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter"

If dirty_tex_counter has been updated and set_shader_image invokes DCC
decompression, the DCC decompression itself checks the counter and updates
descriptors, which in turn invokes the same DCC decompression. The blitter
can't handle the recursion and the driver eventually crashes.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a0740d59aa)
2017-03-14 00:13:13 +00:00
Kenneth Graunke
080433321c i965: Support the force_glsl_version driconf option.
Gallium drivers have had this for a while.  It makes sense to support
it consistently across drivers, so expose it in i965 as well.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 2f7a7ae131)
2017-03-14 00:13:13 +00:00
Kenneth Graunke
6a6da88868 i965: Fix check for negative pitch in can_do_fast_copy_blit().
At this point, the pitch is in bytes.  We haven't yet divided the pitch
by 4 for tiled surfaces, so abs(pitch) may be larger than 32K.  This
means the bit 15 trick won't work.

The caller now has signed integers anyway, so just pass those through
and do the obvious check.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 02216a1ddf)
2017-03-14 00:13:13 +00:00
Kenneth Graunke
ccff7fcc17 i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.
SIMD16 compute shaders use a send(16) with mlen 1 for the EOT message,
using a source of g127 for the single register.  With a UD type, this
supposedly could read g128, which doesn't exist, causing the simulator
to get cranky.  Use a UW type to avoid this.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit fcf723b647)
2017-03-14 00:13:13 +00:00
Kenneth Graunke
fc6d1cbd2e i965: Fix fast depth clears for surfaces with a dimension of 16384.
I hadn't bothered to set this bit because I figured it would just
paper over us getting the rectangle wrong.  But it turns out that
there is a legitimate reason to use it, so let's do so.

The alternative would be to chop up 16k clears to multiple 8k clears,
which is pointlessly painful.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 5106df85da)
2017-03-14 00:13:13 +00:00
Dave Airlie
38c9a9abad radv: program a default point size.
Along the lines of what
3b804819 anv: Default PointSize to 1.0 if not written by the shader
does for anv, program a default point size in the hw of 1.0.

This preempt fixes a bunch of geom shader tests.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2ab2be092d)
2017-03-14 00:13:13 +00:00
Marek Olšák
e212f91046 st/mesa: destroy pipe_context before destroying st_context (v2)
If radeonsi starts compiling an optimized shader variant asynchronously
with a GL debug callback set and the application destroys the GL context,
radeonsi crashes when trying to write shader stats into the debug output
of a non-existent context after compilation, because st/mesa was destroyed
before pipe_context.

Firefox with WebGL2 enabled hits this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99456

v2: protect against a double destroy in st_create_context_priv and callers.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d9ef549238)
2017-03-14 00:13:12 +00:00
Ian Romanick
d41c9c3a49 mesa: Don't advertise GL_OES_read_format in core profile
OpenGL ES implementations are not allowed to ship ARB extensions, and
OpenGL implementations are not allowed to ship OES extensions.

The functionality is also included in GL_ARB_ES2_compatibility.  Ever
OpenGL core-profile driver currently exposes both extensions.  I don't
know of any applications that explicitly check for GL_OES_read_format,
so removing it seems very unlikely to cause problems.  No functionality
is removed.

I have left this extension in place for compatibility profile.  There
are still OpenGL 1.x drivers in Mesa, and adding code to check for
compatibility profile and not GL_ARB_ES2_compatibility for
GL_IMPLEMENTATION_COLOR_READ_TYPE and GL_IMPLEMENTATION_COLOR_READ_FORMAT
just feels dumb.

Three other other alternatives considered:

 - Remove the string from compatibility profile drivers but leave the
   functionality in place.

 - Add a flag to expose the extension string, and set it in every OpenGL
   driver that does not expose GL_ARB_ES2_compatibility (and those
   drivers only).  I tried this.  You can't have two instances of an
   extension in the extension table (one dummy_true for ES1 and one with
   a flag for compatibility profile), so the implementation requires a
   bit of effort.

 - Only expose the extension in compatibility if the version is less
   than 2.0.  I didn't see an easy way to do this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c4a0c1efff)
2017-03-14 00:13:12 +00:00
George Kyriazis
1665761bb0 swr: Align query results allocation
Some query results struct contents are declared as cache line aligned.
Use aligned malloc, and align the whole struct, to be safe.

Fixes crash when compiling with clang.

CC: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit 00847e4f14)
2017-03-14 00:13:12 +00:00
Bruce Cherniak
59f76392a5 swr: Prune empty nodes in CalculateProcessorTopology.
CalculateProcessorTopology tries to figure out system topology by
parsing /proc/cpuinfo to determine the number of threads, cores, and
NUMA nodes.  There are some architectures where the "physical id" begins
with 1 rather than 0, which was creating and empty "0" node and causing a
crash in CreateThreadPool.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b829206b07)
2017-03-14 00:13:12 +00:00
Nicolai Hähnle
77b794d85b glsl: split DIV_TO_MUL_RCP into single- and double-precision flags
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b71c415c3d)
2017-03-14 00:13:12 +00:00
Lionel Landwerlin
83cf2d72f7 spirv: don't assert with location decorations on non i/o variables
Some applications might add location decoration to samplers. Rather
than raising an error it seems it would make more sense to just
discard these decorations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8a28e764d0)
2017-03-13 18:43:09 +00:00
Nicolai Hähnle
b6ec9c2098 radeonsi: fix texture gather on stencil textures
At least on VI, texture gather doesn't work with a 24_8 data format, so
use 8_8_8_8 and a modified swizzle instead.

A bit of background: When creating a GL_STENCIL_INDEX8 texture, we select
the X24S8 pipe format because we don't support stencil-only render targets
properly. With mip-mapping this can lead to a setup where the tiling is
incompatible with stencil texturing, and a flushed stencil texture is
used. For the flushed stencil, a literal X24S8 is used because there were
issues with an 8bpp DB->CB copy.

Longer term, it would be good if we could get away from these workarounds,
i.e. properly support an S8 format for stencil-only rendering and flushed
stencil. Since stencil texturing is somewhat rare, it's not a high
priority.

Fixes GL45-CTS.texture_cube_map_array.sampling.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 3cd092c415)
2017-03-13 18:43:09 +00:00
Nicolai Hähnle
caef71832c mesa/main: fix meta caller of _mesa_ClampColor
Since _mesa_ClampColor properly checks for support of the API function
now, it's meta callers need to check support as well.

Fixes: 963311b71f ("mesa/main: fix version/extension checks in _mesa_ClampColor")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99401
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a7c635ec65)
2017-03-13 18:43:08 +00:00
Emil Velikov
572584bbf0 Revert "get-pick-list.sh: Require explicit "13.0" for nominating stable patches"
This reverts commit 9185a3385b.

As per the updated documentation - see commit 9e4248b206
("docs/submittingpatches.html: remove version tag for nominations")
the explicit version tag is no longer needed.

Conflicts:
	bin/get-pick-list.sh
2017-03-01 18:42:23 +00:00
Emil Velikov
112e75f51b docs: add sha256 checksums for 13.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-20 11:54:07 +00:00
Emil Velikov
71f3ff57fa docs: add release notes for 13.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-20 11:43:27 +00:00
Emil Velikov
8d622e91d4 Update version to 13.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-20 10:01:46 +00:00
Ilia Mirkin
1d561d8147 nvc0: disable linked tsc mode in compute launch descriptor
Empirically, this makes things work. Presumably this was originally
copied from the blob, which does make use of linked tsc mode.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99532
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 956556b3c3)
2017-02-17 18:29:05 +00:00
Bartosz Tomczyk
9f66954047 r600/sb: Fix memory leak
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 94262e5f5d)
Fixes: e933246013 ("r600/sb: Fix loop optimization related hangs on eg")
Nominated-by: Andreas Boll <andreas.boll.dev@gmail.com>
2017-02-17 18:27:40 +00:00
Derek Foreman
bebf672fc7 egl/dri2: add image_loader_extension back into loader extensions for wayland
before commit f871946594
image_loader_extension was always present in dri2_dpy->extensions,
after that commit it is only present for render nodes.

Its removal broke partial render based on buffer age on (at least)
raspberry pi.

Fixes: f871946594 "egl/dri2: rework dri2_egl_display::extensions storage"
Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 534ea2b5ba)
2017-02-16 17:59:43 +00:00
Vinson Lee
1b0715f05f util: Fix Clang trivial destructor check.
Check for Clang before GCC.

Clang defines __GNUC__ == 4 and __GNUC_MINOR__ == 2 and matches the GCC
check but not the GCC version for trivial destructor.

Fixes: 98ab905af0 ("mesa: Define introspection macro to determine
whether a type is trivially destructible.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98526
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>

(cherry picked from commit ed6694d511)
2017-02-16 17:59:43 +00:00
Vinson Lee
06b96072c7 scons: Require libdrm >= 2.4.66 for DRM.
configure.ac already requires 2.4.66.

Fix SCons build. drmDevicePtr is not available until libdrm 2.4.65.

  Compiling src/loader/loader.c ...
src/loader/loader.c:111:40: error: unknown type name ‘drmDevicePtr’
 static char *drm_construct_id_path_tag(drmDevicePtr device)
                                        ^

Fixes: 4a183f4d06 ("scons: loader: use libdrm when available")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98421
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Vedran Miletić <vedran@miletic.net>
(cherry picked from commit f2770fb3d5)
2017-02-16 17:59:43 +00:00
Emil Velikov
2248d24509 bin/get-fixes-pick-list.sh: add new script
The script parses the "Fixes" tags and nominates respective commit if
applicable.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 389478c4e9)
2017-02-16 17:59:42 +00:00
Emil Velikov
32cf2344c1 bin/get-pick-list.sh: remove ancient way of nominating patches
The old way of nominating patches [NOTE: .*[Cc]andidate] was
deprecated and has been unused for approx. 3 years.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit f1b0b75099)
2017-02-16 17:59:42 +00:00
Emil Velikov
a05089106b bin/get-pick-list.sh: limit `git grep ...' only as needed
Analogous to previous commit.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit d6b1d11d4f)
2017-02-16 17:59:42 +00:00
Emil Velikov
dfe79fc3f1 bin/get-typod-pick-list.sh: limit `git grep ...' to only as needed
The currently used range HEAD..origin/master is far too broad. It looks
for nominations within the already_landed list (branchpoint..HEAD).

Similarly we look for already_landed whiting the [possible] nominations
Rand branchpoint..origin/master.

Improve things by limiting the look ups to the branch point.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit d292f12d94)
2017-02-16 17:59:42 +00:00
Emil Velikov
dad81f4cd0 bin/get-extra-pick-list: rework to use already_picked list
Currently we loop (git log --grep) to check if the fix has landed. We
can simplify and make things faster by storing the already_picked list
and grep ping through it.

Slim down the message while we're here.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 71e00d62ed)
2017-02-16 17:59:42 +00:00
Emil Velikov
61d7a9dc21 bin/get-extra-pick-list: use git merge-base to get the branchpoint
Since mesa development history is linear and the only diversion is at
the branchpoint. Thus we can drop the ad-hoc parsing and use git
merge-base to retrieve it.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit cb1947eac7)
2017-02-16 17:59:42 +00:00
Hans de Goede
279604b4fa glx/glvnd: Fix GLXdispatchIndex sorting
Commit 8bca8d89ef ("glx/glvnd: Fix dispatch function names and indices")
fixed the sorting of the array initializers in g_glxglvnddispatchfuncs.c
because FindGLXFunction's binary search needs these to be sorted
alphabetically.

That commit also mostly fixed the sorting of the DI_foo defines in
g_glxglvnddispatchindices.h, which is what actually matters as the
arrays are initialized using "[DI_foo] = glXfoo," but a small error
crept in which at least causes glXGetVisualFromFBConfigSGIX to not
resolve, breaking games such as "The Binding of Isaac: Rebirth" and
"Crypt of the NecroDancer" from Steam not working and possible causes
other problems too.

This commit fixes the last of the sorting errors, fixing these mentioned
games not working.

Fixes: 8bca8d89ef ("glx/glvnd: Fix dispatch function names and indices")
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Cc: Adam Jackson <ajax@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 4c66f529a8)
2017-02-16 17:59:42 +00:00
Dave Airlie
d0e460c7b7 radv: adopt some init config workarounds from radeonsi.
Just one bonaire fix.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 09bf5491c4)
2017-02-16 17:59:42 +00:00
Dave Airlie
f00bc877a2 radv: fix cik macroModeIndex.
This just a CIK fix ported from radeonsi.

Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0f1a4220a6)
2017-02-16 17:59:42 +00:00
Dave Airlie
c3365b06ac radv: change base aligmment for allocated memory.
On some CIK (Hawaii) this needs to be at least 64k, I'm not 100% sure
it doesn't need to be 128k.

This was causing fast clear eliminate to overwrite the previous buffer,
which since my gfx init code, was the indirect buffer.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=99692
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 06ffd29925)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/amd/vulkan/radv_device.c
2017-02-16 17:59:42 +00:00
Jason Ekstrand
2bdb22fdaa i965/sampler_state: Set the "Base Mip Level" field on Sandy Bridge
Fixes two GL ES 3.0 CTS tests on Sandy Bridge:

ES3-CTS.functional.texture.mipmap.cube.base_level.linear_linear
ES3-CTS.functional.texture.mipmap.cube.base_level.linear_nearest

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c59d1ea51b)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_sampler_state.c
2017-02-16 17:59:42 +00:00
Jason Ekstrand
bbe50d9b03 i965/sampler_state: Pass texObj into update_sampler_state
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c4f8f395b2)
2017-02-16 17:59:42 +00:00
Eric Anholt
0683feb18c vc4: Avoid emitting small immediates for UBO indirect load address guards.
The kernel will reject our shader if we emit one here, and having 4, 8, or
12 as the top end of our UBO clamp rare is enough that it's not worth
making the kernel let us.

Fixes piglit fs-const-array-of-struct and
fs-const-array-of-struct-of-array since recent GLSL linking changes made
us get this as an indirect load of a uniform, instead of a tempoary.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b230939303)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/vc4/vc4_opt_small_immediates.c
2017-02-16 17:59:42 +00:00
Marek Olšák
7d172c6c35 gallium/radeon: fix performance of buffer readbacks
We want cached GTT for all non-persistent read mappings.
Set level = 0 on purpose.

Use dma_copy, because resource_copy_region causes a failure in the PBO
read of piglit/getteximage-luminance.

If Rocket League used the READ flag, it should get cached GTT.

v2: mask out UNSYNCHRONIZED

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d86099df0a)
2017-02-16 17:59:41 +00:00
Marc-André Lureau
f562f57646 tgsi-dump: dump label if instruction has one
The instruction has an associated label when Instruction.Label == 1,
as can be seen in ureg_emit_label() or tgsi_build_full_instruction().

This fixes dump generating extra :0 labels on conditionals, and virgl
parsing more than the expected tokens and eventually reaching "Illegal
command buffer" (when parsing more than a safety margin of 10 we
currently have).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit dc2d9b8da1)
2017-02-16 17:59:41 +00:00
Lionel Landwerlin
f4e2c60858 spirv: handle undefined components for OpVectorShuffle
Fixes:
   dEQP-VK.spirv_assembly.instruction.compute.opspecconstantop.vector_related
   dEQP-VK.spirv_assembly.instruction.graphics.opspecconstantop.vector_related*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bbe8705c57)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/compiler/spirv/spirv_to_nir.c
2017-02-16 17:59:41 +00:00
Ian Romanick
b18c791a64 linker: Accurately mark a uniform block instance array element as used in a stage
Now that information about which array-of-arrays elements are accessed
is tracked, use that information to only mark an instance array element
as used-by-stage if, in fact, it is.

Fixes GL45-CTS.program_interface_query.uniform-block-types.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ceea514d91)
2017-02-16 17:46:15 +00:00
Ian Romanick
010886b120 glsl: Walk a list of ir_dereference_array to mark array elements as accessed
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d32956935e)
2017-02-16 17:46:14 +00:00
Ian Romanick
473319075b glsl: Mark a set of array elements as accessed using a list of array_deref_range
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e92935089b)
2017-02-16 17:46:14 +00:00
Ian Romanick
0288655ce7 glsl: Add structures to track accessed elements of a single array
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8d499f60c8)
2017-02-16 17:46:14 +00:00
Ian Romanick
64f24795b9 glsl: Add tracking for elements of an array-of-arrays that have been accessed
If there's a better way to provide access to ir_array_refcount_entry
private members to the test functions, I am very interested to know
about it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b7053b80f2)
2017-02-16 17:46:14 +00:00
Ian Romanick
784b362767 glsl: Use simpler visitor to determine which UBO and SSBO blocks are used
Very soon this visitor will get more complicated.  The users of the
existing ir_variable_refcount visitor won't need the coming
functionality, and this use doesn't need much of the functionality of
ir_variable_refcount.

v2: ir_array_refcount_visitor::get_variable_entry cannot return NULL, so
don't check it.  Suggested by Timothy.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5085b64031)
2017-02-16 17:46:14 +00:00
Ian Romanick
3abc968236 glsl: Track the linearized array index for each UBO instance array element
v2: Set linearizer_array_index in process_block_array_leaf.  Suggested
by Timothy.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d56bd07bb3)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/compiler/glsl/link_uniform_blocks.cpp
2017-02-16 17:46:14 +00:00
Ian Romanick
efe15de566 glsl: Fix wonkey indentation left from previous commit
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 300de78ab1)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/compiler/glsl/link_uniform_blocks.cpp
2017-02-16 17:46:14 +00:00
Ian Romanick
91df6e8aed glsl: Split process_block_array into two functions
One for the array parts and one for the leaf members.  This will
simplify later changes.

The indentation is wonkey after this patch.  This was done to make it
more obvious that the function is just getting split.  The next patch
will fix the indentation.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8862fefba0)
2017-02-16 17:46:14 +00:00
Ian Romanick
00d2299007 linker: Accurately track gl_uniform_block::stageref
As the linked per-stage shaders are processed, mark any block that has a
field that is accessed as referenced.  When combining all the linked
shaders, combine the per-stage stageref masks.

This fixes a number of GLES CTS tests:

    ES31-CTS.core.geometry_shader.program_resource.program_resource
    ES32-CTS.core.geometry_shader.program_resource.program_resource
    ESEXT-CTS.geometry_shader.program_resource.program_resource
    piglit.gl45-cts.geometry_shader.program_resource.program_resource

However, it makes quite a few more fail:

    ES31-CTS.functional.program_interface_query.buffer_variable.random.6
    ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.compute.unnamed_block.float
    ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.separable_fragment.unnamed_block.float
    ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_fragment_only_fragment.unnamed_block.float
    ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_fragment.unnamed_block.float
    ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_geo_fragment_only_fragment.unnamed_block.float
    ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_geo_fragment.unnamed_block.float
    ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_fragment_only_fragment.unnamed_block.float
    ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_fragment.unnamed_block.float
    ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_geo_fragment_only_fragment.unnamed_block.float
    ES31-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_geo_fragment.unnamed_block.float
    ES32-CTS.functional.program_interface_query.buffer_variable.random.6
    ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.compute.unnamed_block.float
    ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.separable_fragment.unnamed_block.float
    ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_fragment_only_fragment.unnamed_block.float
    ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_fragment.unnamed_block.float
    ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_geo_fragment_only_fragment.unnamed_block.float
    ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_geo_fragment.unnamed_block.float
    ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_fragment_only_fragment.unnamed_block.float
    ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_fragment.unnamed_block.float
    ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_geo_fragment_only_fragment.unnamed_block.float
    ES32-CTS.functional.program_interface_query.buffer_variable.referenced_by.vertex_tess_geo_fragment.unnamed_block.float

I have diagnosed the failures, but I'm not sure whether we or the
tests are wrong.  After optimizations are applied, all of the tests
are of the form:

    buffer X {
        float f;
    } x;

    void main()
    {
        x.f = x.f;
    }

The test then queries that x is referenced by that shader stage.  We
eliminate the assignment of x.f to itself, and that removes the last
reference to x.  We report that x is not referenced, and the test fails.
I do not know whether or not we are allowed to eliminate that assignment
of x.f to itself.

After discussions with the OpenGL ES group in Khronos, we believe that
Mesa's behavior is correct.  I will provide patches to the CTS tests
to Khronos.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 084105c213)
[Emil Velikov: nominate considering the above fixes and it's a
requirement for the following nine patches]
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-16 17:46:14 +00:00
Ian Romanick
ed48242e05 linker: Slight code rearrange to prevent duplication in the next commit
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 392fabcfee)
2017-02-16 17:46:09 +00:00
Chad Versace
01ac2d3c5c i965/mt: Disable HiZ when sharing depth buffer externally (v2)
intel_miptree_make_shareable() discarded and disabled CCS. Fix it so
that it discards and disables HiZ too.

Fixes dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer
on Skylake.

v2: Actually do what the commit message says. Discard the HiZ buffer.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98329
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Nanley Chery <nanley.g.chery@intel.com
Cc: Haixia Shi <hshi@chromium.org>
(cherry picked from commit 42011be1e2)
[Emil Velikov: patch is a backport by Chad of above commit]
2017-02-14 13:32:38 +00:00
Topi Pohjolainen
1770ba4d8f i965/gen6: Issue direct depth stall and flush after depth clear
instead of calling unconditionally brw_emit_mi_flush() which
does:

   brw_emit_pipe_control_flush(brw,
                                PIPE_CONTROL_DEPTH_CACHE_FLUSH |
                                PIPE_CONTROL_RENDER_TARGET_FLUSH |
                                PIPE_CONTROL_CS_STALL);

   brw_emit_pipe_control_flush(brw,
                                PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
                                PIPE_CONTROL_CONST_CACHE_INVALIDATE);

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 46b346899d)
2017-02-10 01:08:39 +00:00
Topi Pohjolainen
60662cf26e i965: Make depth clear flushing more explicit
Current blorp logic issues unconditional "flush everything"
(see brw_emit_mi_flush()) after each render. For example, all
blits issue this unconditionally which shouldn't be needed if
they set render cache properly so that subsequent renders do
necessary flushing before drawing.

In case of piglit:

ext_framebuffer_multisample-accuracy all_samples depth_draw small

intel_hiz_exec() is always preceded by blorb blit and the
unconditional flush looks to hide the lack of stall and flushes
in depth clears. By removing the brw_emit_mi_flush() I get gpu
hangs.

This patch adds the stalls and flushes mandated by the spec
and gets rid of those hangs.

v2 (Jason, Ken): Document the rational for separating
                 depth cache flush and stall on Gen7.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit e6da6943fe)
2017-02-10 01:08:39 +00:00
Emil Velikov
fada7e5fda mesa/tests: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 091f2b8c98)
2017-02-09 23:46:34 +00:00
Emil Velikov
dc5ac1404c dri/osmesa: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 6ba96bdcab)
2017-02-09 23:46:31 +00:00
Emil Velikov
5c7fcaacd9 dri/swrast: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ede4ff9adc)
2017-02-09 23:46:29 +00:00
Emil Velikov
61e3b6d309 radeon, r200: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 5a0ba1e5de)
2017-02-09 23:46:25 +00:00
Emil Velikov
de2402aafb mapi: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ee5de93269)
2017-02-09 23:46:22 +00:00
Emil Velikov
6086a15f1a loader: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit af860850a0)
2017-02-09 23:46:17 +00:00
Emil Velikov
70a8715ea7 glx/windows: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 912b4f5472)
2017-02-09 23:46:14 +00:00
Emil Velikov
0d934e4a39 glx/apple: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jeremy Sequoia <jeremyhu@apple.com>
(cherry picked from commit 5b874cee09)
2017-02-09 23:46:11 +00:00
Emil Velikov
40a92fd518 glx: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d66f9e6d93)
2017-02-09 23:46:08 +00:00
Emil Velikov
e1815ff6a2 d3dadapter9: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d221bf9b91)
2017-02-09 23:46:05 +00:00
Emil Velikov
f332448ef5 st/dri: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 517f34b4be)
2017-02-09 23:46:03 +00:00
Emil Velikov
a8ba39aba6 clover: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Aaron Watry <awatry@gmail.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 65d5a60cac)
2017-02-09 23:45:59 +00:00
Emil Velikov
f8230e841a egl: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c5921ae0d2)
2017-02-09 23:45:56 +00:00
Emil Velikov
c1ad22360d i915: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 90ac5c339e)
2017-02-09 23:45:54 +00:00
Emil Velikov
585d8777b0 i965: automake: include builddir prior to srcdir
The latter can contain stale generated file, which, as-is, we'll end up
using.

Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 4622c75dfb)
2017-02-09 23:45:51 +00:00
Emil Velikov
3479dde5c6 freedreno: automake: correctly set MKDIR_GEN
Analogous to previous commit.

Fixes: 4610e5ef28 "freedreno/ir3: fix sin/cos"
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
(cherry picked from commit a922c82125)
2017-02-09 23:45:48 +00:00
Emil Velikov
df67d0590a i965: automake: correctly set MKDIR_GEN
Otherwise we might end up w/o the respective folder (depending on
autotools version) and fail at build time.

Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5eed48d237)
2017-02-09 23:45:45 +00:00
Jason Ekstrand
129f6089a0 vulkan/wsi: Lower the maximum image sizes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit d6397dd625)
2017-02-09 23:45:34 +00:00
Jason Ekstrand
426b1156c7 vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 659edd9f5c)
2017-02-09 23:45:31 +00:00
Jason Ekstrand
92c061289e vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormats
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit dc578ef060)
2017-02-09 23:42:19 +00:00
Bruce Cherniak
c50346e58b swr: [rasterizer core] Remove dead code Clipper::ClipScalar()
Clipper::ClipScalar() is dead code and should be removed.  It is causing
an error with gcc-7 because it references a now defunct member.

v2: includes bugzilla reference, same code change

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99633
CC: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
(cherry picked from commit bf29495dcd)
2017-02-09 23:41:56 +00:00
Ilia Mirkin
eb73e3c6d0 st/mesa: MAX_VARYING is the max supported number of patch varyings, not min
This fixes
GL45-CTS.tessellation_shader.tessellation_shader_tessellation.max_in_out_attributes
on nouveau. We only support 30 patch varyings (as 2 vec4 slots end up
being used for tess level settings), but were getting 32 exposed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7d3f9ed71c)
2017-02-09 23:41:56 +00:00
Ilia Mirkin
f6813d5dda vbo: process buffer binding state changes on draw when recording
The VBO module keeps track of any vbo buffers. It updates this list when
receiving an InvalidateState call, however this never happens when
recording draws right now. Make sure that we do all the usual state
updates when recording draws so that the VBO list may be kept up to
date.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99631
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e73f87fcbd)
2017-02-09 23:41:56 +00:00
Jason Ekstrand
0624d6c2a4 anv: Improve flushing around STATE_BASE_ADDRESS
It is not clear from the docs exactly how pipelined STATE_BASE_ADDRESS
actually is.  We know from experimentation that we need to flush the
render cache prior to emitting STATE_BASE_ADDRESS and invalidate the
texture cache afterwards.  The only thing the PRM says is that, on gen8+
we're supposed to invalidate the state cache after STATE_BASE_ADDRESS
but experimentation has indicated that doing so does nothing whatsoever.

Since we don't really know, let's do just a bit more flushing in the
hopes that this won't be a problem again.  In particular:

 1) Do a CS stall before we emit STATE_BASE_ADDRESS since we don't
    really know whether or not it's pipelined.

 2) Do a data cache flush in case what runs before STATE_BASE_ADDRESS
    is a compute shader.

 3) Invalidate the state and constant caches after STATE_BASE_ADDRESS
    because the state may be getting cached there (we don't really know).

Reported-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92128590bc)
2017-02-09 23:41:56 +00:00
Jason Ekstrand
b5bfc9bcc0 anv: Flush render cache before STATE_BASE_ADDRESS on gen7
We had no good reason for *not* doing this on gen7 before but we didn't
know it was needed.  Recently, when trying update to Vulkan CTS version
1.0.2 in our CI system, Mark discovered GPU hangs on Haswell that appear
to be STATE_BASE_ADDRESS related.  This commit fixes them.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f1f9794118)
2017-02-09 23:41:56 +00:00
Jason Ekstrand
01044bf446 isl/formats: Only advertise sampling for A4B4G4R4 on Broadwell
This causes hangs on Broadwell if you try to render to it.  I have no
idea how we managed to not hit this earlier.

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4871930451)
2017-02-09 23:41:56 +00:00
Jason Ekstrand
cf94d126b6 intel/blorp: Handle clearing of A4B4G4R4 on all platforms
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0348b5a0b)
2017-02-09 23:41:56 +00:00
Bartosz Tomczyk
697cf3c720 r600: Fix stack overflow
Commit 7b5878ee04 increased number of
outputs to 64, but left output array intact. This caused stack overflow
when number of outputs is bigger then 32. Found by ASAN.

Cc: "12.0 13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a41f2527ae)
2017-02-09 23:41:55 +00:00
Kenneth Graunke
68d18cccff i965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.
Applications may delete a shader program, create a new one, and bind it
before the next draw.  With terrible luck, malloc may randomly return a
chunk of memory for the new gl_program that happened to be the exact
same pointer as our previously bound gl_program.  In this case, our
logic to detect new programs in brw_upload_pipeline_state() would break:

      if (brw->vertex_program != ctx->VertexProgram._Current) {
         brw->vertex_program = ctx->VertexProgram._Current;
         brw->ctx.NewDriverState |= BRW_NEW_VERTEX_PROGRAM;
      }

Because the pointer is the same, we'd think it was the same program.
But it could be wildly different - a different stage altogether,
different sets of resources, and so on.  This causes utter chaos.

As unlikely as this seems, I believe I hit this when running a subset
of the CTS in a loop, in a group of tests that churns through simple
programs, deleting and rebuilding them.  Presumably malloc uses a
bucketing cache of sorts, and so freeing up a gl_program and allocating
a new one fairly quickly causes it to reuse that memory.

The result was that brw->vertex_program->info.num_ssbos claimed the
program had SSBOs, while brw->vs.base.prog_data.binding_table claimed
that there were none.  This was crazy, because the binding table is
calculated from info.num_ssbos - the shader info appeared to change
between shader compile time and draw time.  Careful use of watchpoints
revealed that it was being clobbered by rzalloc's memset when building
an entirely different program...

Fortunately, our 0xd0d0d0d0 canary for unused binding table entries
caused us to crash out of bounds when trying to upload SSBOs, or we
may have never discovered this heisenbug.

Fixes crashes in GL45-CTS.compute_shader.sso-case2 when using a hacked
cts-runner that only runs GL45-CTS.compute_shader.s* in EGL config ID 5
at 64x64 in a loop with 100 iterations.

Cc: "17.0 13.0 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7c5629a269)
2017-02-09 23:41:55 +00:00
Emil Velikov
fa5d8de838 configure.ac: list radeon in --with-vulkan-drivers help string
Analogous to what we do for the dri and gallium drivers.

Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@colllabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit cb6be5c8c0)
2017-02-09 23:41:55 +00:00
Lionel Landwerlin
58c7c9d438 spirv: handle OpUndef as part of the variable parsing pass
Looking at the following bit of SPIRV shader :

...
%zero        = OpConstant %i32 0
%ivec3_0     = OpConstantComposite %ivec3 %zero %zero %zero
%vec3_undef  = OpUndef %ivec3
%sc_0        = OpSpecConstant %i32 0
%sc_1        = OpSpecConstant %i32 0
%sc_2        = OpSpecConstant %i32 0
...

Our compiler currently stops parsing variables & types on the OpUndef
and switches to instructions, leaving the following sc_[0-2] variables
untreated.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit df7063cba3)
2017-02-08 16:01:34 +00:00
Lionel Landwerlin
5210d157c4 anv: fix descriptor pool internal size allocation
The size of the pool is slightly smaller than the size of the
structure containing the whole pool. We need to take that into account
on when setting up the internals.

Fixes a crash due to out of bound memory access in:
   dEQP-VK.api.descriptor_pool.out_of_pool_memory

v2: Drop debug traces (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c3421106ec)
2017-02-08 15:59:58 +00:00
Lionel Landwerlin
f69b8510ba anv: set command buffer to NULL when allocations fail
The spec section 5.2 says:

   "vkAllocateCommandBuffers can be used to create multiple command
   buffers. If the creation of any of those command buffers fails, the
   implementation must destroy all successfully created command buffer
   objects from this command, set all entries of the pCommandBuffers
   array to VK_NULL_HANDLE and return the error."

Fixes:
   dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
   dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 25e21cb8d0)
2017-02-08 15:58:30 +00:00
Jason Ekstrand
94278a48d1 i965/blorp: Use the correct ISL format for combined depth/stencil
In brw_blorp_copyteximage, we use the format from the render buffer.
This could be a combined depth/stencil format.  In this case, we handle
stencil properly but we give blorp the wrong ISL format.  Specifically,
we would give blorp ISL_FORMAT_R32G32B32A32_FLOAT which is the wrong
size was causing GPU hangs.

Fixes: GL45-CTS.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_copyteximage

Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4c180f9633)
2017-02-08 15:56:57 +00:00
Marek Olšák
70afb711fa radeonsi: always set the TCL1_ACTION_ENA when invalidating L2
Some CIK-VI docs say this is the default behavior on SI. That doesn't
answer whether it's also the default behavior on CIK-VI.

Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 573bf0940a)
2017-02-08 15:55:17 +00:00
Jason Ekstrand
a545a33b1a nir/search: Use the correct bit size for integer comparisons
The previous code always compared integers as 64-bit.  Due to variations
in sign-extension in the code generated by nir_opt_algebraic.py, this
meant that nir_search doesn't always do what you want.  Instead, 32-bit
values should be matched as 32-bit and 64-bit values should be matched
as 64-bit.  While we're here we unify the unsigned and signed paths.
Now that we're using the right bit size, they should be the same since
the only difference we had before was sign extension.

This gets the UE4 bitfield_extract optimization working again.  It had
stopped working due to the constant 0xff00ff00 getting sign-extended
when it shouldn't have.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bb96b03461)
2017-02-08 15:53:20 +00:00
Lionel Landwerlin
f084d3c7aa anv: don't require render target isl bit for depth/stencil surfaces
Blorp can deal with depth/stencil surfaces blits/copies without the
render target requirement. Also having both render target and
depth/stencil requirement is incompatible from isl's point of view.

This fixes an image creation issue in the high level quality settings
of the Unity3D player, which requires a depth texture with src/dst
transfer & 4x multisampling.

v2: Simply aspect checking condition (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 74c23bde5b)
2017-02-08 15:30:38 +00:00
Emil Velikov
6bfc352f5a docs: add sha256 checksums for 13.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-01 11:19:37 +00:00
Emil Velikov
3255d10da4 docs: add release notes for 13.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-01 10:10:38 +00:00
Emil Velikov
c6c7e98208 Update version to 13.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-01 10:04:14 +00:00
Emil Velikov
9185a3385b get-pick-list.sh: Require explicit "13.0" for nominating stable patches
A nomination unadorned with a specific version is now interpreted as
being aimed at the 17.0 branch, which was recently opened.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-25 13:43:44 +00:00
Nayan Deshmukh
b602f4f5bd st/va: delay calling begin_frame until we have all parameters
If begin_frame is called before setting intra_matrix and
non_intra_matrix it leads to segmentation faults when
vl_mpeg12_decoder.c is used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92634
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 4b0e9babc6)

Squashed with commit:

st/va: make sure that we call begin_frame() only once v2

This fixes "st/va: delay calling begin_frame until we have all parameters".

v2: call begin frame after decoder (re)creation as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
(cherry picked from commit 1338d912f5)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/state_trackers/va/va_private.h
2017-01-25 13:43:22 +00:00
Heiko Przybyl
12ba860584 r600/sb: Fix loop optimization related hangs on eg
Make sure unused ops and their references are removed, prior to entering
the GCM (global code motion) pass, to stop GCM from breaking the loop
logic and thus hanging the GPU.

Turns out, that sb has problems with loops and node optimizations
regarding associative folding:

- the global code motion (gcm) pass moves ops up a loop level/basic block
until they've fulfilled their total usage count
- if there are ops folded into others, the usage count won't be
fulfilled and thus the op moved way up to the top
- within GCM the op would be visited and their deps would be moved
alongside it, to fulfill the src constaints
- in a loop, an unused op is moved out of the loop and GCM would move
the src value ops up as well
- now here arises the problem: if the loop counter is one of the src
values it would get moved up as well, the loop break condition would
never get hit and the shader turn into an endless loop, resulting in the
GPU hanging and being reset

A reduced (albeit nonsense) piglit example would be:

[require]
GLSL >= 1.20

[fragment shader]

uniform int SIZE;
uniform vec4 lights[512];

void main()
{
    float x = 0;
    for(int i = 0; i < SIZE; i++)
        x += lights[2*i+1].x;
}

[test]
uniform int SIZE 1
draw rect -1 -1 2 2

Which gets optimized to:

===== SHADER #12 OPT ================================== PS/BARTS/EVERGREEN =====
===== 42 dw ===== 1 gprs ===== 2 stack =========================================
ALU 3 @24
     1      y: MOV                R0.y,  0
            t: MULLO_UINT         R0.w,  [0x00000002 2.8026e-45].x, R0.z

LOOP_START_DX10 @22
PUSH @6
ALU 1 @30 KC0[CB0:0-15]
     2 M    x: PRED_SETGE_INT     __.x,  R0.z, KC0[0].x
JUMP @14 POP:1
LOOP_BREAK @20
POP @14 POP:1
ALU 2 @32
     3      x: ADD_INT            R0.x,  R0.w, [0x00000002 2.8026e-45].x

TEX 1 @36
               VFETCH             R0.x___, R0.x,   RID:0   MFC:16 UCF:0 FMT[..]
ALU 1 @40
     4      y: ADD                R0.y,  R0.y, R0.x
LOOP_END @4
EXPORT_DONE        PIXEL 0     R0.____  EOP
===== SHADER_END ===============================================================

Notice R0.z being the loop counter/break condition relevant register
and being never incremented at all. Also some of the loop content
has been moved out of it, to fulfill the requirements for the one unused
op.

With a debug build of mesa this would produce an error like
error at : PRED_SETGE_INT     __, __, EM.2,    R1.x.2||FP@R0.z, C0.x
  : operand value R1.x.2||FP@R0.z was not previously written to its gpr
and the compilation would fail due to this. On a release build it gets
passed to the GPU.

When using this patch, the loop remains intact:

===== SHADER #12 OPT ================================== PS/BARTS/EVERGREEN =====
===== 48 dw ===== 1 gprs ===== 2 stack =========================================
ALU 2 @24
     1      y: MOV                R0.y,  0
            z: MOV                R0.z,  0
LOOP_START_DX10 @22
PUSH @6
ALU 1 @28 KC0[CB0:0-15]
     2 M    x: PRED_SETGE_INT     __.x,  R0.z, KC0[0].x
JUMP @14 POP:1
LOOP_BREAK @20
POP @14 POP:1
ALU 4 @30
     3      t: MULLO_UINT         T0.x,  [0x00000002 2.8026e-45].x, R0.z

     4      x: ADD_INT            R0.x,  T0.x, [0x00000002 2.8026e-45].x

TEX 1 @40
               VFETCH             R0.x___, R0.x,   RID:0   MFC:16 UCF:0 FMT[..]
ALU 2 @44
     5      y: ADD                R0.y,  R0.y, R0.x
            z: ADD_INT            R0.z,  R0.z, 1
LOOP_END @4
EXPORT_DONE        PIXEL 0     R0.____  EOP
===== SHADER_END ===============================================================

Piglit: ./piglit summary console -d results/*_gpu_noglx
        name: unpatched_gpu_noglx patched_gpu_noglx
        ----  ------------------- -----------------
        pass:               18016             18021
        fail:                 748               743
        crash:                  7                 7
        skip:                1124              1124
        timeout:                0                 0
        warn:                  13                13
        incomplete:             0                 0
        dmesg-warn:             0                 0
        dmesg-fail:             0                 0
        changes:                0                 5
        fixes:                  0                 5
        regressions:            0                 0
        total:              19908             19908

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94900
Tested-by: Heiko Przybyl <lil_tux@web.de>
Tested-on: Barts PRO HD6850
Signed-off-by: Heiko Przybyl <lil_tux@web.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e933246013)
Nominated-by: Andreas Boll <andreas.boll.dev@gmail.com>
2017-01-24 01:13:33 +00:00
Kenneth Graunke
b9fd9a693b i965: Properly flush in hsw_pause_transform_feedback().
Fixes a number of transform feedback tests when run with Linux 4.8,
which allows us to use the MI_LOAD_REGISTER_REG command, at which point
we started using this new broken path.

ES3-CTS.functional.transform_feedback.array_element.interleaved.lines.*
and Piglit's arb_transform_feedback2/draw-auto are both fixed by this
patch, for example.

Thanks to Chris Wilson for catching this mistake!

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99030
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 2138347a45)
2017-01-24 01:13:33 +00:00
Andres Rodriguez
830b1051ab radv: fix include order for installed headers v2
In situations where libdrm_amdgpu and mesa are installed to the same
location, the mesa installed headers will take precedence over the git
source headers.

This is due to the AMDGPU_CFLAGS containing the install directory.

This situation can cause build errors if the git version of a header is
newer than the currently installed version of a header (e.g. git pull
updates vulkan.h)

Note: using the same install prefix for mesa and libdrm is probably a
common occurrence since it is described in the radeonBuildHowTo wiki:
https://www.x.org/wiki/radeonBuildHowTo/

v2: added sign-off

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit a3ad6a34c6)
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 01:13:33 +00:00
Andres Rodriguez
15432c29be vulkan/wsi: clarify the severity of lack of DRI3 v2
The current message sounds like a small warning, clarify that it can
result in lack of presentation support and application crashes.

v2: add "if they do" (Bas)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98263
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Acked-by: Jason ekstrand <jason@jlekstrand.net>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit e0674e740b)
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 01:13:32 +00:00
Rob Clark
c148b51a83 freedreno: some fence cleanup
Prep-work for next patch, mostly move to tracking last_fence as a
pipe_fence_handle (created now only in fd_gmem_render_tiles()), and a
bit of superficial renaming.

Signed-off-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 16f6ceaca9)

Fixes a glxgears issues.
Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Nominated-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
2017-01-24 01:13:32 +00:00
Kenneth Graunke
704072afed glsl: Use ir_var_temporary when generating inline functions.
We were using ir_var_auto for the inlined function parameter variables,
which is wrong, as it suggests that those are real variables declared
by the program.

Normally this doesn't matter.  However, if you called built-ins at
global scope, it would pollute the global variable namespace with
these new parameter temporaries.  If the shader already had variables
with those names, the linker might see contradictory global variable
declarations and raise an error.

Making them temporaries indicates that these are just things generated
by the compiler internally.  This avoids confusing the linker.

Fixes a new Piglit test: glsl-fs-multiple-builtins.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99154
Reported-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 62b8bcda1c)
2017-01-24 01:13:32 +00:00
Emil Velikov
d88fee9df0 automake: use shared llvm libs for make distcheck
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 23dcce0c03)
2017-01-24 01:13:32 +00:00
Jason Ekstrand
ae28120652 isl: Mark A4B4G4R4_UNORM as supported on gen8
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 4e7958fb13)
2017-01-24 01:13:32 +00:00
Samuel Pitoiset
501c380d87 gallium/hud: add missing break in hud_cpufreq_graph_install()
Fixes: e99b9395be "gallium/hud: Add support for CPU frequency monitoring"
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 383fc8e9f3)
2017-01-24 01:13:32 +00:00
Marek Olšák
645b47c32b radeonsi: don't forget to add HTILE to the buffer list for texturing
This fixes VM faults. Discovered by Samuel Pitoiset.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98975
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99450

Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit e490b7812c)
2017-01-24 01:13:31 +00:00
Zachary Michaels
d446f45567 radeonsi: Always leave poly_offset in a valid state
This commit makes si_update_poly_offset set poly_offset to NULL if
uses_poly_offset is false. This way poly_offset either points into the
currently queued rasterizer, or it is NULL.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99451
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d7d32b3bfe)
2017-01-24 01:13:31 +00:00
Emil Velikov
ced6fd3508 egl/wayland: use the destroy_window_callback for swrast
As described in commit 690ead4a13 ("egl/wayland-egl: Fix for segfault
in dri2_wl_destroy_surface.") if we attempt to destroy a EGL surface
attached to already destroyed Wayland window we'll get a segfault.

v2: set the correct callback alongside the window->private. (Dan)

Cc: Daniel Stone <daniels@collabora.com>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit bfd6314350)
2017-01-24 01:13:31 +00:00
Marek Olšák
5dd74b9c40 radeonsi: for the tess barrier, only use emit_waitcnt on SI and LLVM 3.9+
Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 57f18623fb)
2017-01-24 01:13:31 +00:00
Kenneth Graunke
ee1118d1df i965: Make BLORP disable the NP Z PMA stall fix.
This may fix GPU hangs on Gen8.  I don't know if it does though.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7a2b65a1d7)
2017-01-24 01:13:31 +00:00
Bas Nieuwenhuizen
f9cac64aee radv: Support loader interface version 3.
Port of 1e41d7f7b0:
"anv: Support loader interface version 3 (patch v2)"

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 6d2fb04f09)
2017-01-24 01:13:30 +00:00
Emil Velikov
259e74f682 cherry-ignore: add wayland race condition fix
The commit resolves the issue by using new Wayland API thus bumping the
Wayland build requirement from 1.2 to 1.11. Allowing requirement changes
[for stable] is a rare exception and in this case we jump over 3 years
worth, which is not cool.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 01:13:30 +00:00
Jonas Ådahl
bf556db1c9 egl/wayland: Cleanup private display connection when init fails
When failing to initializing the Wayland EGL driver, don't leak the
display server connection if it was us who created it.

Signed-off-by: Jonas Ådahl <jadahl@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 361796651c)
2017-01-24 01:13:30 +00:00
Emil Velikov
c0709db9cf cherry-ignore: add "_mesa_ClampColor extension/version fix"
As requested by Nicolai, on the mailing list.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 01:13:30 +00:00
Emil Velikov
c865a39636 get-typod-pick-list.sh: add new script
Typos do happen as people nominate patches for stable. This script aims
to catch most of those.

Due to the subtle nature of things, one has to pay special attention to
the output, similar to get-extra-pick-list.sh.

At the moment only the following is handled:
 grep -i "CC:.*mesa-dev"

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f0bdd13fdb)
2017-01-24 01:13:30 +00:00
Ilia Mirkin
83a83a89ea nouveau: take extra push space into account for pushbuf_space calls
Ever since a long time ago when I messed around with fences, I ensure
that after a PUSH_SPACE call there is enough space to write a fence out
into the pushbuf.

However the PUSH_SPACE macro is not all-knowing, and so sometimes we
have to invoke nouveau_pushbuf_space manually with the relocs/pushes
args set. If we don't take the extra allocation from PUSH_SPACE into
account, then we will end up accidentally flushing when the code was not
expecting a flush. This can lead to various runtime and rendering
failures.

The amount of extra allocation isn't that important - it has to be at
least 8 based on the current nouveau_winsys.h setting, but even more
won't hurt. I just rounded up to powers of 2.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99354
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Ben Skeggs <bskeggs@redhat.com>
(cherry picked from commit eb60a89bc3)
2017-01-24 01:13:29 +00:00
Grazvydas Ignotas
eba4c8fea0 mapi: update the asm code to support x32
Fixes crashes when both glx-tls and asm are enabled on x32.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94512
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=575458
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 8945836658)
2017-01-24 01:13:29 +00:00
Chuck Atkins
8a1113ce3f glx: Add missing glproto dependency for gallium-xlib glx
Cc: mesa-stable@lists.freedesktop.org
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Signed-of-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit e9a4ec4bd8)
2017-01-24 01:13:29 +00:00
Emil Velikov
fdb01cdb92 cherry-ignore: add radv: Call nir_lower_constant_initializers."
Depends on nir_lower_constant_initializers() which isn't in stable.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 01:13:29 +00:00
Chad Versace
f33d1100b4 anv: Support loader interface version 3 (patch v2)
This patch implements vk_icdNegotiateLoaderICDInterfaceVersion(), which
brings us to loader interface v3.

v2:
  - Drop the pragmas. [emil]
  - Advertise v3 instead of v2. Anvil supported more than I
    thought.  [jason]
  - s/Surface/SurfaceKHR/ in comments. [emil]

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 1e41d7f7b0)
2017-01-24 01:13:29 +00:00
Chad Versace
023e380ccc vulkan: Update vk_icd.h to interface version 3
Import from commit f2aeefec on branch 'master'
of https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 98cf089849)
2017-01-24 01:13:29 +00:00
Chad Versace
213791e86c vulkan: Add new cast macros for VkIcd types
We can't import the latest vk_icd.h because the new header breaks the
Mesa build. This patch defines new casting macros,
ICD_DEFINE_NONDISP_HANDLE_CASTS() and ICD_FROM_HANDLE(), which can
handle both the old and new vk_icd.h, and will prevent the build from
breaking when we update the header.

In the old vk_icd.h, types were defined as:

  typedef struct _VkIcdFoo {
    ...
  } VkIcdFoo;

Commit 6ebba1f6 in the Vulkan loader changed the above to

  typedef {
    ...
  } VkIcdFoo;

because the old definitions violated the C and C++ specs. According to
the specs, identifiers that begins with an underscore followed by an
uppercase letter are reserved. (It's pedantic, I know), See the Github
issue referenced below.

References: https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/7
References: 6ebba1f630
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c085bfcec9)
2017-01-24 01:13:28 +00:00
Timothy Arceri
0c54aa1568 util: fix list_is_singular()
Currently its dependant on the user calling and checking the result
of list_empty() before using the result of list_is_singular().

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0252ba26c5)
2017-01-24 01:13:28 +00:00
Nanley Chery
e2a522b755 anv/image: Disable HiZ for depth buffer arrays
We currently don't perform clears or resolves on multiple array layers
with HiZ.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 5857858aa6)
2017-01-24 01:13:28 +00:00
Nanley Chery
d5a941b7d9 anv/cmd_buffer: Fix programmed HiZ qpitch
Match the comment above the field by using units of pixels and not HiZ
blocks.

Cc: mesa-stable@lists.freedesktop.org
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 9f1d3a0c97)
2017-01-24 01:13:28 +00:00
Nanley Chery
8b3beddd03 anv/cmd_buffer: Fix arrayed depth/stencil attachments
Enable multiple layers of the depth/stencil buffers to be accessible.

Fixes the crucible test, func.depthstencil.arrayed_clear.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 61992e0afe)
2017-01-24 01:13:28 +00:00
Jason Ekstrand
fb51df52d0 nir/search: Only allow matching SSA values
This is more correct and should also be a tiny bit faster since we're
just comparing pointers instead of calling nir_src_equal.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c472568b4e)
2017-01-24 01:13:28 +00:00
Kenneth Graunke
f41850eaad spirv: Move cursor before calling vtn_ssa_value() in phi 2nd pass.
vtn_ssa_value() can produce variable loads, and the cursor might
be after a return statement, causing nir_builder assert failures
about not inserting instructions after a jump.

This fixes:
dEQP-VK.spirv_assembly.instruction.graphics.barrier.in_if
dEQP-VK.spirv_assembly.instruction.graphics.barrier.in_switch

Cc: "13.0 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 203c128781)
2017-01-24 01:13:28 +00:00
Timothy Arceri
37dcca9a78 glsl: fix opt_minmax redundancy checks against baserange
Marking operations as redundant if they are equal to the base
range is fine when the tree structure is something like this:

        max
      /     \
     max     b
    /   \
   3    max
       /   \
      3     a

But the opt falls apart with a tree like this:

        max
     /       \
    max     max
   /   \   /   \
  3    a   b    3

The problem is that both branches are treated the same: descending in
the left branch will prune the constant, and then descending the right
branch will prune the constant there as well, because limits[0] wasn't
updated to take the change on the left branch into account, and so we
still get [3,\infty) as baserange.

In order to fix the bug we just disable the marking of redundant expressions
when they match the baserange.

NIR algebraic opt will clean up the first tree for anyway, hopefully
other backends are smart enough to do this also.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 1edc53a66b)
2017-01-24 01:13:28 +00:00
Jason Ekstrand
17cd8edc37 anv/formats: Use the real format for B4G4R4A4_UNORM_PACK16 on gen8
Because border color is handled pre-swizzle, when we move the alpha
channel around in the format, the OPAQUE_BLACK border colors don't work
correctly on B4G4R4A4_UNORM_PACK16 with the hack.  This fixes the
following Vulkan CTS tests on Broadwell:

dEQP-VK.pipeline.sampler.view_type.2d_array.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
dEQP-VK.pipeline.sampler.view_type.1d_array.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
dEQP-VK.pipeline.sampler.view_type.2d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
dEQP-VK.pipeline.sampler.view_type.1d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black
dEQP-VK.pipeline.sampler.view_type.3d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d7bed6158)
2017-01-24 01:13:25 +00:00
Kenneth Graunke
512728415d i965: Fix texturing in the vec4 TCS and GS backends.
We were failing to zero m0.2 of the sampler message header for TCS and
GS messages in the simple case.  fs_generator has done this for about
a year now, but we missed it in vec4_generator.

Fixes ES31-CTS.core.texture_cube_map_array.sampling,
GL45-CTS.texture_cube_map_array.sampling, and many
dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler subtests:
- dynamically_uniform.tessellation_control.isampler3d
- dynamically_uniform.tessellation_control.isamplercube
- dynamically_uniform.tessellation_control.sampler2d
- dynamically_uniform.tessellation_control.usamplercube
- dynamically_uniform.tessellation_control.sampler2darray
- dynamically_uniform.tessellation_control.isampler2darray
- dynamically_uniform.tessellation_control.usampler3d
- dynamically_uniform.tessellation_control.usampler2darray
- dynamically_uniform.tessellation_control.usampler2d
- dynamically_uniform.tessellation_control.sampler3d
- dynamically_uniform.tessellation_control.samplercube
- dynamically_uniform.tessellation_control.isampler2d
- uniform.tessellation_control.isampler3d
- uniform.tessellation_control.isamplercube
- uniform.tessellation_control.usampler2d
- uniform.tessellation_control.usampler3d
- uniform.tessellation_control.sampler2darray
- uniform.tessellation_control.isampler2darray
- uniform.tessellation_control.usampler2darray
- uniform.tessellation_control.sampler2d
- uniform.tessellation_control.usamplercube
- uniform.tessellation_control.sampler3d
- uniform.tessellation_control.samplercube
- uniform.tessellation_control.isampler2d

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4295af646f)
2017-01-24 01:13:25 +00:00
Fredrik Höglund
5afc33bcb3 dri3: Fix MakeCurrent without a default framebuffer
In OpenGL 3.0 and later it is legal to make a context current without
a default framebuffer.

This has been broken since DRI3 support was introduced.

Cc: "13.0 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b6670157d7)
2017-01-24 01:13:25 +00:00
Kenneth Graunke
32d50d0f75 i965: Fix last slot calculations
If the VUE map has slots at the end which the shader does not write,
then we'd "flush" (constructing an URB write) on the last output it
actually wrote.  Then, we'd construct another SEND with EOT, but with
no actual payload data.  That's not legal.

For example, SSO programs have clip distance slots allocated no matter
what, but the shader may not write them.  If it doesn't write any user
defined varyings, then the clip distance slots will be the last ones.

Found while debugging
dEQP-VK.tessellation.shader_input_output.gl_position_vs_to_tcs_to_tes

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 480d6c1653)
2017-01-24 01:13:25 +00:00
Marek Olšák
f7d13af063 va: call texture_get_handle while the mutex is being held
The context may be used by texture_get_handle.

Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 89975e29d3)
2017-01-24 01:13:24 +00:00
Marek Olšák
071c058d9d vdpau: call texture_get_handle while the mutex is being held
The context may be used by texture_get_handle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99158

Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dbba4e03b1)
2017-01-24 01:13:24 +00:00
Chad Versace
b5ad01d579 meta: Disable dithering during glGenerateMipmap
Fixes tests 'dEQP-GLES3.functional.texture.mipmap.*.generate.rgba5551*' on
Intel Broadwell 0x1616.

The GL 4.5 spec describes the algorithm of glGenerateMipmap as:

    The contents of the derived images are computed by repeated, filtered
    reduction of the level base image.  [...] No particular filter algorithm is
    required, though a box filter is recommended as the default filter.

Consider a texture for which all pixels are identical at level 0.
From the spec's description above, one may reasonably assume that the "filtered
reduction" of level 0 produces a new miplevel for which again all pixels are
identical. For any 2x2 subspan of identical pixels, it is difficult to see how
the "filtered reduction" of that subspan can produce a pixel that differs from
the source pixels.

Dithering during _mesa_meta_GenerateMipmap() violated that reasonable
assumption.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99210
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c4b87f129e)
2017-01-24 01:13:24 +00:00
Christian König
a2ba1f4393 vl/zscan: fix "Fix trivial sign compare warnings"
The variable actually needs to be signed, otherwise converting it to a
float doesn't work as expected.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98914
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Fixes: 1fb4179f92 ("vl: Fix trivial sign compare warnings")
(cherry picked from commit ac57bcda1e)
2017-01-24 01:13:24 +00:00
Chad Versace
84317907b1 mesa/shaderobj: Fix races on refcounts
Use atomic ops when updating gl_shader::RefCount.

Fixes intermittent failures and crashes in
'dEQP-EGL.functional.sharing.gles2.multithread.*'.
All tests in that group now pass except
'dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_server_sync.textures.copyteximage2d_texsubimage2d_render'.

Tested with:
  mesa: branch 'master' at d6545f2
  deqp: branch 'nougat-cts-dev' at 4acf725 with additional local fixes
  DEQP_TARGET: x11_egl
  hw: Intel Broadwell 0x1616

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99085
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: Haixia Shi <hshi@chromium.org>
(cherry picked from commit 464b23b1f2)
[Emil Velikov: add u_atomic.h include to resolve the build]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 01:13:24 +00:00
Chad Versace
9c4ebd16e9 anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0
The spec implicitly allows the incoming count to be 0. From the Vulkan
1.0.38 spec, Section 4.1 Physical Devices:

    If the value referenced by pQueueFamilyPropertyCount is not 0 [then
    do stuff].

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit d6545f2345)
2017-01-24 01:13:24 +00:00
Chad Versace
0ca96e995e egl: Emit correct error when robust context creation fails
Fixes dEQP-EGL.functional.create_context_ext.robust_*
on Intel with GBM.

If the user sets the EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR in
EGL_CONTEXT_FLAGS_KHR when creating an OpenGL ES context, then
EGL_KHR_create_context spec requires that we unconditionally emit
EGL_BAD_ATTRIBUTE because that flag does not exist for OpenGL ES. When
creating an OpenGL context, the spec requires that we emit EGL_BAD_MATCH
if we can't support the request; that error is generated in the egl_dri2
layer where the driver capability is actually checked.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99188
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit b85c0b569f)
2017-01-24 01:13:23 +00:00
Damien Grassart
23ecfe8f09 anv: return count of queue families written
The Vulkan spec indicates that
vkGetPhysicalDeviceQueueFamilyProperties() should overwrite
pQueueFamilyPropertyCount with the number of structures actually
written to pQueueFamilyProperties.

Signed-off-by: Damien Grassart <damien@grassart.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 75252826e8)
2017-01-24 01:13:23 +00:00
Chad Versace
dcf5c05499 mesa/texformat: Handle GL_RGBA + GL_UNSIGNED_SHORT_5_5_5_1
_mesa_choose_tex_format() already handles GL_RGBA + GL_UNSIGNED_SHORT_1_5_5_5_REV
by converting it to MESA_FORMAT_B5G5R5A1_UNORM. Teach it do the same for
the non-reversed type. Otherwise, the switch's fallthrough converts it
to an 8888 format, which has incompatible precision in the alpha
channel.

Patch 2/2 to fix dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8
on Intel.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99185
Cc: Haixia Shi <hshi@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f3739810e3)
2017-01-24 01:13:23 +00:00
Chad Versace
ca81a9f9df dri: Add __DRI_IMAGE_FORMAT_ARGB1555
This allows eglCreateImage() to accept textures of said format.

Patch 1/2 to fix
dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8
on Intel.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99185
Cc: Haixia Shi <hshi@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9aa6ab0748)
2017-01-24 01:13:23 +00:00
Jason Ekstrand
e8a2e40260 i965/generator/tex: Handle an immediate sampler with an indirect texture
In this case we were dying when we tried to do SHL addr sampler imm(8)
because that puts an immediate in src0 of a two source instruction. This
fixes 2704 of the new separate sampler Vulkan CTS tests on Sky Lake.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 88b5acfa09)
2017-01-24 01:13:23 +00:00
Arda Coskunses
b5c7a1e2d7 vulkan/wsi/x11: don't crash on null wsi x11 connection
Without this check driver crash when application window
closed unexpectedly.

Acked-by: Edward O'Callaghan <funfunctor@folklore194.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 99de7b7525)
2017-01-24 01:13:23 +00:00
Arda Coskunses
94d521cd88 vulkan/wsi/x11: don't crash on null visual
When application window closed unexpectedly due to
lost window visualtypes getting invlaid parameters
which is causing a crash. Necessary check is added
to prevent the crash.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 01dd363e67)
2017-01-24 01:13:22 +00:00
Fredrik Höglund
8c8d819065 radv: fix dual source blending
Add the index to the location when assigning driver locations for
output variables.

Otherwise two fragment shader outputs declared as:

   layout (location = 0, index = 0) out vec4 output1;
   layout (location = 0, index = 1) out vec4 output2;

will end up aliasing one another.

Note that this patch will make the second output variable in the above
example alias a possible third output variable with location = 1 and
index = 0. But this shouldn't be a problem in practice since only one
color attachment is supported when dual-source blending is used.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 27a8aab882)
2017-01-24 01:13:22 +00:00
Dave Airlie
a83fb211c0 radv: flush smem for uniform buffer bit.
(cc'ing stable as I'd like to backport the ubo speedup as well)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 9d23b8a18e)
2017-01-24 01:13:22 +00:00
Chad Versace
19d2f006da egl: Check config's surface types in eglCreate*Surface()
If the provided EGLConfig does not support the requested surface type,
then emit EGL_BAD_MATCH.

Fixes dEQP-EGL.functional.negative_api.create_pbuffer_surface
on GBM.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit fbb4af96c6)
2017-01-24 01:13:22 +00:00
Kenneth Graunke
1cc110517b i965: Don't bail on vertex element processing if we need draw params.
BaseVertex, BaseInstance, DrawID, and some edge flag conditions need
vertex buffer and elements structs.  We can't bail early in this case.

Gen4-7 already do this properly.  Gen8+ did not.

Thanks to Ilia Mirkin for helping track this down.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99144
Reported-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 8fc5443a2b)
2017-01-24 01:13:22 +00:00
Michel Dänzer
0b63515f22 cso: Don't restore nr_samplers in cso_restore_fragment_samplers
If info->nr_samplers > ctx->nr_fragment_samplers_saved, the assignment
would prevent cso_single_sampler_done from unbinding the no longer used
samplers from the driver, which could result in use-after-free. This is
probably unlikely to happen in practice though.

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 3d661a12be)
2017-01-24 01:13:22 +00:00
Francisco Jerez
02c3e9033e anv: Fix uniform and storage buffer offset alignment limits.
This fixes a regression in a bunch of image store vulkan CTS tests
from commit ad38ba1134, which started
using OWORD block read messages to implement UBO loads.  The reason
for the failure is that we were giving bogus buffer alignment limits
to the application (1B), so the CTS would happily come back with
descriptor sets pointing at not even word-aligned uniform buffer
addresses.

Surprisingly the sampler messages used to fetch pull constants before
that commit were able to cope with the non-texel aligned addresses,
but the dataport messages used to fetch pull constants after that
commit and the ones used to access storage buffers (before and after
the same commit) aren't as permissive with unaligned addresses.

Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99097
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 79d08ed3d2)
2017-01-24 01:13:21 +00:00
Timothy Arceri
1185212d79 nir: Turn imov/fmov of undef into undef
Reverting the previous attempt at this a5502a721f resulted in
the following Vulkan test failing.

dEQP-VK.glsl.return.return_in_dynamic_loop_dynamic_vertex

This time we use the num_components from the alu dest rather than
num_inputs to the op to determine the size of the undef.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99100
(cherry picked from commit 3421b3f5a3)
2017-01-24 01:13:21 +00:00
Emil Velikov
a908680f3e cherry-ignore: add couple of intel_miptree_copy related patches
The commits which introduced intel_miptree_copy are invasive/large to be
considered for stable.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 01:13:21 +00:00
Emil Velikov
c8ece92ded docs: add sha256 checksums for 13.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-05 15:59:07 +00:00
Emil Velikov
bec04114d2 docs: add release notes for 13.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-05 15:50:29 +00:00
Emil Velikov
a3d0bb354e Update version to 13.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-24 13:03:53 +00:00
Rhys Kidd
241dc4634f glsl: Add pthread libs to cache_test
Fixes the following compile error, present when the SHA1 library is libgcrypt:

  CCLD     glsl/tests/cache-test
glsl/.libs/libglsl.a(libmesautil_la-mesa-sha1.o): In function `call_once':
/mesa/src/util/../../include/c11/threads_posix.h:96: undefined reference to `pthread_once'

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 5c73ecaac4)
2016-12-24 13:03:53 +00:00
Matt Turner
e851f27487 i965/fs: Reject copy propagation into SEL if not min/max.
We shouldn't ever see a SEL with conditional mod other than GE (for max)
or L (for min), but we might see one with predication and no conditional
mod.

total instructions in shared programs: 8241806 -> 8241902 (0.00%)
instructions in affected programs: 13284 -> 13380 (0.72%)
HURT: 62

total cycles in shared programs: 84165104 -> 84166244 (0.00%)
cycles in affected programs: 75364 -> 76504 (1.51%)
helped: 10
HURT: 34

Fixes generated code in at least Sanctum 2, Borderlands 2, Goat
Simulator, XCOM: Enemy Unknown, and Shogun 2.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92234
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7bed52bb5f)
2016-12-24 13:03:53 +00:00
Matt Turner
4dd3f7c9a0 i965/fs: Add unit tests for copy propagation pass.
Pretty basic, but it's a start.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 091a8a04ad)
[Emil Velikov: nir_shader_create() has only three arguments]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-24 13:02:55 +00:00
Matt Turner
a4f301816b i965/fs: Rename opt_copy_propagate -> opt_copy_propagation.
Matches the vec4 backend, cmod propagation, and saturate propagation.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6014da50ec)
2016-12-16 13:53:50 +00:00
Timothy Arceri
c682fdb77c Revert "nir: Turn imov/fmov of undef into undef."
This reverts commit 6aa730000f.

This was changing the size of the undef to always be 1 (the number of inputs
to imov and fmov) which is wrong, we could be moving a vec4 for example.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a5502a721f)
2016-12-16 13:01:00 +00:00
Chad Versace
12618c1c90 egl: Fix crashes in eglCreate*Surface()
Don't dereference a null EGLDisplay.

Fixes tests
  dEQP-EGL.functional.negative_api.create_pbuffer_surface
  dEQP-EGL.functional.negative_api.create_pixmap_surface

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=99038
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5e97b8f5ce)
2016-12-16 13:00:50 +00:00
Nanley Chery
63bdcc5c88 mesa/fbobject: Update CubeMapFace when reusing textures
Framebuffer attachments can be specified through FramebufferTexture*
calls. Upon specifying a depth (or stencil) framebuffer attachment that
internally reuses a texture, the cube map face of the new attachment
would not be updated (defaulting to TEXTURE_CUBE_MAP_POSITIVE_X).
Fix this issue by actually updating the CubeMapFace field.

This bug manifested itself in BindFramebuffer calls performed on
framebuffers whose stencil attachments internally reused a depth
texture.  When binding a framebuffer, we walk through the framebuffer's
attachments and update each one's corresponding gl_renderbuffer. Since
the framebuffer's depth and stencil attachments may share a
gl_renderbuffer and the walk visits the stencil attachment after
the depth attachment, the uninitialized CubeMapFace forced rendering
to TEXTURE_CUBE_MAP_POSITIVE_X.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77662
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 63318d34ac)
2016-12-16 12:01:23 +00:00
Jason Ekstrand
fb9f0a1197 spirv: Use a simpler and more correct implementaiton of tanh()
The new implementation is more correct because it clamps the incoming value
to 10 to avoid floating-point overflow.  It also uses a much reduced
version of the formula which only requires 1 exp() rather than 2.  This
fixes all of the dEQP-VK.glsl.builtin.precision.tanh.* tests.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit da1c49171d)
2016-12-15 16:46:28 +00:00
Haixia Shi
41c688a6c3 compiler/glsl: fix precision problem of tanh
Clamp input scalar value to range [-10, +10] to avoid precision problems
when the absolute value of input is too large.

Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.* test
failures.

v2: added more explanation in the comment.
v3: fixed a typo in the comment.

Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit d4983390a8)
2016-12-15 16:46:11 +00:00
Jason Ekstrand
0c2a66c5b6 anv/descriptor_set: Write the state offset in the surface state free list.
When Kristian reworked descriptor set allocation, somehow he forgot to
actually store the offset in the free list.  Somehow, this completely
missed CTS testing until now... This fixes all 2744 of the new
'dEQP-VK.texture.filtering.* tests in the latest CTS.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 37537b7d86)
2016-12-15 16:15:13 +00:00
Jason Ekstrand
626b85cc15 anv/device: Implicitly unmap memory objects in FreeMemory
From the Vulkan spec version 1.0.32 docs for vkFreeMemory:

   "If a memory object is mapped at the time it is freed, it is implicitly
   unmapped."

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit b1217eada9)
2016-12-15 16:14:55 +00:00
Jason Ekstrand
23f1e04abb anv/device: Return the right error for failed maps
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 920f34a2d9)
2016-12-15 15:57:53 +00:00
Nicolai Hähnle
cf07f78f7e radeonsi: fix an off-by-one error in the bounds check for max_vertices
The spec actually says that calling EmitStreamVertex is undefined when
you exceed max_vertices. But we do need to avoid trampling over memory
outside the GSVS ring.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 88509518b0)
2016-12-15 15:53:30 +00:00
Nicolai Hähnle
cf4316a9ce radeonsi: do not kill GS with memory writes
Vertex emits beyond the specified maximum number of vertices are supposed to
have no effect, which is why we used to always kill GS that reached the limit.

However, if the GS also writes to memory (SSBO, atomics, shader images), then
we must keep going and only skip the vertex emit itself.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 7655bccce8)
2016-12-15 15:53:30 +00:00
Nicolai Hähnle
bc39170c33 radeonsi: update all GSVS ring descriptors for new buffer allocations
Fixes GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_geometry_instanced.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 7b5b3d63c5)
2016-12-15 15:53:30 +00:00
Chad Versace
4cc5e897b5 i965/mt: Disable aux surfaces after making miptree shareable
The entire goal of intel_miptree_make_shareable() is to permanently
disable the miptree's aux surfaces. So set
intel_mipmap_tree:disable_aux_buffers after the function's done with
discarding down the aux surfaces.

References: https://bugs.freedesktop.org/show_bug.cgi?id=98329
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Nanley Chery <nanley.g.chery@intel.com
Cc: Haixia Shi <hshi@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1c8be049be)
2016-12-14 19:31:30 +00:00
Dave Airlie
1f33823fc1 radv: add missing license file to radv_meta_bufimage.
Just noticed this file was missing license and any
explaination of what is in it.

(stable just for license header reasons)
Reviewed by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

(cherry picked from commit 2a33049c70)
2016-12-14 19:06:49 +00:00
Marek Olšák
11b8d52dce radeonsi: disable the constant engine (CE) on Carrizo and Stoney
It must be disabled until the kernel bug is fixed, and then we'll enable CE
based on the DRM version.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 31f988a9d6)
2016-12-14 19:03:12 +00:00
Marek Olšák
18bb2d5c66 radeonsi: wait for outstanding LDS instructions in memory barriers if needed
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 13c34cf8ca)
2016-12-14 19:03:12 +00:00
Marek Olšák
3b956bdbcc tgsi: fix the src type of TGSI_OPCODE_MEMBAR
It's a literal integer. The next commit will need this.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 16ba04d6de)
2016-12-14 19:03:12 +00:00
Marek Olšák
2da119dfe9 radeonsi: wait for outstanding memory instructions in TCS barriers
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 16f49c16c7)
2016-12-14 19:03:12 +00:00
Marek Olšák
6f37d30679 radeonsi: allow specifying simm16 of emit_waitcnt at call sites
The next commit will use this.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 15e96c70b0)
2016-12-14 19:03:12 +00:00
Marek Olšák
1e8eb3ef80 radeonsi: fix incorrect FMASK checking in bind_sampler_states
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 38d4859b94)
2016-12-14 19:03:12 +00:00
Marek Olšák
27a11b6d26 radeonsi: always restore sampler states when unbinding sampler views
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit b3a2aa9cba)
2016-12-14 19:03:12 +00:00
Marek Olšák
86b8bc7656 cso: don't release sampler states that are bound
This fixes random radeonsi GPU hangs in Batman Arkham: Origins (Wine) and
probably many other games too.

cso_cache deletes sampler states when the cache size is too big and doesn't
check which sampler states are bound, causing use-after-free in drivers.
Because of that, radeonsi uploaded garbage sampler states and the hardware
went bananas. Other drivers may have experienced similar issues.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 6dc96de303)
2016-12-14 19:03:12 +00:00
Nicolai Hähnle
7c813ce14e radeonsi: fix isolines tess factor writes to control ring
Fixes piglit arb_tessellation_shader/execution/isoline{_no_tcs}.shader_test.

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d3931a355f)
[Emil Velikov: there is no si_shader_key::part in branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-14 19:03:12 +00:00
Jason Ekstrand
983c38af2a genxml/gen9: Change the default of MI_SEMAPHORE_WAIT::RegisterPoleMode
We would really like it to be false as that's what you get on hardware that
doesn't have RegisterPoleMode (Sky Lake for example).  While we're at it,
we change it to a boolean.  This fixes dEQP-VK.synchronization.smoke.events
on Broxton.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit eb7b51d62a)
2016-12-14 19:03:11 +00:00
Kenneth Graunke
41c18889be i965: Allocate at least some URB space even when max_vertices = 0.
Allocating zero URB space is a really bad idea.  The hardware has to
give threads a handle to their URB space, and threads have to use that
to terminate the thread.  Having it be an empty region just breaks a
lot of assumptions.  Hence, why we asserted that it isn't possible.

Unfortunately, it /is/ possible prior to Gen8, if max_vertices = 0.
In theory a geometry shader could do SSBO/image access and maybe
still accomplish something.  In reality, this is tripped up by
conformance tests.

Gen8+ already avoids this problem by placing the vertex count DWord
in the URB entry header.  This fixes things on earlier generations.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit a41f5dcb14)
2016-12-14 19:03:11 +00:00
Dave Airlie
adda8b9eb6 radv: fix another regression since shadow fixes.
This fixes:
dEQP-VK.glsl.texture_gather.basic.2d.depth32f.*

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8033f78f94)
2016-12-14 19:03:11 +00:00
Ilia Mirkin
69a4fa0c35 mesa: only verify that enabled arrays have backing buffers
We were previously also verifying that no backing buffers were available
when an array wasn't enabled. This is has no basis in the spec, and it
causes GLupeN64 to fail as a result.

Fixes: c2e146f487 ("mesa: error out in indirect draw when vertex bindings mismatch")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 7c16552f8d)
2016-12-14 19:03:11 +00:00
Eric Anholt
403d106c9c vc4: In a loop break/continue, jump if everyone has taken the path.
This should be a win for most loops, which tend to have uniform control
flow.

More importantly, it exposes important information to live variables: that
the break/continue here means that our jump target may have access to
values that were live on our input.  Previously, we were just setting the
exec mask and letting control flow fall through, so an intervening def
between the break and the end of the loop would appear to live variables
as if it screened off the variable, when it didn't actually.

Fixes a regression in glsl-vs-loop-redundant-condition.shader_test when a
perturbing of register allocation caused a live variable to get stomped.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8e5ec33f11)
2016-12-14 19:03:11 +00:00
Marek Olšák
a539345c3e radeonsi: apply the double EVENT_WRITE_EOP workaround to VI as well
Internal docs don't mention it, but they also don't mention that the bug
has been fixed (like other CI bugs fixed in VI).

Vulkan does this too.

v2: also update r600_gfx_write_fence_dwords

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
(cherry picked from commit bacf9b4e73)
2016-12-14 19:03:11 +00:00
Marek Olšák
002fa13cfa radeonsi: add a tess+GS hang workaround for VI dGPUs
ported from Vulkan

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a816c7fe07)
2016-12-14 19:03:11 +00:00
Marek Olšák
590366320d radeonsi: apply a tessellation bug workaround for SI
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 78c4528ae7)
[Emil Velikov: resolve trivial conflict]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_state_draw.c
2016-12-14 19:03:11 +00:00
Marek Olšák
a30cbf5a70 radeonsi: apply a TC L1 write corruption workaround for SI
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 72e46c9889)
2016-12-14 19:03:11 +00:00
Marek Olšák
40e16eac75 radeonsi: apply a multi-wave workgroup SPI bug workaround to affected CIK chips
All codepaths are handled except for clover.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 72d48fcd8e)
2016-12-14 19:03:11 +00:00
Marek Olšák
3ece256629 radeonsi: consolidate max-work-group-size computation
The next commit will need this.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit ec36c63b4f)
2016-12-14 19:03:11 +00:00
Marek Olšák
9275ed5595 radeonsi: disable RB+ blend optimizations for dual source blending
This fixes dual source blending on Stoney. The fix was copied from Vulkan.
The problem was discovered during internal testing.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 5e5573b1bf)
2016-12-14 19:03:10 +00:00
Marek Olšák
ad374fb2a9 radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending
copied from Vulkan

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit ff50c44a5f)
2016-12-14 19:03:10 +00:00
Marek Olšák
e444e1f235 radeonsi: always set all blend registers
better safe than sorry

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 87b208a54e)
2016-12-14 19:03:10 +00:00
Timothy Arceri
7f2ee55aac mesa: fix active subroutine uniforms properly
07fe2d565b introduced a big hack in order to return
NumSubroutineUniforms when querying ACTIVE_RESOURCES for
<shader>_SUBROUTINE_UNIFORM interfaces. However this is the
wrong fix we are meant to be returning the number of active
resources i.e. the count of subroutine uniforms in the
resource list which is what the code was previously doing,
anything else will cause trouble when trying to retrieve
the resource properties based on the ACTIVE_RESOURCES count.

The real problem is that NumSubroutineUniforms was counting
array elements as separate uniforms but the innermost array
is always considered a single uniform so we fix that count
instead which was counted incorrectly in 7fa0250f9.

Idealy we could probably completely remove
NumSubroutineUniforms and just compute its value when needed
from the resource list but this works for now.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0303201dfb)
[Emil Velikov: LinkStatus is in gl_shader_program]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/main/program_resource.c
2016-12-14 19:03:10 +00:00
Jason Ekstrand
8b9f8d3062 anv/cmd_buffer: Remove the 1-D case from the HiZ QPitch calculation
The 1-D special case doesn't actually apply to depth or HiZ.  I discovered
this while converting BLORP over to genxml and ISL.  The reason is that the
1-D special case only applies to the new Sky Lake 1-D layout which is only
used for LINEAR 1-D images.  For tiled 1-D images, such as depth buffers,
the old gen4 2-D layout is used and the QPitch should be in rows.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f469235a6e)
2016-12-14 19:03:10 +00:00
Jason Ekstrand
59be849daf anv/image: Rename hiz_surface to aux_surface
(cherry picked from commit c3eb58664e)
2016-12-14 19:03:10 +00:00
Dave Airlie
7704d2ffd6 radv: set maxFragmentDualSrcAttachments to 1
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit eaf0768b8f)
2016-12-14 19:03:10 +00:00
Dave Airlie
5d60c22cb8 anv: set maxFragmentDualSrcAttachments to 1
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit f9ab60202d)
2016-12-14 19:03:10 +00:00
Gwan-gyeong Mun
4cd5090578 vulkan/wsi: Fix resource leak in success path of wsi_queue_init()
It fixes leakage of pthread_condattr resource on wsi_queue_init()

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
(cherry picked from commit 65ea559465)
2016-12-14 19:03:10 +00:00
Gwan-gyeong Mun
eb62264769 anv: Update the teardown in reverse order of the anv_CreateDevice
This updates releasing of resource in reverse order of the anv_CreateDevice
to anv_DestroyDevice.
And it fixes resource leak in pthread_mutex, pthread_cond, anv_gem_context.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit b178652b41)
2016-12-14 19:03:10 +00:00
Gwan-gyeong Mun
ef08616dcb anv: Add missing error-checking to anv_block_pool_init (v2)
When the memfd_create() and u_vector_init() fail on anv_block_pool_init(),
this patch makes to return VK_ERROR_INITIALIZATION_FAILED.
All of initialization success on anv_block_pool_init(), it makes to return
VK_SUCCESS.

CID 1394319

v2: Fixes from Emil's review:
  a) Add the return type for propagating the return value to caller.
  b) Changed anv_block_pool_init() to return VK_ERROR_INITIALIZATION_FAILED
     on failure of initialization.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ecc618b0d8)
2016-12-14 19:03:10 +00:00
Emil Velikov
6c1b7600e4 radv: don't leak the fd if radv_physical_device_init() succeeds
radv_amdgpu_winsys_create() does not take ownership of the fd, thus we
end up leaking it as we return with VK_SUCCESS.

Cc: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 78707a15f2)
2016-12-14 19:03:09 +00:00
Emil Velikov
deba381a85 anv: don't leak memory if anv_init_wsi() fails
brw_compiler_create() rzalloc-ates memory which we forgot to free.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit a1cf494f77)
2016-12-14 19:03:09 +00:00
Emil Velikov
a5feaf22be anv: don't double-close the same fd
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 3af8171547)
2016-12-14 19:03:09 +00:00
Jason Ekstrand
7dceb97604 anv/cmd_buffer: Re-emit MEDIA_CURBE_LOAD when CS push constants are dirty
This can happen even if the binding table isn't changed.  For instance, you
could have dynamic offsets with your descriptor set.  This fixes the new
stress.lots-of-surface-state.cs.dynamic cricible test.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 054e48ee0e)

Conflicts:
	src/intel/vulkan/genX_cmd_buffer.c

Squashed with commit:

anv/cmd_buffer: Emit CS push constants after binding tables

Emitting binding tables can cause push constants to be dirtied if the
shader uses images so we need to handle push constants later.

(cherry picked from commit 7a2cfd4adb)
2016-12-14 19:02:55 +00:00
Emil Velikov
2722144bed docs: add sha256 checksums for 13.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-28 15:28:01 +00:00
Emil Velikov
c9e993ba13 docs: add release notes for 13.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-28 15:06:08 +00:00
Emil Velikov
f92c2e3d2b Update version to 13.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-28 15:02:57 +00:00
Dave Airlie
02fd5a19b7 radv: fix 3D clears with baseMiplevel
This fixes:
dEQP-VK.api.image_clearing.clear_color_image.3d*

These were hitting an assert as the code wasn't taking the
baseMipLevel into account when minify the image depth.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 09c0c17bc3)
2016-11-28 12:56:34 +00:00
Dave Airlie
87b76f0e05 radv/ac/llvm: shadow samplers only return one value.
The intrinsic engine asserts in llvm due to this.

Reported-by: Christoph Haag <haagch+mesadev@frickel.club>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b56b54cbf1)

Squashed with commit:

radv/ac/llvm: fix regression with shadow samplers fix

This fixes b56b54cbf1:
radv/ac/llvm: shadow samplers only return one value

It makes sure we only do that for shadow sampling, as
opposed to sizing requests.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b2e217369e)

Squashed with commit:

radv: brown-paper bag for a forgotten else.

This fixes the fix:
radv/ac/llvm: fix regression with shadow samplers fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 020978af12)
2016-11-28 12:56:18 +00:00
Dave Airlie
17dee709a9 radv/si: fix optimal micro tile selection
The same fix was posted for radeonsi, so port it here.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 9838db8f64)
2016-11-28 12:56:17 +00:00
Emil Velikov
d653c84a68 radv: honour the number of properties available
Cap up-to the number of properties available while copying the data.
Otherwise we might crash and/or leak data.

Cc: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a025c5b2c7)
2016-11-28 12:56:17 +00:00
Dave Airlie
960a87fb17 radv: fix texel fetch offset with 2d arrays.
The code didn't limit the offsets to the number supplied, so
if we expected 3 but only got 2 we were accessing undefined memory.

This fixes random failures in:
dEQP-VK.glsl.texture_functions.texelfetchoffset.sampler2darray_*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit bb8ac18340)
2016-11-28 12:56:17 +00:00
Jason Ekstrand
aa939d7d2a vulkan/wsi/x11: Implement FIFO mode.
This implements VK_PRESENT_MODE_FIFO_KHR for X11.  Unfortunately, due to
the way the present extension works, we have to manage the queue of
presented images in a separate thread.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e73d136a02)
2016-11-28 12:18:20 +00:00
Kevin Strasser
0aa527526c vulkan/wsi: Add a thread-safe queue implementation
In order to support FIFO mode without blocking the application on calls
to vkQueuePresentKHR it is necessary to enqueue the request and defer
calling the server until the next vblank period. The xcb present api
doesn't offer a way to register a callback, so we will have to spawn a
worker thread that will wait for a request to be added to the queue, call
to the server, and then make the image available for reuse.  This commit
introduces the queue data structure needed to implement this.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 932bb3f0dd)
2016-11-28 12:18:12 +00:00
Dave Airlie
6eceac3a02 vulkan/wsi/x11: add support for IMMEDIATE present mode
We shouldn't be using ASYNC here, that would be used
for immediate mode, so let's implement that.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ca035006c8)
2016-11-28 12:17:51 +00:00
Dave Airlie
2e3e5c0e73 vulkan/wsi: store present mode in swapchain base class
This just moves this up a level as x11 will need it to
implement things properly.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1cdca1eb16)
2016-11-28 12:17:46 +00:00
Dave Airlie
ae6e22e311 vulkan/wsi/x11: handle timeouts properly in next image acquire (v1.1)
For 0 timeout, just poll for an event, and if none, return
For UINT64_MAX timeout, just wait for special event blocked
For other timeouts get the xcb fd and block on it, decreasing
the timeout if we get woken up for non-special events.

v1.1: return VK_TIMEOUT for poll timeouts.
handle timeout going negative.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 787c172aed)
2016-11-28 12:17:41 +00:00
Eduardo Lima Mitev
f7b58a378c vulkan/wsi/x11: Fix behavior of vkGetPhysicalDeviceSurfacePresentModesKHR
x11_surface_get_present_modes() is currently asserting that the number of
elements in pPresentModeCount must be greater than or equal to the number
of present modes available. This is buggy because pPresentModeCount
elements are later copied from the internal modes' array, so if
pPresentModeCount is greater, it will overflow it.

On top of that, this assertion violates the spec. From the Vulkan 1.0
(revision 32, with KHR extensions), page 581 of the PDF:

    "If the value of pPresentModeCount is less than the number of
     presentation modes supported, at most pPresentModeCount values will be
     written. If pPresentModeCount is smaller than the number of
     presentation modes supported for the given surface, VK_INCOMPLETE
     will be returned instead of VK_SUCCESS to indicate that not all the
     available values were returned."

So, the correct behavior is: if pPresentModeCount is greater than the
internal number of formats, it is clamped to that many present modes. But
if it is lesser than that, then pPresentModeCount elements are copied,
and the call returns VK_INCOMPLETE.

This fix is similar (but simpler and more readable) than the one I provided
in 750d8cad72 for vkGetPhysicalDeviceSurfaceFormatsKHR, which was suffering
from the same problem.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit b677b99db5)
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-24 16:34:42 +00:00
Eduardo Lima Mitev
28c6c8d09e vulkan/wsi/x11: Fix behavior of vkGetPhysicalDeviceSurfaceFormatsKHR
x11_surface_get_formats() is currently asserting that the number of
elements in pSurfaceFormats must be greater than or equal to the number
of formats available. This is buggy because pSurfaceFormatsCount
elements are later copied from the internal formats' array, so if
pSurfaceFormatCount is greater, it will overflow it.

On top of that, this assertion violates the spec. From the Vulkan 1.0
(revision 32, with KHR extensions), page 579 of the PDF:

    "If pSurfaceFormats is NULL, then the number of format pairs supported
     for the given surface is returned in pSurfaceFormatCount. Otherwise,
     pSurfaceFormatCount must point to a variable set by the user to the
     number of elements in the pSurfaceFormats array, and on return the
     variable is overwritten with the number of structures actually written
     to pSurfaceFormats. If the value of pSurfaceFormatCount is less than
     the number of format pairs supported, at most pSurfaceFormatCount
     structures will be written. If pSurfaceFormatCount is smaller than
     the number of format pairs supported for the given surface,
     VK_INCOMPLETE will be returned instead of VK_SUCCESS to indicate that
     not all the available values were returned."

So, the correct behavior is: if pSurfaceFormatCount is greater than the
internal number of formats, it is clamped to that many formats. But
if it is lesser than that, then pSurfaceFormatCount elements are copied,
and the call returns VK_INCOMPLETE.

Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 750d8cad72)
Nominated-by: Emil Velikov <emil.velikov@collabora.com>

Squashed with commit:

vulkan/wsi/x11: Smplify implementation of vkGetPhysicalDeviceSurfaceFormatsKHR

This patch simplifies x11_surface_get_formats(). It is actually just a
readability improvement over the patch I provided earlier this week
(750d8cad72).

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 129da27426)
2016-11-24 16:34:42 +00:00
Iago Toral Quiroga
9eea4ba5ab anv/format: handle unsupported formats properly
According to the spec for vkGetPhysicalDeviceImageFormatProperties:

"If format is not a supported image format, or if the combination of format,
 type, tiling, usage, and flags is not supported for images, then
 vkGetPhysicalDeviceImageFormatProperties returns VK_ERROR_FORMAT_NOT_SUPPORTED."

Makes the following Vulkan CTS tests report 'Not Supported' instead of crashing:

dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_unorm
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_snorm
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_uscaled
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_sscaled
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_uint
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_sint
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_unorm
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_snorm
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_uscaled
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_sscaled
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_uint
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_sint
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_r4g4_unorm_pack8
dEQP-VK.api.image_clearing.clear_color_image.1d_r8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_r8g8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_r8g8b8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_b5g5r5a1_unorm_pack16

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 35deeda66f)

Squashed with:

anv/format: handle unsupported formats earlier

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 277f868e66)
2016-11-24 16:34:42 +00:00
Emil Velikov
e692630755 anv: fix enumeration of properties
Driver should enumerate only up-to min2(num_available, num_requested)
properties and return VK_INCOMPLETE if the # of requested props is
smaller than the ones available.

Presently we assert out in such cases.

Inspired by a similar fix for RADV.

v2: Use MIN2 + typed_memcpy (Jason).

Should fix: dEQP-VK.api.info.device.extensions

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 5cc07d854c)
2016-11-24 16:34:42 +00:00
Lucas Stach
c10e1fb440 gbm: request correct version of the DRI2_FENCE extension
There is no version 2 of the DRI2_FENCE extension. So only a request
for version 1 has a chance to succeed.

Fixes: 74b1969d71 (gbm: wire up fence extension)
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d9a3ad94ca)
2016-11-24 16:34:42 +00:00
Jason Ekstrand
f77b097223 anv/cmd_buffer: Emit a CS stall before setting a CS pipeline
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f680a01ad4)
2016-11-24 16:34:42 +00:00
Jason Ekstrand
6c87a21497 anv/cmd_buffer: Handle running out of binding tables in compute shaders
If we try to allocate a binding table and fail, we have to get a new
binding table block, re-emit STATE_BASE_ADDRESS, and then try again.  We
already handle this correctly for 3D and blorp but it never got handled for
CS.  This fixes the new stress.lots-of-surface-state.cs.static crucible test.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 722ab3de9f)
2016-11-24 16:34:42 +00:00
Gwan-gyeong Mun
a39e535d6c anv: Fix unintentional integer overflow in anv_CreateDmaBufImageINTEL
Since both pCreateInfo->strideInBytes and pCreateInfo->extent.height
are of uint32_t type 32-bit arithmetic will be used.

Fix unintentional integer overflow by casting to uint64_t before
multifying.

CID 1394321

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
[Emil Velikov: cast only of the arguments]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit e074a08a6d)
2016-11-24 16:34:42 +00:00
Gwan-gyeong Mun
c19a331139 util/disk_cache: close a previously opened handle in disk_cache_put (v2)
We're missing the close() to the matching open().

CID 1373407

v2: Fixes from Emil Velikov's review
    Update the teardown in reverse order of the setup/init.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
(cherry picked from commit 69cc7d90f9)
2016-11-24 16:34:41 +00:00
Jordan Justen
d6964bbf54 i965/hsw: Set integer mode in sampling state for stencil texturing
Fixes:

ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot
ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot
ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_pot
ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_npot
ES31-CTS.functional.texture.border_clamp.unused_channels.depth24_stencil8_sample_stencil
ES31-CTS.functional.texture.border_clamp.unused_channels.depth32f_stencil8_sample_stencil

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 44c5ed02d1)
2016-11-24 16:34:41 +00:00
Nicolai Hähnle
63e2bb2f36 glsl/lower_output_reads: fix geometry shader output handling with conditional emit
Consider a geometry shader that contains code like this:

   some_out = expr;

   if (cond) {
      ...
      EmitVertex();
   } else {
      ...
      EmitVertex();
   }

Both branches should see the correct value of some_out.

Since this is a rather subtle and rare case, I'm submitting a piglit test
for this as well.

GLSL says that the values of output variables are undefined after
EmitVertex(). With this change, the values will now be defined and
unmodified. This may reduce optimization opportunities in the probably
quite rare case where subsequent compiler passes cannot prove that the
value of the output variable is overwritten.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 0d383a79a8)
2016-11-24 16:34:41 +00:00
Nicolai Hähnle
3d5b40fa76 radeonsi: store group_size_variable in struct si_compute
For compute shaders, we free the selector after the shader has been
compiled, so we need to save this bit somewhere else.  Also, make sure that
this type of bug cannot re-appear, by NULL-ing the selector pointer after
we're done with it.

This bug has been there since the feature was added, but was only exposed
in piglit arb_compute_variable_group_size-local-size by commit
9bfee7047b (which is totally unrelated).

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 42d5e91a2a)
2016-11-24 16:34:41 +00:00
Jason Ekstrand
9581776d53 anv: Implement a depth stall restriction on gen7
Fixes around 60 Vulkan CTS tests on Haswell

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a8b85f1f77)
2016-11-24 16:34:41 +00:00
Dave Airlie
6a3b5f32c2 radv: spir-v allows texture size query with and without lod.
The translation to llvm was failing here due to required lod.

This fixes some new  SteamVR shaders.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b1340fd708)
2016-11-24 16:34:41 +00:00
Dave Airlie
32adfd509d radv: fix image view creation for depth and stencil only
This fixes the image view for sampling just the depth.

It removes some pointless swizzle code, and adds
a missing case for the x8_d24 format.

Fixes:
dEQP-VK.renderpass.formats.d32_sfloat_s8_uint.input.*
dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.*
dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.*

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6d7be52d90)
2016-11-24 16:34:41 +00:00
Dave Airlie
7e9bdb40f3 radv: make sure to flush input attachments correctly.
This fixes 9 of the
dEQP-VK.renderpass.attachment_allocation.input_output.*
tests.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 51a44c0021)
2016-11-24 16:34:41 +00:00
Kenneth Graunke
3c9e8660e9 i965: Fix GS push inputs with enhanced layouts.
We weren't taking first_component into account when handling GS push
inputs.  We hardly ever push GS inputs, so this was not caught by
existing tests.  When I started using component qualifiers for the
gl_ClipDistance arrays, glsl-1.50-transform-feedback-type-and-size
started catching this.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit c4be6e0b8d)
2016-11-24 16:34:41 +00:00
Tapani Pälli
8691daef62 mesa: fix empty program log length
In case we have empty log (""), we should return 0. This fixes
Khronos WebGL conformance test 'program-infolog'.

From OpenGL ES 3.1 (and OpenGL 4.5 Core) spec:
   "If pname is INFO_LOG_LENGTH , the length of the info log, including
    a null terminator, is returned. If there is no info log, zero is
    returned."

v2: apply same fix for get_shaderiv and _mesa_GetProgramPipelineiv (Ian)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97321
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ec4e71f75e)
2016-11-24 16:34:41 +00:00
Kenneth Graunke
1809f17bda mesa: Drop PATH_MAX usage.
GNU/Hurd does not define PATH_MAX since it doesn't have such arbitrary
limitation, so this failed to compile.  Apparently glibc does not
enforce PATH_MAX restrictions anyway, so it's kind of a hoax:

https://www.gnu.org/software/libc/manual/html_node/Limits-for-Files.html

MSVC uses a different name (_MAX_PATH) as well, which is annoying.

We don't really need it.  We can simply asprintf() the filenames.
If the filename exceeds an OS path limit, presumably fopen() will
fail, and we already check that.  (We actually use ralloc_asprintf
because Mesa provides that everywhere, and it doesn't look like we've
provided an implementation of GNU's asprintf() for all platforms.)

Fixes the build on GNU/Hurd.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98632
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9bfee7047b)
[Emil Velikov: s|prog->Id|base->Id|]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/main/arbprogram.c
2016-11-24 16:34:41 +00:00
Kenneth Graunke
747052ee18 i965: Fix compute shader crash.
Fixes crashes when starting Deus Ex: Mankind Divided.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit ca76e6b521)
2016-11-24 16:34:40 +00:00
Jason Ekstrand
c94c804c29 anv/blorp: Ignore clears for attachments first used as resolve destinations
Otherwise, we'll try to clear it the first time it's used as a draw so if
you do some multisampled rendering, resolve to an attachment, and then draw
on top of the single-sampled attachment, we might accidentally clear it.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ccdf9af392)
2016-11-24 16:34:40 +00:00
Jason Ekstrand
90bf0cb313 nir/spirv: Fix handling of gl_PrimitiveId
Before, we were always treating it as an output which bogus.  The only
stage in which this it can be an output is the geometry stage.  In all
other stages, it's an input which, in the back-end, we actually want to be
a system value.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 9557147592)
2016-11-24 16:34:40 +00:00
Jason Ekstrand
4c21d20dcf anv/fence: Handle ANV_FENCE_CREATE_SIGNALED_BIT
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1c97432ce8)
2016-11-24 16:34:40 +00:00
Jason Ekstrand
8dbdbc2191 anv: Handle null in all destructors
This fixes a bunch of new CTS tests which look for exactly this.  Even in
the cases where we just call vk_free to free a CPU data structure, we still
handle NULL explicitly.  This way we're less likely to forget to handle
NULL later should we actually do something less trivial.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 49f08ad77f)
[Emil Velikov: color_rt_surface_state is still around]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/vulkan/anv_image.c
2016-11-24 16:34:40 +00:00
Ben Widawsky
045420ea06 i965/glk: Add basic Geminilake support
v2: s/bdw/gen; Add the 2x6 config
v3: Add min_ds_entries

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 2193fb0e1f)
2016-11-24 16:34:40 +00:00
Ben Widawsky
ee56f5577d i965: Reorder PCI ID list to match release order
I have some OCD...

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit ffd9060b23)
2016-11-24 16:34:40 +00:00
Ben Widawsky
d3de9f5cb9 i965: Add some APL and KBL SKU strings
We got a couple for products that exist on ark.intel.com, so let's just
put them in now.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit b8509c8936)

Squashed with commit:

i965: Fix KBL typo in string

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 19a01f8139)
2016-11-24 16:34:40 +00:00
Dave Airlie
145ecf60dd ac/nir/llvm: fix channel in texture gather lowering code.
This fixes a number of CTS tests like:
dEQP-VK.glsl.texture_gather.basic.2d.rgba8ui.size_npot.clamp_to_edge_repeat

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 713522fb8d)
2016-11-24 16:34:40 +00:00
Dave Airlie
7bbe351e49 radv: don't crash on null swapchain destroy.
Just return if the passed in swapchain is NULL.

Fixes: dEQP-VK.wsi.xlib.swapchain.destroy.null_handle

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 38ab625c5f)
2016-11-24 16:34:40 +00:00
Dave Airlie
a3f628ca25 wsi: fix VK_INCOMPLETE for vkGetSwapchainImagesKHR
This fixes the x11 and wayland backends to not assert:
dEQP-VK.wsi.xcb.swapchain.get_images.incomplete

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 253fa25d09)
2016-11-24 16:34:40 +00:00
Jordan Justen
154cb64721 isl: Fix height calculation in isl_msaa_interleaved_scale_px_to_sa
No known fixed tests, but it looks like a typo from:

commit 8ac99eabb6

    intel/isl: Add a helper for getting the size of an interleaved pixel

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0ac57afa6f)
2016-11-24 16:34:40 +00:00
Kenneth Graunke
607cac69f8 intel: Set min_ds_entries on Broxton.
This was missing.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
(cherry picked from commit 341fc0073a)
2016-11-24 16:34:39 +00:00
Lionel Landwerlin
4b2caa02f0 anv: fix multi level clears with VK_REMAINING_MIP_LEVELS
A commit from the CTS suite on the 1.0-dev branch started using
VK_REMAINING_MIP_LEVELS, we're not dealing with it properly for clears.

Fixes:
   dEQP-VK.api.image_clearing.clear_color_image.*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a46bc3f70a)
2016-11-24 16:34:39 +00:00
Eric Anholt
fd5fe00f7b vc4: Fix register class handling of DDX/DDY arguments.
I had this exactly backwards, but apparently the piglit tests were all
landing in r0-r3 anyway.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 977d8b526b)
2016-11-24 16:34:39 +00:00
Steinar H. Gunderson
e3fe51dbee Fix races during _mesa_HashWalk().
There is currently no protection against walking a hash (using
_mesa_HashWalk()) and modifying it at the same time, for instance by inserting
or deleting elements. This leads to segfaults in multithreaded code if e.g.
someone calls glTexImage2D (which may have to walk the list of FBOs) while
another thread is calling glDeleteFramebuffers on another thread with the two
contexts sharing lists.

The reason for this is that _mesa_HashWalk() doesn't actually take the mutex
that normally protects the hash; it takes an entirely different mutex.
Thus, walks are only protected against other walks, and there is also no
outer lock taking this. There is an old comment saying that this is to fix
problems with deadlock if the callback needs to take a mutex; we solve this
by changing the mutex to be recursive.

A demonstration Helgrind hit from a real application:

==13412== Possible data race during write of size 8 at 0x3498C6A8 by thread #1
==13412== Locks held: 2, at addresses 0x1AF09530 0x2B3DF400
==13412==    at 0x1F040C99: _mesa_hash_table_remove (hash_table.c:395)
==13412==    by 0x1EE98174: _mesa_HashRemove_unlocked (hash.c:350)
==13412==    by 0x1EE98174: _mesa_HashRemove (hash.c:365)
==13412==    by 0x1EE2372D: _mesa_DeleteFramebuffers (fbobject.c:2669)
==13412==    by 0x6105AA4: movit::ResourcePool::cleanup_unlinked_fbos(void*) (resource_pool.cpp:473)
==13412==    by 0x610615B: movit::ResourcePool::release_fbo(unsigned int) (resource_pool.cpp:442)
[...]
==13412== This conflicts with a previous read of size 8 by thread #20
==13412== Locks held: 2, at addresses 0x1AF09558 0x1AF73318
==13412==    at 0x1F040CD9: _mesa_hash_table_next_entry (hash_table.c:415)
==13412==    by 0x1EE982A8: _mesa_HashWalk (hash.c:426)
==13412==    by 0x1EED6DFD: _mesa_update_fbo_texture.part.33 (teximage.c:2683)
==13412==    by 0x1EED9410: _mesa_update_fbo_texture (teximage.c:3043)
==13412==    by 0x1EED9410: teximage (teximage.c:3073)
==13412==    by 0x1EEDA28F: _mesa_TexImage2D (teximage.c:3105)
==13412==    by 0x166A68: operator() (mixer.cpp:454)

There are many more interactions than just these two possible.

Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Steinar H. Gunderson <steinar+mesa@gunderson.no>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 2e2562cabb)
2016-11-24 16:34:39 +00:00
Jason Ekstrand
9d5c3fc12b i965/gs: Allow primitive id to be a system value
This allows for gl_PrimitiveId to come in as a system value rather than as
an input.  This is the way it will come in from SPIR-V. We keeps the input
path working for now so we don't break GL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a5e88e66e6)
[Emil Velikov: nir_shader::info is not a pointer in branch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
2016-11-24 16:34:39 +00:00
Jason Ekstrand
cf8b11fc6c vulkan/wsi: Report the correct min/maxImageCount
From the Vulkan spec 1.0.32 section 29.6 docs for vkAcquireNextImageKHR:

   "Let n be the total number of images in the swapchain, m be the value of
   VkSurfaceCapabilitiesKHR::minImageCount, and a be the number of
   presentable images that the application has currently acquired (i.e.
   images acquired with vkAcquireNextImageKHR, but not yet presented with
   vkQueuePresentKHR).  vkAcquireNextImageKHR can always succeed if a ≤ n -
   m at the time vkAcquireNextImageKHR is called. vkAcquireNextImageKHR
   should not be called if a > n - m with a timeout of UINT64_MAX; in such
   a case, vkAcquireNextImageKHR may block indefinitely."

With minImageCount == 2 (as it was previously, the client is allowed to
acquire all but one image withoutblocking.  If we really need 4 images for
mailbox mode + pageflipping, then we need to request a minimum of 4 images
up-front.  This is a bit unfortunate because it means we will always
consume 4 images.  In the future, we may be able to optimize this a bit by
waiting until the server starts to flip and returning OUT_OF_DATE to get
the client to re-allocate with more images or something like that.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4fa0ca80ee)
2016-11-23 18:49:50 +00:00
Dave Airlie
6520a64c4d radv: fix texturesamples to handle single sample case
We can only read the valid samples if this is an MSAA
texture, which means the type field must be 0x14 or 0x15.

This fixes:
dEQP-VK.glsl.texture_functions.query.texturesamples.*

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 2de85eb97a)
2016-11-23 14:01:43 +00:00
Ian Romanick
953030bbb3 glsl: Parse 0 as a preprocessor INTCONSTANT
This allows a more reasonable error message for '#version 0' of

    0:1(10): error: GLSL 0.00 is not supported. Supported versions are: 1.10, 1.20, 1.30, 1.00 ES, 3.00 ES, 3.10 ES, and 3.20 ES

instead of

    0:1(10): error: syntax error, unexpected $undefined, expecting INTCONSTANT

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97420
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Karol Herbst <karolherbst@gmail.com>
(cherry picked from commit c8c46641af)
2016-11-23 14:00:44 +00:00
Ian Romanick
dfd6b765ba glcpp: Handle '#version 0' and other invalid values
The #version directive can only handle decimal constants.  Enforce that
the value is a decimal constant.

Section 3.3 (Preprocessor) of the GLSL 4.50 spec says:

    The language version a shader is written to is specified by

        #version number profile opt

    where number must be a version of the language, following the same
    convention as __VERSION__ above.

The same section also says:

    __VERSION__ will substitute a decimal integer reflecting the version
    number of the OpenGL shading language.

Use a separate flag to track whether or not the #version line has been
encountered.  Any possible sentinel (0 is currently used) could be
specified in a #version directive.  This would lead to trying to
(internally) redefine __VERSION__.  Since there is no parser location
for this addition, NULL is passed.  This eventually results in a NULL
dereference and a segfault.

Attempts to use -1 as the sentinel would also fail if '#version
4294967295' or '#version 18446744073709551615' were used.  We should
have piglit tests for both of these.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97420
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Karol Herbst <karolherbst@gmail.com>
(cherry picked from commit e85a747e29)
2016-11-23 13:59:41 +00:00
Jason Ekstrand
a4b67f664e vulkan/wsi/wayland: Clean up some error handling paths
This gets rid of all the memory leaks reported by the WSI CTS tests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 302f641d14)
2016-11-23 13:58:45 +00:00
Jason Ekstrand
0a2c318d9c vulkan/wsi/wayland: Include pthread.h
We use pthreads and, for some reason, it wasn't getting included

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3b6abfc69a)
2016-11-23 13:57:49 +00:00
Jason Ekstrand
8dab75a2ee anv: Rework fences
Our previous fence implementation was very simple.  Fences had two states:
signaled and unsignaled.  However, this didn't properly handle all of the
edge-cases that we need to handle.  In order to handle the case where the
client calls vkGetFenceStatus on a fence that has not yet been submitted
via vkQueueSubmit, we need a three-status system.  In order to handle the
case where the client calls vkWaitForFences on fences which have not yet
been submitted, we need more complex logic and a condition variable.  It's
rather annoying but, so long as the client doesn't do that, we should still
hit the fast path and use i915_gem_wait to do all our waiting.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 843775bab7)
2016-11-23 13:56:49 +00:00
Jason Ekstrand
64c818d6a6 anv/wsi: Set the fence to signaled in AcquireNextImageKHR
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 73701be667)
2016-11-23 13:55:52 +00:00
Jason Ekstrand
1ba7f6ce38 anv/gen8: Stall when needed in Cmd(Set|Reset)Event
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 71397042fe)
2016-11-23 13:54:55 +00:00
Eric Anholt
64d7d70c5b vc4: Clamp the shadow comparison value.
Fixes piglit glsl-fs-shadow2D-clamp-z.

Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 08d51487e3)
2016-11-23 13:52:19 +00:00
Eric Anholt
9a4206379b vc4: Don't abort when a shader compile fails.
It's much better to just skip the draw call entirely.  Getting this
information out of register allocation will also be useful for
implementing threaded fragment shaders, which will need to retry
non-threaded if RA fails.

Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4d019bd703)
2016-11-23 13:07:00 +00:00
Emil Velikov
4685a724f5 cherry-ignore: add reverted LLVM_LIBDIR patch
The patch was reverted shortly after it was merged.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-23 13:03:53 +00:00
Emil Velikov
b47ce6ddb8 docs: add sha256 checksums for 13.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-14 11:37:03 +00:00
Emil Velikov
f2f487ebbb docs: add release notes for 13.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-14 10:58:11 +00:00
Emil Velikov
11b9cdfcf9 Update version to 13.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-14 10:43:20 +00:00
Darren Salt
42d221723b radv/pipeline: Don't dereference NULL dynamic state pointers
This is a port of commit a4a5917248:

   Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts
   of pCreateInfo members are moved to the earliest points at which they
   should not be NULL.

This fixes a segfault, related to pColorBlendState, seen in Talos Principle
which I've observed after startup is completed and when exiting the menus,
depending on when Vulkan rendering is selected.

v2: moved the NULL check in radv_pipeline_init_blend_state to after the
declarations.
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

(cherry picked from commit 9b121512ac)
2016-11-14 09:36:25 +00:00
Steven Toth
d6bcbfb36c gallium/hud: protect against and initialization race
In the event that multiple threads attempt to install a graph
concurrently, protect the shared list.

Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 381edca826)
2016-11-14 09:35:08 +00:00
Steven Toth
e19ed2971f gallium/hud: close a previously opened handle
We're missing the closedir() to the matching opendir().

Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 5a58323064)
2016-11-14 09:35:03 +00:00
Steven Toth
5fa2b384f0 gallium/hud: fix a problem where objects are free'd while in use.
Instead of trying to maintain a reference counted list of valid HUD
objects, and freeing them accordingly, creating race conditions
between unanticipated multiple threads, simply accept they're
allocated once and never released until the process terminates.

They're a shared resource between multiple threads, so accept
they're always available for use.

Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 6ffed08679)
2016-11-14 09:34:12 +00:00
Kenneth Graunke
e7de2510e5 mesa: Fix pixel shader scratch space allocation on Gen9+ platforms.
We had missed a bit of errata - PS scratch needs to be computed as if
there were 4 subslices per slice, rather than 3.

                          Skylake      Broxton        Kabylake
                      GT1 GT2 GT3 GT4  2x6 3x6  GT1 GT1.5 GT2 GT3 GT4
Actual Slices          1   1   2   3    1   1    1    1    1   2   3
Total Subslices        3   3   6   9    2   3    2    3    3   6   9
Subsl. for PS Scratch  4   4   8   12   4   4    4    4    4   8   12

Note that Skylake GT1-3 already worked because we allocated 64 * 9
(trying to use a value that would work on GT4, with 9 subslices),
and the actual required values were 64 * 4 or 64 * 8.  However, all
others (Skylake GT4, Broxton, and Kabylake GT1-4) underallocated,
which can lead to scratch writes trashing random process memory,
and rendering corruption or GPU hangs.

Fixes GPU hangs and rendering corruption on Skylake GT4 in shaders that
spill.  Particularly, dEQP-GLES31.functional.ubo.all_per_block_buffers.*
now runs successfully with no hangs and renders correctly.  This may
fix problems on Broxton and Kabylake as well.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit aaee3daa90)
2016-11-11 22:20:07 +00:00
Jason Ekstrand
1a47251da4 anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4
This fixes hangs in Dota2

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a6c3d0f92b)
2016-11-11 22:19:51 +00:00
Jason Ekstrand
77dc3a5b7c anv/cmd_buffer: Take a command buffer instead of a batch in two helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1e3e347fd5)
2016-11-11 22:19:38 +00:00
Emil Velikov
3bb0415ab9 radv: Suffix the radeon_icd file with the host CPU
Port of the anv commit d96345de98 ("anv: Suffix the intel_icd file with
the host CPU").

v2: s/intel_icd/radeon_icd/ in commit summary (Gražvydas)

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com> (IRC)
(cherry picked from commit 0f434a68a3)

Squashed with commit:

radv: automake: list correct file in the EXTRA_DIST

Earlier commit renamed the file radeon_icd.json{,.in} but missed one
reference of the file - in EXTRA_DIST.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Fixes: 0f434a68a ("radv: Suffix the radeon_icd file with the host CPU")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b359f62456)
2016-11-10 22:07:29 +00:00
Emil Velikov
a4bc03fdfe radv: use correct .specVersion for extensions
Analogous to previous commit.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com> (IRC)
(cherry picked from commit abe110df01)
2016-11-10 22:07:04 +00:00
Emil Velikov
d8eea63121 anv: use correct .specVersion for extensions
Vulkan has introduced the consept of .specVersion which can be used to
attribute changes of the said extension.

The current loader does not check the value, thus it have gone unnoticed
that the driver exposes an old version of the following extensions:

VK_KHR_xcb_surface        (Rev 6)
VK_KHR_xlib_surface       (Rev 6)
VK_KHR_wayland_surface    (Rev 5)
- Updated the surface create function to take a pCreateInfo structure

VK_KHR_swapchain          (Rev 68)
- Moved the "validity" include for vkAcquireNextImage to be in its proper
  place, after the prototype and list of parameters.
...

According to the documentation:

  * pname:specVersion is the version of this extension.
    It is an integer, incremented with backward compatible changes.

Based on the history of vk.xml the above (latest) revision has been
available since Vulkan 1.0 so even if they were any backwards
incompatible change(s) [as hinted by the revision log] those should be
safe.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit f373a91a52)
2016-11-10 22:07:04 +00:00
Emil Velikov
3e616f77bd amd/addrlib: limit fastcall/regparm to GCC i386
The use of regparm causes an error on arm/arm64 builds with clang.
fastcall is allowed, but still throws a warning. As both options only
have effect on 32-bit x86 builds, limit them to that case.

v2: keep the __i386__ within GCC (Nicolai)

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Nicolai Hähnle <nhaehnle@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Herring <robh@kernel.org>
(cherry picked from commit 190bae7685)
2016-11-10 22:07:04 +00:00
Dave Airlie
49e093a2f5 radv: fix GetFenceStatus for signaled fences
if a fence is created pre-signaled we should return that
in GetFenceStatus even if it hasn't been submitted.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit fb50245ac1)
2016-11-09 23:48:41 +00:00
Dave Airlie
2bbf964af8 radv: enable conditional discard optimisation on radv.
This fixes a bunch of GPU hangs introduced in some CTS
tests like
dEQP-VK.memory.pipeline_barrier.host_write_uniform_buffer.65536

It works around an issue seen in the LLVM backend, but
also makes the radv code work more like the radeonsi stack.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3c9af7578f)
2016-11-09 23:46:32 +00:00
Dave Airlie
fa6c02787e nir: add conditional discard optimisation (v4)
This is ported from GLSL and converts

if (cond)
	discard;

into
discard_if(cond);

This removes a block, but also is needed by radv
to workaround a bug in the LLVM backend.

v2: handle if (a) discard_if(b) (nha)
cleanup and drop pointless loop (Matt)
make sure there are no dependent phis (Eric)
v3: make sure only one instruction in the then block.
v4: remove sneaky tabs, add cursor init (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b16dff2d88)
2016-11-09 23:41:48 +00:00
Dave Airlie
a65b6e12f3 ac/nir: add support for discard_if intrinsic (v2)
We are going to start lowering to this in NIR code,
so prepare radv for it.

v2: handle conversion to kilp properly (nha)

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit dd77faeca2)
2016-11-09 23:39:46 +00:00
Kristian Høgsberg Kristensen
ce555a7d1f anv: Do relocations in userspace before execbuf ioctl
Since our surface state buffer is shared by all batches, the kernel does a
full stall and sync with the CPU between batches every time we call
execbuf2 because it refuses to do relocations on an active buffer.  Doing
them in userspace and passing the NO_RELOC flag to the kernel allows us to
perform the relocations without stalling.

This improves the performance of Dota 2 by around 30% on a Sky Lake GT2.

v2 (Jason Ekstrand):
 - Better comments (Chris Wilson)
 - Fixed write_reloc for correct canonical form (Chris Wilson)

v3 (Jason Ekstrand):
 - Skip relocations which aren't needed
 - Provide an environment variable to always use the kernel
 - More comments about correctness (Chris Wilson)

v4 (Jason Ekstrand):
 - More comments (Chris Wilson)

v5 (Jason Ekstrand):
 - Rebase on top of moving execbuf2 setup go QueueSubmit

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b3a29f2e9e)
2016-11-09 23:38:12 +00:00
Jason Ekstrand
621b048734 anv: Move relocation handling from EndCommandBuffer to QueueSubmit
Ever since the early days of the Vulkan driver, we've been setting up the
lists of relocations at EndCommandBuffer time.  The idea behind this was to
move some of the CPU load out of QueueSubmit which the client is required
to lock around and into command buffer building which could be done in
parallel.  Then QueueSubmit basically just becomes a bunch of execbuf2
calls.

Technically, this works.  However, when you start to do more in QueueSubmit
than just execbuf2, you start to run into problems.  In particular, if a
block pool is resized between EndCommandBuffer and QueueSubmit, the list of
anv_bo's and the execbuf2 object list can get out of sync.  This can cause
problems if, for instance, you wanted to do relocations in userspace.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8b61c57049)
2016-11-09 23:36:32 +00:00
Jason Ekstrand
039a03d8d2 anv/batch: Move last_ss_pool_bo_offset to the command buffer
The original reason for putting it in the batch_bo was to allow primaries
to share it across secondaries or something like that.  However, the
relocation lists in secondary command buffers are are always left alone and
copied into the primary command buffer's relocation list.  This means that
the offset really applies at the command buffer level and putting it in the
batch_bo doesn't make sense.  This fixes a couple of potential bugs around
re-submission of command buffers that are not likely to be hit but are bugs
none the less.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 595400d577)
2016-11-09 23:35:15 +00:00
Jason Ekstrand
c64f655408 anv: Add an anv_execbuf helper struct
This commit adds a little helper struct for storing everything we use to
build an execbuf2 call.  Since the add_bo function really has nothing to do
with a command buffer, it makes sense to break it out a bit.  This also
reduces some of the churn in the next commit.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0fe6829427)
2016-11-09 23:33:42 +00:00
Jason Ekstrand
d22958eecb anv/batch_chain: Improve write_reloc
The old version wasn't properly handling large addresses where we have to
sign-extend to get it into the "canonical form" expected by the hardware.
Also, the new version is capable of doing a clflush of the newly written
reloc if requested.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 095c48a496)
2016-11-09 23:32:31 +00:00
Jason Ekstrand
ab3aeab297 anv: Initialize anv_bo::offset to -1
Since -1 is an invalid GPU address, this lets us know whether or not we
have a valid address for a buffer.  We don't get a valid address until the
first time that buffer is used in an execbuf2 ioctl.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d46bfb6297)
2016-11-09 23:31:00 +00:00
Jason Ekstrand
5bdd4fc273 anv/allocator: Simplify anv_scratch_pool
The previous implementation was being overly clever and using the
anv_bo::size field as its mutex.  Scratch pool allocations don't happen
often, will happen at most a fixed number of times, and never happen in the
critical path (they only happen in shader compilation).  We can make this
much simpler by just using the device mutex.  This also means that we can
start using anv_bo_init_new directly on the bo and avoid setting fields
one-at-a-time.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bd0f8d5070)
2016-11-09 23:29:42 +00:00
Jason Ekstrand
c4643f5f1e anv: Add a new bo_pool_init helper
This ensures that we're always setting all of the fields in anv_bo

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6283b6d56a)
2016-11-09 23:28:03 +00:00
Jason Ekstrand
ceefe979c6 anv: Don't presume to know what address is in a surface relocation
Because our relocation processing happens at EndCommandBuffer time and
because RENDER_SURFACE_STATE objects may be shared by batches, we really
have no clue whatsoever what address is actually written to the relocation
offset in the BO.  We need to stop making such claims to the kernel and
just let it relocate for us.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ba1eea4f95)
2016-11-09 23:26:39 +00:00
Jason Ekstrand
9eca84e052 anv: Add a cmd_buffer_execbuf helper
This puts the actual execbuf2 call in anv_batch_chain.c along with the
other relocation stuff.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit db9f4b2a2b)
2016-11-09 23:24:42 +00:00
Jason Ekstrand
6f55a66ad3 anv/device: Add an execbuf wrapper
This wrapper ensures that we always update all anv_bo::offset fields based
on the offsets returned by the kernel.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 07798c9c3e)
2016-11-09 22:05:22 +00:00
Anuj Phogat
678b4f6372 i965: Fix GPU hang related to multiple render targets and alpha testing
This patch should have been the part of commit e592f7df.
In a situation when there are multiple render targets with alpha testing
enabled, if fragment shader doesn't write to draw buffer zero, it causes
the GPU hang on SKL. No GPU hang is seen on HSW. Simulator gives a
warning for all gen6+ h/w:
"Illegal render target write message length 0xa expected 0xc"

This patch fixes the GPU hang as well as the simulator warning with
new piglit test fbo-mrt-alphatest-no-buffer-zero-write:
https://patchwork.freedesktop.org/patch/118212

No regressions in Jenkins CI system.

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit b9df2251c1)
2016-11-09 14:17:31 +00:00
Dave Airlie
8ab9842d2e radv: emit correct last export when Z/stencil export is enabled
I was getting a random GPU hang in the renderpass simple tests,
it turns out sometimes radv emitted the wrong thing "last".

This fixes the logic to emit Z/stencil last if they occur,
and not mark a color output as last. Also this relies on the
Z/STENCIL being the first two fragment outputs, which they are
so yay.

Fixes: dEQP-VK.renderpass.simple.color_depth (random hangs)
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit bafc75b437)
2016-11-09 14:16:03 +00:00
Jason Ekstrand
dba0abdc91 intel/blorp: Emit all the binding tables
At least on Sky Lake, after emitting 3DSTATE_CONSTANT_*, you are required
to re-emit the 3DSTATE_BINDING_TABLE_POINTERS packet for the corresponding
stage.  If you don't, double-buffering may fail and you may get the wrong
constants.  It turns out that you need to do this even if you have no push
constants to speak of or else the next 3DSTATE_CONSTANT packet you emit for
that stage may not work correctly.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 406cd9d126)
2016-11-09 14:14:33 +00:00
Eric Anholt
a31947fbf9 vc4: Use Newton-Raphson on the 1/W write to fix glmark2 terrain.
The 1/W was apparently not accurate enough, and we were getting sparklies
in the distance.  The closed driver also did a N-R step here.

Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 283d4d18e5)
2016-11-09 14:13:01 +00:00
Nicolai Hähnle
f7efc0f0fc st/mesa: fix the layer of VDPAU surface samplers
A (latent) bug in VDPAU interop was exposed by commit
e5cc84dd43.

Before that commit, the st_vdpau code created samplers with
first_layer == last_layer == 1 that the general texture handling code
would immediately delete and re-create, because the layer does not match
the information in the GL texture object.

This was correct behavior at least in the DMABUF case, because the imported
resource is supposed to have the correct offset already applied.  In the
non-DMABUF case, this was just plain wrong but apparently nobody noticed.

After that commit, the state tracker assumes that an existing sampler is
correct at all times.  Existing samplers are supposed to be deleted when
they may become invalid, and they will be created on-demand.  This meant
that the sampler with first_layer == last_layer == 1 stuck around, leading
to rendering artefacts (on radeonsi), command stream failures (on r600), and
assertions (in debug builds everywhere).

This patch fixes the problem by simply not creating a sampler at all in
st_vdpau_map_surface.  We rely on the generic texture code to do the right
thing, adding the layer_override to make the non-DMABUF case work.

v2: add the layer_override

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98512
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Cc: Christian König <deathsimple@vodafone.de>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 322483f71b)
2016-11-09 14:11:18 +00:00
Dave Airlie
9c297c5487 Revert "st/vdpau: use linear layout for output surfaces"
This reverts commit d180de3532.

This is a radeon specific hack that causes problems on nouveau
when combined with the SHARED flag later. If radeonsi needs a fix
for this, please fix it in the driver.

[chk]
Using linear surfaces for this makes sense because tilling isn't
beneficial and the surfaces can potentially be shared with other GPUs
using the VDPAU OpenGL interop.

[airlied]
I think we need a flag that isn't SHARED/LINEAR that is more
SHARED_OTHER_GPU.

[mareko]
Does radeonsi need PIPE_BIND_VIDEO_DECODE_OUTPUT that it would translate
into linear ?

[mareko]
My only concern is decoding performance. If the decoder works in 64x1
blocks, tiling will hurt. That's the theory. I don't know how the
decoder works.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (I+A)
(cherry picked from commit d0d5f7600c)
2016-11-09 14:09:41 +00:00
Marek Olšák
aa947e7a63 radeonsi: fix an assertion failure in si_decompress_sampler_color_textures
This fixes a crash in Deus Ex: Mankind Divided. Release builds were
unaffected, so it's not too serious.

Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 00baaa4752)
2016-11-09 14:07:47 +00:00
Marek Olšák
d54699135f glx: make interop ABI visible again
This was broken when the GLAPI use was removed from mesa_glinterop.h.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 64c2593a5c)
2016-11-09 14:06:27 +00:00
Marek Olšák
a0d11b190a egl: make interop ABI visible again
This was broken when the GLAPI use was removed from mesa_glinterop.h.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ee39d4456e)
2016-11-09 14:05:06 +00:00
Marek Olšák
aa60c7b1c1 egl: use util/macros.h
I need the definition of PUBLIC.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bf51b45313)
2016-11-09 14:03:39 +00:00
Nicolai Hähnle
b8f99c6b2f st/glsl_to_tgsi: fix dvec[34] loads from SSBO
When splitting up loads, we have to add 16 bytes to the offset for
the high components, just like already happens for stores.

Fixes arb_gpu_shader_fp64@shader_storage@layout-std140-fp64-shader.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e4b378800e)
2016-11-09 14:02:14 +00:00
Francisco Jerez
2789bfdbb5 nir: Flip gl_SamplePosition in nir_lower_wpos_ytransform().
Assuming the hardware is set up to use a screen coordinate system
flipped vertically with respect to the GL's window coordinate system,
the SYSTEM_VALUE_SAMPLE_POS vector will also be flipped vertically
with respect to the value expected by the GL, so we need to give it
the same treatment as gl_FragCoord.  Fixes the following CTS tests on
i965:

 ES31-CTS.functional.shaders.multisample_interpolation.interpolate_at_offset.at_sample_position.default_framebuffer
 ES31-CTS.functional.shaders.sample_variables.sample_pos.correctness.default_framebuffer

when run with any multisample configuration, e.g. rgba8888d24s8ms4.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit f3d387867f)
2016-11-09 14:00:31 +00:00
Andreas Boll
da1ac6bc46 glx/windows: Add wgl.h to the sources list
Otherwise it won't be picked in the tarball and the build will fail.

Fixes: 533b3530c1 ("direct-to-native-GL for GLX clients on Cygwin
("Windows-DRI")")
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>

(cherry picked from commit f792f0687f)
2016-11-09 13:59:00 +00:00
Nicolai Hähnle
996c20208f glsl: fix lowering of UBO references of named blocks
When a UBO reference has the form block_name.foo where block_name refers
to a block where the first member has a non-zero offset, the base offset
was incorrectly added to the reference.

Fixes an assertion triggered in debug builds by
GL45-CTS.enhanced_layouts.uniform_block_layout_qualifier_conflict. That test
doesn't properly check for correct execution in this case, so I am also
going to send out a piglit test.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 37d646c1b3)
2016-11-09 13:57:21 +00:00
Kenneth Graunke
9397899aed glsl: Update deref types when resizing implicitly sized arrays.
At link time, we resolve the size of implicitly sized arrays.
When doing so, we update the type of the ir_variables.  However,
we neglected to update the type of ir_dereference nodes which
reference those variables.

It turns out array_resize_visitor (for GS/TCS/TES interface array
handling) already did 2/3 of the cases for this, so we can simply
refactor the code and reuse it.

This fixes:
GL45-CTS.shader_storage_buffer_object.basic-syntax
GL45-CTS.shader_storage_buffer_object.basic-syntaxSSO

which have an SSBO containing an implicitly sized array, followed
by some other members.  setup_buffer_access uses the dereference
types to compute offsets to fields, and it had a stale type where
the implicitly sized array's length was still 0 instead of the
actual length.

While we're here, we can also fix update_array_sizes to properly
update deref types as well, fixing a FINISHME from 2010.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 8df4aebc94)
2016-11-09 13:55:41 +00:00
Timothy Arceri
bd3fde4068 mesa/glsl: delete previously linked shaders earlier when linking
This moves the delete linked shaders call to
_mesa_clear_shader_program_data() which makes sure we delete them
before returning due to any validation problems.

It also reduces some code duplication.

From the OpenGL 4.5 Core spec:

   "If LinkProgram failed, any information about a previous link of
   that program object is lost. Thus, a failed link does not restore
   the old state of program.

   ...

   If one of these commands is called with a program for which
   LinkProgram failed, no error is generated unless otherwise noted.
   Implementations may return information on variables and interface
   blocks that would have been active had the program been linked
   successfully. In cases where the link failed because the program
   required too many resources, these commands may help applications
   determine why limits were exceeded."

Therefore it's expected that we shouldn't be able to query the
program that failed to link and retrieve information about a
previously successful link.

Before this change the linker was doing validation before freeing
the previously linked shaders and therefore could exit on failure
before they were freed.

This change also fixes an issue in compat profile where a program
with no shaders attached is expect to fall back to fixed function
but was instead trying to relink IR from a previous link.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97715
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d2861d682a)
2016-11-09 13:53:50 +00:00
Fredrik Höglund
2478cfe41d radv: add support for anisotropic filtering on VI+
Ported from radeonsi.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e7b9c5eb74)
2016-11-09 13:52:16 +00:00
Dave Airlie
4514ce8bc7 radv: fix dual source blending
Dolphin tried to use this, but we hadn't had any tests for it properly.

All that is required is the shader output format needs to be set
for 0 and 1 exports.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 73592b9284)
2016-11-09 13:50:47 +00:00
Adam Jackson
bc1d7a6ac4 glx/glvnd: Fix dispatch function names and indices
As this array was not actually sorted, FindGLXFunction's binary search
would only sometimes work.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 8bca8d89ef)
2016-11-09 13:49:18 +00:00
Adam Jackson
c08a62a0b1 glx/glvnd: Don't modify the dummy slot in the dispatch table
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit deb0eb1660)
2016-11-09 13:47:50 +00:00
Jason Ekstrand
81df3f63cb anv/pipeline: Properly cache prog_data::param
Before we were caching the prog data but we weren't doing anything with
brw_stage_prog_data::param so anything with push constants wasn't getting
cached properly.  This commit fixes that.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 71cc1e188d)
2016-11-09 13:46:12 +00:00
Jason Ekstrand
e016945bdd anv/pipeline: Put actual pointers in anv_shader_bin
While we can simply calculate offsets to get to things such as the
prog_data and the key, it's much more user-friendly if there are just
pointers.  Also, it's a bit more fool-proof.

While we're at it, we rework the pipeline cache API to use the
brw_stage_prog_data type directly.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ff3185e3ba)
2016-11-09 13:44:32 +00:00
Jason Ekstrand
78fbafedf1 intel/blorp: Pass a brw_stage_prog_data to upload_shader
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4306c10a88)
2016-11-09 13:41:32 +00:00
Jason Ekstrand
5be463694b intel/blorp: Use wm_prog_data instead of hand-rolling our own
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 058304f081)
[Emil Velikov: brw_compile_fs() has different signature]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/blorp/blorp.c
2016-11-09 13:39:09 +00:00
Jason Ekstrand
88ebff8e25 anv: Better handle return codes from anv_physical_device_init
The case where we just want the loop to continue is INCOMPATIBLE_DRIVER
because that simply means that whatever FD we opened isn't a supported
Intel chip.  Other error codes such as OUT_OF_HOST_MEMORY are actual errors
and we should be returning early in that case.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a5f8ff6ca1)
2016-11-09 13:20:37 +00:00
Jason Ekstrand
9c722e8a2e vulkan/wsi/x11: Clean up connections in finish_wsi
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit daeb21e478)
2016-11-09 13:16:22 +00:00
Jason Ekstrand
f622d33347 vulkan/wsi/x11: Better handle wsi_x11_connection_create failure
Without this fix, the function would still end up returning NULL but it
would put that NULL connection in the hash table which would be bad.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fc0e9e3e40)
2016-11-09 13:14:48 +00:00
Chih-Wei Huang
dd5e802d33 android: avoid using libdrm with host modules
Note LOCAL_CFLAGS and LOCAL_SHARED_LIBRARIES in Android.common.mk
are used by both host and target modules. However, commit 112e988
moved libdrm related flags to common. It causes the errors like:

error: 'out/host/linux-x86/obj32/SHARED_LIBRARIES/libdrm_intermediates/export_includes',
needed by 'out/host/linux-x86/obj32/EXECUTABLES/mesa_gen_matypes_intermediates/import_includes',
missing and no known rule to make it

No reason to use libdrm with host modules.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Fixes: 112e988329 ("Android: move libdrm settings to top-level
Android.common.mk")
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit e3e5b1a488)
2016-11-09 13:13:19 +00:00
Nicolai Hähnle
ea07a57fc0 radeonsi: fix BFE/BFI lowering for GLSL semantics
Fixes spec/arb_gpu_shader5/execution/built-in-functions/*-bitfield{Extract,Insert}

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 5aef14932a)
2016-11-09 13:11:29 +00:00
Dave Airlie
620ef8e742 radv: expose xlib platform extension
I missed this when I added the xlib code, this allows
dolphin emu to start and crash later.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 9f0726f3e5)
2016-11-09 13:09:45 +00:00
Jason Ekstrand
2f8b48d274 anv/device: Return DEVICE_LOST if execbuf2 fails
This makes more sense than OUT_OF_HOST_MEMORY.  Technically, you can
recover from a failed execbuf2 but the batch you just submitted didn't
fully execute so things are in an ill-defined state.  The app doesn't want
to continue from that point anyway.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c41ec1679f)
2016-11-09 12:48:39 +00:00
Emil Velikov
405dd26860 docs: add sha256 checksums for 13.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-01 16:05:32 +00:00
Emil Velikov
df1b0a5a86 docs: Update 13.0.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-01 15:55:24 +00:00
Emil Velikov
acc06a239a Update version to 13.0.0(final)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-01 15:47:33 +00:00
Vinson Lee
5ef2504759 util: Include string.h in bitscan.h.
Fix build error with clang.

  Compiling src/compiler/glsl/link_varyings.cpp ...
In file included from src/compiler/glsl/link_varyings.cpp:33:
In file included from src/compiler/glsl/glsl_symbol_table.h:34:
In file included from src/compiler/glsl/ir.h:33:
In file included from src/compiler/glsl_types.h:29:
/usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration
extern int ffs (int __i) __THROW __attribute__ ((__const__));
           ^
src/util/bitscan.h:51:13: note: expanded from macro 'ffs'
            ^
src/util/bitscan.h:96:18: note: previous declaration is here
   const int i = ffs(*mask) - 1;
                 ^
src/util/bitscan.h:51:13: note: expanded from macro 'ffs'
            ^

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97952
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 889ee4da05)
2016-11-01 13:21:51 +00:00
Leo Liu
8daa9b33c0 st/omx/dec: disable tunnel for size different case
When the video coded size is different from frame size, we need the result
buffers are same as coded size, which are not size compatible with encode
required size, so that simply use no tunnel for this case instead of frame
by frame converting.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 06e3cd6a45)
2016-11-01 12:55:49 +00:00
Leo Liu
16a4d76374 st/omx/dec: result buffers size should match codec decoder size
Otherwise fails the check of matching between decoder size and buffers
size in kernel.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d9b2c4048d)
2016-11-01 12:54:11 +00:00
Jason Ekstrand
4251e076d5 i965/fs/generator: Don't use the address immediate for MOV_INDIRECT
The address immediate field is only 9 bits and, since the value is in
bytes, the highest GRF we can point to with it is g15.  This makes it
pretty close to useless for MOV_INDIRECT.  There were already piles of
restrictions preventing us from using it prior to Broadwell, so let's get
rid of the gen8+ code path entirely.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97779
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 2a4a86862c)
2016-11-01 12:52:40 +00:00
Marek Olšák
6c55e33424 radeonsi: fix behavior of GLSL findLSB(0)
12.0 and older need the same fix but elsewhere.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 4bf45a6079)
2016-11-01 12:50:53 +00:00
Marek Olšák
2ec8ad91b3 radeonsi: set VGT_GS_ONCHIP_CNTL on CIK and later
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: 11.2 12.0 13.0  <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e24dc43164)
2016-11-01 12:49:02 +00:00
Kenneth Graunke
0ff597c39b glsl: Improve accuracy of alpha scaling in advanced blend lowering.
When blending with GL_COLORBURN_KHR and these colors:

   dst = <0.372549027, 0.372549027, 0.372549027, 0.372549027>
   src = <0.09375, 0.046875, 0.0, 0.375>

the normalized dst value became 0.99999994 (due to precision problems
in the floating point divide of rgb by alpha).  This caused the color
burn equation to fail the dst >= 1.0 comparison.  The blue channel would
then fall through to the dst < 1.0 && src >= 0 comparison, which was
true, since src.b == 0.  This produced a factor of 0.0 instead of 1.0.

This is an inherent numerical instability in the color burn and dodge
equations - depending on the precision of alpha scaling, the value can
be either 0.0 or 1.0.  Technically, GLSL floating point division doesn't
even guarantee that 0.372549027 / 0.372549027 = 1.0.  So arguably, the
CTS should allow either value.  I've filed a bug at Khronos for further
discussion (linked below).

In the meantime, this patch improves the precision of alpha scaling by
replacing the division with (rgb == alpha ? 1.0 : rgb / alpha).  We may
not need this long term, but for now, it fixes the following CTS tests:

ES31-CTS.blend_equation_advanced.blend_specific.GL_COLORBURN_KHR
ES31-CTS.blend_equation_advanced.blend_all.GL_COLORBURN_KHR_all_qualifier

Cc: currojerez@riseup.net
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16042
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit e6aeeace69)
2016-11-01 12:47:32 +00:00
Jason Ekstrand
89cefe6325 intel/blorp: Rework our usage of ralloc when compiling shaders
Previously, we were creating the shader with a NULL ralloc context and then
trusting in blorp_compile_fs to clean it up.  The only problem was that
blorp_compile_fs didn't clean up its context properly so we were leaking.
When I went to fix that, I realized that it couldn't because it has to
return the shader binary which is allocated off of that context and used by
the caller.  The solution is to make blorp_compile_fs take a ralloc
context, allocate the nir_shaders directly off that context, and clean it
all up in whatever function creates the shader and calls blorp_compile_fs.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0, 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43dadb6edd)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/blorp/blorp_clear.c
2016-11-01 12:45:43 +00:00
Jason Ekstrand
75258017dd intel/blorp: Rename compile_nir_shader to compile_fs
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
(cherry picked from commit ab92480272)
2016-11-01 12:44:10 +00:00
Jason Ekstrand
875534e14c intel/blorp: Fix a couple asserts around image copy rectangles
With dealing with rectangles in compressed images, you can have a width or
height that isn't a multiple of the corresponding compression block
dimension but only if that edge of your rectangle is on the edge of the
image.  When we call convert_to_single_slice, it creates an 2-D image and a
set of tile offsets into that image.  When detecting the right-edge and
bottom-edge cases, we weren't including the tile offsets so the assert
would misfire.  This caused crashes in a few UE4 demos

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reported-by: "Eero Tamminen" <eero.t.tamminen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98431
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Tested-by: "Eero Tamminen" <eero.t.tamminen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 4964a5149b)
2016-11-01 12:31:43 +00:00
Samuel Pitoiset
06baf2cd86 nvc0/ir: fix emission of IMAD with NEG modifiers
The emitter tried to emit sub instead of subr when src0 has
actually a NEG modifier.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 84e946380b)
2016-11-01 12:14:47 +00:00
Emil Velikov
91b2b925d1 Update version to 13.0.0-rc3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-31 11:53:03 +00:00
Samuel Iglesias Gonsálvez
7a977612fc glsl: update default precision qualifier when it is set in the shader
Default precision qualifier for a data type could be set several times
inside a shader. This patch allows to update the default precision
qualifier for the given type that is saved in the symbol table.

If it is not in the symbol table, just add it.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97804
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 0e742926c6)
2016-10-27 19:47:20 +01:00
Samuel Iglesias Gonsálvez
95b5a69093 mesa/program: Add _mesa_symbol_table_replace_symbol()
This function allows to modify an existing symbol.

v2:
- Remove namespace usage now that it was deleted.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit dfbdb2c0b3)
Nominated-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-10-27 19:12:06 +01:00
Timothy Arceri
ea37a06037 glsl/mesa: remove unused namespace support from the symbol table
Namespace support seems to have been unused for a very long time.

Previously the hash table entry was never removed and the symbol name
wasn't freed until the symbol table was destroyed.

In theory this could reduced the number of times we need to copy a string
as duplicate names are reused. However in practice there is likely only a
limited number of symbols that are the same and this is likely to cause
other less than optimal behaviour such as the hash_table continuously
growing.

Along with dropping namespace support this change removes entries from
the hash table as they become unused.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
(cherry picked from commit 6dbe8a1b9f)
Nominated-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-10-27 19:11:47 +01:00
Kenneth Graunke
02d5e60ee0 glsl: Size TCS->TES unsized arrays to gl_MaxPatchVertices for queries.
SSO validation and other program interface queries want to see that
unsized (non-patch) TCS output/TES input arrays are implicitly sized
to gl_MaxPatchVertices.

By the time we create the program resource lists, we've sized the arrays
to their actual size.  (We try to create TCS output arrays to match the
output patch size right away, and at this point, we should have shrunk
TES input arrays.)  One option would be to keep them sized to
gl_MaxPatchVertices, and defer shrinking them.  But that's a big change,
and I don't think it's a good idea.

Instead, this patch introduces a new ir_variable flag which indicates
the variable is implicitly to gl_MaxPatchVertices.  Then, the linker
munges the types when creating the resource list, ignoring the size
in the IR's types.  Basically, lie about it for resource queries.
It's ugly, but I think it ought to work.

We probably could use var->data.implicit_sized_array for this, but
I opted for a separate bit to try and avoid convoluting the existing
SSBO handling.  They're similar in concept, but share none of the
same code...

Fixes:
ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage
and the ES32-CTS and ESEXT-CTS variants.

v2: Add a comment (requested by Timothy, written by me).

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 173558445d)
2016-10-27 11:33:14 +01:00
Kenneth Graunke
649a47a834 glsl: Pass ctx to program interface query helper functions.
The next commit will use this in add_shader_variable - this just
separates out some of the mechanical changes for easier review.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 34fd2ffed8)
2016-10-27 11:31:59 +01:00
Tapani Pälli
d640b0d71b egl: set preserved behavior for surface only if config supports it
Otherwise we can end up with mismatching behavior between config and
surface when client queries surface attributes. As example, configs
for DRI3 do not support preserved behavior but here we were setting
preserved behavior for pixmap and pbuffer.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98326
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit 2035930966)
2016-10-27 11:30:54 +01:00
Dave Airlie
d35c4d1512 radv/ac/llvm: trim texture return values
The intrinsic engine asserts in llvm due to this,
as we put a vec4 into a vec1, and the next instruction
isn't expecting it.

So trim the vector at the end before inserting it.

Reported-by: Christoph Haag <haagch+mesadev@frickel.club>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d548fa882b)
2016-10-27 11:29:49 +01:00
Samuel Pitoiset
5c00425354 nvc0/ir: fix emission of SHLADD with NEG modifiers
This affects GF100:GK110 chipsets, but not GM107+ where the
logic is a bit different. The emitters tried to emit sub
instead of subr when src0 has a NEG modifier.

This fixes the following piglit tests glsl-fs-loop-nested
and glsl-vs-loop-nested.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1ec7227d44)
2016-10-27 11:28:30 +01:00
Eric Engestrom
6458a9dc6c egl/dri2: swap_buffers_with_damage falls back to swap_buffers
Since commit 0a606a400f ("egl: add eglSwapBuffersWithDamageKHR"),
Android has been broken because the function eglSwapBuffersWithDamageKHR
is provided regardless of the extension being present. Also, the Android
meta-EGL always advertises the extension regardless of the underlying
EGL implementation. As there doesn't seem to be a simple way
conditionally make the EGL function ptr NULL, just implement a brain
dead version of eglSwapBuffersWithDamage{KHR,EXT}.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
CC: Rob Clark <robdclark@gmail.com>
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Rob Herring <robh@kernel.org>
[Emil Velikov: copy the original commit message from Rob's patch]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit 4fa799ae04)
2016-10-27 11:27:23 +01:00
Marek Olšák
b1d02e7006 st/mesa: allow multiple concurrent waiters in ClientWaitSync
so->fence can be unreferenced by one thread while another thread is
somewhere in ClientWaitSync and expecting so->fence to be non-NULL.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98172

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit b687f766fd)
2016-10-27 11:26:18 +01:00
Marek Olšák
c29a37c444 st/mesa: unduplicate st_check_sync code
It's the same as st_client_wait_sync. Discovered by Michel.
This is needed to make the following fix simpler.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit f240ad98bc)
2016-10-27 11:25:14 +01:00
Marek Olšák
9c5bbfcbc8 winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures
Maybe this is why SDMA has been broken for many amdgpu users?

SDMA is the only block which is used with imported textures and relies
on this variable. DB also uses it, but it doesn't get imported textures,
so it's unaffected.

I do get SDMA failures on Tonga before this patch if R600_DEBUG=testdma
is changed to use imported textures.

Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 6ec3b2a4b1)
2016-10-27 11:24:02 +01:00
Marek Olšák
cf82ceb21e gallium/radeon: make sure the address of separate CMASK is aligned properly
This should fix random GPU hangs on Hawaii and Fiji.

Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit dce05b3423)
2016-10-27 11:22:44 +01:00
Marek Olšák
b214af38b9 gallium/radeon: fix incorrect bpe use in si_set_optimal_micro_tile_mode
Oh my god, I wonder what catastrophic issues this was causing on SI.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 8a21f52d73)
2016-10-27 11:21:24 +01:00
Fredrik Höglund
fbfc01e654 vulkan/wsi/wayland: fix ARGB window support
Use an ARGB format for the DRM buffer when the compositeAlpha field
in VkSwapchainCreateInfoKHR is set to
VK_COMPOSITE_ALPHA_PRE_MULTIPLIED_BIT_KHR.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 68db0fe034)
2016-10-27 11:20:16 +01:00
Fredrik Höglund
100851b1f5 vulkan/wsi/x11: fix ARGB window support
Pass the correct depth to xcb_dri3_pixmap_from_buffer_checked().
Otherwise xcb_present_pixmap() fails with a BadMatch error.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 972670c200)
2016-10-27 11:19:09 +01:00
Fredrik Höglund
8ec30b87c0 radv: mark the fence as submitted and signalled in vkAcquireNextImageKHR
This stops the debug layers from complaining when fences are used to
throttle image acquisition.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0a153f4ee4)
2016-10-27 11:17:55 +01:00
Matt Turner
cc5995d9e6 radv: Replace "abi_versions" with correct "api_version".
git history shows "abi_versions" was used from the outset.

Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98415
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 14aac061e9)
2016-10-27 11:16:51 +01:00
Matt Turner
42de0666ec anv: Replace "abi_versions" with correct "api_version".
git history shows "abi_versions" was used from the outset.

Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98415
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 07755237d3)
2016-10-27 11:15:44 +01:00
Samuel Pitoiset
4083feb939 nvc0: use correct bufctx when invalidating CP textures
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7b2712c367)
2016-10-27 11:14:28 +01:00
Tapani Pälli
92a50b3d6e mesa: fix error handling in DrawBuffers
Patch rearranges error checking so that enum checking provided via
destmask happens before other checks. It needs to be done in this
order because other error checks do not work properly if there were
invalid enums passed.

Patch also refines one existing check and it's documentation to match
GLES 3.0 spec (also in later specs). This was somewhat mysteriously
referring to desktop GL but had a check for gles3.

Fixes following dEQP tests:

   dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers

no CI regressions observed.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98134
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit a1652a059e)
2016-10-27 11:13:24 +01:00
Tapani Pälli
732b39507b egl: add check that eglCreateContext gets a valid config
Fixes following dEQP test:

   dEQP-EGL.functional.negative_api.create_context

v2: don't break EGL_KHR_no_config_context (Eric Engestrom)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5876f3c85a)
2016-10-27 11:12:19 +01:00
Tapani Pälli
8962e9a239 Revert "egl/android: Set EGL_MAX_PBUFFER_WIDTH and EGL_MAX_PBUFFER_HEIGHT"
This reverts commit b1d636aa00, previous
commit sets these values for all egl configs.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 1ef7873397)
2016-10-27 11:11:13 +01:00
Tapani Pälli
29f70e8e09 egl/dri2: set max values for pbuffer width and height
While these max values were previously fixed for pbuffer creation, this
change makes also eglGetConfigAttrib() return correct values.

Fixes following dEQP tests:

   dEQP-EGL.functional.create_surface.pbuffer.rgb888_no_depth_no_stencil
   dEQP-EGL.functional.create_surface.pbuffer.rgb888_depth_stencil
   dEQP-EGL.functional.create_surface.pbuffer.rgba8888_no_depth_no_stencil
   dEQP-EGL.functional.create_surface.pbuffer.rgba8888_depth_stencil

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98326
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b91e1e38e8)
2016-10-27 11:10:08 +01:00
Kenneth Graunke
04bd51d7d0 i965: Drop nir_inputs from fs_visitor.
It's unused.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 41034abfe6)
2016-10-27 11:08:41 +01:00
Kenneth Graunke
f17450ff7e i965: Don't use nir_assign_var_locations for VS/TES/GS outputs.
Fixes spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-dvec3.

v2: Remove nir_outputs field from fs_visitor (caught by Tim and Iago).

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 59864e8e02)
2016-10-27 11:07:08 +01:00
Kenneth Graunke
de826a10a7 i965: Make split_virtual_grfs() call compact_virtual_grfs().
Post-splitting, VGRFs have a maximum size (MAX_VGRF_SIZE).  This is
required by the register allocator, as we have to create classes for
each size of VGRF.

We can (and do) allocate virtual registers larger than MAX_VGRF_SIZE,
but we must ensure that they are splittable.  split_virtual_grfs()
asserts that the post-splitting register size is in range.

Unfortunately, these trip for completely dead registers which are too
large - we only set split points for live registers.  So dead ones are
never split, and if they happened to be too large, they'd trip asserts.

To fix this, call compact_virtual_grfs() to eliminate dead registers
before splitting.

v2: Add a comment written by Iago.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 27715c73ff)
2016-10-27 11:06:02 +01:00
Kenneth Graunke
188a866fd0 i965: Drop unnecessary switch statement in nir_setup_outputs()
TCS and FS are skipped above.  CS has no output variables.
All remaining cases take the same path.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 3728ee000a)
2016-10-27 11:04:57 +01:00
Axel Davy
a850e69b7e st/nine: Fix locking CubeTexture surfaces.
Only one face of Cubetextures was locked when in DEFAULT Pool.
Fixes:
https://github.com/iXit/Mesa-3D/issues/129

CC: "12.0 13.0" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit eed605a473)
2016-10-27 11:03:50 +01:00
Axel Davy
d576a2b0e6 st/nine: Fix mistake in Volume9 UnlockBox
In the format fallback path,
the height was used instead of the depth.

CC: "12.0 13.0" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit fe7bb46134)
2016-10-27 11:02:45 +01:00
Axel Davy
32caa7438a st/nine: Fix leak with integer and boolean constants
Leak introduced by:
a83dce0128

The patch also moves the part to
release changed.vs_const_i and changed.vs_const_b
before the if (!cb.buffer_size) check,
to avoid reuploading every draw call if
integer or boolean constants are dirty, but the shaders
use no constants.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
CC: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 25beccb379)
2016-10-27 11:01:38 +01:00
Nicolai Hähnle
074ede8d4f st/mesa: cleanup and fix primitive restart for indirect draws
There are three intended functional changes here:

1. OpenGL 4.5 clarifies that primitive restart should only apply with index
   buffers, so make that change explicit in the indirect draw path.

2. Make PrimitiveRestartFixedIndex work with indirect draws.

3. The change where primitive_restart is only set when the restart index can
   actually have an effect (based on the size of indices) is also applied for
   indirect draws.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 3d6b5dee3a)
2016-10-27 10:41:13 +01:00
Emil Velikov
497cf4a9d1 cherry-ignore: add mapi VISILITY_CFLAGS patch
Cherry-picked without -x

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-27 10:36:07 +01:00
Emil Velikov
f623a8be3e Update version to 13.0.0-rc2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-24 12:09:15 +01:00
Jonathan Gray
af81cdfec0 mapi: automake: set VISIBILITY_CFLAGS for shared glapi
shared glapi was previously built without setting CFLAGS for
AM_CFLAGS and VISIBILITY_CFLAGS.

This resulted in symbols being exported that shouldn't be.

The x86 and sparc assembly versions of the dispatch table partially
mitigated this by using .hidden.  Otherwise shared_dispatch_stub_*
were being exported.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: "11.2 12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-10-24 11:32:13 +01:00
Emil Velikov
990f395e00 anv: automake: cleanup the generated json file during make clean
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 8df581520a)

Conflicts:
	src/intel/vulkan/Makefile.am
2016-10-24 11:31:53 +01:00
Stencel, Joanna
19e8270fe0 egl/wayland: add missing destroy_window callback
The original patch by Joanna added the function pointer and callback yet
things got only partially applied - the infra was added, but the
implementation was missing.

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Fixes: 690ead4a13 ("egl/wayland-egl: Fix for segfault in
dri2_wl_destroy_surface.")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

(cherry picked from commit 2e0ab61e29)
2016-10-24 09:55:02 +01:00
Emil Velikov
cac49ee2cd automake: don't forget to pick wglext.h in the tarball
Earlier commit reworked the header install rules, to ensure that the
correct ones are installed only as needed.

By doing so it dropped a wildcard which was effectively including the
wglext.h header in the tarball.

Add the header to the top-level noinst_HEADERS, since the it is not
meant to be installed (autoconf is not used on Windows plaforms).

Fixes: a89faa2022 ("autoconf: Make header install distinct for various
APIs (v2)")
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Cc: Chuck Atkins <chuck.atkins@kitware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

(cherry picked from commit 3511a86111)
2016-10-24 09:55:02 +01:00
Dave Airlie
0f8b7f90d1 radv: allow cmask transitions without fast clear
This fixes
dEQP-VK.pipeline.multisample.sampled_image*

These all render to multisampled image, and then
sample from it, so we must transition it correctly,
since we have a cmask and fmask this will cause
the correct transition.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a969548f59)
2016-10-24 09:55:02 +01:00
Jason Ekstrand
abf5327b86 anv: Suffix the intel_icd file with the host CPU
Vulkan has a multi-arch problem... The idea behind the Vulkan loader is
that you have a little json file on your disk that tells the loader where
to find drivers.  The loader looks for these json files in standard
locations, and then goes and loads the my_driver.so's that they specify.
This allows you as a driver implementer to put their driver wherever on the
disk they want so long as the ICD points in the right place.

For a multi-arch system, however, you may have multiple libvulkan_intel.so
files installed that the loader needs to pick depending on architecture.
Since the ICD file format does not specify any architecture information,
you can't tell the loader where to find the 32-bit version vs. the 64-bit
version.  The way that packagers have been dealing with this is to place
libvulkan_intel.so in the top level lib directory and provide just a name
(and no path) to the loader.  It will then use the regular system search
paths and find the correct driver.  While this solution works fine for
distro-installed Vulkan drivers, it doesn't work so well for user-installed
drivers because they may put it in /opt or $HOME/.local or some other more
exotic location.  In this case, you can't use an ICD json file with just a
library name because it doesn't know where to find it; you also have to add
that to your library lookup path via LD_LIBRARY_PATH or similar.

This patch handles both use-cases by taking advantage of the fact that the
loader dlopen()s each of the drivers and, if one dlopen() calls fails, it
silently continues on to open other drivers.  By suffixing the icd file, we
can provide two different json files: intel_icd.x86_64.json and
intel_icd.i686.json with different paths.  Since dlopen() will only succeed
on the libvulkan_intel.so of the right arch, the loader will happily ignore
the others and load that one.  This allows us to properly handle multi-arch
while still providing a full path so user installs will work fine.

I tested this on my Fedora 25 machine with 32 and 64-bit builds of our
Vulkan driver installed and 32 and 64-bit builds of crucible.  It seems to
work just fine.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d96345de98)

Squashed with commit:

anv: Always use the full driver path in the intel_icd.*.json

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7ea4ef8849)

Squashed with commit:

configure: Get rid of the --disable-vulkan-icd-full-driver-path flag

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3f05fc62f9)
2016-10-24 09:54:28 +01:00
Francisco Jerez
d0d3e721d0 Revert "Revert "mapi: export all GLES 3.2 functions in libGLESv2.so""
This reverts commit 85e9bbc14d.  The
previous commit should help with the scons build failure caused by the
original commit.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 811eb7f178)
2016-10-24 09:10:01 +01:00
Francisco Jerez
293e458558 glapi: Move PrimitiveBoundingBox and BlendBarrier definitions into ES3.2 category.
These two GLES 3.2 entry points were being defined in the category of
the ARB_ES3_2_compatibility and KHR_blend_equation_advanced extensions
respectively instead of in the ES3.2 category.  Defining them in the
ES3.2 category makes sure that the gl_procs.py generator emits
declarations in the glprocs.h header file for the unsuffixed GLES-only
entry points that PrimitiveBoundingBoxARB and BlendBarrierKHR
respectively alias.  This should avoid a compilation failure during
scons builds in combination with "mapi: export all GLES 3.2 functions
in libGLESv2.so".

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
(cherry picked from commit 15a084a039)
2016-10-24 09:08:28 +01:00
Samuel Pitoiset
5798d602e0 nvc0: do not break 3D state by pushing MS coordinates on Fermi
Long story short, 3D and CP are aliased on Fermi and initializing
compute after pushing the MS sample coordinate offsets seems to
corrupt 3D state for weird reasons.

I still don't have the faintest clue what is going on, but
this seems to only affect Fermi generation. A possible fix
could be to use two different channels, one for 3D and one
for CP.

This fixes a bunch of regressions pinpointed by piglit.

Fixes: "nvc0: fix up image support for allowing multiple samples"
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 42273edf79)
2016-10-24 09:07:21 +01:00
Nicolai Hähnle
039d1e6f11 radeonsi: fix 64-bit loads from LDS
Fixes spec/arb_tessellation_shader/execution/dvec[23]-vs-tcs-tes, among
others.

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 4a2dbfff05)
2016-10-24 09:06:16 +01:00
Nicolai Hähnle
ba6efd48c3 st/mesa: only set primitive_restart when the restart index is in range
Even when enabled, primitive restart has no effect when the restart index
is larger than the representable values in the index buffer.

Fixes GL45-CTS.gtf31.GL3Tests.primitive_restart.primitive_restart_upconvert
for radeonsi VI.

v2: add an explanatory comment

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
(cherry picked from commit bfa50f88ce)
2016-10-24 09:05:18 +01:00
Nicolai Hähnle
13f685cf11 st/glsl_to_tgsi: sort input and output decls by TGSI index
Fixes a regression introduced by commit 777dcf81b.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98307
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3d9b57e493)
2016-10-24 09:04:17 +01:00
Nicolai Hähnle
8f807e914f st/glsl_to_tgsi: fix block copies of arrays of structs
Use a full writemask in this case. This is relevant e.g. when a function
has an inout argument which is an array of structs.

v2: use C-style comment (Timothy Arceri)

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a1895685f8)
2016-10-24 09:03:16 +01:00
Nicolai Hähnle
3581e21d5b st/glsl_to_tgsi: fix block copies of arrays of doubles
Set the type of the left-hand side to the same as the right-hand side,
so that when the base type is double, the writemask of the MOV instruction
is properly fixed up.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ca592af880)
2016-10-24 09:02:14 +01:00
Ilia Mirkin
52df379d6b nv50/ir: process texture offset sources as regular sources
With ARB_gpu_shader5, texture offsets can be any source, including TEMPs
and IN's. Make sure to process them as regular sources so that we pick
up masks, etc.

This should fix some CTS tests that feed offsets directly to
textureGatherOffset, and we were not picking up the input use, thus not
advertising it in the shader header.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dave Airlie <airlied@redhat.com>
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cd45d758ff)
2016-10-24 09:01:03 +01:00
Ilia Mirkin
05b89cf40e nv50,nvc0: avoid reading out of bounds when getting bogus so info
The state tracker tries to attach the info to the wrong shader. This is
easy enough to protect against.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 313fba5ee1)
2016-10-24 08:59:57 +01:00
Eric Engestrom
4768b7353f wsi/wayland: fix error path
Fixes: 1720bbd353 ("anv/wsi: split image alloc/free out to separate fns.")
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8bf7717e1f)
2016-10-24 08:58:59 +01:00
Dave Airlie
554a99ebde radv: use emit_icmp for samples_identical
On a debug llvm build we'd assert on the next compare
when the return from samples_identical was i1 instead
of i32.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d842546ad1)
2016-10-24 08:51:57 +01:00
Emil Velikov
e45c4586c2 Update version to 13.0.0-rc1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-19 19:12:42 +01:00
Emil Velikov
2ced8eb136 Revert Use absolute path in intel_icd.json and related patches.
This commit effectively reverts the following commits:

This reverts commit 0b6837a643.
This reverts commit 05f36435ef.
This reverts commit a2ae67aa47.

While the feature introduced is convinient for development it is not as
useful for distributions. Furthermore it even breaks things as one
wishes to have both 32 and 64 bit package installed on the same system.

Keep the functionality in development branch(es) and drop it from
distribution packages to avoid confusion and misuse.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-19 19:10:30 +01:00
334 changed files with 8724 additions and 3061 deletions

View File

@@ -82,11 +82,13 @@ LOCAL_CFLAGS += \
-D__STDC_LIMIT_MACROS
endif
ifneq ($(LOCAL_IS_HOST_MODULE),true)
# add libdrm if there are hardware drivers
ifneq ($(filter-out swrast,$(MESA_GPU_DRIVERS)),)
LOCAL_CFLAGS += -DHAVE_LIBDRM
LOCAL_SHARED_LIBRARIES += libdrm
endif
endif
LOCAL_CPPFLAGS += \
$(if $(filter true,$(MESA_LOLLIPOP_BUILD)),-D_USING_LIBCXX) \

View File

@@ -40,7 +40,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-vdpau \
--enable-xa \
--enable-xvmc \
--disable-llvm-shared-libs \
--enable-llvm-shared-libs \
--with-egl-platforms=x11,wayland,drm,surfaceless \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr \
@@ -62,6 +62,7 @@ noinst_HEADERS = \
include/c99_math.h \
include/c11 \
include/D3D9 \
include/GL/wglext.h \
include/HaikuGL \
include/no_extern_c.h \
include/pci_ids

View File

@@ -1 +1 @@
12.1.0-devel
13.0.6

28
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,28 @@
# Commit was picked with -x
907ace57986733add2aebfa9dd7c83c67efed70e mapi: automake: set VISIBILITY_CFLAGS for shared glapi
# Commit was reverted shortly after it landed in master
a39ad185932eab4f25a0cb2b112c10d8700ef242 configure.ac: honour LLVM_LIBDIR when linking against LLVM
# Commit fixes an earlier patch which is quite invasive to be considered for stable.
157971e450c34ec430c295ff922c2e597294aba3 i965/blit: Fix the src dimension sanity check in miptree_copy
# Similar to the above - depends on the series which introduce intel_miptree_copy
b18cd8ce2c07c2d1a666fbff1f0d92d17dd5b22c i965/miptree: Use intel_miptree_copy for maps
# The commit is a backport of an identical anv one. The latter is not in stable
# and so does this one since they depend on functionality which is not in stable.
65cbb993d33976d9ee24eff01ade8ed9013617ca radv: Call nir_lower_constant_initializers.
# Commit causes regression on i915, and Nicolai requested that we drop it all together.
963311b71fd9900351a4a9dd1cd5f5db391f7e1b mesa/main: fix version/extension checks in _mesa_ClampColor
# Misnominated (only previous commit was meant to be for stable)
36b9976e1f99e8070c67cb8a255793939db77d02 egl/wayland: Avoid race conditions when on non-main thread
# The optimisation itself is broken and was removed completely
a4393bd97fe62e8299273bae769201c5c9c816ea i965/fs: Fix the inline nir_op_pack_double optimization
# There is no ANV fast_clear support in branch
42b10b175d5e8dfb9c4c46edbc306e7fac6bd3ec anv/blorp/clear_subpass: Only set surface clear color for fast clears
6b644e571e2344691e4d58ff0bba3ddc059c1a5d anv: Stall before fast-clear operations

View File

@@ -10,26 +10,28 @@
# $ bin/get-extra-pick-list.sh | tee picklist
# Use the last branchpoint as our limit for the search
# XXX: there should be a better way for this
latest_branchpoint=`git branch | grep \* | cut -c 3-`-branchpoint
latest_branchpoint=`git merge-base origin/master HEAD`
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' |\
cut -c -8 |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# For each cherry-picked commit...
cat already_picked | cut -c -8 |\
while read sha
do
# Check if the original commit is referenced in master
# ... check if it's referenced (fixed by another) patch
git log -n1 --pretty=oneline --grep=$sha $latest_branchpoint..origin/master |\
cut -c -8 |\
while read candidate
do
# Check if the potential fix, hasn't landed in branch yet.
found=`git log -n1 --pretty=oneline --reverse --grep=$candidate $latest_branchpoint..HEAD |wc -l`
if test $found = 0
then
echo Commit $candidate might need to be picked, as it references $sha
# And flag up if it hasn't landed in branch yet.
if grep -q ^$candidate already_picked ; then
continue
fi
echo Commit $candidate references $sha
done
done
rm -f already_picked

61
bin/get-fixes-pick-list.sh Executable file
View File

@@ -0,0 +1,61 @@
#!/bin/bash
# Script for generating a list of candidates [referenced by a Fixes tag] for
# cherry-picking to a stable branch
#
# Usage examples:
#
# $ bin/get-fixes-pick-list.sh
# $ bin/get-fixes-pick-list.sh > picklist
# $ bin/get-fixes-pick-list.sh | tee picklist
# Use the last branchpoint as our limit for the search
latest_branchpoint=`git merge-base origin/master HEAD`
# List all the commits between day 1 and the branch point...
git log --reverse --pretty=%H $latest_branchpoint > already_landed
# ... and the ones cherry-picked.
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits with Fixes tag
git log --reverse --pretty=%H -i --grep="fixes:" $latest_branchpoint..origin/master |\
while read sha
do
# For each one try to extract the tag
fixes_count=`git show $sha | grep -i "fixes:" | wc -l`
if [ "x$fixes_count" != x1 ] ; then
echo WARNING: Commit $sha has nore than one Fixes tag
fi
fixes=`git show $sha | grep -i "fixes:" | head -n 1`
# The following sed/cut combination is borrowed from GregKH
id=`echo ${fixes} | sed -e 's/^[ \t]*//' | cut -f 2 -d ':' | sed -e 's/^[ \t]*//' | cut -f 1 -d ' '`
# Bail out if we cannot find suitable id.
# Any specific validation the $id is valid and not some junk, is
# implied with the follow up code
if [ "x$id" = x ] ; then
continue
fi
# Check if the offending commit is in branch.
# Be that cherry-picked ...
# ... or landed before the branchpoint.
if grep -q ^$id already_picked ||
grep -q ^$id already_landed ; then
# Finally nominate the fix if it hasn't landed yet.
if grep -q ^$sha already_picked ; then
continue
fi
echo Commit $sha fixes $id
fi
done
rm -f already_picked
rm -f already_landed

View File

@@ -8,13 +8,16 @@
# $ bin/get-pick-list.sh > picklist
# $ bin/get-pick-list.sh | tee picklist
# Use the last branchpoint as our limit for the search
latest_branchpoint=`git merge-base origin/master HEAD`
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^CC:.*mesa-stable' $latest_branchpoint..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

42
bin/get-typod-pick-list.sh Executable file
View File

@@ -0,0 +1,42 @@
#!/bin/sh
# Script for generating a list of candidates which have typos in the nomination line
#
# Usage examples:
#
# $ bin/get-typod-pick-list.sh
# $ bin/get-typod-pick-list.sh > picklist
# $ bin/get-typod-pick-list.sh | tee picklist
# NB:
# This script intentionally _never_ checks for specific version tag
# Should we consider folding it with the original get-pick-list.sh
# Use the last branchpoint as our limit for the search
latest_branchpoint=`git merge-base origin/master HEAD`
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^CC:.*mesa-dev' $latest_branchpoint..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.
if [ -f bin/.cherry-ignore ] ; then
if grep -q ^$sha bin/.cherry-ignore ; then
continue
fi
fi
# Check to see if it has already been picked over.
if grep -q ^$sha already_picked ; then
continue
fi
git log -n1 --pretty=oneline $sha | cat
done
rm -f already_picked

View File

@@ -1377,6 +1377,9 @@ AC_ARG_ENABLE([driglx-direct],
dnl
dnl libGL configuration per driver
dnl
if test "x$enable_glx" != xno; then
PKG_CHECK_MODULES([GLPROTO], [glproto >= $GLPROTO_REQUIRED])
fi
case "x$enable_glx" in
xxlib | xgallium-xlib)
# Xlib-based GLX
@@ -1390,7 +1393,6 @@ xxlib | xgallium-xlib)
;;
xdri)
# DRI-based GLX
PKG_CHECK_MODULES([GLPROTO], [glproto >= $GLPROTO_REQUIRED])
# find the DRI deps for libGL
dri_modules="x11 xext xdamage xfixes x11-xcb xcb xcb-glx >= $XCBGLX_REQUIRED"
@@ -1648,7 +1650,7 @@ fi
AC_ARG_WITH([vulkan-drivers],
[AS_HELP_STRING([--with-vulkan-drivers@<:@=DIRS...@:>@],
[comma delimited Vulkan drivers list, e.g.
"intel"
"intel,radeon"
@<:@default=no@:>@])],
[with_vulkan_drivers="$withval"],
[with_vulkan_drivers="no"])
@@ -1667,13 +1669,6 @@ AC_ARG_WITH([vulkan-icddir],
[VULKAN_ICD_INSTALL_DIR='${datarootdir}/vulkan/icd.d'])
AC_SUBST([VULKAN_ICD_INSTALL_DIR])
AC_ARG_ENABLE([vulkan-icd-full-driver-path],
[AS_HELP_STRING([--disable-vulkan-icd-full-driver-path],
[create Vulkan ICD files with just a .so name and no path])],
[vulkan_icd_driver_path="$enableval"],
[vulkan_icd_driver_path="yes"])
AM_CONDITIONAL(VULKAN_ICD_DRIVER_PATH, test "x$vulkan_icd_driver_path" = xyes)
if test -n "$with_vulkan_drivers"; then
VULKAN_DRIVERS=`IFS=', '; echo $with_vulkan_drivers`
for driver in $VULKAN_DRIVERS; do
@@ -1979,6 +1974,21 @@ if test "x$enable_opencl" = xyes; then
if test "x$have_libelf" != xyes; then
AC_MSG_ERROR([Clover requires libelf])
fi
if test "x${ac_cv_cxx_compiler_gnu}" = xyes; then
altivec_enabled=no
AC_COMPILE_IFELSE([AC_LANG_SOURCE([
#if !defined(__VEC__) || !defined(__ALTIVEC__)
#error "AltiVec not enabled"
#endif
])], altivec_enabled=yes)
if test "$altivec_enabled" = yes; then
CLOVER_STD_OVERRIDE="-std=gnu++11"
fi
AC_SUBST([CLOVER_STD_OVERRIDE])
fi
fi
AM_CONDITIONAL(HAVE_CLOVER, test "x$enable_opencl" = xyes)
AM_CONDITIONAL(HAVE_CLOVER_ICD, test "x$enable_opencl_icd" = xyes)

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.0 Release Notes / TBD</h1>
<h1>Mesa 13.0.0 Release Notes / November 1, 2016</h1>
<p>
Mesa 13.0.0 is a new development release.
@@ -33,7 +33,8 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD.
4a54d7cdc1a94a8dae05a75ccff48356406d51b0d6a64cbdc641c266e3e008eb mesa-13.0.0.tar.gz
94edb4ebff82066a68be79d9c2627f15995e1fe10f67ab3fc63deb842027d727 mesa-13.0.0.tar.xz
</pre>
@@ -74,11 +75,236 @@ Note: some of the new features are only available with certain drivers.
<h2>Bug fixes</h2>
TBD.
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61907">Bug 61907</a> - Indirect rendering of multi-texture vertex arrays broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69622">Bug 69622</a> - eglTerminate then eglMakeCurrent crahes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71759">Bug 71759</a> - Intel driver fails with &quot;intel_do_flush_locked failed: No such file or directory&quot; if buffer imported with EGL_NATIVE_PIXMAP_KHR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83036">Bug 83036</a> - [ILK]Piglit spec_ARB_copy_image_arb_copy_image-formats fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89599">Bug 89599</a> - symbol 'x86_64_entry_start' is already defined when building with LLVM/clang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90513">Bug 90513</a> - Odd gray and red flicker in The Talos Principle on GK104</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91342">Bug 91342</a> - Very dark textures on some objects in indoors environments in Postal 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92306">Bug 92306</a> - GL Excess demo renders incorrectly on nv43</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94148">Bug 94148</a> - Framebuffer considered invalid when a draw call is done before glCheckFramebufferStatus</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94354">Bug 94354</a> - R9285 Unigine Valley perf regression since radeonsi: use re-Z</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94561">Bug 94561</a> - [llvmpipe] PIPE_CAP_VIDEO_MEMORY reports negative value on 32 bits (with 16GB ram)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94627">Bug 94627</a> - Game Risen on wine black grass</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94681">Bug 94681</a> - dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 takes 25 minutes to compile</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95000">Bug 95000</a> - deqp: assert in dEQP-GLES3.functional.vertex_arrays.single_attribute.strides.fixed.user_ptr_stride17_components2_quads1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95130">Bug 95130</a> - Derivatives of gl_Color wrong when helper pixels used</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95246">Bug 95246</a> - Segfault in glBindFramebuffer()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95419">Bug 95419</a> - [HSW][regression][bisect] RPG Maker game gives &quot;invalid floating point operation&quot; at startup</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95462">Bug 95462</a> - [BXT,BSW] arb_gpu_shader_fp64 causes gpu hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95529">Bug 95529</a> - [regression, bisected] Image corruption in Chrome</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96235">Bug 96235</a> - st_nir.h:34: error: redefinition of typedef nir_shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96274">Bug 96274</a> - [NVC0] Failure when compiling compute shader: Assertion `bb-&gt;getFirst()-&gt;serial &lt;= bb-&gt;getExit()-&gt;serial' failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96285">Bug 96285</a> - Mesa build broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96299">Bug 96299</a> - [vulkan] 64 regressions due to mesa d5f2f32</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96343">Bug 96343</a> - oom since st/mesa: implement PBO downloads for ReadPixels</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96346">Bug 96346</a> - [SNB,CTS] es2-cts.gtf.gl.atan regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96349">Bug 96349</a> - [CTS,SKL,BSW,BDW,KBL,BXT] es31-cts.arrays_of_arrays.interactionuniformbuffers3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96351">Bug 96351</a> - [CTS,SKL,KBL,BXT] es2-cts.gtf.gl2extensiontests.egl_image.egl_image</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96358">Bug 96358</a> - SSO: wrong interface validation between GS and VS (regresion due to latest gles 3.1)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96425">Bug 96425</a> - [bisected] occasional dark render in The Talos Principle</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96484">Bug 96484</a> - [vulkan] deqp-vk.glsl.builtin.precision.sin / cos regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96504">Bug 96504</a> - [vulkancts] compute tests crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96516">Bug 96516</a> - [bisected: 482526] &quot;clover: Update OpenCL version string to match OpenGL&quot;: clover's build fails because of missing git_sha1.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96528">Bug 96528</a> - Location qualifier segfaults during shader compilation</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96541">Bug 96541</a> - Tonga Unreal elemental bad rendering since radeonsi: Decompress DCC textures in a render feedback loop</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96565">Bug 96565</a> - Clive Barker's Jericho displays strange,vivid colors when motion blur enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96607">Bug 96607</a> - [bisected] texture misrender / flicker in The Talos Principle on SKL</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96617">Bug 96617</a> - gl_SecondaryFragDataEXT doesn't work for extended blend func</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96629">Bug 96629</a> - dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0: Assertion `width &gt;= 1' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96639">Bug 96639</a> - st/mesa: transfer_map with too-high level with dEQP-GLES2.functional.texture.completeness.cube.extra_level</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96674">Bug 96674</a> - [SNB, ILK] spec.ext_image_dma_buf_import.ext_image_dma_buf_import-sample_nv1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96729">Bug 96729</a> - Wrong shader compilation error message</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96762">Bug 96762</a> - [radeonsi,apitrace] Firewatch: nothing rendered in scrollable (text) areas</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96765">Bug 96765</a> - BindFragDataLocationIndexed on array fragment shader output.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96770">Bug 96770</a> - include/GL/mesa_glinterop.h:62: error: redefinition of typedef GLXContext</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96782">Bug 96782</a> - [regression bisected] R600 fp64 and glsl-4.00 piglit failures</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96791">Bug 96791</a> - Cannot use image from swapchains for sampling</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96825">Bug 96825</a> - anv_device.c:31:27: fatal error: anv_timestamp.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96835">Bug 96835</a> - &quot;gallium: Force blend color to 16-byte alignment&quot; crash with &quot;-march=native -O3&quot; causes some 32bit games to crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96850">Bug 96850</a> - Crucible tests fail for 32bit mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96878">Bug 96878</a> - [Bisected: cc2d0e6][HSW] &quot;GPU HANG&quot; msg after autologin to gnome-session</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96908">Bug 96908</a> - [radeonsi] MSAA causes graphical artifacts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96911">Bug 96911</a> - webgl2 conformance2/textures/misc/tex-mipmap-levels.html crashes 12.1 Intel driver</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96949">Bug 96949</a> - [regression] Piglit numSamples assertion failures with 9a23a177b90</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96950">Bug 96950</a> - Another regression from bc4e0c486: vbo: Use a bitmask to track the active arrays in vbo_exec*.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96971">Bug 96971</a> - invariant qualifier is not valid for shader inputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97019">Bug 97019</a> - [clover] build failure in llvm/codegen/native.cpp:129:52</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97032">Bug 97032</a> - [BDW,SKL] piglit.spec.arb_gpu_shader5.arb_gpu_shader5-interpolateatcentroid-flat</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97033">Bug 97033</a> - [BDW,SKL] piglit.spec.arb_gpu_shader_fp64.varying-packing.simple regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97039">Bug 97039</a> - The Talos Principle and Serious Sam 3 GPU faults</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97083">Bug 97083</a> - [IVB,BYT] GPU hang on deqp-gles31.functional.separate.shader.random</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97140">Bug 97140</a> - dd_draw.c:949:11: error: implicit declaration of function 'fmemopen' is invalid in C99 [-Werror,-Wimplicit-function-declaration]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97207">Bug 97207</a> - [IVY BRIDGE] Fragment shader discard writing to depth</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97214">Bug 97214</a> - X not running with error &quot;Failed to make EGL context current&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97225">Bug 97225</a> - [i965 on HD4600 Haswell] xcom switch to ingame cinematics cause segmentation fault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97233">Bug 97233</a> - vkQuake VkSpecializationMapEntry related bug</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97260">Bug 97260</a> - R9 290 low performance in Linux 4.7</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97267">Bug 97267</a> - [BDW] GL45-CTS.texture_cube_map_array.sampling asserts inside brw_fs.cpp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97278">Bug 97278</a> - [vulkancts,HSW] all vulkancts tests assert on HSW</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97285">Bug 97285</a> - Darkness in Dota 2 after Patch &quot;Make Gallium's BlitFramebuffer follow the GL 4.4 sRGB rules&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97286">Bug 97286</a> - `make check` fails uniform-initializer-test</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97305">Bug 97305</a> - Gallium: TBOs and images set the offset in elements, not bytes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97307">Bug 97307</a> - glsl/glcpp/tests/glcpp-test regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97309">Bug 97309</a> - piglit.spec.glsl-1_30.compiler.switch-statement.switch-case-duplicated.vert regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97322">Bug 97322</a> - GenerateMipmap creates wrong mipmap for sRGB texture</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97331">Bug 97331</a> - glDrawElementsBaseVertex doesn't work in display list on i915</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97351">Bug 97351</a> - DrawElementsBaseVertex with VBO ignores base vertex on Intel GMA 9xx in some cases</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97413">Bug 97413</a> - BioShock Infinite crashes on startup with Mesa Git version, R7 370</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97426">Bug 97426</a> - glScissor gives vertically inverted result</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97448">Bug 97448</a> - [HSW] deqp-vk.api_.copy_and_blit.image_to_image_stencil regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97476">Bug 97476</a> - Shader binaries should not be stored in the PipelineCache</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97477">Bug 97477</a> - i915g: gl_FragCoord is always (0.0, max_y)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97513">Bug 97513</a> - clover reports wrong device pointer size</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97549">Bug 97549</a> - [SNB, BXT] up to 40% perf drop from &quot;loader/dri3: Overhaul dri3_update_num_back&quot; commit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97587">Bug 97587</a> - make check nir/tests/control_flow_tests regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97761">Bug 97761</a> - es2-cts.gtf.gl2extensiontests.egl_image_external.testsimpleunassociated crashes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97773">Bug 97773</a> - New Mesa master now results in warnings in glrender (and subsurfaces and simple-egl), black screen</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97779">Bug 97779</a> - [regression, bisected][BDW, GPU hang] stuck on render ring, always reproducible</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97790">Bug 97790</a> - Vulkan cts regressions due to 24be63066</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97804">Bug 97804</a> - Later precision statement isn't overriding earlier one</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97808">Bug 97808</a> - &quot;tgsi/scan: don't set interp flags for inputs only used by INTERP instructions&quot; causes glitches in wine with gallium nine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97887">Bug 97887</a> - llvm segfault in janusvr -render vive</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97894">Bug 97894</a> - Crash in u_transfer_unmap_vtbl when unmapping a buffer mapped in different context</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97952">Bug 97952</a> - /usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97969">Bug 97969</a> - [radeonsi, bisected: fb827c0] Video decoding shows green artifacts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97976">Bug 97976</a> - VCE regression BO to small for addr since winsys/amdgpu: enable buffer allocation from slabs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98005">Bug 98005</a> - VCE dual instance encoding inconsistent since st/va: enable dual instances encode by sync surface</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98025">Bug 98025</a> - [radeonsi] incorrect primitive restart index used</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98128">Bug 98128</a> - nir/tests/control_flow_tests.cpp:79:73: error: nir_loop_first_cf_node was not declared in this scope</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98131">Bug 98131</a> - Compiler should reject lowp/mediump qualifiers on atomic_uints</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98133">Bug 98133</a> - GetSynciv should raise an error if bufSize &lt; 0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98134">Bug 98134</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers wants a different GL error code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98135">Bug 98135</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.transform_feedback_varyings wants a different GL error code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98167">Bug 98167</a> - [vulkan, radv] missing libgcrypt and openssl devel results in linker error in libvulkan_common</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98172">Bug 98172</a> - Concurrent call to glClientWaitSync results in segfault in one of the waiters.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98244">Bug 98244</a> - dEQP: textureOffset(sampler2DArrayShadow, ...) should not exist.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98264">Bug 98264</a> - Build broken for i965 due to multiple deifnitions of intelFenceExtension</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98307">Bug 98307</a> - &quot;st/glsl_to_tgsi: explicitly track all input and output declaration&quot; broke flightgear colors on rs780</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98326">Bug 98326</a> - [dEQP, EGL] pbuffer depth/stencil tests fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98415">Bug 98415</a> - Vulkan Driver JSON file contains incorrect field</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98431">Bug 98431</a> - UnrealEngine v4 demos startup fails to blorp blit assert</li>
</ul>
<h2>Changes</h2>
TBD.
Mesa no longer depends on libudev.
</div>
</body>

188
docs/relnotes/13.0.1.html Normal file
View File

@@ -0,0 +1,188 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.1 Release Notes / November 14, 2016</h1>
<p>
Mesa 13.0.1 is a bug fix release which fixes bugs found since the 13.0.0 release.
</p>
<p>
Mesa 13.0.1 implements the OpenGL 4.4 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.4. OpenGL
4.4 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
7cbb91dead05cde279ee95f86e8321c8e1c8fc9deb88f12e0f587672a10d88c5 mesa-13.0.1.tar.gz
71962fb2bf77d33b0ad4a565b490dbbeaf4619099c6d9722f04a73187957a731 mesa-13.0.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97715">Bug 97715</a> - [ILK,G45,G965] piglit.spec.arb_separate_shader_objects.misc api error checks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98012">Bug 98012</a> - [IVB] Segfault when running Dolphin twice with Vulkan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98512">Bug 98512</a> - radeon r600 vdpau: Invalid command stream: texture bo too small</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (2):</p>
<ul>
<li>glx/glvnd: Don't modify the dummy slot in the dispatch table</li>
<li>glx/glvnd: Fix dispatch function names and indices</li>
</ul>
<p>Andreas Boll (1):</p>
<ul>
<li>glx/windows: Add wgl.h to the sources list</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965: Fix GPU hang related to multiple render targets and alpha testing</li>
</ul>
<p>Chih-Wei Huang (1):</p>
<ul>
<li>android: avoid using libdrm with host modules</li>
</ul>
<p>Darren Salt (1):</p>
<ul>
<li>radv/pipeline: Don't dereference NULL dynamic state pointers</li>
</ul>
<p>Dave Airlie (8):</p>
<ul>
<li>radv: expose xlib platform extension</li>
<li>radv: fix dual source blending</li>
<li>Revert "st/vdpau: use linear layout for output surfaces"</li>
<li>radv: emit correct last export when Z/stencil export is enabled</li>
<li>ac/nir: add support for discard_if intrinsic (v2)</li>
<li>nir: add conditional discard optimisation (v4)</li>
<li>radv: enable conditional discard optimisation on radv.</li>
<li>radv: fix GetFenceStatus for signaled fences</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: add sha256 checksums for 13.0.0</li>
<li>amd/addrlib: limit fastcall/regparm to GCC i386</li>
<li>anv: use correct .specVersion for extensions</li>
<li>radv: use correct .specVersion for extensions</li>
<li>radv: Suffix the radeon_icd file with the host CPU</li>
<li>Update version to 13.0.1</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: Use Newton-Raphson on the 1/W write to fix glmark2 terrain.</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>nir: Flip gl_SamplePosition in nir_lower_wpos_ytransform().</li>
</ul>
<p>Fredrik Höglund (1):</p>
<ul>
<li>radv: add support for anisotropic filtering on VI+</li>
</ul>
<p>Jason Ekstrand (21):</p>
<ul>
<li>anv/device: Return DEVICE_LOST if execbuf2 fails</li>
<li>vulkan/wsi/x11: Better handle wsi_x11_connection_create failure</li>
<li>vulkan/wsi/x11: Clean up connections in finish_wsi</li>
<li>anv: Better handle return codes from anv_physical_device_init</li>
<li>intel/blorp: Use wm_prog_data instead of hand-rolling our own</li>
<li>intel/blorp: Pass a brw_stage_prog_data to upload_shader</li>
<li>anv/pipeline: Put actual pointers in anv_shader_bin</li>
<li>anv/pipeline: Properly cache prog_data::param</li>
<li>intel/blorp: Emit all the binding tables</li>
<li>anv/device: Add an execbuf wrapper</li>
<li>anv: Add a cmd_buffer_execbuf helper</li>
<li>anv: Don't presume to know what address is in a surface relocation</li>
<li>anv: Add a new bo_pool_init helper</li>
<li>anv/allocator: Simplify anv_scratch_pool</li>
<li>anv: Initialize anv_bo::offset to -1</li>
<li>anv/batch_chain: Improve write_reloc</li>
<li>anv: Add an anv_execbuf helper struct</li>
<li>anv/batch: Move last_ss_pool_bo_offset to the command buffer</li>
<li>anv: Move relocation handling from EndCommandBuffer to QueueSubmit</li>
<li>anv/cmd_buffer: Take a command buffer instead of a batch in two helpers</li>
<li>anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>glsl: Update deref types when resizing implicitly sized arrays.</li>
<li>mesa: Fix pixel shader scratch space allocation on Gen9+ platforms.</li>
</ul>
<p>Kristian Høgsberg (1):</p>
<ul>
<li>anv: Do relocations in userspace before execbuf ioctl</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>egl: use util/macros.h</li>
<li>egl: make interop ABI visible again</li>
<li>glx: make interop ABI visible again</li>
<li>radeonsi: fix an assertion failure in si_decompress_sampler_color_textures</li>
</ul>
<p>Nicolai Hähnle (4):</p>
<ul>
<li>radeonsi: fix BFE/BFI lowering for GLSL semantics</li>
<li>glsl: fix lowering of UBO references of named blocks</li>
<li>st/glsl_to_tgsi: fix dvec[34] loads from SSBO</li>
<li>st/mesa: fix the layer of VDPAU surface samplers</li>
</ul>
<p>Steven Toth (3):</p>
<ul>
<li>gallium/hud: fix a problem where objects are free'd while in use.</li>
<li>gallium/hud: close a previously opened handle</li>
<li>gallium/hud: protect against and initialization race</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>mesa/glsl: delete previously linked shaders earlier when linking</li>
</ul>
</div>
</body>
</html>

189
docs/relnotes/13.0.2.html Normal file
View File

@@ -0,0 +1,189 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.2 Release Notes / November 28, 2016</h1>
<p>
Mesa 13.0.2 is a bug fix release which fixes bugs found since the 13.0.1 release.
</p>
<p>
Mesa 13.0.2 implements the OpenGL 4.4 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.4. OpenGL
4.4 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
6014233a5db6032ab8de4881384871bbe029de684502707794ce7b3e6beec308 mesa-13.0.2.tar.gz
a6ed622645f4ed61da418bf65adde5bcc4bb79023c36ba7d6b45b389da4416d5 mesa-13.0.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97321">Bug 97321</a> - Query INFO_LOG_LENGTH for empty info log should return 0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97420">Bug 97420</a> - &quot;#version 0&quot; crashes glsl_compiler</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98632">Bug 98632</a> - Fix build on Hurd without PATH_MAX</li>
</ul>
<h2>Changes</h2>
<p>Ben Widawsky (3):</p>
<ul>
<li>i965: Add some APL and KBL SKU strings</li>
<li>i965: Reorder PCI ID list to match release order</li>
<li>i965/glk: Add basic Geminilake support</li>
</ul>
<p>Dave Airlie (14):</p>
<ul>
<li>radv: fix texturesamples to handle single sample case</li>
<li>wsi: fix VK_INCOMPLETE for vkGetSwapchainImagesKHR</li>
<li>radv: don't crash on null swapchain destroy.</li>
<li>ac/nir/llvm: fix channel in texture gather lowering code.</li>
<li>radv: make sure to flush input attachments correctly.</li>
<li>radv: fix image view creation for depth and stencil only</li>
<li>radv: spir-v allows texture size query with and without lod.</li>
<li>vulkan/wsi/x11: handle timeouts properly in next image acquire (v1.1)</li>
<li>vulkan/wsi: store present mode in swapchain base class</li>
<li>vulkan/wsi/x11: add support for IMMEDIATE present mode</li>
<li>radv: fix texel fetch offset with 2d arrays.</li>
<li>radv/si: fix optimal micro tile selection</li>
<li>radv/ac/llvm: shadow samplers only return one value.</li>
<li>radv: fix 3D clears with baseMiplevel</li>
</ul>
<p>Eduardo Lima Mitev (2):</p>
<ul>
<li>vulkan/wsi/x11: Fix behavior of vkGetPhysicalDeviceSurfaceFormatsKHR</li>
<li>vulkan/wsi/x11: Fix behavior of vkGetPhysicalDeviceSurfacePresentModesKHR</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 13.0.1</li>
<li>cherry-ignore: add reverted LLVM_LIBDIR patch</li>
<li>anv: fix enumeration of properties</li>
<li>radv: honour the number of properties available</li>
<li>Update version to 13.0.2</li>
</ul>
<p>Eric Anholt (3):</p>
<ul>
<li>vc4: Don't abort when a shader compile fails.</li>
<li>vc4: Clamp the shadow comparison value.</li>
<li>vc4: Fix register class handling of DDX/DDY arguments.</li>
</ul>
<p>Gwan-gyeong Mun (2):</p>
<ul>
<li>util/disk_cache: close a previously opened handle in disk_cache_put (v2)</li>
<li>anv: Fix unintentional integer overflow in anv_CreateDmaBufImageINTEL</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>anv/format: handle unsupported formats properly</li>
</ul>
<p>Ian Romanick (2):</p>
<ul>
<li>glcpp: Handle '#version 0' and other invalid values</li>
<li>glsl: Parse 0 as a preprocessor INTCONSTANT</li>
</ul>
<p>Jason Ekstrand (15):</p>
<ul>
<li>anv/gen8: Stall when needed in Cmd(Set|Reset)Event</li>
<li>anv/wsi: Set the fence to signaled in AcquireNextImageKHR</li>
<li>anv: Rework fences</li>
<li>vulkan/wsi/wayland: Include pthread.h</li>
<li>vulkan/wsi/wayland: Clean up some error handling paths</li>
<li>vulkan/wsi: Report the correct min/maxImageCount</li>
<li>i965/gs: Allow primitive id to be a system value</li>
<li>anv: Handle null in all destructors</li>
<li>anv/fence: Handle ANV_FENCE_CREATE_SIGNALED_BIT</li>
<li>nir/spirv: Fix handling of gl_PrimitiveId</li>
<li>anv/blorp: Ignore clears for attachments first used as resolve destinations</li>
<li>anv: Implement a depth stall restriction on gen7</li>
<li>anv/cmd_buffer: Handle running out of binding tables in compute shaders</li>
<li>anv/cmd_buffer: Emit a CS stall before setting a CS pipeline</li>
<li>vulkan/wsi/x11: Implement FIFO mode.</li>
</ul>
<p>Jordan Justen (2):</p>
<ul>
<li>isl: Fix height calculation in isl_msaa_interleaved_scale_px_to_sa</li>
<li>i965/hsw: Set integer mode in sampling state for stencil texturing</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>intel: Set min_ds_entries on Broxton.</li>
<li>i965: Fix compute shader crash.</li>
<li>mesa: Drop PATH_MAX usage.</li>
<li>i965: Fix GS push inputs with enhanced layouts.</li>
</ul>
<p>Kevin Strasser (1):</p>
<ul>
<li>vulkan/wsi: Add a thread-safe queue implementation</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>anv: fix multi level clears with VK_REMAINING_MIP_LEVELS</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>gbm: request correct version of the DRI2_FENCE extension</li>
</ul>
<p>Nicolai Hähnle (2):</p>
<ul>
<li>radeonsi: store group_size_variable in struct si_compute</li>
<li>glsl/lower_output_reads: fix geometry shader output handling with conditional emit</li>
</ul>
<p>Steinar H. Gunderson (1):</p>
<ul>
<li>Fix races during _mesa_HashWalk().</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>mesa: fix empty program log length</li>
</ul>
</div>
</body>
</html>

177
docs/relnotes/13.0.3.html Normal file
View File

@@ -0,0 +1,177 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.3 Release Notes / January 5, 2017</h1>
<p>
Mesa 13.0.3 is a bug fix release which fixes bugs found since the 13.0.2 release.
</p>
<p>
Mesa 13.0.3 implements the OpenGL 4.4 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.4. OpenGL
4.4 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
55b07d056f9b855ba9d7c8b2ddc7d3b220a61c6ab1bdc73cbfc2f607721094c2 mesa-13.0.3.tar.gz
d9aa8be5c176d00d0cd503cb2f64a5a403ea471ec819c022581414860d7ba40e mesa-13.0.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77662">Bug 77662</a> - Fail to render to different faces of depth-stencil cube map</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92234">Bug 92234</a> - [BDW] GPU hang in Shogun2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98329">Bug 98329</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99038">Bug 99038</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.negative_api.create_pixmap_surface crashes</li>
</ul>
<h2>Changes</h2>
<p>Chad Versace (2):</p>
<ul>
<li>i965/mt: Disable aux surfaces after making miptree shareable</li>
<li>egl: Fix crashes in eglCreate*Surface()</li>
</ul>
<p>Dave Airlie (4):</p>
<ul>
<li>anv: set maxFragmentDualSrcAttachments to 1</li>
<li>radv: set maxFragmentDualSrcAttachments to 1</li>
<li>radv: fix another regression since shadow fixes.</li>
<li>radv: add missing license file to radv_meta_bufimage.</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 13.0.2</li>
<li>anv: don't double-close the same fd</li>
<li>anv: don't leak memory if anv_init_wsi() fails</li>
<li>radv: don't leak the fd if radv_physical_device_init() succeeds</li>
<li>Update version to 13.0.3</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: In a loop break/continue, jump if everyone has taken the path.</li>
</ul>
<p>Gwan-gyeong Mun (3):</p>
<ul>
<li>anv: Add missing error-checking to anv_block_pool_init (v2)</li>
<li>anv: Update the teardown in reverse order of the anv_CreateDevice</li>
<li>vulkan/wsi: Fix resource leak in success path of wsi_queue_init()</li>
</ul>
<p>Haixia Shi (1):</p>
<ul>
<li>compiler/glsl: fix precision problem of tanh</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>mesa: only verify that enabled arrays have backing buffers</li>
</ul>
<p>Jason Ekstrand (8):</p>
<ul>
<li>anv/cmd_buffer: Re-emit MEDIA_CURBE_LOAD when CS push constants are dirty</li>
<li>anv/image: Rename hiz_surface to aux_surface</li>
<li>anv/cmd_buffer: Remove the 1-D case from the HiZ QPitch calculation</li>
<li>genxml/gen9: Change the default of MI_SEMAPHORE_WAIT::RegisterPoleMode</li>
<li>anv/device: Return the right error for failed maps</li>
<li>anv/device: Implicitly unmap memory objects in FreeMemory</li>
<li>anv/descriptor_set: Write the state offset in the surface state free list.</li>
<li>spirv: Use a simpler and more correct implementaiton of tanh()</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Allocate at least some URB space even when max_vertices = 0.</li>
</ul>
<p>Marek Olšák (17):</p>
<ul>
<li>radeonsi: always set all blend registers</li>
<li>radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending</li>
<li>radeonsi: disable RB+ blend optimizations for dual source blending</li>
<li>radeonsi: consolidate max-work-group-size computation</li>
<li>radeonsi: apply a multi-wave workgroup SPI bug workaround to affected CIK chips</li>
<li>radeonsi: apply a TC L1 write corruption workaround for SI</li>
<li>radeonsi: apply a tessellation bug workaround for SI</li>
<li>radeonsi: add a tess+GS hang workaround for VI dGPUs</li>
<li>radeonsi: apply the double EVENT_WRITE_EOP workaround to VI as well</li>
<li>cso: don't release sampler states that are bound</li>
<li>radeonsi: always restore sampler states when unbinding sampler views</li>
<li>radeonsi: fix incorrect FMASK checking in bind_sampler_states</li>
<li>radeonsi: allow specifying simm16 of emit_waitcnt at call sites</li>
<li>radeonsi: wait for outstanding memory instructions in TCS barriers</li>
<li>tgsi: fix the src type of TGSI_OPCODE_MEMBAR</li>
<li>radeonsi: wait for outstanding LDS instructions in memory barriers if needed</li>
<li>radeonsi: disable the constant engine (CE) on Carrizo and Stoney</li>
</ul>
<p>Matt Turner (3):</p>
<ul>
<li>i965/fs: Rename opt_copy_propagate -&gt; opt_copy_propagation.</li>
<li>i965/fs: Add unit tests for copy propagation pass.</li>
<li>i965/fs: Reject copy propagation into SEL if not min/max.</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>mesa/fbobject: Update CubeMapFace when reusing textures</li>
</ul>
<p>Nicolai Hähnle (4):</p>
<ul>
<li>radeonsi: fix isolines tess factor writes to control ring</li>
<li>radeonsi: update all GSVS ring descriptors for new buffer allocations</li>
<li>radeonsi: do not kill GS with memory writes</li>
<li>radeonsi: fix an off-by-one error in the bounds check for max_vertices</li>
</ul>
<p>Rhys Kidd (1):</p>
<ul>
<li>glsl: Add pthread libs to cache_test</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>mesa: fix active subroutine uniforms properly</li>
<li>Revert "nir: Turn imov/fmov of undef into undef."</li>
</ul>
</div>
</body>
</html>

255
docs/relnotes/13.0.4.html Normal file
View File

@@ -0,0 +1,255 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.4 Release Notes / February 1, 2017</h1>
<p>
Mesa 13.0.4 is a bug fix release which fixes bugs found since the 13.0.3 release.
</p>
<p>
Mesa 13.0.4 implements the OpenGL 4.4 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.4. OpenGL
4.4 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
a78518030b0b7d77a6c426ac3ff40f4b27fb0e2cdb0dfbe685024a46cae59bad mesa-13.0.4.tar.gz
a95d7ce8f7bd5f88585e4be3144a341236d8c0fc91f6feaec59bb8ba3120e726 mesa-13.0.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92634">Bug 92634</a> - gallium's vl_mpeg12_decoder does not work with st/va</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94512">Bug 94512</a> - X segfaults with glx-tls enabled in a x32 environment</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94900">Bug 94900</a> - HD6950 GPU lockup loop with various steam games (octodad[always], saints row 4[always], dead island[always], grid autosport[sometimes])</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98263">Bug 98263</a> - [radv] The Talos Principle fails to launch with &quot;Fatal error: Cannot set display mode.&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98914">Bug 98914</a> - mesa-vdpau-drivers: breaks vdpau for mpeg2video</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98975">Bug 98975</a> - Wasteland 2 Directors Cut: Hangs. GPU fault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99030">Bug 99030</a> - [HSW, regression] transform feedback fails on Linux 4.8</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99085">Bug 99085</a> - [EGL] dEQP-EGL.functional.sharing.gles2.multithread intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99097">Bug 99097</a> - [vulkancts] dEQP-VK.image.store regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99100">Bug 99100</a> - [SKL,BDW,BSW,KBL] dEQP-VK.glsl.return.return_in_dynamic_loop_dynamic_vertex regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99144">Bug 99144</a> - Incorrect rendering using glDrawArraysInstancedBaseInstance and first != 0 on Skylake</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99154">Bug 99154</a> - Link time error when using multiple builtin functions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99158">Bug 99158</a> - vdpau segfaults and gpu locks with kodi on R9285</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99185">Bug 99185</a> - dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99188">Bug 99188</a> - dEQP-EGL.functional.create_context_ext.robust_gl_30.rgb565_no_depth_no_stencil</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99210">Bug 99210</a> - ES3-CTS.functional.texture.mipmap.cube.generate.rgba5551_*</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99354">Bug 99354</a> - [G71] &quot;Assertion `bkref' failed&quot; reproducible with glmark2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99450">Bug 99450</a> - [amdgpu] Payday 2 visual glitches on some models</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99451">Bug 99451</a> - polygon offset use after free</li>
</ul>
<h2>Changes</h2>
<p>Andres Rodriguez (2):</p>
<ul>
<li>vulkan/wsi: clarify the severity of lack of DRI3 v2</li>
<li>radv: fix include order for installed headers v2</li>
</ul>
<p>Arda Coskunses (2):</p>
<ul>
<li>vulkan/wsi/x11: don't crash on null visual</li>
<li>vulkan/wsi/x11: don't crash on null wsi x11 connection</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Support loader interface version 3.</li>
</ul>
<p>Chad Versace (10):</p>
<ul>
<li>egl: Check config's surface types in eglCreate*Surface()</li>
<li>dri: Add __DRI_IMAGE_FORMAT_ARGB1555</li>
<li>mesa/texformat: Handle GL_RGBA + GL_UNSIGNED_SHORT_5_5_5_1</li>
<li>egl: Emit correct error when robust context creation fails</li>
<li>anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0</li>
<li>mesa/shaderobj: Fix races on refcounts</li>
<li>meta: Disable dithering during glGenerateMipmap</li>
<li>vulkan: Add new cast macros for VkIcd types</li>
<li>vulkan: Update vk_icd.h to interface version 3</li>
<li>anv: Support loader interface version 3 (patch v2)</li>
</ul>
<p>Christian König (1):</p>
<ul>
<li>vl/zscan: fix "Fix trivial sign compare warnings"</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>glx: Add missing glproto dependency for gallium-xlib glx</li>
</ul>
<p>Damien Grassart (1):</p>
<ul>
<li>anv: return count of queue families written</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv: flush smem for uniform buffer bit.</li>
</ul>
<p>Emil Velikov (10):</p>
<ul>
<li>docs: add sha256 checksums for 13.0.3</li>
<li>cherry-ignore: add couple of intel_miptree_copy related patches</li>
<li>cherry-ignore: add radv: Call nir_lower_constant_initializers."</li>
<li>get-typod-pick-list.sh: add new script</li>
<li>cherry-ignore: add "_mesa_ClampColor extension/version fix"</li>
<li>cherry-ignore: add wayland race condition fix</li>
<li>egl/wayland: use the destroy_window_callback for swrast</li>
<li>automake: use shared llvm libs for make distcheck</li>
<li>get-pick-list.sh: Require explicit "13.0" for nominating stable patches</li>
<li>Update version to 13.0.4</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>anv: Fix uniform and storage buffer offset alignment limits.</li>
</ul>
<p>Fredrik Höglund (2):</p>
<ul>
<li>radv: fix dual source blending</li>
<li>dri3: Fix MakeCurrent without a default framebuffer</li>
</ul>
<p>Grazvydas Ignotas (1):</p>
<ul>
<li>mapi: update the asm code to support x32</li>
</ul>
<p>Heiko Przybyl (1):</p>
<ul>
<li>r600/sb: Fix loop optimization related hangs on eg</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nouveau: take extra push space into account for pushbuf_space calls</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>i965/generator/tex: Handle an immediate sampler with an indirect texture</li>
<li>anv/formats: Use the real format for B4G4R4A4_UNORM_PACK16 on gen8</li>
<li>nir/search: Only allow matching SSA values</li>
<li>isl: Mark A4B4G4R4_UNORM as supported on gen8</li>
</ul>
<p>Jonas Ådahl (1):</p>
<ul>
<li>egl/wayland: Cleanup private display connection when init fails</li>
</ul>
<p>Kenneth Graunke (7):</p>
<ul>
<li>i965: Don't bail on vertex element processing if we need draw params.</li>
<li>i965: Fix last slot calculations</li>
<li>i965: Fix texturing in the vec4 TCS and GS backends.</li>
<li>spirv: Move cursor before calling vtn_ssa_value() in phi 2nd pass.</li>
<li>i965: Make BLORP disable the NP Z PMA stall fix.</li>
<li>glsl: Use ir_var_temporary when generating inline functions.</li>
<li>i965: Properly flush in hsw_pause_transform_feedback().</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>vdpau: call texture_get_handle while the mutex is being held</li>
<li>va: call texture_get_handle while the mutex is being held</li>
<li>radeonsi: for the tess barrier, only use emit_waitcnt on SI and LLVM 3.9+</li>
<li>radeonsi: don't forget to add HTILE to the buffer list for texturing</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>cso: Don't restore nr_samplers in cso_restore_fragment_samplers</li>
</ul>
<p>Nanley Chery (3):</p>
<ul>
<li>anv/cmd_buffer: Fix arrayed depth/stencil attachments</li>
<li>anv/cmd_buffer: Fix programmed HiZ qpitch</li>
<li>anv/image: Disable HiZ for depth buffer arrays</li>
</ul>
<p>Nayan Deshmukh (1):</p>
<ul>
<li>st/va: delay calling begin_frame until we have all parameters</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno: some fence cleanup</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>gallium/hud: add missing break in hud_cpufreq_graph_install()</li>
</ul>
<p>Timothy Arceri (3):</p>
<ul>
<li>nir: Turn imov/fmov of undef into undef</li>
<li>glsl: fix opt_minmax redundancy checks against baserange</li>
<li>util: fix list_is_singular()</li>
</ul>
<p>Zachary Michaels (1):</p>
<ul>
<li>radeonsi: Always leave poly_offset in a valid state</li>
</ul>
</div>
</body>
</html>

210
docs/relnotes/13.0.5.html Normal file
View File

@@ -0,0 +1,210 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.5 Release Notes / February 20, 2017</h1>
<p>
Mesa 13.0.5 is a bug fix release which fixes bugs found since the 13.0.4 release.
</p>
<p>
Mesa 13.0.5 implements the OpenGL 4.4 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.4. OpenGL
4.4 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
7e45e3812078726eabca6d9384364bf035a3c4279024ec9090dd1b19a8989926 mesa-13.0.5.tar.gz
bfcea7e2c801525a60895c8aff11aa68457ee9aa35d01a4638e1f310a3f5ef87 mesa-13.0.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98329">Bug 98329</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98421">Bug 98421</a> - src/loader/loader.c:111:40: error: unknown type name drmDevicePtr</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98526">Bug 98526</a> - glsl/tests/general-ir-test regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99532">Bug 99532</a> - Compute shader doesn't give right result under some circumstances</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99631">Bug 99631</a> - segfault with OSVRTrackerView and openscenegraph git master</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99633">Bug 99633</a> - rasterizer/core/clip.h:279:49: error: const struct API_STATE has no member named linkageCount</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99692">Bug 99692</a> - [radv] Mostly broken on Hawaii PRO/CIK ASICs</li>
</ul>
<h2>Changes</h2>
<p>Bartosz Tomczyk (2):</p>
<ul>
<li>r600: Fix stack overflow</li>
<li>r600/sb: Fix memory leak</li>
</ul>
<p>Bruce Cherniak (1):</p>
<ul>
<li>swr: [rasterizer core] Remove dead code Clipper::ClipScalar()</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>i965/mt: Disable HiZ when sharing depth buffer externally (v2)</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>radv: change base aligmment for allocated memory.</li>
<li>radv: fix cik macroModeIndex.</li>
<li>radv: adopt some init config workarounds from radeonsi.</li>
</ul>
<p>Derek Foreman (1):</p>
<ul>
<li>egl/dri2: add image_loader_extension back into loader extensions for wayland</li>
</ul>
<p>Emil Velikov (26):</p>
<ul>
<li>docs: add sha256 checksums for 13.0.4</li>
<li>configure.ac: list radeon in --with-vulkan-drivers help string</li>
<li>i965: automake: correctly set MKDIR_GEN</li>
<li>freedreno: automake: correctly set MKDIR_GEN</li>
<li>i965: automake: include builddir prior to srcdir</li>
<li>i915: automake: include builddir prior to srcdir</li>
<li>egl: automake: include builddir prior to srcdir</li>
<li>clover: automake: include builddir prior to srcdir</li>
<li>st/dri: automake: include builddir prior to srcdir</li>
<li>d3dadapter9: automake: include builddir prior to srcdir</li>
<li>glx: automake: include builddir prior to srcdir</li>
<li>glx/apple: automake: include builddir prior to srcdir</li>
<li>glx/windows: automake: include builddir prior to srcdir</li>
<li>loader: automake: include builddir prior to srcdir</li>
<li>mapi: automake: include builddir prior to srcdir</li>
<li>radeon, r200: automake: include builddir prior to srcdir</li>
<li>dri/swrast: automake: include builddir prior to srcdir</li>
<li>dri/osmesa: automake: include builddir prior to srcdir</li>
<li>mesa/tests: automake: include builddir prior to srcdir</li>
<li>bin/get-extra-pick-list: use git merge-base to get the branchpoint</li>
<li>bin/get-extra-pick-list: rework to use already_picked list</li>
<li>bin/get-typod-pick-list.sh: limit `git grep ...' to only as needed</li>
<li>bin/get-pick-list.sh: limit `git grep ...' only as needed</li>
<li>bin/get-pick-list.sh: remove ancient way of nominating patches</li>
<li>bin/get-fixes-pick-list.sh: add new script</li>
<li>Update version to 13.0.5</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: Avoid emitting small immediates for UBO indirect load address guards.</li>
</ul>
<p>Hans de Goede (1):</p>
<ul>
<li>glx/glvnd: Fix GLXdispatchIndex sorting</li>
</ul>
<p>Ian Romanick (11):</p>
<ul>
<li>linker: Slight code rearrange to prevent duplication in the next commit</li>
<li>linker: Accurately track gl_uniform_block::stageref</li>
<li>glsl: Split process_block_array into two functions</li>
<li>glsl: Fix wonkey indentation left from previous commit</li>
<li>glsl: Track the linearized array index for each UBO instance array element</li>
<li>glsl: Use simpler visitor to determine which UBO and SSBO blocks are used</li>
<li>glsl: Add tracking for elements of an array-of-arrays that have been accessed</li>
<li>glsl: Add structures to track accessed elements of a single array</li>
<li>glsl: Mark a set of array elements as accessed using a list of array_deref_range</li>
<li>glsl: Walk a list of ir_dereference_array to mark array elements as accessed</li>
<li>linker: Accurately mark a uniform block instance array element as used in a stage</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>vbo: process buffer binding state changes on draw when recording</li>
<li>st/mesa: MAX_VARYING is the max supported number of patch varyings, not min</li>
<li>nvc0: disable linked tsc mode in compute launch descriptor</li>
</ul>
<p>Jason Ekstrand (11):</p>
<ul>
<li>nir/search: Use the correct bit size for integer comparisons</li>
<li>i965/blorp: Use the correct ISL format for combined depth/stencil</li>
<li>intel/blorp: Handle clearing of A4B4G4R4 on all platforms</li>
<li>isl/formats: Only advertise sampling for A4B4G4R4 on Broadwell</li>
<li>anv: Flush render cache before STATE_BASE_ADDRESS on gen7</li>
<li>anv: Improve flushing around STATE_BASE_ADDRESS</li>
<li>vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormats</li>
<li>vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModes</li>
<li>vulkan/wsi: Lower the maximum image sizes</li>
<li>i965/sampler_state: Pass texObj into update_sampler_state</li>
<li>i965/sampler_state: Set the "Base Mip Level" field on Sandy Bridge</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.</li>
</ul>
<p>Lionel Landwerlin (5):</p>
<ul>
<li>anv: don't require render target isl bit for depth/stencil surfaces</li>
<li>anv: set command buffer to NULL when allocations fail</li>
<li>anv: fix descriptor pool internal size allocation</li>
<li>spirv: handle OpUndef as part of the variable parsing pass</li>
<li>spirv: handle undefined components for OpVectorShuffle</li>
</ul>
<p>Marc-André Lureau (1):</p>
<ul>
<li>tgsi-dump: dump label if instruction has one</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>radeonsi: always set the TCL1_ACTION_ENA when invalidating L2</li>
<li>gallium/radeon: fix performance of buffer readbacks</li>
</ul>
<p>Topi Pohjolainen (2):</p>
<ul>
<li>i965: Make depth clear flushing more explicit</li>
<li>i965/gen6: Issue direct depth stall and flush after depth clear</li>
</ul>
<p>Vinson Lee (2):</p>
<ul>
<li>scons: Require libdrm &gt;= 2.4.66 for DRM.</li>
<li>util: Fix Clang trivial destructor check.</li>
</ul>
</div>
</body>
</html>

287
docs/relnotes/13.0.6.html Normal file
View File

@@ -0,0 +1,287 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.6 Release Notes / March 20, 2017</h1>
<p>
Mesa 13.0.6 is a bug fix release which fixes bugs found since the 13.0.5 release.
</p>
<p>
Mesa 13.0.6 implements the OpenGL 4.4 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.4. OpenGL
4.4 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
1076590f29103f022a2cd87e6dff6ae77072013745603d06b0410c373ab2bb1a mesa-13.0.6.tar.gz
29ef104a7fc082d352b1599bd6cb1d040be424ccd22f5e0eb7ee9b0e9acd3597 mesa-13.0.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68504">Bug 68504</a> - 9.2-rc1 workaround for clover build failure on ppc/altivec: cannot convert 'bool' to '__vector(4) __bool int' in return</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97102">Bug 97102</a> - [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98869">Bug 98869</a> - Electronic Super Joy graphic artefacts (regression,bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99401">Bug 99401</a> - [g33] regression: piglit.spec.!opengl 1_0.gl-1_0-beginend-coverage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99456">Bug 99456</a> - Firefox crashing when opening about:support with WebGL2 enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99677">Bug 99677</a> - heap-use-after-free in glsl</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99715">Bug 99715</a> - Don't print: &quot;Note: Buggy applications may crash, if they do please report to vendor&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99850">Bug 99850</a> - Tessellation bug on Carrizo</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100049">Bug 100049</a> - &quot;ralloc: Make sure ralloc() allocations match malloc()'s alignment.&quot; causes seg fault in 32bit build</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (2):</p>
<ul>
<li>radv: Emit pending flushes before executing a secondary command buffer</li>
<li>radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBuffer</li>
</ul>
<p>Bartosz Tomczyk (1):</p>
<ul>
<li>glsl: fix heap-buffer-overflow</li>
</ul>
<p>Bas Nieuwenhuizen (8):</p>
<ul>
<li>radv: Pass CMASK alignment to application.</li>
<li>radv: Pass DCC alignment to application.</li>
<li>radv: Never try to create more than max_sets descriptor sets.</li>
<li>radv: Reset emitted compute pipeline when calling secondary cmd buffer.</li>
<li>radv: Only use PKT3_OCCLUSION_QUERY when it doesn't hang.</li>
<li>radv: Use correct size for availability flag.</li>
<li>radv: Disable HTILE for textures with multiple layers/levels.</li>
<li>radv: Emit cache flushes before CP DMA.</li>
</ul>
<p>Ben Crocker (3):</p>
<ul>
<li>gallivm: Improve debug output (V2)</li>
<li>gallivm: Override getHostCPUName() "generic" w/ "pwr8" (v4)</li>
<li>gallivm: Reenable PPC VSX (v3)</li>
</ul>
<p>Brendan King (1):</p>
<ul>
<li>egl/dri3: implement query surface hook</li>
</ul>
<p>Bruce Cherniak (1):</p>
<ul>
<li>swr: Prune empty nodes in CalculateProcessorTopology.</li>
</ul>
<p>Connor Abbott (1):</p>
<ul>
<li>anv: fix Get*MemoryRequirements for !LLC</li>
</ul>
<p>Dave Airlie (13):</p>
<ul>
<li>radv: program a default point size.</li>
<li>radv: handle transfer_write as a dst flag.</li>
<li>radv/ac: handle nir irem opcode.</li>
<li>radv/ac: implement txs for buffer textures.</li>
<li>radv/ac: correctly size shared memory usage.</li>
<li>radv/ac: avoid the fmask path when doing txs.</li>
<li>radv: pass FMASK alignment to application</li>
<li>tgsi: fix memory leak in tgsi sanity check</li>
<li>radv: fix depth format in blit2d.</li>
<li>radv: fix txs for sampler buffers</li>
<li>radv: drop Z24 support.</li>
<li>radv: disable mip point pre clamping.</li>
<li>radv: setup llvm target data layout</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: add sha256 checksums for 13.0.5</li>
<li>Revert "get-pick-list.sh: Require explicit "13.0" for nominating stable patches"</li>
<li>cherry-ignore: don't pick nir_op_pack_double optimisation fix</li>
<li>i965: move brw_define.h ifndef guard to the top</li>
<li>cherry-ignore: add ANV fast clears related fixes</li>
<li>Update version to 13.0.6</li>
</ul>
<p>Fredrik Höglund (2):</p>
<ul>
<li>radv: fix the dynamic buffer index in vkCmdBindDescriptorSets</li>
<li>radv/ac: fix multiple descriptor sets with dynamic buffers</li>
</ul>
<p>George Kyriazis (1):</p>
<ul>
<li>swr: Align query results allocation</li>
</ul>
<p>Grazvydas Ignotas (3):</p>
<ul>
<li>r300g: only allow byteswapped formats on big endian</li>
<li>gallium/u_queue: fix a crash with atexit handlers</li>
<li>gallium/u_queue: set num_threads correctly if not all threads start</li>
</ul>
<p>Gregory Hainaut (1):</p>
<ul>
<li>glapi: fix typo in count_scale</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>mesa: Don't advertise GL_OES_read_format in core profile</li>
</ul>
<p>Ilia Mirkin (8):</p>
<ul>
<li>nvc0: increase number of ubo binding points</li>
<li>nvc0/ir: fix robustness guarantees for constbuf loads on kepler+ compute</li>
<li>nvc0/ir: fix ubo max clamp, reset file index</li>
<li>gm107/ir: fix address offset bitfield for ATOMS</li>
<li>nvc0: set the render condition in the compute object</li>
<li>st/mesa: don't pass compare mode for stencil-sampled textures</li>
<li>nvc0: take extra pushbuf space into account for pushbuf_space calls</li>
<li>nvc0: increase alignment to 256 for texture buffers on fermi</li>
</ul>
<p>Jacob Lifshay (1):</p>
<ul>
<li>vulkan/wsi: Improve the DRI3 error message</li>
</ul>
<p>Jason Ekstrand (11):</p>
<ul>
<li>i965: Use a better guardband calculation.</li>
<li>intel/blorp: Swizzle clear colors on the CPU</li>
<li>i965/fs: Remove the inline pack_double_2x32 optimization</li>
<li>anv: Add an invalidate_range helper</li>
<li>anv/query: clflush the bo map on non-LLC platforms</li>
<li>genxml: Make MI_STORE_DATA_IMM more consistent</li>
<li>anv/query: Perform CmdResetQueryPool on the GPU</li>
<li>blorp/exec: Use uint32_t for copying varying data</li>
<li>intel/blorp: Explicitly flush all allocated state</li>
<li>anv: Accurately advertise dynamic descriptor limits</li>
<li>anv: Properly handle destroying NULL devices and instances</li>
</ul>
<p>Jonas Pfeil (1):</p>
<ul>
<li>ralloc: Make sure ralloc() allocations match malloc()'s alignment.</li>
</ul>
<p>Jose Maria Casanova Crespo (1):</p>
<ul>
<li>glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1</li>
</ul>
<p>Kenneth Graunke (7):</p>
<ul>
<li>i965: Fix fast depth clears for surfaces with a dimension of 16384.</li>
<li>i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.</li>
<li>i965: Fix check for negative pitch in can_do_fast_copy_blit().</li>
<li>i965: Support the force_glsl_version driconf option.</li>
<li>i965: Combine the Gen6 SF and Clip viewport atoms.</li>
<li>mesa: Do (TCS &amp;&amp; !TES) draw time validation in ES as well.</li>
<li>egl: Ensure ResetNotificationStrategy matches for shared contexts.</li>
</ul>
<p>Lionel Landwerlin (3):</p>
<ul>
<li>spirv: don't assert with location decorations on non i/o variables</li>
<li>anv: wsi: report presentation error per image request</li>
<li>i965/fs: fix uninitialized memory access</li>
</ul>
<p>Marc Di Luzio (1):</p>
<ul>
<li>glsl: correct compute shader checks for memoryBarrier functions</li>
</ul>
<p>Marek Olšák (10):</p>
<ul>
<li>st/mesa: destroy pipe_context before destroying st_context (v2)</li>
<li>radeonsi: don't invoke DCC decompression in update_all_texture_descriptors</li>
<li>radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start (v2)</li>
<li>gallium/util: remove unused u_index_modify helpers</li>
<li>gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionally</li>
<li>gallium/u_queue: fix random crashes when the app calls exit()</li>
<li>st/mesa: reset sample_mask, min_sample, and render_condition for PBO ops</li>
<li>st/mesa: set blend state for PBO readbacks</li>
<li>radeonsi: fix broken tessellation on Carrizo and Stoney</li>
<li>radeonsi: mark all bound shader buffer ranges as initialized</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>clover: Work around build failure with AltiVec.</li>
</ul>
<p>Nicolai Hähnle (12):</p>
<ul>
<li>mesa/main: fix meta caller of _mesa_ClampColor</li>
<li>radeonsi: fix texture gather on stencil textures</li>
<li>glsl: split DIV_TO_MUL_RCP into single- and double-precision flags</li>
<li>glx/dri3: handle NULL pointers in loader-to-DRI3 drawable conversion</li>
<li>glx/dri3: guard in_current_context against a disappeared drawable</li>
<li>glx: guard swap-interval functions against destroyed drawables</li>
<li>dri/common: clear the loaderPrivate pointer in driDestroyDrawable</li>
<li>winsys/amdgpu: reduce max_alloc_size based on GTT limits</li>
<li>radeonsi: handle MultiDrawIndirect in si_get_draw_start_count</li>
<li>radeonsi: fix UINT/SINT clamping for 10-bit formats on &lt;= CIK</li>
<li>st/glsl_to_tgsi: avoid iterating past the head of the instruction list</li>
<li>st/mesa: inform the driver of framebuffer changes before compute dispatches</li>
</ul>
<p>Samuel Iglesias Gonsálvez (6):</p>
<ul>
<li>glsl: fix heap-use-after-free in ast_declarator_list::hir()</li>
<li>i965/fs: mark last DF uniform array element as 64 bit live one</li>
<li>i965/fs: detect different bit size accesses to uniforms to push them in proper locations</li>
<li>i965/fs: fix indirect load DF uniforms on BSW/BXT</li>
<li>i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handles</li>
<li>i965/fs: emit MOV_INDIRECT with the source with the right register type</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>winsys/amdgpu: avoid potential segfault in amdgpu_bo_map()</li>
</ul>
</div>
</body>
</html>

View File

@@ -1121,6 +1121,7 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_FORMAT_XRGB2101010 0x1009
#define __DRI_IMAGE_FORMAT_ARGB2101010 0x100a
#define __DRI_IMAGE_FORMAT_SARGB8 0x100b
#define __DRI_IMAGE_FORMAT_ARGB1555 0x100c
#define __DRI_IMAGE_USE_SHARE 0x0001
#define __DRI_IMAGE_USE_SCANOUT 0x0002
@@ -1148,6 +1149,7 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_FOURCC_R8 0x20203852
#define __DRI_IMAGE_FOURCC_GR88 0x38385247
#define __DRI_IMAGE_FOURCC_ARGB1555 0x35315241
#define __DRI_IMAGE_FOURCC_RGB565 0x36314752
#define __DRI_IMAGE_FOURCC_ARGB8888 0x34325241
#define __DRI_IMAGE_FOURCC_XRGB8888 0x34325258

View File

@@ -109,6 +109,10 @@ CHIPSET(0x162A, bdw_gt3, "Intel(R) Iris Pro P6300 (Broadwell GT3e)")
CHIPSET(0x162B, bdw_gt3, "Intel(R) Iris 6100 (Broadwell GT3)")
CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell GT3")
CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")
CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overridden in brw_get_renderer_string */
CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1")
@@ -134,6 +138,11 @@ CHIPSET(0x1932, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)")
CHIPSET(0x193A, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")
CHIPSET(0x193B, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)")
CHIPSET(0x193D, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")
CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")
CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics 505 (Broxton)")
CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics 500 (Broxton 2x6)")
CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1")
CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1")
CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1")
@@ -144,22 +153,15 @@ CHIPSET(0x5913, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
CHIPSET(0x5915, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
CHIPSET(0x5917, kbl_gt1_5, "Intel(R) Kabylake GT1.5")
CHIPSET(0x5912, kbl_gt2, "Intel(R) Kabylake GT2")
CHIPSET(0x5916, kbl_gt2, "Intel(R) Kabylake GT2")
CHIPSET(0x5916, kbl_gt2, "Intel(R) HD Graphics 620 (Kabylake GT2)")
CHIPSET(0x591A, kbl_gt2, "Intel(R) Kabylake GT2")
CHIPSET(0x591B, kbl_gt2, "Intel(R) Kabylake GT2")
CHIPSET(0x591D, kbl_gt2, "Intel(R) Kabylake GT2")
CHIPSET(0x591E, kbl_gt2, "Intel(R) Kabylake GT2")
CHIPSET(0x591E, kbl_gt2, "Intel(R) HD Graphics 615 (Kabylake GT2)")
CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F")
CHIPSET(0x5923, kbl_gt3, "Intel(R) Kabylake GT3")
CHIPSET(0x5926, kbl_gt3, "Intel(R) Kabylake GT3")
CHIPSET(0x5927, kbl_gt3, "Intel(R) Kabylake GT3")
CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overridden in brw_get_renderer_string */
CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")
CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")
CHIPSET(0x3184, glk, "Intel(R) HD Graphics (Geminilake)")
CHIPSET(0x3185, glk_2x6, "Intel(R) HD Graphics (Geminilake 2x6)")

View File

@@ -1,28 +1,56 @@
//
// File: vk_icd.h
//
/*
* Copyright (c) 2015-2016 The Khronos Group Inc.
* Copyright (c) 2015-2016 Valve Corporation
* Copyright (c) 2015-2016 LunarG, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
#ifndef VKICD_H
#define VKICD_H
#include "vk_platform.h"
#include "vulkan.h"
/*
* Loader-ICD version negotiation API
*/
#define CURRENT_LOADER_ICD_INTERFACE_VERSION 3
#define MIN_SUPPORTED_LOADER_ICD_INTERFACE_VERSION 0
typedef VkResult (VKAPI_PTR *PFN_vkNegotiateLoaderICDInterfaceVersion)(uint32_t *pVersion);
/*
* The ICD must reserve space for a pointer for the loader's dispatch
* table, at the start of <each object>.
* The ICD must initialize this variable using the SET_LOADER_MAGIC_VALUE macro.
*/
#define ICD_LOADER_MAGIC 0x01CDC0DE
#define ICD_LOADER_MAGIC 0x01CDC0DE
typedef union _VK_LOADER_DATA {
uintptr_t loaderMagic;
void *loaderData;
typedef union {
uintptr_t loaderMagic;
void *loaderData;
} VK_LOADER_DATA;
static inline void set_loader_magic_value(void* pNewObject) {
VK_LOADER_DATA *loader_info = (VK_LOADER_DATA *) pNewObject;
static inline void set_loader_magic_value(void *pNewObject) {
VK_LOADER_DATA *loader_info = (VK_LOADER_DATA *)pNewObject;
loader_info->loaderMagic = ICD_LOADER_MAGIC;
}
static inline bool valid_loader_magic_value(void* pNewObject) {
const VK_LOADER_DATA *loader_info = (VK_LOADER_DATA *) pNewObject;
static inline bool valid_loader_magic_value(void *pNewObject) {
const VK_LOADER_DATA *loader_info = (VK_LOADER_DATA *)pNewObject;
return (loader_info->loaderMagic & 0xffffffff) == ICD_LOADER_MAGIC;
}
@@ -30,56 +58,74 @@ static inline bool valid_loader_magic_value(void* pNewObject) {
* Windows and Linux ICDs will treat VkSurfaceKHR as a pointer to a struct that
* contains the platform-specific connection and surface information.
*/
typedef enum _VkIcdWsiPlatform {
typedef enum {
VK_ICD_WSI_PLATFORM_MIR,
VK_ICD_WSI_PLATFORM_WAYLAND,
VK_ICD_WSI_PLATFORM_WIN32,
VK_ICD_WSI_PLATFORM_XCB,
VK_ICD_WSI_PLATFORM_XLIB,
VK_ICD_WSI_PLATFORM_DISPLAY
} VkIcdWsiPlatform;
typedef struct _VkIcdSurfaceBase {
VkIcdWsiPlatform platform;
typedef struct {
VkIcdWsiPlatform platform;
} VkIcdSurfaceBase;
#ifdef VK_USE_PLATFORM_MIR_KHR
typedef struct _VkIcdSurfaceMir {
VkIcdSurfaceBase base;
MirConnection* connection;
MirSurface* mirSurface;
typedef struct {
VkIcdSurfaceBase base;
MirConnection *connection;
MirSurface *mirSurface;
} VkIcdSurfaceMir;
#endif // VK_USE_PLATFORM_MIR_KHR
#ifdef VK_USE_PLATFORM_WAYLAND_KHR
typedef struct _VkIcdSurfaceWayland {
VkIcdSurfaceBase base;
struct wl_display* display;
struct wl_surface* surface;
typedef struct {
VkIcdSurfaceBase base;
struct wl_display *display;
struct wl_surface *surface;
} VkIcdSurfaceWayland;
#endif // VK_USE_PLATFORM_WAYLAND_KHR
#ifdef VK_USE_PLATFORM_WIN32_KHR
typedef struct _VkIcdSurfaceWin32 {
VkIcdSurfaceBase base;
HINSTANCE hinstance;
HWND hwnd;
typedef struct {
VkIcdSurfaceBase base;
HINSTANCE hinstance;
HWND hwnd;
} VkIcdSurfaceWin32;
#endif // VK_USE_PLATFORM_WIN32_KHR
#ifdef VK_USE_PLATFORM_XCB_KHR
typedef struct _VkIcdSurfaceXcb {
VkIcdSurfaceBase base;
xcb_connection_t* connection;
xcb_window_t window;
typedef struct {
VkIcdSurfaceBase base;
xcb_connection_t *connection;
xcb_window_t window;
} VkIcdSurfaceXcb;
#endif // VK_USE_PLATFORM_XCB_KHR
#ifdef VK_USE_PLATFORM_XLIB_KHR
typedef struct _VkIcdSurfaceXlib {
VkIcdSurfaceBase base;
Display* dpy;
Window window;
typedef struct {
VkIcdSurfaceBase base;
Display *dpy;
Window window;
} VkIcdSurfaceXlib;
#endif // VK_USE_PLATFORM_XLIB_KHR
#ifdef VK_USE_PLATFORM_ANDROID_KHR
typedef struct {
ANativeWindow* window;
} VkIcdSurfaceAndroid;
#endif //VK_USE_PLATFORM_ANDROID_KHR
typedef struct {
VkIcdSurfaceBase base;
VkDisplayModeKHR displayMode;
uint32_t planeIndex;
uint32_t planeStackIndex;
VkSurfaceTransformFlagBitsKHR transform;
float globalAlpha;
VkDisplayPlaneAlphaFlagBitsKHR alphaMode;
VkExtent2D imageExtent;
} VkIcdSurfaceDisplay;
#endif // VKICD_H

View File

@@ -651,7 +651,7 @@ def generate(env):
env.PkgCheckModules('X11', ['x11', 'xext', 'xdamage', 'xfixes', 'glproto >= 1.4.13'])
env.PkgCheckModules('XCB', ['x11-xcb', 'xcb-glx >= 1.8.1', 'xcb-dri2 >= 1.8'])
env.PkgCheckModules('XF86VIDMODE', ['xxf86vm'])
env.PkgCheckModules('DRM', ['libdrm >= 2.4.38'])
env.PkgCheckModules('DRM', ['libdrm >= 2.4.66'])
if env['x11']:
env.Append(CPPPATH = env['X11_CPPPATH'])

View File

@@ -88,7 +88,11 @@ typedef int INT;
#ifndef ADDR_FASTCALL
#if defined(__GNUC__)
#define ADDR_FASTCALL __attribute__((regparm(0)))
#if defined(__i386__)
#define ADDR_FASTCALL __attribute__((regparm(0)))
#else
#define ADDR_FASTCALL
#endif
#else
#define ADDR_FASTCALL __fastcall
#endif

View File

@@ -1301,6 +1301,9 @@ static void visit_alu(struct nir_to_llvm_context *ctx, nir_alu_instr *instr)
src[1] = to_float(ctx, src[1]);
result = LLVMBuildFRem(ctx->builder, src[0], src[1], "");
break;
case nir_op_irem:
result = LLVMBuildSRem(ctx->builder, src[0], src[1], "");
break;
case nir_op_idiv:
result = LLVMBuildSDiv(ctx->builder, src[0], src[1], "");
break;
@@ -1683,7 +1686,7 @@ static LLVMValueRef radv_lower_gather4_integer(struct nir_to_llvm_context *ctx,
for (c = 0; c < 2; c++) {
half_texel[c] = LLVMBuildExtractElement(ctx->builder, size,
ctx->i32zero, "");
LLVMConstInt(ctx->i32, c, false), "");
half_texel[c] = LLVMBuildUIToFP(ctx->builder, half_texel[c], ctx->f32, "");
half_texel[c] = emit_fdiv(ctx, ctx->f32one, half_texel[c]);
half_texel[c] = LLVMBuildFMul(ctx->builder, half_texel[c],
@@ -1782,15 +1785,17 @@ static LLVMValueRef visit_vulkan_resource_index(struct nir_to_llvm_context *ctx,
unsigned desc_set = nir_intrinsic_desc_set(instr);
unsigned binding = nir_intrinsic_binding(instr);
LLVMValueRef desc_ptr = ctx->descriptor_sets[desc_set];
struct radv_descriptor_set_layout *layout = ctx->options->layout->set[desc_set].layout;
struct radv_pipeline_layout *pipeline_layout = ctx->options->layout;
struct radv_descriptor_set_layout *layout = pipeline_layout->set[desc_set].layout;
unsigned base_offset = layout->binding[binding].offset;
LLVMValueRef offset, stride;
if (layout->binding[binding].type == VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC ||
layout->binding[binding].type == VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC) {
unsigned idx = pipeline_layout->set[desc_set].dynamic_offset_start +
layout->binding[binding].dynamic_offset_offset;
desc_ptr = ctx->push_constants;
base_offset = ctx->options->layout->push_constant_size;
base_offset += 16 * layout->binding[binding].dynamic_offset_offset;
base_offset = pipeline_layout->push_constant_size + 16 * idx;
stride = LLVMConstInt(ctx->i32, 16, false);
} else
stride = LLVMConstInt(ctx->i32, layout->binding[binding].size, false);
@@ -2609,6 +2614,24 @@ static void emit_barrier(struct nir_to_llvm_context *ctx)
ctx->voidt, NULL, 0, 0);
}
static void emit_discard_if(struct nir_to_llvm_context *ctx,
nir_intrinsic_instr *instr)
{
LLVMValueRef cond;
ctx->shader_info->fs.can_discard = true;
cond = LLVMBuildICmp(ctx->builder, LLVMIntNE,
get_src(ctx, instr->src[0]),
ctx->i32zero, "");
cond = LLVMBuildSelect(ctx->builder, cond,
LLVMConstReal(ctx->f32, -1.0f),
ctx->f32zero, "");
emit_llvm_intrinsic(ctx, "llvm.AMDGPU.kill",
LLVMVoidTypeInContext(ctx->context),
&cond, 1, 0);
}
static LLVMValueRef
visit_load_local_invocation_index(struct nir_to_llvm_context *ctx)
{
@@ -2921,6 +2944,9 @@ static void visit_intrinsic(struct nir_to_llvm_context *ctx,
LLVMVoidTypeInContext(ctx->context),
NULL, 0, 0);
break;
case nir_intrinsic_discard_if:
emit_discard_if(ctx, instr);
break;
case nir_intrinsic_memory_barrier:
emit_waitcnt(ctx);
break;
@@ -3277,18 +3303,31 @@ static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
}
}
if (instr->op == nir_texop_txs && instr->sampler_dim == GLSL_SAMPLER_DIM_BUF) {
result = get_buffer_size(ctx, res_ptr, true);
goto write_result;
}
if (instr->op == nir_texop_texture_samples) {
LLVMValueRef res, samples;
LLVMValueRef res, samples, is_msaa;
res = LLVMBuildBitCast(ctx->builder, res_ptr, ctx->v8i32, "");
samples = LLVMBuildExtractElement(ctx->builder, res,
LLVMConstInt(ctx->i32, 3, false), "");
is_msaa = LLVMBuildLShr(ctx->builder, samples,
LLVMConstInt(ctx->i32, 28, false), "");
is_msaa = LLVMBuildAnd(ctx->builder, is_msaa,
LLVMConstInt(ctx->i32, 0xe, false), "");
is_msaa = LLVMBuildICmp(ctx->builder, LLVMIntEQ, is_msaa,
LLVMConstInt(ctx->i32, 0xe, false), "");
samples = LLVMBuildLShr(ctx->builder, samples,
LLVMConstInt(ctx->i32, 16, false), "");
samples = LLVMBuildAnd(ctx->builder, samples,
LLVMConstInt(ctx->i32, 0xf, false), "");
samples = LLVMBuildShl(ctx->builder, ctx->i32one,
samples, "");
samples = LLVMBuildSelect(ctx->builder, is_msaa, samples,
ctx->i32one, "");
result = samples;
goto write_result;
}
@@ -3387,7 +3426,10 @@ static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
address[count++] = sample_index;
} else if(instr->op == nir_texop_txs) {
count = 0;
address[count++] = lod;
if (lod)
address[count++] = lod;
else
address[count++] = ctx->i32zero;
}
for (chan = 0; chan < count; chan++) {
@@ -3412,7 +3454,7 @@ static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
result = build_tex_intrinsic(ctx, instr, &txf_info);
result = LLVMBuildExtractElement(ctx->builder, result, ctx->i32zero, "");
result = LLVMBuildICmp(ctx->builder, LLVMIntEQ, result, ctx->i32zero, "");
result = emit_int_cmp(ctx, LLVMIntEQ, result, ctx->i32zero);
goto write_result;
}
@@ -3430,7 +3472,8 @@ static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
* The sample index should be adjusted as follows:
* sample_index = (fmask >> (sample_index * 4)) & 0xF;
*/
if (instr->sampler_dim == GLSL_SAMPLER_DIM_MS) {
if (instr->sampler_dim == GLSL_SAMPLER_DIM_MS &&
instr->op != nir_texop_txs) {
LLVMValueRef txf_address[4];
struct ac_tex_info txf_info = { 0 };
unsigned txf_count = count;
@@ -3485,12 +3528,13 @@ static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
if (offsets && instr->op == nir_texop_txf) {
nir_const_value *const_offset =
nir_src_as_const_value(instr->src[const_src].src);
int num_offsets = instr->src[const_src].src.ssa->num_components;
assert(const_offset);
if (instr->coord_components > 2)
num_offsets = MIN2(num_offsets, instr->coord_components);
if (num_offsets > 2)
address[2] = LLVMBuildAdd(ctx->builder,
address[2], LLVMConstInt(ctx->i32, const_offset->i32[2], false), "");
if (instr->coord_components > 1)
if (num_offsets > 1)
address[1] = LLVMBuildAdd(ctx->builder,
address[1], LLVMConstInt(ctx->i32, const_offset->i32[1], false), "");
address[0] = LLVMBuildAdd(ctx->builder,
@@ -3512,6 +3556,8 @@ static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
if (instr->op == nir_texop_query_levels)
result = LLVMBuildExtractElement(ctx->builder, result, LLVMConstInt(ctx->i32, 3, false), "");
else if (instr->is_shadow && instr->op != nir_texop_txs && instr->op != nir_texop_lod && instr->op != nir_texop_tg4)
result = LLVMBuildExtractElement(ctx->builder, result, ctx->i32zero, "");
else if (instr->op == nir_texop_txs &&
instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE &&
instr->is_array) {
@@ -3520,7 +3566,8 @@ static void visit_tex(struct nir_to_llvm_context *ctx, nir_tex_instr *instr)
LLVMValueRef z = LLVMBuildExtractElement(ctx->builder, result, two, "");
z = LLVMBuildSDiv(ctx->builder, z, six, "");
result = LLVMBuildInsertElement(ctx->builder, result, z, two, "");
}
} else if (instr->dest.ssa.num_components != 4)
result = trim_vector(ctx, result, instr->dest.ssa.num_components);
write_result:
if (result) {
@@ -3910,7 +3957,7 @@ static void
handle_shader_output_decl(struct nir_to_llvm_context *ctx,
struct nir_variable *variable)
{
int idx = variable->data.location;
int idx = variable->data.location + variable->data.index;
unsigned attrib_count = glsl_count_attribute_slots(variable->type, false);
variable->data.driver_location = idx * 4;
@@ -3940,7 +3987,7 @@ handle_shader_output_decl(struct nir_to_llvm_context *ctx,
si_build_alloca_undef(ctx, ctx->f32, "");
}
}
ctx->output_mask |= ((1ull << attrib_count) - 1) << variable->data.location;
ctx->output_mask |= ((1ull << attrib_count) - 1) << idx;
}
static void
@@ -4351,12 +4398,10 @@ handle_fs_outputs_post(struct nir_to_llvm_context *ctx,
for (unsigned i = 0; i < RADEON_LLVM_MAX_OUTPUTS; ++i) {
LLVMValueRef values[4];
bool last;
if (!(ctx->output_mask & (1ull << i)))
continue;
last = ctx->output_mask <= ((1ull << (i + 1)) - 1);
if (i == FRAG_RESULT_DEPTH) {
ctx->shader_info->fs.writes_z = true;
depth = to_float(ctx, LLVMBuildLoad(ctx->builder,
@@ -4366,10 +4411,14 @@ handle_fs_outputs_post(struct nir_to_llvm_context *ctx,
stencil = to_float(ctx, LLVMBuildLoad(ctx->builder,
ctx->outputs[radeon_llvm_reg_index_soa(i, 0)], ""));
} else {
bool last = false;
for (unsigned j = 0; j < 4; j++)
values[j] = to_float(ctx, LLVMBuildLoad(ctx->builder,
ctx->outputs[radeon_llvm_reg_index_soa(i, j)], ""));
if (!ctx->shader_info->fs.writes_z && !ctx->shader_info->fs.writes_stencil)
last = ctx->output_mask <= ((1ull << (i + 1)) - 1);
si_export_mrt_color(ctx, values, V_008DFC_SQ_EXP_MRT + index, last);
index++;
}
@@ -4450,6 +4499,13 @@ LLVMModuleRef ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
memset(shader_info, 0, sizeof(*shader_info));
LLVMSetTarget(ctx.module, "amdgcn--");
LLVMTargetDataRef data_layout = LLVMCreateTargetDataLayout(tm);
char *data_layout_str = LLVMCopyStringRepOfTargetData(data_layout);
LLVMSetDataLayout(ctx.module, data_layout_str);
LLVMDisposeTargetData(data_layout);
LLVMDisposeMessage(data_layout_str);
setup_types(&ctx);
ctx.builder = LLVMCreateBuilderInContext(ctx.context);
@@ -4471,7 +4527,7 @@ LLVMModuleRef ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
idx++;
}
shared_size *= 4;
shared_size *= 16;
var = LLVMAddGlobalInAddressSpace(ctx.module,
LLVMArrayType(ctx.i8, shared_size),
"compute_lds",

View File

@@ -4,3 +4,4 @@
/radv_timestamp.h
/dev_icd.json
/vk_format_table.c
/radeon_icd.*.json

View File

@@ -32,9 +32,6 @@ lib_LTLIBRARIES = libvulkan_radeon.la
# The gallium includes are for the util/u_math.h include from main/macros.h
AM_CPPFLAGS = \
$(AMDGPU_CFLAGS) \
$(VALGRIND_CFLAGS) \
$(DEFINES) \
-I$(top_srcdir)/include \
-I$(top_builddir)/src \
-I$(top_srcdir)/src \
@@ -48,7 +45,10 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/src/mesa \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/gallium/include
-I$(top_srcdir)/src/gallium/include \
$(AMDGPU_CFLAGS) \
$(VALGRIND_CFLAGS) \
$(DEFINES)
AM_CFLAGS = \
$(VISIBILITY_CFLAGS) \
@@ -131,11 +131,11 @@ vk_format_table.c: vk_format_table.py \
$(PYTHON2) $(srcdir)/vk_format_table.py $(srcdir)/vk_format_layout.csv > $@
BUILT_SOURCES = $(VULKAN_GENERATED_FILES)
CLEANFILES = $(BUILT_SOURCES) dev_icd.json radv_timestamp.h
CLEANFILES = $(BUILT_SOURCES) dev_icd.json radeon_icd.@host_cpu@.json
EXTRA_DIST = \
$(top_srcdir)/include/vulkan/vk_icd.h \
dev_icd.json.in \
radeon_icd.json \
radeon_icd.json.in \
radv_entrypoints_gen.py \
vk_format_layout.csv \
vk_format_parse.py \
@@ -155,7 +155,7 @@ libvulkan_radeon_la_LDFLAGS = \
icdconfdir = @VULKAN_ICD_INSTALL_DIR@
icdconf_DATA = radeon_icd.json
icdconf_DATA = radeon_icd.@host_cpu@.json
# The following is used for development purposes, by setting VK_ICD_FILENAMES.
noinst_DATA = dev_icd.json
@@ -164,4 +164,9 @@ dev_icd.json : dev_icd.json.in
-e "s#@build_libdir@#${abs_top_builddir}/${LIB_DIR}#" \
< $(srcdir)/dev_icd.json.in > $@
radeon_icd.@host_cpu@.json : radeon_icd.json.in
$(AM_V_GEN) $(SED) \
-e "s#@install_libdir@#${libdir}#" \
< $(srcdir)/radeon_icd.json.in > $@
include $(top_srcdir)/install-lib-links.mk

View File

@@ -2,6 +2,6 @@
"file_format_version": "1.0.0",
"ICD": {
"library_path": "@build_libdir@/libvulkan_radeon.so",
"abi_versions": "1.0.3"
"api_version": "1.0.3"
}
}

View File

@@ -1,7 +0,0 @@
{
"file_format_version": "1.0.0",
"ICD": {
"library_path": "libvulkan_radeon.so",
"abi_versions": "1.0.3"
}
}

View File

@@ -0,0 +1,7 @@
{
"file_format_version": "1.0.0",
"ICD": {
"library_path": "@install_libdir@/libvulkan_radeon.so",
"api_version": "1.0.3"
}
}

View File

@@ -345,7 +345,8 @@ radv_emit_graphics_raster_state(struct radv_cmd_buffer *cmd_buffer,
raster->spi_interp_control);
radeon_set_context_reg_seq(cmd_buffer->cs, R_028A00_PA_SU_POINT_SIZE, 2);
radeon_emit(cmd_buffer->cs, 0);
unsigned tmp = (unsigned)(1.0 * 8.0);
radeon_emit(cmd_buffer->cs, S_028A00_HEIGHT(tmp) | S_028A00_WIDTH(tmp));
radeon_emit(cmd_buffer->cs, S_028A04_MIN_SIZE(radv_pack_float_12p4(0)) |
S_028A04_MAX_SIZE(radv_pack_float_12p4(8192/2))); /* R_028A04_PA_SU_POINT_MINMAX */
@@ -1393,7 +1394,7 @@ void radv_CmdBindDescriptorSets(
radv_bind_descriptor_set(cmd_buffer, set, idx);
for(unsigned j = 0; j < set->layout->dynamic_offset_count; ++j, ++dyn_idx) {
unsigned idx = j + layout->set[i].dynamic_offset_start;
unsigned idx = j + layout->set[i + firstSet].dynamic_offset_start;
uint32_t *dst = cmd_buffer->dynamic_buffers + idx * 4;
assert(dyn_idx < dynamicOffsetCount);
@@ -1652,6 +1653,9 @@ void radv_CmdExecuteCommands(
{
RADV_FROM_HANDLE(radv_cmd_buffer, primary, commandBuffer);
/* Emit pending flushes on primary prior to executing secondary */
si_emit_cache_flush(primary);
for (uint32_t i = 0; i < commandBufferCount; i++) {
RADV_FROM_HANDLE(radv_cmd_buffer, secondary, pCmdBuffers[i]);
@@ -1661,6 +1665,7 @@ void radv_CmdExecuteCommands(
/* if we execute secondary we need to re-emit out pipelines */
if (commandBufferCount) {
primary->state.emitted_pipeline = NULL;
primary->state.emitted_compute_pipeline = NULL;
primary->state.dirty |= RADV_CMD_DIRTY_PIPELINE;
primary->state.dirty |= RADV_CMD_DIRTY_DYNAMIC_ALL;
}
@@ -2163,9 +2168,6 @@ static void radv_handle_cmask_image_transition(struct radv_cmd_buffer *cmd_buffe
radv_initialise_cmask(cmd_buffer, image, 0xffffffffu);
} else if (radv_layout_has_cmask(image, src_layout) &&
!radv_layout_has_cmask(image, dst_layout)) {
if (!cmd_buffer->device->allow_fast_clears)
return;
radv_fast_clear_flush_image_inplace(cmd_buffer, image);
}
}
@@ -2286,14 +2288,18 @@ void radv_CmdPipelineBarrier(
case VK_ACCESS_INDIRECT_COMMAND_READ_BIT:
case VK_ACCESS_INDEX_READ_BIT:
case VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT:
case VK_ACCESS_UNIFORM_READ_BIT:
flush_bits |= RADV_CMD_FLAG_INV_VMEM_L1;
break;
case VK_ACCESS_UNIFORM_READ_BIT:
flush_bits |= RADV_CMD_FLAG_INV_VMEM_L1 | RADV_CMD_FLAG_INV_SMEM_L1;
break;
case VK_ACCESS_SHADER_READ_BIT:
flush_bits |= RADV_CMD_FLAG_INV_GLOBAL_L2;
break;
case VK_ACCESS_COLOR_ATTACHMENT_READ_BIT:
case VK_ACCESS_TRANSFER_READ_BIT:
case VK_ACCESS_TRANSFER_WRITE_BIT:
case VK_ACCESS_INPUT_ATTACHMENT_READ_BIT:
flush_bits |= RADV_CMD_FLUSH_AND_INV_FRAMEBUFFER | RADV_CMD_FLAG_INV_GLOBAL_L2;
default:
break;

View File

@@ -275,12 +275,13 @@ radv_descriptor_set_create(struct radv_device *device,
uint32_t layout_size = align_u32(layout->size, 32);
set->size = layout->size;
if (!cmd_buffer) {
if (pool->current_offset + layout_size <= pool->size) {
if (pool->current_offset + layout_size <= pool->size &&
pool->allocated_sets < pool->max_sets) {
set->bo = pool->bo;
set->mapped_ptr = (uint32_t*)(pool->mapped_ptr + pool->current_offset);
set->va = device->ws->buffer_get_va(set->bo) + pool->current_offset;
pool->current_offset += layout_size;
++pool->allocated_sets;
} else {
int entry = pool->free_list, prev_entry = -1;
uint32_t offset;
@@ -417,6 +418,7 @@ VkResult radv_CreateDescriptorPool(
pool->full_list = 0;
pool->free_nodes[max_sets - 1].next = -1;
pool->max_sets = max_sets;
pool->allocated_sets = 0;
for (int i = 0; i + 1 < max_sets; ++i)
pool->free_nodes[i].next = i + 1;
@@ -494,6 +496,7 @@ VkResult radv_ResetDescriptorPool(
radv_descriptor_set_destroy(device, pool, set, false);
}
pool->allocated_sets = 0;
pool->current_offset = 0;
pool->free_list = -1;
pool->full_list = 0;

View File

@@ -44,12 +44,6 @@
#include "util/debug.h"
struct radv_dispatch_table dtable;
struct radv_fence {
struct radeon_winsys_fence *fence;
bool submitted;
bool signalled;
};
static VkResult
radv_physical_device_init(struct radv_physical_device *device,
struct radv_instance *instance,
@@ -97,6 +91,7 @@ radv_physical_device_init(struct radv_physical_device *device,
fprintf(stderr, "WARNING: radv is not a conformant vulkan implementation, testing use only.\n");
device->name = device->rad_info.name;
close(fd);
return VK_SUCCESS;
fail:
@@ -119,13 +114,19 @@ static const VkExtensionProperties global_extensions[] = {
#ifdef VK_USE_PLATFORM_XCB_KHR
{
.extensionName = VK_KHR_XCB_SURFACE_EXTENSION_NAME,
.specVersion = 5,
.specVersion = 6,
},
#endif
#ifdef VK_USE_PLATFORM_XLIB_KHR
{
.extensionName = VK_KHR_XLIB_SURFACE_EXTENSION_NAME,
.specVersion = 6,
},
#endif
#ifdef VK_USE_PLATFORM_WAYLAND_KHR
{
.extensionName = VK_KHR_WAYLAND_SURFACE_EXTENSION_NAME,
.specVersion = 4,
.specVersion = 5,
},
#endif
};
@@ -133,7 +134,7 @@ static const VkExtensionProperties global_extensions[] = {
static const VkExtensionProperties device_extensions[] = {
{
.extensionName = VK_KHR_SWAPCHAIN_EXTENSION_NAME,
.specVersion = 67,
.specVersion = 68,
},
};
@@ -424,7 +425,7 @@ void radv_GetPhysicalDeviceProperties(
.maxGeometryTotalOutputComponents = 1024,
.maxFragmentInputComponents = 128,
.maxFragmentOutputAttachments = 8,
.maxFragmentDualSrcAttachments = 2,
.maxFragmentDualSrcAttachments = 1,
.maxFragmentCombinedOutputResources = 8,
.maxComputeSharedMemorySize = 32768,
.maxComputeWorkGroupCount = { 65535, 65535, 65535 },
@@ -659,17 +660,15 @@ VkResult radv_EnumerateInstanceExtensionProperties(
uint32_t* pPropertyCount,
VkExtensionProperties* pProperties)
{
unsigned i;
if (pProperties == NULL) {
*pPropertyCount = ARRAY_SIZE(global_extensions);
return VK_SUCCESS;
}
for (i = 0; i < *pPropertyCount; i++)
memcpy(&pProperties[i], &global_extensions[i], sizeof(VkExtensionProperties));
*pPropertyCount = MIN2(*pPropertyCount, ARRAY_SIZE(global_extensions));
typed_memcpy(pProperties, global_extensions, *pPropertyCount);
*pPropertyCount = i;
if (i < ARRAY_SIZE(global_extensions))
if (*pPropertyCount < ARRAY_SIZE(global_extensions))
return VK_INCOMPLETE;
return VK_SUCCESS;
@@ -681,19 +680,17 @@ VkResult radv_EnumerateDeviceExtensionProperties(
uint32_t* pPropertyCount,
VkExtensionProperties* pProperties)
{
unsigned i;
if (pProperties == NULL) {
*pPropertyCount = ARRAY_SIZE(device_extensions);
return VK_SUCCESS;
}
for (i = 0; i < *pPropertyCount; i++)
memcpy(&pProperties[i], &device_extensions[i], sizeof(VkExtensionProperties));
*pPropertyCount = MIN2(*pPropertyCount, ARRAY_SIZE(device_extensions));
typed_memcpy(pProperties, device_extensions, *pPropertyCount);
*pPropertyCount = i;
if (i < ARRAY_SIZE(device_extensions))
if (*pPropertyCount < ARRAY_SIZE(device_extensions))
return VK_INCOMPLETE;
return VK_SUCCESS;
}
@@ -869,7 +866,7 @@ VkResult radv_AllocateMemory(
flags |= RADEON_FLAG_NO_CPU_ACCESS;
else
flags |= RADEON_FLAG_CPU_ACCESS;
mem->bo = device->ws->buffer_create(device->ws, alloc_size, 32768,
mem->bo = device->ws->buffer_create(device->ws, alloc_size, 65536,
domain, flags);
if (!mem->bo) {
@@ -1172,6 +1169,8 @@ VkResult radv_GetFenceStatus(VkDevice _device, VkFence _fence)
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_fence, fence, _fence);
if (fence->signalled)
return VK_SUCCESS;
if (!fence->submitted)
return VK_NOT_READY;
@@ -1734,31 +1733,55 @@ radv_tex_bordercolor(VkBorderColor bcolor)
return 0;
}
static unsigned
radv_tex_aniso_filter(unsigned filter)
{
if (filter < 2)
return 0;
if (filter < 4)
return 1;
if (filter < 8)
return 2;
if (filter < 16)
return 3;
return 4;
}
static void
radv_init_sampler(struct radv_device *device,
struct radv_sampler *sampler,
const VkSamplerCreateInfo *pCreateInfo)
{
uint32_t max_aniso = 0;
uint32_t max_aniso_ratio = 0;//TODO
uint32_t max_aniso = pCreateInfo->anisotropyEnable && pCreateInfo->maxAnisotropy > 1.0 ?
(uint32_t) pCreateInfo->maxAnisotropy : 0;
uint32_t max_aniso_ratio = radv_tex_aniso_filter(max_aniso);
bool is_vi;
is_vi = (device->instance->physicalDevice.rad_info.chip_class >= VI);
if (!is_vi && max_aniso > 0) {
radv_finishme("Anisotropic filtering must be disabled manually "
"by the shader on SI-CI when BASE_LEVEL == LAST_LEVEL\n");
max_aniso = max_aniso_ratio = 0;
}
sampler->state[0] = (S_008F30_CLAMP_X(radv_tex_wrap(pCreateInfo->addressModeU)) |
S_008F30_CLAMP_Y(radv_tex_wrap(pCreateInfo->addressModeV)) |
S_008F30_CLAMP_Z(radv_tex_wrap(pCreateInfo->addressModeW)) |
S_008F30_MAX_ANISO_RATIO(max_aniso_ratio) |
S_008F30_DEPTH_COMPARE_FUNC(radv_tex_compare(pCreateInfo->compareOp)) |
S_008F30_FORCE_UNNORMALIZED(pCreateInfo->unnormalizedCoordinates ? 1 : 0) |
S_008F30_ANISO_THRESHOLD(max_aniso_ratio >> 1) |
S_008F30_ANISO_BIAS(max_aniso_ratio) |
S_008F30_DISABLE_CUBE_WRAP(0) |
S_008F30_COMPAT_MODE(is_vi));
sampler->state[1] = (S_008F34_MIN_LOD(S_FIXED(CLAMP(pCreateInfo->minLod, 0, 15), 8)) |
S_008F34_MAX_LOD(S_FIXED(CLAMP(pCreateInfo->maxLod, 0, 15), 8)));
S_008F34_MAX_LOD(S_FIXED(CLAMP(pCreateInfo->maxLod, 0, 15), 8)) |
S_008F34_PERF_MIP(max_aniso_ratio ? max_aniso_ratio + 6 : 0));
sampler->state[2] = (S_008F38_LOD_BIAS(S_FIXED(CLAMP(pCreateInfo->mipLodBias, -16, 16), 8)) |
S_008F38_XY_MAG_FILTER(radv_tex_filter(pCreateInfo->magFilter, max_aniso)) |
S_008F38_XY_MIN_FILTER(radv_tex_filter(pCreateInfo->minFilter, max_aniso)) |
S_008F38_MIP_FILTER(radv_tex_mipfilter(pCreateInfo->mipmapMode)) |
S_008F38_MIP_POINT_PRECLAMP(1) |
S_008F38_MIP_POINT_PRECLAMP(0) |
S_008F38_DISABLE_LSB_CEIL(1) |
S_008F38_FILTER_PREC_FIX(1) |
S_008F38_ANISO_OVERRIDE(is_vi));
@@ -1800,3 +1823,48 @@ void radv_DestroySampler(
return;
vk_free2(&device->alloc, pAllocator, sampler);
}
/* vk_icd.h does not declare this function, so we declare it here to
* suppress Wmissing-prototypes.
*/
PUBLIC VKAPI_ATTR VkResult VKAPI_CALL
vk_icdNegotiateLoaderICDInterfaceVersion(uint32_t *pSupportedVersion);
PUBLIC VKAPI_ATTR VkResult VKAPI_CALL
vk_icdNegotiateLoaderICDInterfaceVersion(uint32_t *pSupportedVersion)
{
/* For the full details on loader interface versioning, see
* <https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/blob/master/loader/LoaderAndLayerInterface.md>.
* What follows is a condensed summary, to help you navigate the large and
* confusing official doc.
*
* - Loader interface v0 is incompatible with later versions. We don't
* support it.
*
* - In loader interface v1:
* - The first ICD entrypoint called by the loader is
* vk_icdGetInstanceProcAddr(). The ICD must statically expose this
* entrypoint.
* - The ICD must statically expose no other Vulkan symbol unless it is
* linked with -Bsymbolic.
* - Each dispatchable Vulkan handle created by the ICD must be
* a pointer to a struct whose first member is VK_LOADER_DATA. The
* ICD must initialize VK_LOADER_DATA.loadMagic to ICD_LOADER_MAGIC.
* - The loader implements vkCreate{PLATFORM}SurfaceKHR() and
* vkDestroySurfaceKHR(). The ICD must be capable of working with
* such loader-managed surfaces.
*
* - Loader interface v2 differs from v1 in:
* - The first ICD entrypoint called by the loader is
* vk_icdNegotiateLoaderICDInterfaceVersion(). The ICD must
* statically expose this entrypoint.
*
* - Loader interface v3 differs from v2 in:
* - The ICD must implement vkCreate{PLATFORM}SurfaceKHR(),
* vkDestroySurfaceKHR(), and other API which uses VKSurfaceKHR,
* because the loader no longer does so.
*/
*pSupportedVersion = MIN2(*pSupportedVersion, 3u);
return VK_SUCCESS;
}

View File

@@ -154,6 +154,7 @@ uint32_t radv_translate_tex_dataformat(VkFormat format,
case VK_FORMAT_D16_UNORM:
return V_008F14_IMG_DATA_FORMAT_16;
case VK_FORMAT_D24_UNORM_S8_UINT:
case VK_FORMAT_X8_D24_UNORM_PACK32:
return V_008F14_IMG_DATA_FORMAT_8_24;
case VK_FORMAT_S8_UINT:
return V_008F14_IMG_DATA_FORMAT_8;
@@ -729,9 +730,6 @@ uint32_t radv_translate_dbformat(VkFormat format)
case VK_FORMAT_D16_UNORM:
case VK_FORMAT_D16_UNORM_S8_UINT:
return V_028040_Z_16;
case VK_FORMAT_X8_D24_UNORM_PACK32:
case VK_FORMAT_D24_UNORM_S8_UINT:
return V_028040_Z_24; /* deprecated on SI */
case VK_FORMAT_D32_SFLOAT:
case VK_FORMAT_D32_SFLOAT_S8_UINT:
return V_028040_Z_32_FLOAT;

View File

@@ -267,17 +267,7 @@ si_make_texture_descriptor(struct radv_device *device,
if (desc->colorspace == VK_FORMAT_COLORSPACE_ZS) {
const unsigned char swizzle_xxxx[4] = {0, 0, 0, 0};
const unsigned char swizzle_yyyy[4] = {1, 1, 1, 1};
switch (vk_format) {
case VK_FORMAT_X8_D24_UNORM_PACK32:
case VK_FORMAT_D24_UNORM_S8_UINT:
case VK_FORMAT_D32_SFLOAT_S8_UINT:
vk_format_compose_swizzles(mapping, swizzle_yyyy, swizzle);
break;
default:
vk_format_compose_swizzles(mapping, swizzle_xxxx, swizzle);
}
vk_format_compose_swizzles(mapping, swizzle_xxxx, swizzle);
} else {
vk_format_compose_swizzles(mapping, desc->swizzle, swizzle);
}
@@ -520,6 +510,7 @@ radv_image_alloc_fmask(struct radv_device *device,
image->fmask.offset = align64(image->size, image->fmask.alignment);
image->size = image->fmask.offset + image->fmask.size;
image->alignment = MAX2(image->alignment, image->fmask.alignment);
}
static void
@@ -585,6 +576,7 @@ radv_image_alloc_cmask(struct radv_device *device,
/* + 8 for storing the clear values */
image->clear_value_offset = image->cmask.offset + image->cmask.size;
image->size = image->cmask.offset + image->cmask.size + 8;
image->alignment = MAX2(image->alignment, image->cmask.alignment);
}
static void
@@ -595,6 +587,7 @@ radv_image_alloc_dcc(struct radv_device *device,
/* + 8 for storing the clear values */
image->clear_value_offset = image->dcc_offset + image->surface.dcc_size;
image->size = image->dcc_offset + image->surface.dcc_size + 8;
image->alignment = MAX2(image->alignment, image->surface.dcc_alignment);
}
static unsigned
@@ -666,6 +659,9 @@ radv_image_alloc_htile(struct radv_device *device,
if (env_var_as_boolean("RADV_HIZ_DISABLE", false))
return;
if (image->array_size > 1 || image->levels > 1)
return;
image->htile.size = radv_image_get_htile_size(device, image);
if (!image->htile.size)
@@ -775,8 +771,13 @@ radv_image_view_init(struct radv_image_view *iview,
iview->vk_format = pCreateInfo->format;
iview->aspect_mask = pCreateInfo->subresourceRange.aspectMask;
if (iview->aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT)
if (iview->aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT) {
is_stencil = true;
iview->vk_format = vk_format_stencil_only(iview->vk_format);
} else if (iview->aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT) {
iview->vk_format = vk_format_depth_only(iview->vk_format);
}
iview->extent = (VkExtent3D) {
.width = radv_minify(image->extent.width , range->baseMipLevel),
.height = radv_minify(image->extent.height, range->baseMipLevel),
@@ -794,7 +795,7 @@ radv_image_view_init(struct radv_image_view *iview,
si_make_texture_descriptor(device, image, false,
iview->type,
pCreateInfo->format,
iview->vk_format,
&pCreateInfo->components,
0, radv_get_levelCount(image, range) - 1,
range->baseArrayLayer,
@@ -836,29 +837,29 @@ void radv_image_set_optimal_micro_tile_mode(struct radv_device *device,
switch (micro_tile_mode) {
case 0: /* displayable */
switch (image->surface.bpe) {
case 8:
case 1:
image->surface.tiling_index[0] = 10;
break;
case 16:
case 2:
image->surface.tiling_index[0] = 11;
break;
default: /* 32, 64 */
default: /* 4, 8 */
image->surface.tiling_index[0] = 12;
break;
}
break;
case 1: /* thin */
switch (image->surface.bpe) {
case 8:
case 1:
image->surface.tiling_index[0] = 14;
break;
case 16:
case 2:
image->surface.tiling_index[0] = 15;
break;
case 32:
case 4:
image->surface.tiling_index[0] = 16;
break;
default: /* 64, 128 */
default: /* 8, 16 */
image->surface.tiling_index[0] = 17;
break;
}

View File

@@ -26,6 +26,7 @@
#include "radv_meta.h"
#include "nir/nir_builder.h"
#include "vk_format.h"
enum blit2d_dst_type {
/* We can bind this destination as a "normal" render target and render
@@ -284,8 +285,10 @@ radv_meta_blit2d_normal_dst(struct radv_cmd_buffer *cmd_buffer,
for (unsigned r = 0; r < num_rects; ++r) {
VkFormat depth_format = 0;
if (dst->aspect_mask != VK_IMAGE_ASPECT_COLOR_BIT)
depth_format = dst->image->vk_format;
if (dst->aspect_mask == VK_IMAGE_ASPECT_STENCIL_BIT)
depth_format = vk_format_stencil_only(dst->image->vk_format);
else if (dst->aspect_mask == VK_IMAGE_ASPECT_DEPTH_BIT)
depth_format = vk_format_depth_only(dst->image->vk_format);
struct blit2d_src_temps src_temps;
blit2d_bind_src(cmd_buffer, src_img, src_buf, &rects[r], &src_temps, src_type, depth_format);

View File

@@ -523,6 +523,8 @@ void radv_CmdUpdateBuffer(
assert(!(va & 3));
if (dataSize < 4096) {
si_emit_cache_flush(cmd_buffer);
cmd_buffer->device->ws->cs_add_buffer(cmd_buffer->cs, dst_buffer->bo, 8);
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, words + 4);

View File

@@ -1,6 +1,33 @@
/*
* Copyright © 2016 Red Hat.
* Copyright © 2016 Bas Nieuwenhuizen
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include "radv_meta.h"
#include "nir/nir_builder.h"
/*
* Compute shader implementation of image->buffer copy.
*/
static nir_shader *
build_nir_itob_compute_shader(struct radv_device *dev)
{

View File

@@ -998,7 +998,7 @@ radv_cmd_clear_image(struct radv_cmd_buffer *cmd_buffer,
const VkImageSubresourceRange *range = &ranges[r];
for (uint32_t l = 0; l < radv_get_levelCount(image, range); ++l) {
const uint32_t layer_count = image->type == VK_IMAGE_TYPE_3D ?
radv_minify(image->extent.depth, l) :
radv_minify(image->extent.depth, range->baseMipLevel + l) :
radv_get_layerCount(image, range);
for (uint32_t s = 0; s < layer_count; ++s) {
struct radv_image_view iview;

View File

@@ -144,6 +144,7 @@ radv_optimize_nir(struct nir_shader *shader)
NIR_PASS(progress, shader, nir_opt_algebraic);
NIR_PASS(progress, shader, nir_opt_constant_folding);
NIR_PASS(progress, shader, nir_opt_undef);
NIR_PASS(progress, shader, nir_opt_conditional_discard);
} while (progress);
}
@@ -642,7 +643,8 @@ radv_pipeline_compute_spi_color_formats(struct radv_pipeline *pipeline,
const VkGraphicsPipelineCreateInfo *pCreateInfo,
uint32_t blend_enable,
uint32_t blend_need_alpha,
bool single_cb_enable)
bool single_cb_enable,
bool blend_mrt0_is_dual_src)
{
RADV_FROM_HANDLE(radv_render_pass, pass, pCreateInfo->renderPass);
struct radv_subpass *subpass = pass->subpasses + pCreateInfo->subpass;
@@ -664,6 +666,8 @@ radv_pipeline_compute_spi_color_formats(struct radv_pipeline *pipeline,
blend->cb_shader_mask = si_get_cb_shader_mask(col_format);
if (blend_mrt0_is_dual_src)
col_format |= (col_format & 0xf) << 4;
if (!col_format)
col_format |= V_028714_SPI_SHADER_32_R;
blend->spi_shader_col_format = col_format;
@@ -715,8 +719,13 @@ radv_pipeline_init_blend_state(struct radv_pipeline *pipeline,
struct radv_blend_state *blend = &pipeline->graphics.blend;
unsigned mode = V_028808_CB_NORMAL;
uint32_t blend_enable = 0, blend_need_alpha = 0;
bool blend_mrt0_is_dual_src = false;
int i;
bool single_cb_enable = false;
if (!vkblend)
return;
if (extra && extra->custom_blend_mode) {
single_cb_enable = true;
mode = extra->custom_blend_mode;
@@ -755,7 +764,9 @@ radv_pipeline_init_blend_state(struct radv_pipeline *pipeline,
}
if (is_dual_src(srcRGB) || is_dual_src(dstRGB) || is_dual_src(srcA) || is_dual_src(dstA))
radv_finishme("dual source blending");
if (i == 0)
blend_mrt0_is_dual_src = true;
if (eqRGB == VK_BLEND_OP_MIN || eqRGB == VK_BLEND_OP_MAX) {
srcRGB = VK_BLEND_FACTOR_ONE;
dstRGB = VK_BLEND_FACTOR_ONE;
@@ -797,7 +808,7 @@ radv_pipeline_init_blend_state(struct radv_pipeline *pipeline,
blend->cb_color_control |= S_028808_MODE(V_028808_CB_DISABLE);
radv_pipeline_compute_spi_color_formats(pipeline, pCreateInfo,
blend_enable, blend_need_alpha, single_cb_enable);
blend_enable, blend_need_alpha, single_cb_enable, blend_mrt0_is_dual_src);
}
static uint32_t si_translate_stencil_op(enum VkStencilOp op)
@@ -1069,18 +1080,27 @@ radv_pipeline_init_dynamic_state(struct radv_pipeline *pipeline,
struct radv_dynamic_state *dynamic = &pipeline->dynamic_state;
dynamic->viewport.count = pCreateInfo->pViewportState->viewportCount;
if (states & (1 << VK_DYNAMIC_STATE_VIEWPORT)) {
typed_memcpy(dynamic->viewport.viewports,
pCreateInfo->pViewportState->pViewports,
pCreateInfo->pViewportState->viewportCount);
}
/* Section 9.2 of the Vulkan 1.0.15 spec says:
*
* pViewportState is [...] NULL if the pipeline
* has rasterization disabled.
*/
if (!pCreateInfo->pRasterizationState->rasterizerDiscardEnable) {
assert(pCreateInfo->pViewportState);
dynamic->scissor.count = pCreateInfo->pViewportState->scissorCount;
if (states & (1 << VK_DYNAMIC_STATE_SCISSOR)) {
typed_memcpy(dynamic->scissor.scissors,
pCreateInfo->pViewportState->pScissors,
pCreateInfo->pViewportState->scissorCount);
dynamic->viewport.count = pCreateInfo->pViewportState->viewportCount;
if (states & (1 << VK_DYNAMIC_STATE_VIEWPORT)) {
typed_memcpy(dynamic->viewport.viewports,
pCreateInfo->pViewportState->pViewports,
pCreateInfo->pViewportState->viewportCount);
}
dynamic->scissor.count = pCreateInfo->pViewportState->scissorCount;
if (states & (1 << VK_DYNAMIC_STATE_SCISSOR)) {
typed_memcpy(dynamic->scissor.scissors,
pCreateInfo->pViewportState->pScissors,
pCreateInfo->pViewportState->scissorCount);
}
}
if (states & (1 << VK_DYNAMIC_STATE_LINE_WIDTH)) {
@@ -1098,7 +1118,21 @@ radv_pipeline_init_dynamic_state(struct radv_pipeline *pipeline,
pCreateInfo->pRasterizationState->depthBiasSlopeFactor;
}
if (states & (1 << VK_DYNAMIC_STATE_BLEND_CONSTANTS)) {
/* Section 9.2 of the Vulkan 1.0.15 spec says:
*
* pColorBlendState is [...] NULL if the pipeline has rasterization
* disabled or if the subpass of the render pass the pipeline is
* created against does not use any color attachments.
*/
bool uses_color_att = false;
for (unsigned i = 0; i < subpass->color_count; ++i) {
if (subpass->color_attachments[i].attachment != VK_ATTACHMENT_UNUSED) {
uses_color_att = true;
break;
}
}
if (uses_color_att && states & (1 << VK_DYNAMIC_STATE_BLEND_CONSTANTS)) {
assert(pCreateInfo->pColorBlendState);
typed_memcpy(dynamic->blend_constants,
pCreateInfo->pColorBlendState->blendConstants, 4);
@@ -1110,14 +1144,17 @@ radv_pipeline_init_dynamic_state(struct radv_pipeline *pipeline,
* no need to override the depthstencil defaults in
* radv_pipeline::dynamic_state when there is no depthstencil attachment.
*
* From the Vulkan spec (20 Oct 2015, git-aa308cb):
* Section 9.2 of the Vulkan 1.0.15 spec says:
*
* pDepthStencilState [...] may only be NULL if renderPass and subpass
* specify a subpass that has no depth/stencil attachment.
* pDepthStencilState is [...] NULL if the pipeline has rasterization
* disabled or if the subpass of the render pass the pipeline is created
* against does not use a depth/stencil attachment.
*/
if (subpass->depth_stencil_attachment.attachment != VK_ATTACHMENT_UNUSED) {
if (!pCreateInfo->pRasterizationState->rasterizerDiscardEnable &&
subpass->depth_stencil_attachment.attachment != VK_ATTACHMENT_UNUSED) {
assert(pCreateInfo->pDepthStencilState);
if (states & (1 << VK_DYNAMIC_STATE_DEPTH_BOUNDS)) {
assert(pCreateInfo->pDepthStencilState);
dynamic->depth_bounds.min =
pCreateInfo->pDepthStencilState->minDepthBounds;
dynamic->depth_bounds.max =
@@ -1125,7 +1162,6 @@ radv_pipeline_init_dynamic_state(struct radv_pipeline *pipeline,
}
if (states & (1 << VK_DYNAMIC_STATE_STENCIL_COMPARE_MASK)) {
assert(pCreateInfo->pDepthStencilState);
dynamic->stencil_compare_mask.front =
pCreateInfo->pDepthStencilState->front.compareMask;
dynamic->stencil_compare_mask.back =
@@ -1133,7 +1169,6 @@ radv_pipeline_init_dynamic_state(struct radv_pipeline *pipeline,
}
if (states & (1 << VK_DYNAMIC_STATE_STENCIL_WRITE_MASK)) {
assert(pCreateInfo->pDepthStencilState);
dynamic->stencil_write_mask.front =
pCreateInfo->pDepthStencilState->front.writeMask;
dynamic->stencil_write_mask.back =
@@ -1141,7 +1176,6 @@ radv_pipeline_init_dynamic_state(struct radv_pipeline *pipeline,
}
if (states & (1 << VK_DYNAMIC_STATE_STENCIL_REFERENCE)) {
assert(pCreateInfo->pDepthStencilState);
dynamic->stencil_reference.front =
pCreateInfo->pDepthStencilState->front.reference;
dynamic->stencil_reference.back =

View File

@@ -498,6 +498,7 @@ struct radv_descriptor_pool {
int free_list;
int full_list;
uint32_t max_sets;
uint32_t allocated_sets;
struct radv_descriptor_pool_free_node free_nodes[];
};
@@ -1206,6 +1207,13 @@ void radv_initialise_cmask(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image, uint32_t value);
void radv_initialize_dcc(struct radv_cmd_buffer *cmd_buffer,
struct radv_image *image, uint32_t value);
struct radv_fence {
struct radeon_winsys_fence *fence;
bool submitted;
bool signalled;
};
#define RADV_DEFINE_HANDLE_CASTS(__radv_type, __VkType) \
\
static inline struct __radv_type * \

View File

@@ -131,6 +131,7 @@ VkResult radv_GetQueryPoolResults(
VkDeviceSize stride,
VkQueryResultFlags flags)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_query_pool, pool, queryPool);
char *data = pData;
VkResult result = VK_SUCCESS;
@@ -141,23 +142,20 @@ VkResult radv_GetQueryPoolResults(
char *src = pool->ptr + query * pool->stride;
uint32_t available;
if (flags & VK_QUERY_RESULT_WAIT_BIT) {
while(!*(volatile uint32_t*)(pool->ptr + pool->availability_offset + 4 * query))
;
}
if (!*(uint32_t*)(pool->ptr + pool->availability_offset + 4 * query) &&
!(flags & VK_QUERY_RESULT_PARTIAL_BIT)) {
if (flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT)
*(uint32_t*)dest = 0;
result = VK_NOT_READY;
continue;
}
available = *(uint32_t*)(pool->ptr + pool->availability_offset + 4 * query);
switch (pool->type) {
case VK_QUERY_TYPE_TIMESTAMP:
case VK_QUERY_TYPE_TIMESTAMP: {
if (flags & VK_QUERY_RESULT_WAIT_BIT) {
while(!*(volatile uint32_t*)(pool->ptr + pool->availability_offset + 4 * query))
;
}
available = *(uint32_t*)(pool->ptr + pool->availability_offset + 4 * query);
if (!available && !(flags & VK_QUERY_RESULT_PARTIAL_BIT)) {
result = VK_NOT_READY;
break;
}
if (flags & VK_QUERY_RESULT_64_BIT) {
*(uint64_t*)dest = *(uint64_t*)src;
dest += 8;
@@ -166,8 +164,32 @@ VkResult radv_GetQueryPoolResults(
dest += 4;
}
break;
}
case VK_QUERY_TYPE_OCCLUSION: {
uint64_t result = *(uint64_t*)(src + pool->stride - 16);
volatile uint64_t const *src64 = (volatile uint64_t const *)src;
uint64_t result = 0;
int db_count = get_max_db(device);
available = 1;
for (int i = 0; i < db_count; ++i) {
uint64_t start, end;
do {
start = src64[2 * i];
end = src64[2 * i + 1];
} while ((!(start & (1ull << 63)) || !(end & (1ull << 63))) && (flags & VK_QUERY_RESULT_WAIT_BIT));
if (!(start & (1ull << 63)) || !(end & (1ull << 63)))
available = 0;
else {
result += end - start;
}
}
if (!available && !(flags & VK_QUERY_RESULT_PARTIAL_BIT)) {
result = VK_NOT_READY;
break;
}
if (flags & VK_QUERY_RESULT_64_BIT) {
*(uint64_t*)dest = result;
@@ -183,8 +205,11 @@ VkResult radv_GetQueryPoolResults(
}
if (flags & VK_QUERY_RESULT_WITH_AVAILABILITY_BIT) {
*(uint32_t*)dest = available;
dest += 4;
if (flags & VK_QUERY_RESULT_64_BIT) {
*(uint64_t*)dest = available;
} else {
*(uint32_t*)dest = available;
}
}
}
@@ -357,11 +382,14 @@ void radv_CmdEndQuery(
radeon_emit(cs, va + 8);
radeon_emit(cs, (va + 8) >> 32);
radeon_emit(cs, PKT3(PKT3_OCCLUSION_QUERY, 3, 0));
radeon_emit(cs, va);
radeon_emit(cs, va >> 32);
radeon_emit(cs, va + pool->stride - 16);
radeon_emit(cs, (va + pool->stride - 16) >> 32);
/* hangs for VK_COMMAND_BUFFER_LEVEL_SECONDARY. */
if (cmd_buffer->level == VK_COMMAND_BUFFER_LEVEL_PRIMARY) {
radeon_emit(cs, PKT3(PKT3_OCCLUSION_QUERY, 3, 0));
radeon_emit(cs, va);
radeon_emit(cs, va >> 32);
radeon_emit(cs, va + pool->stride - 16);
radeon_emit(cs, (va + pool->stride - 16) >> 32);
}
break;
default:

View File

@@ -75,7 +75,7 @@ void radv_DestroySurfaceKHR(
const VkAllocationCallbacks* pAllocator)
{
RADV_FROM_HANDLE(radv_instance, instance, _instance);
RADV_FROM_HANDLE(_VkIcdSurfaceBase, surface, _surface);
ICD_FROM_HANDLE(VkIcdSurfaceBase, surface, _surface);
vk_free2(&instance->alloc, pAllocator, surface);
}
@@ -87,7 +87,7 @@ VkResult radv_GetPhysicalDeviceSurfaceSupportKHR(
VkBool32* pSupported)
{
RADV_FROM_HANDLE(radv_physical_device, device, physicalDevice);
RADV_FROM_HANDLE(_VkIcdSurfaceBase, surface, _surface);
ICD_FROM_HANDLE(VkIcdSurfaceBase, surface, _surface);
struct wsi_interface *iface = device->wsi_device.wsi[surface->platform];
return iface->get_support(surface, &device->wsi_device,
@@ -101,7 +101,7 @@ VkResult radv_GetPhysicalDeviceSurfaceCapabilitiesKHR(
VkSurfaceCapabilitiesKHR* pSurfaceCapabilities)
{
RADV_FROM_HANDLE(radv_physical_device, device, physicalDevice);
RADV_FROM_HANDLE(_VkIcdSurfaceBase, surface, _surface);
ICD_FROM_HANDLE(VkIcdSurfaceBase, surface, _surface);
struct wsi_interface *iface = device->wsi_device.wsi[surface->platform];
return iface->get_capabilities(surface, pSurfaceCapabilities);
@@ -114,7 +114,7 @@ VkResult radv_GetPhysicalDeviceSurfaceFormatsKHR(
VkSurfaceFormatKHR* pSurfaceFormats)
{
RADV_FROM_HANDLE(radv_physical_device, device, physicalDevice);
RADV_FROM_HANDLE(_VkIcdSurfaceBase, surface, _surface);
ICD_FROM_HANDLE(VkIcdSurfaceBase, surface, _surface);
struct wsi_interface *iface = device->wsi_device.wsi[surface->platform];
return iface->get_formats(surface, &device->wsi_device, pSurfaceFormatCount,
@@ -128,7 +128,7 @@ VkResult radv_GetPhysicalDeviceSurfacePresentModesKHR(
VkPresentModeKHR* pPresentModes)
{
RADV_FROM_HANDLE(radv_physical_device, device, physicalDevice);
RADV_FROM_HANDLE(_VkIcdSurfaceBase, surface, _surface);
ICD_FROM_HANDLE(VkIcdSurfaceBase, surface, _surface);
struct wsi_interface *iface = device->wsi_device.wsi[surface->platform];
return iface->get_present_modes(surface, pPresentModeCount,
@@ -249,7 +249,7 @@ VkResult radv_CreateSwapchainKHR(
VkSwapchainKHR* pSwapchain)
{
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(_VkIcdSurfaceBase, surface, pCreateInfo->surface);
ICD_FROM_HANDLE(VkIcdSurfaceBase, surface, pCreateInfo->surface);
struct wsi_interface *iface =
device->instance->physicalDevice.wsi_device.wsi[surface->platform];
struct wsi_swapchain *swapchain;
@@ -288,6 +288,9 @@ void radv_DestroySwapchainKHR(
RADV_FROM_HANDLE(wsi_swapchain, swapchain, _swapchain);
const VkAllocationCallbacks *alloc;
if (!_swapchain)
return;
if (pAllocator)
alloc = pAllocator;
else
@@ -318,13 +321,21 @@ VkResult radv_AcquireNextImageKHR(
VkSwapchainKHR _swapchain,
uint64_t timeout,
VkSemaphore semaphore,
VkFence fence,
VkFence _fence,
uint32_t* pImageIndex)
{
RADV_FROM_HANDLE(wsi_swapchain, swapchain, _swapchain);
RADV_FROM_HANDLE(radv_fence, fence, _fence);
return swapchain->acquire_next_image(swapchain, timeout, semaphore,
pImageIndex);
VkResult result = swapchain->acquire_next_image(swapchain, timeout, semaphore,
pImageIndex);
if (fence && result == VK_SUCCESS) {
fence->submitted = true;
fence->signalled = true;
}
return result;
}
VkResult radv_QueuePresentKHR(

View File

@@ -371,6 +371,15 @@ void si_init_config(struct radv_physical_device *physical_device,
radeon_set_context_reg(cs, R_028408_VGT_INDX_OFFSET, 0);
if (physical_device->rad_info.chip_class >= CIK) {
/* If this is 0, Bonaire can hang even if GS isn't being used.
* Other chips are unaffected. These are suboptimal values,
* but we don't use on-chip GS.
*/
radeon_set_context_reg(cs, R_028A44_VGT_GS_ONCHIP_CNTL,
S_028A44_ES_VERTS_PER_SUBGRP(64) |
S_028A44_GS_PRIMS_PER_SUBGRP(4));
radeon_set_sh_reg(cs, R_00B51C_SPI_SHADER_PGM_RSRC3_LS, S_00B51C_CU_EN(0xffff));
radeon_set_sh_reg(cs, R_00B41C_SPI_SHADER_PGM_RSRC3_HS, 0);
radeon_set_sh_reg(cs, R_00B31C_SPI_SHADER_PGM_RSRC3_ES, S_00B31C_CU_EN(0xffff));
radeon_set_sh_reg(cs, R_00B21C_SPI_SHADER_PGM_RSRC3_GS, S_00B21C_CU_EN(0xffff));
@@ -383,7 +392,6 @@ void si_init_config(struct radv_physical_device *physical_device,
*
* LATE_ALLOC_VS = 2 is the highest safe number.
*/
radeon_set_sh_reg(cs, R_00B51C_SPI_SHADER_PGM_RSRC3_LS, S_00B51C_CU_EN(0xffff));
radeon_set_sh_reg(cs, R_00B118_SPI_SHADER_PGM_RSRC3_VS, S_00B118_CU_EN(0xffff));
radeon_set_sh_reg(cs, R_00B11C_SPI_SHADER_LATE_ALLOC_VS, S_00B11C_LIMIT(2));
} else {
@@ -392,7 +400,6 @@ void si_init_config(struct radv_physical_device *physical_device,
* - VS can't execute on CU0.
* - If HS writes outputs to LDS, LS can't execute on CU0.
*/
radeon_set_sh_reg(cs, R_00B51C_SPI_SHADER_PGM_RSRC3_LS, S_00B51C_CU_EN(0xfffe));
radeon_set_sh_reg(cs, R_00B118_SPI_SHADER_PGM_RSRC3_VS, S_00B118_CU_EN(0xfffe));
radeon_set_sh_reg(cs, R_00B11C_SPI_SHADER_LATE_ALLOC_VS, S_00B11C_LIMIT(31));
}
@@ -846,6 +853,7 @@ void si_cp_dma_buffer_copy(struct radv_cmd_buffer *cmd_buffer,
uint64_t main_src_va, main_dest_va;
uint64_t skipped_size = 0, realign_size = 0;
si_emit_cache_flush(cmd_buffer);
if (cmd_buffer->device->instance->physicalDevice.rad_info.family <= CHIP_CARRIZO ||
cmd_buffer->device->instance->physicalDevice.rad_info.family == CHIP_STONEY) {
@@ -909,6 +917,8 @@ void si_cp_dma_clear_buffer(struct radv_cmd_buffer *cmd_buffer, uint64_t va,
assert(va % 4 == 0 && size % 4 == 0);
si_emit_cache_flush(cmd_buffer);
while (size) {
unsigned byte_count = MIN2(size, CP_DMA_MAX_BYTE_COUNT);
unsigned dma_flags = 0;

View File

@@ -274,6 +274,19 @@ static void radv_set_micro_tile_mode(struct radeon_surf *surf,
surf->micro_tile_mode = G_009910_MICRO_TILE_MODE(tile_mode);
}
static unsigned cik_get_macro_tile_index(struct radeon_surf *surf)
{
unsigned index, tileb;
tileb = 8 * 8 * surf->bpe;
tileb = MIN2(surf->tile_split, tileb);
for (index = 0; tileb > 64; index++)
tileb >>= 1;
assert(index < 16);
return index;
}
static int radv_amdgpu_winsys_surface_init(struct radeon_winsys *_ws,
struct radeon_surf *surf)
@@ -435,6 +448,7 @@ static int radv_amdgpu_winsys_surface_init(struct radeon_winsys *_ws,
AddrSurfInfoIn.tileIndex = 10; /* 2D displayable */
else
AddrSurfInfoIn.tileIndex = 14; /* 2D non-displayable */
AddrSurfInfoOut.macroModeIndex = cik_get_macro_tile_index(surf);
}
}

View File

@@ -62,10 +62,14 @@ glsl_tests_blob_test_LDADD = \
glsl_tests_cache_test_SOURCES = \
glsl/tests/cache_test.c
glsl_tests_cache_test_CFLAGS = \
$(PTHREAD_CFLAGS)
glsl_tests_cache_test_LDADD = \
glsl/libglsl.la
glsl/libglsl.la \
$(PTHREAD_LIBS)
glsl_tests_general_ir_test_SOURCES = \
glsl/tests/array_refcount_test.cpp \
glsl/tests/builtin_variable_test.cpp \
glsl/tests/invalidate_locations_test.cpp \
glsl/tests/general_ir_test.cpp \

View File

@@ -28,6 +28,8 @@ LIBGLSL_FILES = \
glsl/glsl_to_nir.cpp \
glsl/glsl_to_nir.h \
glsl/hir_field_selection.cpp \
glsl/ir_array_refcount.cpp \
glsl/ir_array_refcount.h \
glsl/ir_basic_block.cpp \
glsl/ir_basic_block.h \
glsl/ir_builder.cpp \
@@ -227,6 +229,7 @@ NIR_FILES = \
nir/nir_metadata.c \
nir/nir_move_vec_src_uses_to_dest.c \
nir/nir_normalize_cubemap_coords.c \
nir/nir_opt_conditional_discard.c \
nir/nir_opt_constant_folding.c \
nir/nir_opt_copy_propagate.c \
nir/nir_opt_cse.c \

View File

@@ -4330,6 +4330,8 @@ handle_tess_ctrl_shader_output_decl(struct _mesa_glsl_parse_state *state,
if (var->data.patch)
return;
var->data.tess_varying_implicit_sized_array = var->type->is_unsized_array();
validate_layout_qualifier_vertex_count(state, loc, var, num_vertices,
&state->tcs_output_size,
"tessellation control shader output");
@@ -4366,6 +4368,7 @@ handle_tess_shader_input_decl(struct _mesa_glsl_parse_state *state,
if (var->type->is_unsized_array()) {
var->type = glsl_type::get_array_instance(var->type->fields.array,
state->Const.MaxPatchVertices);
var->data.tess_varying_implicit_sized_array = true;
} else if (var->type->length != state->Const.MaxPatchVertices) {
_mesa_glsl_error(&loc, state,
"per-vertex tessellation shader input arrays must be "
@@ -5156,11 +5159,13 @@ ast_declarator_list::hir(exec_list *instructions,
* sized by an earlier input primitive layout qualifier, when
* present, as per the following table."
*/
const enum ir_variable_mode mode = (const enum ir_variable_mode)
(earlier == NULL ? var->data.mode : earlier->data.mode);
const bool implicitly_sized =
(var->data.mode == ir_var_shader_in &&
(mode == ir_var_shader_in &&
state->stage >= MESA_SHADER_TESS_CTRL &&
state->stage <= MESA_SHADER_GEOMETRY) ||
(var->data.mode == ir_var_shader_out &&
(mode == ir_var_shader_out &&
state->stage == MESA_SHADER_TESS_CTRL);
if (t->is_unsized_array() && !implicitly_sized)
@@ -7795,10 +7800,9 @@ ast_interface_block::hir(exec_list *instructions,
}
if (var->type->is_unsized_array()) {
if (var->is_in_shader_storage_block()) {
if (is_unsized_array_last_element(var)) {
var->data.from_ssbo_unsized_array = true;
}
if (var->is_in_shader_storage_block() &&
is_unsized_array_last_element(var)) {
var->data.from_ssbo_unsized_array = true;
} else {
/* From GLSL ES 3.10 spec, section 4.1.9 "Arrays":
*
@@ -7806,6 +7810,10 @@ ast_interface_block::hir(exec_list *instructions,
* block and the size is not specified at compile-time, it is
* sized at run-time. In all other cases, arrays are sized only
* at compile-time."
*
* In desktop GLSL it is allowed to have unsized-arrays that are
* not last, as long as we can determine that they are implicitly
* sized.
*/
if (state->es_shader) {
_mesa_glsl_error(&loc, state, "unsized array `%s' "

View File

@@ -537,6 +537,12 @@ compute_shader(const _mesa_glsl_parse_state *state)
return state->stage == MESA_SHADER_COMPUTE;
}
static bool
compute_shader_supported(const _mesa_glsl_parse_state *state)
{
return state->has_compute_shader();
}
static bool
buffer_atomics_supported(const _mesa_glsl_parse_state *state)
{
@@ -1098,15 +1104,15 @@ builtin_builder::create_intrinsics()
ir_intrinsic_group_memory_barrier),
NULL);
add_function("__intrinsic_memory_barrier_atomic_counter",
_memory_barrier_intrinsic(compute_shader,
_memory_barrier_intrinsic(compute_shader_supported,
ir_intrinsic_memory_barrier_atomic_counter),
NULL);
add_function("__intrinsic_memory_barrier_buffer",
_memory_barrier_intrinsic(compute_shader,
_memory_barrier_intrinsic(compute_shader_supported,
ir_intrinsic_memory_barrier_buffer),
NULL);
add_function("__intrinsic_memory_barrier_image",
_memory_barrier_intrinsic(compute_shader,
_memory_barrier_intrinsic(compute_shader_supported,
ir_intrinsic_memory_barrier_image),
NULL);
add_function("__intrinsic_memory_barrier_shared",
@@ -2967,15 +2973,15 @@ builtin_builder::create_builtins()
NULL);
add_function("memoryBarrierAtomicCounter",
_memory_barrier("__intrinsic_memory_barrier_atomic_counter",
compute_shader),
compute_shader_supported),
NULL);
add_function("memoryBarrierBuffer",
_memory_barrier("__intrinsic_memory_barrier_buffer",
compute_shader),
compute_shader_supported),
NULL);
add_function("memoryBarrierImage",
_memory_barrier("__intrinsic_memory_barrier_image",
compute_shader),
compute_shader_supported),
NULL);
add_function("memoryBarrierShared",
_memory_barrier("__intrinsic_memory_barrier_shared",
@@ -3563,9 +3569,17 @@ builtin_builder::_tanh(const glsl_type *type)
ir_variable *x = in_var(type, "x");
MAKE_SIG(type, v130, 1, x);
/* Clamp x to [-10, +10] to avoid precision problems.
* When x > 10, e^(-x) is so small relative to e^x that it gets flushed to
* zero in the computation e^x + e^(-x). The same happens in the other
* direction when x < -10.
*/
ir_variable *t = body.make_temp(type, "tmp");
body.emit(assign(t, min2(max2(x, imm(-10.0f)), imm(10.0f))));
/* (e^x - e^(-x)) / (e^x + e^(-x)) */
body.emit(ret(div(sub(exp(x), exp(neg(x))),
add(exp(x), exp(neg(x))))));
body.emit(ret(div(sub(exp(t), exp(neg(t))),
add(exp(t), exp(neg(t))))));
return sig;
}

View File

@@ -612,19 +612,18 @@ cache_put(struct program_cache *cache,
p_atomic_add(cache->size, size);
done:
if (fd_final != -1)
close(fd_final);
/* This close finally releases the flock, (now that the final dile
* has been renamed into place and the size has been added).
*/
close(fd);
fd = -1;
done:
if (fd != -1)
close(fd);
if (filename_tmp)
ralloc_free(filename_tmp);
if (filename)
ralloc_free(filename);
if (fd != -1)
close(fd);
}
void *

View File

@@ -176,7 +176,7 @@ add_builtin_define(glcpp_parser_t *parser, const char *name, int value);
* (such as the <HASH> and <DEFINE> start conditions in the lexer). */
%token DEFINED ELIF_EXPANDED HASH_TOKEN DEFINE_TOKEN FUNC_IDENTIFIER OBJ_IDENTIFIER ELIF ELSE ENDIF ERROR_TOKEN IF IFDEF IFNDEF LINE PRAGMA UNDEF VERSION_TOKEN GARBAGE IDENTIFIER IF_EXPANDED INTEGER INTEGER_STRING LINE_EXPANDED NEWLINE OTHER PLACEHOLDER SPACE PLUS_PLUS MINUS_MINUS
%token PASTE
%type <ival> INTEGER operator SPACE integer_constant
%type <ival> INTEGER operator SPACE integer_constant version_constant
%type <expression_value> expression
%type <str> IDENTIFIER FUNC_IDENTIFIER OBJ_IDENTIFIER INTEGER_STRING OTHER ERROR_TOKEN PRAGMA
%type <string_list> identifier_list
@@ -424,14 +424,14 @@ control_line_success:
| HASH_TOKEN ENDIF {
_glcpp_parser_skip_stack_pop (parser, & @1);
} NEWLINE
| HASH_TOKEN VERSION_TOKEN integer_constant NEWLINE {
if (parser->version != 0) {
| HASH_TOKEN VERSION_TOKEN version_constant NEWLINE {
if (parser->version_set) {
glcpp_error(& @1, parser, "#version must appear on the first line");
}
_glcpp_parser_handle_version_declaration(parser, $3, NULL, true);
}
| HASH_TOKEN VERSION_TOKEN integer_constant IDENTIFIER NEWLINE {
if (parser->version != 0) {
| HASH_TOKEN VERSION_TOKEN version_constant IDENTIFIER NEWLINE {
if (parser->version_set) {
glcpp_error(& @1, parser, "#version must appear on the first line");
}
_glcpp_parser_handle_version_declaration(parser, $3, $4, true);
@@ -470,6 +470,17 @@ integer_constant:
$$ = $1;
}
version_constant:
INTEGER_STRING {
/* Both octal and hexadecimal constants begin with 0. */
if ($1[0] == '0' && $1[1] != '\0') {
glcpp_error(&@1, parser, "invalid #version \"%s\" (not a decimal constant)", $1);
$$ = 0;
} else {
$$ = strtoll($1, NULL, 10);
}
}
expression:
integer_constant {
$$.value = $1;
@@ -1376,6 +1387,7 @@ glcpp_parser_create(glcpp_extension_iterator extensions, void *state, gl_api api
parser->state = state;
parser->api = api;
parser->version = 0;
parser->version_set = false;
parser->has_new_line_number = 0;
parser->new_line_number = 1;
@@ -2318,10 +2330,11 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio
const char *es_identifier,
bool explicitly_set)
{
if (parser->version != 0)
if (parser->version_set)
return;
parser->version = version;
parser->version_set = true;
add_builtin_define (parser, "__VERSION__", version);

View File

@@ -207,6 +207,15 @@ struct glcpp_parser {
void *state;
gl_api api;
unsigned version;
/**
* Has the #version been set?
*
* A separate flag is used because any possible sentinel value in
* \c ::version could also be set by a #version line.
*/
bool version_set;
bool has_new_line_number;
int new_line_number;
bool has_new_source_number;

View File

@@ -253,6 +253,10 @@ HASH ^{SPC}#{SPC}
yylval->n = strtol(yytext, NULL, 10);
return INTCONSTANT;
}
<PP>0 {
yylval->n = 0;
return INTCONSTANT;
}
<PP>\n { BEGIN 0; yylineno++; yycolumn = 0; return EOL; }
<PP>. { return yytext[0]; }

View File

@@ -126,7 +126,7 @@ void glsl_symbol_table::pop_scope()
bool glsl_symbol_table::name_declared_this_scope(const char *name)
{
return _mesa_symbol_table_symbol_scope(table, -1, name) == 0;
return _mesa_symbol_table_symbol_scope(table, name) == 0;
}
bool glsl_symbol_table::add_variable(ir_variable *v)
@@ -152,7 +152,7 @@ bool glsl_symbol_table::add_variable(ir_variable *v)
symbol_table_entry *entry = new(mem_ctx) symbol_table_entry(v);
if (existing != NULL)
entry->f = existing->f;
int added = _mesa_symbol_table_add_symbol(table, -1, v->name, entry);
int added = _mesa_symbol_table_add_symbol(table, v->name, entry);
assert(added == 0);
(void)added;
return true;
@@ -162,13 +162,13 @@ bool glsl_symbol_table::add_variable(ir_variable *v)
/* 1.20+ rules: */
symbol_table_entry *entry = new(mem_ctx) symbol_table_entry(v);
return _mesa_symbol_table_add_symbol(table, -1, v->name, entry) == 0;
return _mesa_symbol_table_add_symbol(table, v->name, entry) == 0;
}
bool glsl_symbol_table::add_type(const char *name, const glsl_type *t)
{
symbol_table_entry *entry = new(mem_ctx) symbol_table_entry(t);
return _mesa_symbol_table_add_symbol(table, -1, name, entry) == 0;
return _mesa_symbol_table_add_symbol(table, name, entry) == 0;
}
bool glsl_symbol_table::add_interface(const char *name, const glsl_type *i,
@@ -180,7 +180,7 @@ bool glsl_symbol_table::add_interface(const char *name, const glsl_type *i,
symbol_table_entry *entry =
new(mem_ctx) symbol_table_entry(i, mode);
bool add_interface_symbol_result =
_mesa_symbol_table_add_symbol(table, -1, name, entry) == 0;
_mesa_symbol_table_add_symbol(table, name, entry) == 0;
assert(add_interface_symbol_result);
return add_interface_symbol_result;
} else {
@@ -199,7 +199,7 @@ bool glsl_symbol_table::add_function(ir_function *f)
}
}
symbol_table_entry *entry = new(mem_ctx) symbol_table_entry(f);
return _mesa_symbol_table_add_symbol(table, -1, f->name, entry) == 0;
return _mesa_symbol_table_add_symbol(table, f->name, entry) == 0;
}
bool glsl_symbol_table::add_default_precision_qualifier(const char *type_name,
@@ -213,13 +213,16 @@ bool glsl_symbol_table::add_default_precision_qualifier(const char *type_name,
symbol_table_entry *entry =
new(mem_ctx) symbol_table_entry(default_specifier);
return _mesa_symbol_table_add_symbol(table, -1, name, entry) == 0;
if (!get_entry(name))
return _mesa_symbol_table_add_symbol(table, name, entry) == 0;
return _mesa_symbol_table_replace_symbol(table, name, entry) == 0;
}
void glsl_symbol_table::add_global_function(ir_function *f)
{
symbol_table_entry *entry = new(mem_ctx) symbol_table_entry(f);
int added = _mesa_symbol_table_add_global_symbol(table, -1, f->name, entry);
int added = _mesa_symbol_table_add_global_symbol(table, f->name, entry);
assert(added == 0);
(void)added;
}
@@ -261,7 +264,7 @@ int glsl_symbol_table::get_default_precision_qualifier(const char *type_name)
symbol_table_entry *glsl_symbol_table::get_entry(const char *name)
{
return (symbol_table_entry *)
_mesa_symbol_table_find_symbol(table, -1, name);
_mesa_symbol_table_find_symbol(table, name);
}
void

View File

@@ -832,6 +832,12 @@ public:
unsigned implicit_sized_array:1;
/**
* Is this a non-patch TCS output / TES input array that was implicitly
* sized to gl_MaxPatchVertices?
*/
unsigned tess_varying_implicit_sized_array:1;
/**
* Whether this is a fragment shader output implicitly initialized with
* the previous contents of the specified render target at the

View File

@@ -0,0 +1,254 @@
/*
* Copyright © 2016 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/**
* \file ir_array_refcount.cpp
*
* Provides a visitor which produces a list of variables referenced.
*/
#include "ir.h"
#include "ir_visitor.h"
#include "ir_array_refcount.h"
#include "compiler/glsl_types.h"
#include "util/hash_table.h"
ir_array_refcount_visitor::ir_array_refcount_visitor()
: last_array_deref(0), derefs(0), num_derefs(0), derefs_size(0)
{
this->mem_ctx = ralloc_context(NULL);
this->ht = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
_mesa_key_pointer_equal);
}
static void
free_entry(struct hash_entry *entry)
{
ir_array_refcount_entry *ivre = (ir_array_refcount_entry *) entry->data;
delete ivre;
}
ir_array_refcount_visitor::~ir_array_refcount_visitor()
{
ralloc_free(this->mem_ctx);
_mesa_hash_table_destroy(this->ht, free_entry);
}
ir_array_refcount_entry::ir_array_refcount_entry(ir_variable *var)
: var(var), is_referenced(false)
{
num_bits = MAX2(1, var->type->arrays_of_arrays_size());
bits = new BITSET_WORD[BITSET_WORDS(num_bits)];
memset(bits, 0, BITSET_WORDS(num_bits) * sizeof(bits[0]));
/* Count the "depth" of the arrays-of-arrays. */
array_depth = 0;
for (const glsl_type *type = var->type;
type->is_array();
type = type->fields.array) {
array_depth++;
}
}
ir_array_refcount_entry::~ir_array_refcount_entry()
{
delete [] bits;
}
void
ir_array_refcount_entry::mark_array_elements_referenced(const array_deref_range *dr,
unsigned count)
{
if (count != array_depth)
return;
mark_array_elements_referenced(dr, count, 1, 0);
}
void
ir_array_refcount_entry::mark_array_elements_referenced(const array_deref_range *dr,
unsigned count,
unsigned scale,
unsigned linearized_index)
{
/* Walk through the list of array dereferences in least- to
* most-significant order. Along the way, accumulate the current
* linearized offset and the scale factor for each array-of-.
*/
for (unsigned i = 0; i < count; i++) {
if (dr[i].index < dr[i].size) {
linearized_index += dr[i].index * scale;
scale *= dr[i].size;
} else {
/* For each element in the current array, update the count and
* offset, then recurse to process the remaining arrays.
*
* There is some inefficency here if the last element in the
* array_deref_range list specifies the entire array. In that case,
* the loop will make recursive calls with count == 0. In the call,
* all that will happen is the bit will be set.
*/
for (unsigned j = 0; j < dr[i].size; j++) {
mark_array_elements_referenced(&dr[i + 1],
count - (i + 1),
scale * dr[i].size,
linearized_index + (j * scale));
}
return;
}
}
BITSET_SET(bits, linearized_index);
}
ir_array_refcount_entry *
ir_array_refcount_visitor::get_variable_entry(ir_variable *var)
{
assert(var);
struct hash_entry *e = _mesa_hash_table_search(this->ht, var);
if (e)
return (ir_array_refcount_entry *)e->data;
ir_array_refcount_entry *entry = new ir_array_refcount_entry(var);
_mesa_hash_table_insert(this->ht, var, entry);
return entry;
}
array_deref_range *
ir_array_refcount_visitor::get_array_deref()
{
if ((num_derefs + 1) * sizeof(array_deref_range) > derefs_size) {
void *ptr = reralloc_size(mem_ctx, derefs, derefs_size + 4096);
if (ptr == NULL)
return NULL;
derefs_size += 4096;
derefs = (array_deref_range *)ptr;
}
array_deref_range *d = &derefs[num_derefs];
num_derefs++;
return d;
}
ir_visitor_status
ir_array_refcount_visitor::visit_enter(ir_dereference_array *ir)
{
/* It could also be a vector or a matrix. Individual elements of vectors
* are natrices are not tracked, so bail.
*/
if (!ir->array->type->is_array())
return visit_continue;
/* If this array dereference is a child of an array dereference that was
* already visited, just continue on. Otherwise, for an arrays-of-arrays
* dereference like x[1][2][3][4], we'd process the [1][2][3][4] sequence,
* the [1][2][3] sequence, the [1][2] sequence, and the [1] sequence. This
* ensures that we only process the full sequence.
*/
if (last_array_deref && last_array_deref->array == ir) {
last_array_deref = ir;
return visit_continue;
}
last_array_deref = ir;
num_derefs = 0;
ir_rvalue *rv = ir;
while (rv->ir_type == ir_type_dereference_array) {
ir_dereference_array *const deref = rv->as_dereference_array();
assert(deref != NULL);
assert(deref->array->type->is_array());
ir_rvalue *const array = deref->array;
const ir_constant *const idx = deref->array_index->as_constant();
array_deref_range *const dr = get_array_deref();
dr->size = array->type->array_size();
if (idx != NULL) {
dr->index = idx->get_int_component(0);
} else {
/* An unsized array can occur at the end of an SSBO. We can't track
* accesses to such an array, so bail.
*/
if (array->type->array_size() == 0)
return visit_continue;
dr->index = dr->size;
}
rv = array;
}
ir_dereference_variable *const var_deref = rv->as_dereference_variable();
/* If the array being dereferenced is not a variable, bail. At the very
* least, ir_constant and ir_dereference_record are possible.
*/
if (var_deref == NULL)
return visit_continue;
ir_array_refcount_entry *const entry =
this->get_variable_entry(var_deref->var);
if (entry == NULL)
return visit_stop;
entry->mark_array_elements_referenced(derefs, num_derefs);
return visit_continue;
}
ir_visitor_status
ir_array_refcount_visitor::visit(ir_dereference_variable *ir)
{
ir_variable *const var = ir->variable_referenced();
ir_array_refcount_entry *entry = this->get_variable_entry(var);
entry->is_referenced = true;
return visit_continue;
}
ir_visitor_status
ir_array_refcount_visitor::visit_enter(ir_function_signature *ir)
{
/* We don't want to descend into the function parameters and
* dead-code eliminate them, so just accept the body here.
*/
visit_list_elements(this, &ir->body);
return visit_continue_with_parent;
}

View File

@@ -0,0 +1,183 @@
/*
* Copyright © 2016 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
/**
* \file ir_array_refcount.h
*
* Provides a visitor which produces a list of variables referenced.
*/
#include "ir.h"
#include "ir_visitor.h"
#include "compiler/glsl_types.h"
#include "util/bitset.h"
/**
* Describes an access of an array element or an access of the whole array
*/
struct array_deref_range {
/**
* Index that was accessed.
*
* All valid array indices are less than the size of the array. If index
* is equal to the size of the array, this means the entire array has been
* accessed (e.g., due to use of a non-constant index).
*/
unsigned index;
/** Size of the array. Used for offset calculations. */
unsigned size;
};
class ir_array_refcount_entry
{
public:
ir_array_refcount_entry(ir_variable *var);
~ir_array_refcount_entry();
ir_variable *var; /* The key: the variable's pointer. */
/** Has the variable been referenced? */
bool is_referenced;
/**
* Mark a set of array elements as accessed.
*
* If every \c array_deref_range is for a single index, only a single
* element will be marked. If any \c array_deref_range is for an entire
* array-of-, then multiple elements will be marked.
*
* Items in the \c array_deref_range list appear in least- to
* most-significant order. This is the \b opposite order the indices
* appear in the GLSL shader text. An array access like
*
* x = y[1][i][3];
*
* would appear as
*
* { { 3, n }, { m, m }, { 1, p } }
*
* where n, m, and p are the sizes of the arrays-of-arrays.
*
* The set of marked array elements can later be queried by
* \c ::is_linearized_index_referenced.
*
* \param dr List of array_deref_range elements to be processed.
* \param count Number of array_deref_range elements to be processed.
*/
void mark_array_elements_referenced(const array_deref_range *dr,
unsigned count);
/** Has a linearized array index been referenced? */
bool is_linearized_index_referenced(unsigned linearized_index) const
{
assert(bits != 0);
assert(linearized_index <= num_bits);
return BITSET_TEST(bits, linearized_index);
}
private:
/** Set of bit-flags to note which array elements have been accessed. */
BITSET_WORD *bits;
/**
* Total number of bits referenced by \c bits.
*
* Also the total number of array(s-of-arrays) elements of \c var.
*/
unsigned num_bits;
/** Count of nested arrays in the type. */
unsigned array_depth;
/**
* Recursive part of the public mark_array_elements_referenced method.
*
* The recursion occurs when an entire array-of- is accessed. See the
* implementation for more details.
*
* \param dr List of array_deref_range elements to be
* processed.
* \param count Number of array_deref_range elements to be
* processed.
* \param scale Current offset scale.
* \param linearized_index Current accumulated linearized array index.
*/
void mark_array_elements_referenced(const array_deref_range *dr,
unsigned count,
unsigned scale,
unsigned linearized_index);
friend class array_refcount_test;
};
class ir_array_refcount_visitor : public ir_hierarchical_visitor {
public:
ir_array_refcount_visitor(void);
~ir_array_refcount_visitor(void);
virtual ir_visitor_status visit(ir_dereference_variable *);
virtual ir_visitor_status visit_enter(ir_function_signature *);
virtual ir_visitor_status visit_enter(ir_dereference_array *);
/**
* Find variable in the hash table, and insert it if not present
*/
ir_array_refcount_entry *get_variable_entry(ir_variable *var);
/**
* Hash table mapping ir_variable to ir_array_refcount_entry.
*/
struct hash_table *ht;
void *mem_ctx;
private:
/** Get an array_deref_range element from private tracking. */
array_deref_range *get_array_deref();
/**
* Last ir_dereference_array that was visited
*
* Used to prevent some redundant calculations.
*
* \sa ::visit_enter(ir_dereference_array *)
*/
ir_dereference_array *last_array_deref;
/**
* \name array_deref_range tracking
*/
/*@{*/
/** Currently allocated block of derefs. */
array_deref_range *derefs;
/** Number of derefs used in current processing. */
unsigned num_derefs;
/** Size of the derefs buffer in bytes. */
unsigned derefs_size;
/*@}*/
};

View File

@@ -30,7 +30,7 @@
/* Operations for lower_instructions() */
#define SUB_TO_ADD_NEG 0x01
#define DIV_TO_MUL_RCP 0x02
#define FDIV_TO_MUL_RCP 0x02
#define EXP_TO_EXP2 0x04
#define POW_TO_EXP2 0x08
#define LOG_TO_LOG2 0x10
@@ -49,6 +49,8 @@
#define FIND_LSB_TO_FLOAT_CAST 0x20000
#define FIND_MSB_TO_FLOAT_CAST 0x40000
#define IMUL_HIGH_TO_MUL 0x80000
#define DDIV_TO_MUL_RCP 0x100000
#define DIV_TO_MUL_RCP (FDIV_TO_MUL_RCP | DDIV_TO_MUL_RCP)
/**
* \see class lower_packing_builtins_visitor

View File

@@ -130,14 +130,14 @@ ir_print_visitor::unique_name(ir_variable *var)
/* If there's no conflict, just use the original name */
const char* name = NULL;
if (_mesa_symbol_table_find_symbol(this->symbols, -1, var->name) == NULL) {
if (_mesa_symbol_table_find_symbol(this->symbols, var->name) == NULL) {
name = var->name;
} else {
static unsigned i = 1;
name = ralloc_asprintf(this->mem_ctx, "%s@%u", var->name, ++i);
}
_mesa_hash_table_insert(this->printable_names, var, (void *) name);
_mesa_symbol_table_add_symbol(this->symbols, -1, name, var);
_mesa_symbol_table_add_symbol(this->symbols, name, var);
return name;
}

View File

@@ -214,67 +214,98 @@ struct block {
bool has_instance_name;
};
static void process_block_array_leaf(char **name, gl_uniform_block *blocks,
ubo_visitor *parcel,
gl_uniform_buffer_variable *variables,
const struct link_uniform_block_active *const b,
unsigned *block_index,
unsigned *binding_offset,
unsigned linearized_index,
struct gl_context *ctx,
struct gl_shader_program *prog);
/**
*
* \param first_index Value of \c block_index for the first element of the
* array.
*/
static void
process_block_array(struct uniform_block_array_elements *ub_array, char **name,
size_t name_length, gl_uniform_block *blocks,
ubo_visitor *parcel, gl_uniform_buffer_variable *variables,
const struct link_uniform_block_active *const b,
unsigned *block_index, unsigned *binding_offset,
struct gl_context *ctx, struct gl_shader_program *prog)
struct gl_context *ctx, struct gl_shader_program *prog,
unsigned first_index)
{
if (ub_array) {
for (unsigned j = 0; j < ub_array->num_array_elements; j++) {
size_t new_length = name_length;
for (unsigned j = 0; j < ub_array->num_array_elements; j++) {
size_t new_length = name_length;
/* Append the subscript to the current variable name */
ralloc_asprintf_rewrite_tail(name, &new_length, "[%u]",
ub_array->array_elements[j]);
/* Append the subscript to the current variable name */
ralloc_asprintf_rewrite_tail(name, &new_length, "[%u]",
ub_array->array_elements[j]);
if (ub_array->array) {
process_block_array(ub_array->array, name, new_length, blocks,
parcel, variables, b, block_index,
binding_offset, ctx, prog);
binding_offset, ctx, prog, first_index);
} else {
process_block_array_leaf(name, blocks,
parcel, variables, b, block_index,
binding_offset, *block_index - first_index,
ctx, prog);
}
} else {
unsigned i = *block_index;
const glsl_type *type = b->type->without_array();
blocks[i].Name = ralloc_strdup(blocks, *name);
blocks[i].Uniforms = &variables[(*parcel).index];
/* The GL_ARB_shading_language_420pack spec says:
*
* "If the binding identifier is used with a uniform block
* instanced as an array then the first element of the array
* takes the specified block binding and each subsequent
* element takes the next consecutive uniform block binding
* point."
*/
blocks[i].Binding = (b->has_binding) ? b->binding + *binding_offset : 0;
blocks[i].UniformBufferSize = 0;
blocks[i]._Packing = gl_uniform_block_packing(type->interface_packing);
parcel->process(type, blocks[i].Name);
blocks[i].UniformBufferSize = parcel->buffer_size;
/* Check SSBO size is lower than maximum supported size for SSBO */
if (b->is_shader_storage &&
parcel->buffer_size > ctx->Const.MaxShaderStorageBlockSize) {
linker_error(prog, "shader storage block `%s' has size %d, "
"which is larger than than the maximum allowed (%d)",
b->type->name,
parcel->buffer_size,
ctx->Const.MaxShaderStorageBlockSize);
}
blocks[i].NumUniforms =
(unsigned)(ptrdiff_t)(&variables[parcel->index] - blocks[i].Uniforms);
*block_index = *block_index + 1;
*binding_offset = *binding_offset + 1;
}
}
static void
process_block_array_leaf(char **name,
gl_uniform_block *blocks,
ubo_visitor *parcel, gl_uniform_buffer_variable *variables,
const struct link_uniform_block_active *const b,
unsigned *block_index, unsigned *binding_offset,
unsigned linearized_index,
struct gl_context *ctx, struct gl_shader_program *prog)
{
unsigned i = *block_index;
const glsl_type *type = b->type->without_array();
blocks[i].Name = ralloc_strdup(blocks, *name);
blocks[i].Uniforms = &variables[(*parcel).index];
/* The GL_ARB_shading_language_420pack spec says:
*
* "If the binding identifier is used with a uniform block instanced as
* an array then the first element of the array takes the specified
* block binding and each subsequent element takes the next consecutive
* uniform block binding point."
*/
blocks[i].Binding = (b->has_binding) ? b->binding + *binding_offset : 0;
blocks[i].UniformBufferSize = 0;
blocks[i]._Packing = gl_uniform_block_packing(type->interface_packing);
blocks[i].linearized_array_index = linearized_index;
parcel->process(type, blocks[i].Name);
blocks[i].UniformBufferSize = parcel->buffer_size;
/* Check SSBO size is lower than maximum supported size for SSBO */
if (b->is_shader_storage &&
parcel->buffer_size > ctx->Const.MaxShaderStorageBlockSize) {
linker_error(prog, "shader storage block `%s' has size %d, "
"which is larger than than the maximum allowed (%d)",
b->type->name,
parcel->buffer_size,
ctx->Const.MaxShaderStorageBlockSize);
}
blocks[i].NumUniforms =
(unsigned)(ptrdiff_t)(&variables[parcel->index] - blocks[i].Uniforms);
*block_index = *block_index + 1;
*binding_offset = *binding_offset + 1;
}
/* This function resizes the array types of the block so that later we can use
* this new size to correctly calculate the offest for indirect indexing.
*/
@@ -351,7 +382,8 @@ create_buffer_blocks(void *mem_ctx, struct gl_context *ctx,
assert(b->has_instance_name);
process_block_array(b->array, &name, name_length, blocks, &parcel,
variables, b, &i, &binding_offset, ctx, prog);
variables, b, &i, &binding_offset, ctx, prog,
i);
ralloc_free(name);
} else {
blocks[i].Name = ralloc_strdup(blocks, block_type->name);

View File

@@ -28,6 +28,7 @@
#include "glsl_symbol_table.h"
#include "program.h"
#include "util/string_to_uint_map.h"
#include "ir_array_refcount.h"
/**
* \file link_uniforms.cpp
@@ -544,7 +545,7 @@ private:
const char *str_end;
while((str_start = strchr(name_copy, '[')) &&
(str_end = strchr(name_copy, ']'))) {
memmove(str_start, str_end + 1, 1 + strlen(str_end));
memmove(str_start, str_end + 1, 1 + strlen(str_end + 1));
}
unsigned index = 0;
@@ -633,6 +634,8 @@ private:
uniform->opaque[shader_type].index = this->next_subroutine;
uniform->opaque[shader_type].active = true;
prog->_LinkedShaders[shader_type]->NumSubroutineUniforms++;
/* Increment the subroutine index by 1 for non-arrays and by the
* number of array elements for arrays.
*/
@@ -880,6 +883,15 @@ public:
unsigned shader_shadow_samplers;
};
static bool
variable_is_referenced(ir_array_refcount_visitor &v, ir_variable *var)
{
ir_array_refcount_entry *const entry = v.get_variable_entry(var);
return entry->is_referenced;
}
/**
* Walks the IR and update the references to uniform blocks in the
* ir_variables to point at linked shader's list (previously, they
@@ -887,8 +899,13 @@ public:
* shaders).
*/
static void
link_update_uniform_buffer_variables(struct gl_linked_shader *shader)
link_update_uniform_buffer_variables(struct gl_linked_shader *shader,
unsigned stage)
{
ir_array_refcount_visitor v;
v.run(shader->ir);
foreach_in_list(ir_instruction, node, shader->ir) {
ir_variable *const var = node->as_variable();
@@ -898,7 +915,48 @@ link_update_uniform_buffer_variables(struct gl_linked_shader *shader)
assert(var->data.mode == ir_var_uniform ||
var->data.mode == ir_var_shader_storage);
unsigned num_blocks = var->data.mode == ir_var_uniform ?
shader->NumUniformBlocks : shader->NumShaderStorageBlocks;
struct gl_uniform_block **blks = var->data.mode == ir_var_uniform ?
shader->UniformBlocks : shader->ShaderStorageBlocks;
if (var->is_interface_instance()) {
const ir_array_refcount_entry *const entry = v.get_variable_entry(var);
if (entry->is_referenced) {
/* Since this is an interface instance, the instance type will be
* same as the array-stripped variable type. If the variable type
* is an array, then the block names will be suffixed with [0]
* through [n-1]. Unlike for non-interface instances, there will
* not be structure types here, so the only name sentinel that we
* have to worry about is [.
*/
assert(var->type->without_array() == var->get_interface_type());
const char sentinel = var->type->is_array() ? '[' : '\0';
const ptrdiff_t len = strlen(var->get_interface_type()->name);
for (unsigned i = 0; i < num_blocks; i++) {
const char *const begin = blks[i]->Name;
const char *const end = strchr(begin, sentinel);
if (end == NULL)
continue;
if (len != (end - begin))
continue;
/* Even when a match is found, do not "break" here. This could
* be an array of instances, and all elements of the array need
* to be marked as referenced.
*/
if (strncmp(begin, var->get_interface_type()->name, len) == 0 &&
(!var->type->is_array() ||
entry->is_linearized_index_referenced(blks[i]->linearized_array_index))) {
blks[i]->stageref |= 1U << stage;
}
}
}
var->data.location = 0;
continue;
}
@@ -913,11 +971,6 @@ link_update_uniform_buffer_variables(struct gl_linked_shader *shader)
sentinel = '[';
}
unsigned num_blocks = var->data.mode == ir_var_uniform ?
shader->NumUniformBlocks : shader->NumShaderStorageBlocks;
struct gl_uniform_block **blks = var->data.mode == ir_var_uniform ?
shader->UniformBlocks : shader->ShaderStorageBlocks;
const unsigned l = strlen(var->name);
for (unsigned i = 0; i < num_blocks; i++) {
for (unsigned j = 0; j < blks[i]->NumUniforms; j++) {
@@ -931,14 +984,17 @@ link_update_uniform_buffer_variables(struct gl_linked_shader *shader)
if ((ptrdiff_t) l != (end - begin))
continue;
if (strncmp(var->name, begin, l) == 0) {
found = true;
var->data.location = j;
break;
}
} else if (!strcmp(var->name, blks[i]->Uniforms[j].Name)) {
found = true;
found = strncmp(var->name, begin, l) == 0;
} else {
found = strcmp(var->name, blks[i]->Uniforms[j].Name) == 0;
}
if (found) {
var->data.location = j;
if (variable_is_referenced(v, var))
blks[i]->stageref |= 1U << stage;
break;
}
}
@@ -1260,7 +1316,7 @@ link_assign_uniform_locations(struct gl_shader_program *prog,
memset(sh->SamplerUnits, 0, sizeof(sh->SamplerUnits));
memset(sh->ImageUnits, 0, sizeof(sh->ImageUnits));
link_update_uniform_buffer_variables(sh);
link_update_uniform_buffer_variables(sh, i);
/* Reset various per-shader target counts.
*/

View File

@@ -181,7 +181,43 @@ private:
};
class array_resize_visitor : public ir_hierarchical_visitor {
/**
* A visitor helper that provides methods for updating the types of
* ir_dereferences. Classes that update variable types (say, updating
* array sizes) will want to use this so that dereference types stay in sync.
*/
class deref_type_updater : public ir_hierarchical_visitor {
public:
virtual ir_visitor_status visit(ir_dereference_variable *ir)
{
ir->type = ir->var->type;
return visit_continue;
}
virtual ir_visitor_status visit_leave(ir_dereference_array *ir)
{
const glsl_type *const vt = ir->array->type;
if (vt->is_array())
ir->type = vt->fields.array;
return visit_continue;
}
virtual ir_visitor_status visit_leave(ir_dereference_record *ir)
{
for (unsigned i = 0; i < ir->record->type->length; i++) {
const struct glsl_struct_field *field =
&ir->record->type->fields.structure[i];
if (strcmp(field->name, ir->field) == 0) {
ir->type = field->type;
break;
}
}
return visit_continue;
}
};
class array_resize_visitor : public deref_type_updater {
public:
unsigned num_vertices;
gl_shader_program *prog;
@@ -240,24 +276,6 @@ public:
return visit_continue;
}
/* Dereferences of input variables need to be updated so that their type
* matches the newly assigned type of the variable they are accessing. */
virtual ir_visitor_status visit(ir_dereference_variable *ir)
{
ir->type = ir->var->type;
return visit_continue;
}
/* Dereferences of 2D input arrays need to be updated so that their type
* matches the newly assigned type of the array they are accessing. */
virtual ir_visitor_status visit_leave(ir_dereference_array *ir)
{
const glsl_type *const vt = ir->array->type;
if (vt->is_array())
ir->type = vt->fields.array;
return visit_continue;
}
};
/**
@@ -1165,11 +1183,10 @@ interstage_cross_validate_uniform_blocks(struct gl_shader_program *prog,
if (stage_index != -1) {
struct gl_linked_shader *sh = prog->_LinkedShaders[i];
blks[j].stageref |= (1 << i);
struct gl_uniform_block **sh_blks = validate_ssbo ?
sh->ShaderStorageBlocks : sh->UniformBlocks;
blks[j].stageref |= sh_blks[stage_index]->stageref;
sh_blks[stage_index] = &blks[j];
}
}
@@ -1353,7 +1370,7 @@ move_non_declarations(exec_list *instructions, exec_node *last,
* it inside that function leads to compiler warnings with some versions of
* gcc.
*/
class array_sizing_visitor : public ir_hierarchical_visitor {
class array_sizing_visitor : public deref_type_updater {
public:
array_sizing_visitor()
: mem_ctx(ralloc_context(NULL)),
@@ -2273,6 +2290,8 @@ update_array_sizes(struct gl_shader_program *prog)
if (prog->_LinkedShaders[i] == NULL)
continue;
bool types_were_updated = false;
foreach_in_list(ir_instruction, node, prog->_LinkedShaders[i]->ir) {
ir_variable *const var = node->as_variable();
@@ -2328,11 +2347,15 @@ update_array_sizes(struct gl_shader_program *prog)
var->type = glsl_type::get_array_instance(var->type->fields.array,
size + 1);
/* FINISHME: We should update the types of array
* dereferences of this variable now.
*/
types_were_updated = true;
}
}
/* Update the types of dereferences in case we changed any. */
if (types_were_updated) {
deref_type_updater v;
v.run(prog->_LinkedShaders[i]->ir);
}
}
}
@@ -3094,7 +3117,6 @@ link_calculate_subroutine_compat(struct gl_shader_program *prog)
if (!uni)
continue;
sh->NumSubroutineUniforms++;
count = 0;
if (sh->NumSubroutineFunctions == 0) {
linker_error(prog, "subroutine uniform %s defined but no valid functions found\n", uni->type->name);
@@ -3574,6 +3596,7 @@ static gl_shader_variable *
create_shader_variable(struct gl_shader_program *shProg,
const ir_variable *in,
const char *name, const glsl_type *type,
const glsl_type *interface_type,
bool use_implicit_location, int location,
const glsl_type *outermost_struct_type)
{
@@ -3631,7 +3654,7 @@ create_shader_variable(struct gl_shader_program *shProg,
out->type = type;
out->outermost_struct_type = outermost_struct_type;
out->interface_type = in->get_interface_type();
out->interface_type = interface_type;
out->component = in->data.location_frac;
out->index = in->data.index;
out->patch = in->data.patch;
@@ -3643,8 +3666,21 @@ create_shader_variable(struct gl_shader_program *shProg,
return out;
}
static const glsl_type *
resize_to_max_patch_vertices(const struct gl_context *ctx,
const glsl_type *type)
{
if (!type)
return NULL;
return glsl_type::get_array_instance(type->fields.array,
ctx->Const.MaxPatchVertices);
}
static bool
add_shader_variable(struct gl_shader_program *shProg, struct set *resource_set,
add_shader_variable(const struct gl_context *ctx,
struct gl_shader_program *shProg,
struct set *resource_set,
unsigned stage_mask,
GLenum programInterface, ir_variable *var,
const char *name, const glsl_type *type,
@@ -3673,7 +3709,7 @@ add_shader_variable(struct gl_shader_program *shProg, struct set *resource_set,
for (unsigned i = 0; i < type->length; i++) {
const struct glsl_struct_field *field = &type->fields.structure[i];
char *field_name = ralloc_asprintf(shProg, "%s.%s", name, field->name);
if (!add_shader_variable(shProg, resource_set,
if (!add_shader_variable(ctx, shProg, resource_set,
stage_mask, programInterface,
var, field_name, field->type,
use_implicit_location, field_location,
@@ -3687,6 +3723,29 @@ add_shader_variable(struct gl_shader_program *shProg, struct set *resource_set,
}
default: {
const glsl_type *interface_type = var->get_interface_type();
/* Unsized (non-patch) TCS output/TES input arrays are implicitly
* sized to gl_MaxPatchVertices. Internally, we shrink them to a
* smaller size.
*
* This can cause trouble with SSO programs. Since the TCS declares
* the number of output vertices, we can always shrink TCS output
* arrays. However, the TES might not be linked with a TCS, in
* which case it won't know the size of the patch. In other words,
* the TCS and TES may disagree on the (smaller) array sizes. This
* can result in the resource names differing across stages, causing
* SSO validation failures and other cascading issues.
*
* Expanding the array size to the full gl_MaxPatchVertices fixes
* these issues. It's also what program interface queries expect,
* as that is the official size of the array.
*/
if (var->data.tess_varying_implicit_sized_array) {
type = resize_to_max_patch_vertices(ctx, type);
interface_type = resize_to_max_patch_vertices(ctx, interface_type);
}
/* Issue #16 of the ARB_program_interface_query spec says:
*
* "* If a variable is a member of an interface block without an
@@ -3699,8 +3758,7 @@ add_shader_variable(struct gl_shader_program *shProg, struct set *resource_set,
*/
const char *prefixed_name = (var->data.from_named_ifc_block &&
!is_gl_identifier(var->name))
? ralloc_asprintf(shProg, "%s.%s", var->get_interface_type()->name,
name)
? ralloc_asprintf(shProg, "%s.%s", interface_type->name, name)
: name;
/* The ARB_program_interface_query spec says:
@@ -3711,6 +3769,7 @@ add_shader_variable(struct gl_shader_program *shProg, struct set *resource_set,
*/
gl_shader_variable *sha_v =
create_shader_variable(shProg, var, prefixed_name, type,
interface_type,
use_implicit_location, location,
outermost_struct_type);
if (!sha_v)
@@ -3723,7 +3782,8 @@ add_shader_variable(struct gl_shader_program *shProg, struct set *resource_set,
}
static bool
add_interface_variables(struct gl_shader_program *shProg,
add_interface_variables(const struct gl_context *ctx,
struct gl_shader_program *shProg,
struct set *resource_set,
unsigned stage, GLenum programInterface)
{
@@ -3774,7 +3834,7 @@ add_interface_variables(struct gl_shader_program *shProg,
(stage == MESA_SHADER_VERTEX && var->data.mode == ir_var_shader_in) ||
(stage == MESA_SHADER_FRAGMENT && var->data.mode == ir_var_shader_out);
if (!add_shader_variable(shProg, resource_set,
if (!add_shader_variable(ctx, shProg, resource_set,
1 << stage, programInterface,
var, var->name, var->type, vs_input_or_fs_output,
var->data.location - loc_bias))
@@ -3784,7 +3844,9 @@ add_interface_variables(struct gl_shader_program *shProg,
}
static bool
add_packed_varyings(struct gl_shader_program *shProg, struct set *resource_set,
add_packed_varyings(const struct gl_context *ctx,
struct gl_shader_program *shProg,
struct set *resource_set,
int stage, GLenum type)
{
struct gl_linked_shader *sh = shProg->_LinkedShaders[stage];
@@ -3810,7 +3872,7 @@ add_packed_varyings(struct gl_shader_program *shProg, struct set *resource_set,
if (type == iface) {
const int stage_mask =
build_stageref(shProg, var->name, var->data.mode);
if (!add_shader_variable(shProg, resource_set,
if (!add_shader_variable(ctx, shProg, resource_set,
stage_mask,
iface, var, var->name, var->type, false,
var->data.location - VARYING_SLOT_VAR0))
@@ -3822,7 +3884,9 @@ add_packed_varyings(struct gl_shader_program *shProg, struct set *resource_set,
}
static bool
add_fragdata_arrays(struct gl_shader_program *shProg, struct set *resource_set)
add_fragdata_arrays(const struct gl_context *ctx,
struct gl_shader_program *shProg,
struct set *resource_set)
{
struct gl_linked_shader *sh = shProg->_LinkedShaders[MESA_SHADER_FRAGMENT];
@@ -3834,7 +3898,7 @@ add_fragdata_arrays(struct gl_shader_program *shProg, struct set *resource_set)
if (var) {
assert(var->data.mode == ir_var_shader_out);
if (!add_shader_variable(shProg, resource_set,
if (!add_shader_variable(ctx, shProg, resource_set,
1 << MESA_SHADER_FRAGMENT,
GL_PROGRAM_OUTPUT, var, var->name, var->type,
true, var->data.location - FRAG_RESULT_DATA0))
@@ -4093,24 +4157,24 @@ build_program_resource_list(struct gl_context *ctx,
/* Program interface needs to expose varyings in case of SSO. */
if (shProg->SeparateShader) {
if (!add_packed_varyings(shProg, resource_set,
if (!add_packed_varyings(ctx, shProg, resource_set,
input_stage, GL_PROGRAM_INPUT))
return;
if (!add_packed_varyings(shProg, resource_set,
if (!add_packed_varyings(ctx, shProg, resource_set,
output_stage, GL_PROGRAM_OUTPUT))
return;
}
if (!add_fragdata_arrays(shProg, resource_set))
if (!add_fragdata_arrays(ctx, shProg, resource_set))
return;
/* Add inputs and outputs to the resource list. */
if (!add_interface_variables(shProg, resource_set,
if (!add_interface_variables(ctx, shProg, resource_set,
input_stage, GL_PROGRAM_INPUT))
return;
if (!add_interface_variables(shProg, resource_set,
if (!add_interface_variables(ctx, shProg, resource_set,
output_stage, GL_PROGRAM_OUTPUT))
return;
@@ -4743,14 +4807,6 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
"type of shader\n");
}
for (unsigned int i = 0; i < MESA_SHADER_STAGES; i++) {
if (prog->_LinkedShaders[i] != NULL) {
_mesa_delete_linked_shader(ctx, prog->_LinkedShaders[i]);
}
prog->_LinkedShaders[i] = NULL;
}
/* Link all shaders for a particular stage and validate the result.
*/
for (int stage = 0; stage < MESA_SHADER_STAGES; stage++) {

View File

@@ -308,12 +308,18 @@ calc_blend_result(ir_factory f,
f.emit(assign(dst_alpha, swizzle_w(fb)));
f.emit(if_tree(equal(dst_alpha, imm1(0)),
assign(dst_rgb, imm3(0)),
assign(dst_rgb, div(swizzle_xyz(fb), dst_alpha))));
assign(dst_rgb, csel(equal(swizzle_xyz(fb),
swizzle(fb, SWIZZLE_WWWW, 3)),
imm3(1),
div(swizzle_xyz(fb), dst_alpha)))));
f.emit(assign(src_alpha, swizzle_w(src)));
f.emit(if_tree(equal(src_alpha, imm1(0)),
assign(src_rgb, imm3(0)),
assign(src_rgb, div(swizzle_xyz(src), src_alpha))));
assign(src_rgb, csel(equal(swizzle_xyz(src),
swizzle(src, SWIZZLE_WWWW, 3)),
imm3(1),
div(swizzle_xyz(src), src_alpha)))));
ir_variable *factor = f.make_temp(glsl_type::vec3_type, "__blend_factor");

View File

@@ -54,8 +54,8 @@
* want to recognize add(op0, neg(op1)) or the other way around to
* produce a subtract anyway.
*
* DIV_TO_MUL_RCP and INT_DIV_TO_MUL_RCP:
* --------------------------------------
* FDIV_TO_MUL_RCP, DDIV_TO_MUL_RCP, and INT_DIV_TO_MUL_RCP:
* ---------------------------------------------------------
* Breaks an ir_binop_div expression down to op0 * (rcp(op1)).
*
* Many GPUs don't have a divide instruction (945 and 965 included),
@@ -63,9 +63,11 @@
* reciprocal. By breaking the operation down, constant reciprocals
* can get constant folded.
*
* DIV_TO_MUL_RCP only lowers floating point division; INT_DIV_TO_MUL_RCP
* handles the integer case, converting to and from floating point so that
* RCP is possible.
* FDIV_TO_MUL_RCP only lowers single-precision floating point division;
* DDIV_TO_MUL_RCP only lowers double-precision floating point division.
* DIV_TO_MUL_RCP is a convenience macro that sets both flags.
* INT_DIV_TO_MUL_RCP handles the integer case, converting to and from floating
* point so that RCP is possible.
*
* EXP_TO_EXP2 and LOG_TO_LOG2:
* ----------------------------
@@ -326,7 +328,8 @@ lower_instructions_visitor::mod_to_floor(ir_expression *ir)
/* Don't generate new IR that would need to be lowered in an additional
* pass.
*/
if (lowering(DIV_TO_MUL_RCP) && (ir->type->is_float() || ir->type->is_double()))
if ((lowering(FDIV_TO_MUL_RCP) && ir->type->is_float()) ||
(lowering(DDIV_TO_MUL_RCP) && ir->type->is_double()))
div_to_mul_rcp(div_expr);
ir_expression *const floor_expr =
@@ -1588,8 +1591,8 @@ lower_instructions_visitor::visit_leave(ir_expression *ir)
case ir_binop_div:
if (ir->operands[1]->type->is_integer() && lowering(INT_DIV_TO_MUL_RCP))
int_div_to_mul_rcp(ir);
else if ((ir->operands[1]->type->is_float() ||
ir->operands[1]->type->is_double()) && lowering(DIV_TO_MUL_RCP))
else if ((ir->operands[1]->type->is_float() && lowering(FDIV_TO_MUL_RCP)) ||
(ir->operands[1]->type->is_double() && lowering(DDIV_TO_MUL_RCP)))
div_to_mul_rcp(ir);
break;

View File

@@ -193,6 +193,8 @@ flatten_named_interface_blocks_declarations::run(exec_list *instructions)
new_var->data.patch = iface_t->fields.structure[i].patch;
new_var->data.stream = var->data.stream;
new_var->data.how_declared = var->data.how_declared;
new_var->data.tess_varying_implicit_sized_array =
var->data.tess_varying_implicit_sized_array;
new_var->data.from_named_ifc_block = 1;
new_var->init_interface_type(var->type);

View File

@@ -157,7 +157,6 @@ ir_visitor_status
output_read_remover::visit_leave(ir_emit_vertex *ir)
{
hash_table_call_foreach(replacements, emit_return_copy, ir);
_mesa_hash_table_clear(replacements, NULL);
return visit_continue;
}

View File

@@ -107,7 +107,6 @@ public:
struct gl_linked_shader *shader;
bool clamp_block_indices;
struct gl_uniform_buffer_variable *ubo_var;
const struct glsl_struct_field *struct_field;
ir_variable *variable;
ir_rvalue *uniform_block;
@@ -308,8 +307,11 @@ lower_ubo_reference_visitor::setup_for_load_or_store(void *mem_ctx,
this->uniform_block = index;
}
this->ubo_var = var->is_interface_instance()
? &blocks[i]->Uniforms[0] : &blocks[i]->Uniforms[var->data.location];
if (var->is_interface_instance()) {
*const_offset = 0;
} else {
*const_offset = blocks[i]->Uniforms[var->data.location].Offset;
}
break;
}
@@ -317,8 +319,6 @@ lower_ubo_reference_visitor::setup_for_load_or_store(void *mem_ctx,
assert(this->uniform_block);
*const_offset = ubo_var->Offset;
this->struct_field = NULL;
setup_buffer_access(mem_ctx, deref, offset, const_offset, row_major,
matrix_columns, &this->struct_field, packing);

View File

@@ -128,7 +128,7 @@ ir_call::generate_inline(ir_instruction *next_ir)
parameters[i] = NULL;
} else {
parameters[i] = sig_param->clone(ctx, ht);
parameters[i]->data.mode = ir_var_auto;
parameters[i]->data.mode = ir_var_temporary;
/* Remove the read-only decoration because we're going to write
* directly to this variable. If the cloned variable is left

View File

@@ -355,7 +355,7 @@ ir_minmax_visitor::prune_expression(ir_expression *expr, minmax_range baserange)
*/
if (!is_redundant && limits[i].low && baserange.high) {
cr = compare_components(limits[i].low, baserange.high);
if (cr >= EQUAL && cr != MIXED)
if (cr > EQUAL && cr != MIXED)
is_redundant = true;
}
} else {
@@ -373,7 +373,7 @@ ir_minmax_visitor::prune_expression(ir_expression *expr, minmax_range baserange)
*/
if (!is_redundant && limits[i].high && baserange.low) {
cr = compare_components(limits[i].high, baserange.low);
if (cr <= EQUAL)
if (cr < EQUAL)
is_redundant = true;
}
}

View File

@@ -421,7 +421,7 @@ standalone_compile_shader(const struct standalone_options *_options,
}
if ((status == EXIT_SUCCESS) && options->do_link) {
_mesa_clear_shader_program_data(whole_program);
_mesa_clear_shader_program_data(ctx, whole_program);
link_shaders(ctx, whole_program);
status = (whole_program->LinkStatus) ? EXIT_SUCCESS : EXIT_FAILURE;

View File

@@ -123,8 +123,16 @@ _mesa_delete_linked_shader(struct gl_context *ctx,
}
void
_mesa_clear_shader_program_data(struct gl_shader_program *shProg)
_mesa_clear_shader_program_data(struct gl_context *ctx,
struct gl_shader_program *shProg)
{
for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
if (shProg->_LinkedShaders[i] != NULL) {
_mesa_delete_linked_shader(ctx, shProg->_LinkedShaders[i]);
shProg->_LinkedShaders[i] = NULL;
}
}
shProg->NumUniformStorage = 0;
shProg->UniformStorage = NULL;
shProg->NumUniformRemapTable = 0;

View File

@@ -56,7 +56,8 @@ _mesa_delete_linked_shader(struct gl_context *ctx,
struct gl_linked_shader *sh);
extern "C" void
_mesa_clear_shader_program_data(struct gl_shader_program *);
_mesa_clear_shader_program_data(struct gl_context *ctx,
struct gl_shader_program *);
extern "C" void
_mesa_shader_debug(struct gl_context *ctx, GLenum type, GLuint *id,

View File

@@ -0,0 +1,717 @@
/*
* Copyright © 2016 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*/
#include <gtest/gtest.h>
#include "ir.h"
#include "ir_array_refcount.h"
#include "ir_builder.h"
#include "util/hash_table.h"
using namespace ir_builder;
class array_refcount_test : public ::testing::Test {
public:
virtual void SetUp();
virtual void TearDown();
exec_list instructions;
ir_factory *body;
void *mem_ctx;
/**
* glsl_type for a vec4[3][4][5].
*
* The exceptionally verbose name is picked because it matches the syntax
* of http://cdecl.org/.
*/
const glsl_type *array_3_of_array_4_of_array_5_of_vec4;
/**
* glsl_type for a int[3].
*
* The exceptionally verbose name is picked because it matches the syntax
* of http://cdecl.org/.
*/
const glsl_type *array_3_of_int;
/**
* Wrapper to access private member "bits" of ir_array_refcount_entry
*
* The test class is a friend to ir_array_refcount_entry, but the
* individual tests are not part of the class. Since the friendliness of
* the test class does not extend to the tests, provide a wrapper.
*/
const BITSET_WORD *get_bits(const ir_array_refcount_entry &entry)
{
return entry.bits;
}
/**
* Wrapper to access private member "num_bits" of ir_array_refcount_entry
*
* The test class is a friend to ir_array_refcount_entry, but the
* individual tests are not part of the class. Since the friendliness of
* the test class does not extend to the tests, provide a wrapper.
*/
unsigned get_num_bits(const ir_array_refcount_entry &entry)
{
return entry.num_bits;
}
/**
* Wrapper to access private member "array_depth" of ir_array_refcount_entry
*
* The test class is a friend to ir_array_refcount_entry, but the
* individual tests are not part of the class. Since the friendliness of
* the test class does not extend to the tests, provide a wrapper.
*/
unsigned get_array_depth(const ir_array_refcount_entry &entry)
{
return entry.array_depth;
}
};
void
array_refcount_test::SetUp()
{
mem_ctx = ralloc_context(NULL);
instructions.make_empty();
body = new ir_factory(&instructions, mem_ctx);
/* The type of vec4 x[3][4][5]; */
const glsl_type *const array_5_of_vec4 =
glsl_type::get_array_instance(glsl_type::vec4_type, 5);
const glsl_type *const array_4_of_array_5_of_vec4 =
glsl_type::get_array_instance(array_5_of_vec4, 4);
array_3_of_array_4_of_array_5_of_vec4 =
glsl_type::get_array_instance(array_4_of_array_5_of_vec4, 3);
array_3_of_int = glsl_type::get_array_instance(glsl_type::int_type, 3);
}
void
array_refcount_test::TearDown()
{
delete body;
body = NULL;
ralloc_free(mem_ctx);
mem_ctx = NULL;
}
static operand
deref_array(operand array, operand index)
{
void *mem_ctx = ralloc_parent(array.val);
ir_rvalue *val = new(mem_ctx) ir_dereference_array(array.val, index.val);
return operand(val);
}
static operand
deref_struct(operand s, const char *field)
{
void *mem_ctx = ralloc_parent(s.val);
ir_rvalue *val = new(mem_ctx) ir_dereference_record(s.val, field);
return operand(val);
}
/**
* Verify that only the specified set of ir_variables exists in the hash table
*/
static void
validate_variables_in_hash_table(struct hash_table *ht,
unsigned count,
...)
{
ir_variable **vars = new ir_variable *[count];
va_list args;
/* Make a copy of the list of expected ir_variables. The copied list can
* be modified during the checking.
*/
va_start(args, count);
for (unsigned i = 0; i < count; i++)
vars[i] = va_arg(args, ir_variable *);
va_end(args);
struct hash_entry *entry;
hash_table_foreach(ht, entry) {
const ir_instruction *const ir = (ir_instruction *) entry->key;
const ir_variable *const v = ir->as_variable();
if (v == NULL) {
ADD_FAILURE() << "Invalid junk in hash table: ir_type = "
<< ir->ir_type << ", address = "
<< (void *) ir;
continue;
}
unsigned i;
for (i = 0; i < count; i++) {
if (vars[i] == NULL)
continue;
if (vars[i] == v)
break;
}
if (i == count) {
ADD_FAILURE() << "Invalid variable in hash table: \""
<< v->name << "\"";
} else {
/* As each variable is encountered, remove it from the set. Don't
* bother compacting the set because we don't care about
* performance here.
*/
vars[i] = NULL;
}
}
/* Check that there's nothing left in the set. */
for (unsigned i = 0; i < count; i++) {
if (vars[i] != NULL) {
ADD_FAILURE() << "Variable was not in the hash table: \""
<< vars[i]->name << "\"";
}
}
delete [] vars;
}
TEST_F(array_refcount_test, ir_array_refcount_entry_initial_state_for_scalar)
{
ir_variable *const var =
new(mem_ctx) ir_variable(glsl_type::int_type, "a", ir_var_auto);
ir_array_refcount_entry entry(var);
ASSERT_NE((void *)0, get_bits(entry));
EXPECT_FALSE(entry.is_referenced);
EXPECT_EQ(1, get_num_bits(entry));
EXPECT_EQ(0, get_array_depth(entry));
EXPECT_FALSE(entry.is_linearized_index_referenced(0));
}
TEST_F(array_refcount_test, ir_array_refcount_entry_initial_state_for_vector)
{
ir_variable *const var =
new(mem_ctx) ir_variable(glsl_type::vec4_type, "a", ir_var_auto);
ir_array_refcount_entry entry(var);
ASSERT_NE((void *)0, get_bits(entry));
EXPECT_FALSE(entry.is_referenced);
EXPECT_EQ(1, get_num_bits(entry));
EXPECT_EQ(0, get_array_depth(entry));
EXPECT_FALSE(entry.is_linearized_index_referenced(0));
}
TEST_F(array_refcount_test, ir_array_refcount_entry_initial_state_for_matrix)
{
ir_variable *const var =
new(mem_ctx) ir_variable(glsl_type::mat4_type, "a", ir_var_auto);
ir_array_refcount_entry entry(var);
ASSERT_NE((void *)0, get_bits(entry));
EXPECT_FALSE(entry.is_referenced);
EXPECT_EQ(1, get_num_bits(entry));
EXPECT_EQ(0, get_array_depth(entry));
EXPECT_FALSE(entry.is_linearized_index_referenced(0));
}
TEST_F(array_refcount_test, ir_array_refcount_entry_initial_state_for_array)
{
ir_variable *const var =
new(mem_ctx) ir_variable(array_3_of_array_4_of_array_5_of_vec4,
"a",
ir_var_auto);
const unsigned total_elements = var->type->arrays_of_arrays_size();
ir_array_refcount_entry entry(var);
ASSERT_NE((void *)0, get_bits(entry));
EXPECT_FALSE(entry.is_referenced);
EXPECT_EQ(total_elements, get_num_bits(entry));
EXPECT_EQ(3, get_array_depth(entry));
for (unsigned i = 0; i < total_elements; i++)
EXPECT_FALSE(entry.is_linearized_index_referenced(i)) << "index = " << i;
}
TEST_F(array_refcount_test, mark_array_elements_referenced_simple)
{
ir_variable *const var =
new(mem_ctx) ir_variable(array_3_of_array_4_of_array_5_of_vec4,
"a",
ir_var_auto);
const unsigned total_elements = var->type->arrays_of_arrays_size();
ir_array_refcount_entry entry(var);
static const array_deref_range dr[] = {
{ 0, 5 }, { 1, 4 }, { 2, 3 }
};
const unsigned accessed_element = 0 + (1 * 5) + (2 * 4 * 5);
entry.mark_array_elements_referenced(dr, 3);
for (unsigned i = 0; i < total_elements; i++)
EXPECT_EQ(i == accessed_element, entry.is_linearized_index_referenced(i));
}
TEST_F(array_refcount_test, mark_array_elements_referenced_whole_first_array)
{
ir_variable *const var =
new(mem_ctx) ir_variable(array_3_of_array_4_of_array_5_of_vec4,
"a",
ir_var_auto);
ir_array_refcount_entry entry(var);
static const array_deref_range dr[] = {
{ 0, 5 }, { 1, 4 }, { 3, 3 }
};
entry.mark_array_elements_referenced(dr, 3);
for (unsigned i = 0; i < 3; i++) {
for (unsigned j = 0; j < 4; j++) {
for (unsigned k = 0; k < 5; k++) {
const bool accessed = (j == 1) && (k == 0);
const unsigned linearized_index = k + (j * 5) + (i * 4 * 5);
EXPECT_EQ(accessed,
entry.is_linearized_index_referenced(linearized_index));
}
}
}
}
TEST_F(array_refcount_test, mark_array_elements_referenced_whole_second_array)
{
ir_variable *const var =
new(mem_ctx) ir_variable(array_3_of_array_4_of_array_5_of_vec4,
"a",
ir_var_auto);
ir_array_refcount_entry entry(var);
static const array_deref_range dr[] = {
{ 0, 5 }, { 4, 4 }, { 1, 3 }
};
entry.mark_array_elements_referenced(dr, 3);
for (unsigned i = 0; i < 3; i++) {
for (unsigned j = 0; j < 4; j++) {
for (unsigned k = 0; k < 5; k++) {
const bool accessed = (i == 1) && (k == 0);
const unsigned linearized_index = k + (j * 5) + (i * 4 * 5);
EXPECT_EQ(accessed,
entry.is_linearized_index_referenced(linearized_index));
}
}
}
}
TEST_F(array_refcount_test, mark_array_elements_referenced_whole_third_array)
{
ir_variable *const var =
new(mem_ctx) ir_variable(array_3_of_array_4_of_array_5_of_vec4,
"a",
ir_var_auto);
ir_array_refcount_entry entry(var);
static const array_deref_range dr[] = {
{ 5, 5 }, { 2, 4 }, { 1, 3 }
};
entry.mark_array_elements_referenced(dr, 3);
for (unsigned i = 0; i < 3; i++) {
for (unsigned j = 0; j < 4; j++) {
for (unsigned k = 0; k < 5; k++) {
const bool accessed = (i == 1) && (j == 2);
const unsigned linearized_index = k + (j * 5) + (i * 4 * 5);
EXPECT_EQ(accessed,
entry.is_linearized_index_referenced(linearized_index));
}
}
}
}
TEST_F(array_refcount_test, mark_array_elements_referenced_whole_first_and_third_arrays)
{
ir_variable *const var =
new(mem_ctx) ir_variable(array_3_of_array_4_of_array_5_of_vec4,
"a",
ir_var_auto);
ir_array_refcount_entry entry(var);
static const array_deref_range dr[] = {
{ 5, 5 }, { 3, 4 }, { 3, 3 }
};
entry.mark_array_elements_referenced(dr, 3);
for (unsigned i = 0; i < 3; i++) {
for (unsigned j = 0; j < 4; j++) {
for (unsigned k = 0; k < 5; k++) {
const bool accessed = (j == 3);
const unsigned linearized_index = k + (j * 5) + (i * 4 * 5);
EXPECT_EQ(accessed,
entry.is_linearized_index_referenced(linearized_index));
}
}
}
}
TEST_F(array_refcount_test, do_not_process_vector_indexing)
{
/* Vectors and matrices can also be indexed in much the same manner as
* arrays. The visitor should not try to track per-element accesses to
* these types.
*/
ir_variable *var_a = new(mem_ctx) ir_variable(glsl_type::float_type,
"a",
ir_var_auto);
ir_variable *var_b = new(mem_ctx) ir_variable(glsl_type::int_type,
"b",
ir_var_auto);
ir_variable *var_c = new(mem_ctx) ir_variable(glsl_type::vec4_type,
"c",
ir_var_auto);
body->emit(assign(var_a, deref_array(var_c, var_b)));
ir_array_refcount_visitor v;
visit_list_elements(&v, &instructions);
ir_array_refcount_entry *entry_a = v.get_variable_entry(var_a);
ir_array_refcount_entry *entry_b = v.get_variable_entry(var_b);
ir_array_refcount_entry *entry_c = v.get_variable_entry(var_c);
EXPECT_TRUE(entry_a->is_referenced);
EXPECT_TRUE(entry_b->is_referenced);
EXPECT_TRUE(entry_c->is_referenced);
/* As validated by previous tests, for non-array types, num_bits is 1. */
ASSERT_EQ(1, get_num_bits(*entry_c));
EXPECT_FALSE(entry_c->is_linearized_index_referenced(0));
}
TEST_F(array_refcount_test, do_not_process_matrix_indexing)
{
/* Vectors and matrices can also be indexed in much the same manner as
* arrays. The visitor should not try to track per-element accesses to
* these types.
*/
ir_variable *var_a = new(mem_ctx) ir_variable(glsl_type::vec4_type,
"a",
ir_var_auto);
ir_variable *var_b = new(mem_ctx) ir_variable(glsl_type::int_type,
"b",
ir_var_auto);
ir_variable *var_c = new(mem_ctx) ir_variable(glsl_type::mat4_type,
"c",
ir_var_auto);
body->emit(assign(var_a, deref_array(var_c, var_b)));
ir_array_refcount_visitor v;
visit_list_elements(&v, &instructions);
ir_array_refcount_entry *entry_a = v.get_variable_entry(var_a);
ir_array_refcount_entry *entry_b = v.get_variable_entry(var_b);
ir_array_refcount_entry *entry_c = v.get_variable_entry(var_c);
EXPECT_TRUE(entry_a->is_referenced);
EXPECT_TRUE(entry_b->is_referenced);
EXPECT_TRUE(entry_c->is_referenced);
/* As validated by previous tests, for non-array types, num_bits is 1. */
ASSERT_EQ(1, get_num_bits(*entry_c));
EXPECT_FALSE(entry_c->is_linearized_index_referenced(0));
}
TEST_F(array_refcount_test, do_not_process_array_inside_structure)
{
/* Structures can contain arrays. The visitor should not try to track
* per-element accesses to arrays contained inside structures.
*/
const glsl_struct_field fields[] = {
glsl_struct_field(array_3_of_int, "i"),
};
const glsl_type *const record_of_array_3_of_int =
glsl_type::get_record_instance(fields, ARRAY_SIZE(fields), "S");
ir_variable *var_a = new(mem_ctx) ir_variable(glsl_type::int_type,
"a",
ir_var_auto);
ir_variable *var_b = new(mem_ctx) ir_variable(record_of_array_3_of_int,
"b",
ir_var_auto);
/* a = b.i[2] */
body->emit(assign(var_a,
deref_array(
deref_struct(var_b, "i"),
body->constant(int(2)))));
ir_array_refcount_visitor v;
visit_list_elements(&v, &instructions);
ir_array_refcount_entry *entry_a = v.get_variable_entry(var_a);
ir_array_refcount_entry *entry_b = v.get_variable_entry(var_b);
EXPECT_TRUE(entry_a->is_referenced);
EXPECT_TRUE(entry_b->is_referenced);
ASSERT_EQ(1, get_num_bits(*entry_b));
EXPECT_FALSE(entry_b->is_linearized_index_referenced(0));
validate_variables_in_hash_table(v.ht, 2, var_a, var_b);
}
TEST_F(array_refcount_test, visit_simple_indexing)
{
ir_variable *var_a = new(mem_ctx) ir_variable(glsl_type::vec4_type,
"a",
ir_var_auto);
ir_variable *var_b = new(mem_ctx) ir_variable(array_3_of_array_4_of_array_5_of_vec4,
"b",
ir_var_auto);
/* a = b[2][1][0] */
body->emit(assign(var_a,
deref_array(
deref_array(
deref_array(var_b, body->constant(int(2))),
body->constant(int(1))),
body->constant(int(0)))));
ir_array_refcount_visitor v;
visit_list_elements(&v, &instructions);
const unsigned accessed_element = 0 + (1 * 5) + (2 * 4 * 5);
ir_array_refcount_entry *entry_b = v.get_variable_entry(var_b);
const unsigned total_elements = var_b->type->arrays_of_arrays_size();
for (unsigned i = 0; i < total_elements; i++)
EXPECT_EQ(i == accessed_element, entry_b->is_linearized_index_referenced(i)) <<
"i = " << i;
validate_variables_in_hash_table(v.ht, 2, var_a, var_b);
}
TEST_F(array_refcount_test, visit_whole_second_array_indexing)
{
ir_variable *var_a = new(mem_ctx) ir_variable(glsl_type::vec4_type,
"a",
ir_var_auto);
ir_variable *var_b = new(mem_ctx) ir_variable(array_3_of_array_4_of_array_5_of_vec4,
"b",
ir_var_auto);
ir_variable *var_i = new(mem_ctx) ir_variable(glsl_type::int_type,
"i",
ir_var_auto);
/* a = b[2][i][1] */
body->emit(assign(var_a,
deref_array(
deref_array(
deref_array(var_b, body->constant(int(2))),
var_i),
body->constant(int(1)))));
ir_array_refcount_visitor v;
visit_list_elements(&v, &instructions);
ir_array_refcount_entry *const entry_b = v.get_variable_entry(var_b);
for (unsigned i = 0; i < 3; i++) {
for (unsigned j = 0; j < 4; j++) {
for (unsigned k = 0; k < 5; k++) {
const bool accessed = (i == 2) && (k == 1);
const unsigned linearized_index = k + (j * 5) + (i * 4 * 5);
EXPECT_EQ(accessed,
entry_b->is_linearized_index_referenced(linearized_index)) <<
"i = " << i;
}
}
}
validate_variables_in_hash_table(v.ht, 3, var_a, var_b, var_i);
}
TEST_F(array_refcount_test, visit_array_indexing_an_array)
{
ir_variable *var_a = new(mem_ctx) ir_variable(glsl_type::vec4_type,
"a",
ir_var_auto);
ir_variable *var_b = new(mem_ctx) ir_variable(array_3_of_array_4_of_array_5_of_vec4,
"b",
ir_var_auto);
ir_variable *var_c = new(mem_ctx) ir_variable(array_3_of_int,
"c",
ir_var_auto);
ir_variable *var_i = new(mem_ctx) ir_variable(glsl_type::int_type,
"i",
ir_var_auto);
/* a = b[2][3][c[i]] */
body->emit(assign(var_a,
deref_array(
deref_array(
deref_array(var_b, body->constant(int(2))),
body->constant(int(3))),
deref_array(var_c, var_i))));
ir_array_refcount_visitor v;
visit_list_elements(&v, &instructions);
ir_array_refcount_entry *const entry_b = v.get_variable_entry(var_b);
for (unsigned i = 0; i < 3; i++) {
for (unsigned j = 0; j < 4; j++) {
for (unsigned k = 0; k < 5; k++) {
const bool accessed = (i == 2) && (j == 3);
const unsigned linearized_index = k + (j * 5) + (i * 4 * 5);
EXPECT_EQ(accessed,
entry_b->is_linearized_index_referenced(linearized_index)) <<
"array b[" << i << "][" << j << "][" << k << "], " <<
"linear index = " << linearized_index;
}
}
}
ir_array_refcount_entry *const entry_c = v.get_variable_entry(var_c);
for (unsigned i = 0; i < var_c->type->array_size(); i++) {
EXPECT_EQ(true, entry_c->is_linearized_index_referenced(i)) <<
"array c, i = " << i;
}
validate_variables_in_hash_table(v.ht, 4, var_a, var_b, var_c, var_i);
}
TEST_F(array_refcount_test, visit_array_indexing_with_itself)
{
const glsl_type *const array_2_of_array_3_of_int =
glsl_type::get_array_instance(array_3_of_int, 2);
const glsl_type *const array_2_of_array_2_of_array_3_of_int =
glsl_type::get_array_instance(array_2_of_array_3_of_int, 2);
ir_variable *var_a = new(mem_ctx) ir_variable(glsl_type::int_type,
"a",
ir_var_auto);
ir_variable *var_b = new(mem_ctx) ir_variable(array_2_of_array_2_of_array_3_of_int,
"b",
ir_var_auto);
/* Given GLSL code:
*
* int b[2][2][3];
* a = b[ b[0][0][0] ][ b[ b[0][1][0] ][ b[1][0][0] ][1] ][2]
*
* b[0][0][0], b[0][1][0], and b[1][0][0] are trivially accessed.
*
* b[*][*][1] and b[*][*][2] are accessed.
*
* Only b[1][1][0] is not accessed.
*/
operand b000 = deref_array(
deref_array(
deref_array(var_b, body->constant(int(0))),
body->constant(int(0))),
body->constant(int(0)));
operand b010 = deref_array(
deref_array(
deref_array(var_b, body->constant(int(0))),
body->constant(int(1))),
body->constant(int(0)));
operand b100 = deref_array(
deref_array(
deref_array(var_b, body->constant(int(1))),
body->constant(int(0))),
body->constant(int(0)));
operand b_b010_b100_1 = deref_array(
deref_array(
deref_array(var_b, b010),
b100),
body->constant(int(1)));
body->emit(assign(var_a,
deref_array(
deref_array(
deref_array(var_b, b000),
b_b010_b100_1),
body->constant(int(2)))));
ir_array_refcount_visitor v;
visit_list_elements(&v, &instructions);
ir_array_refcount_entry *const entry_b = v.get_variable_entry(var_b);
for (unsigned i = 0; i < 2; i++) {
for (unsigned j = 0; j < 2; j++) {
for (unsigned k = 0; k < 3; k++) {
const bool accessed = !(i == 1 && j == 1 && k == 0);
const unsigned linearized_index = k + (j * 3) + (i * 2 * 3);
EXPECT_EQ(accessed,
entry_b->is_linearized_index_referenced(linearized_index)) <<
"array b[" << i << "][" << j << "][" << k << "], " <<
"linear index = " << linearized_index;
}
}
}
validate_variables_in_hash_table(v.ht, 2, var_a, var_b);
}

View File

@@ -2625,6 +2625,8 @@ bool nir_opt_remove_phis(nir_shader *shader);
bool nir_opt_undef(nir_shader *shader);
bool nir_opt_conditional_discard(nir_shader *shader);
void nir_sweep(nir_shader *shader);
nir_intrinsic_op nir_intrinsic_from_system_value(gl_system_value val);

View File

@@ -272,6 +272,26 @@ lower_interp_var_at_offset(lower_wpos_ytransform_state *state,
flip_y)));
}
static void
lower_load_sample_pos(lower_wpos_ytransform_state *state,
nir_intrinsic_instr *intr)
{
nir_builder *b = &state->b;
b->cursor = nir_after_instr(&intr->instr);
nir_ssa_def *pos = &intr->dest.ssa;
nir_ssa_def *scale = nir_channel(b, get_transform(state), 0);
nir_ssa_def *neg_scale = nir_channel(b, get_transform(state), 2);
/* Either y or 1-y for scale equal to 1 or -1 respectively. */
nir_ssa_def *flipped_y =
nir_fadd(b, nir_fmax(b, neg_scale, nir_imm_float(b, 0.0)),
nir_fmul(b, nir_channel(b, pos, 1), scale));
nir_ssa_def *flipped_pos = nir_vec2(b, nir_channel(b, pos, 0), flipped_y);
nir_ssa_def_rewrite_uses_after(&intr->dest.ssa, nir_src_for_ssa(flipped_pos),
flipped_pos->parent_instr);
}
static void
lower_wpos_ytransform_block(lower_wpos_ytransform_state *state, nir_block *block)
{
@@ -287,6 +307,10 @@ lower_wpos_ytransform_block(lower_wpos_ytransform_state *state, nir_block *block
/* gl_FragCoord should not have array/struct deref's: */
assert(dvar->deref.child == NULL);
lower_fragcoord(state, intr);
} else if (var->data.mode == nir_var_system_value &&
var->data.location == SYSTEM_VALUE_SAMPLE_POS) {
assert(dvar->deref.child == NULL);
lower_load_sample_pos(state, intr);
}
} else if (intr->intrinsic == nir_intrinsic_interp_var_at_offset) {
lower_interp_var_at_offset(state, intr);

View File

@@ -0,0 +1,125 @@
/*
* Copyright © 2016 Red Hat
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include "nir.h"
#include "nir_builder.h"
/** @file nir_opt_conditional_discard.c
*
* Handles optimization of lowering if (cond) discard to discard_if(cond).
*/
static bool
nir_opt_conditional_discard_block(nir_block *block, void *mem_ctx)
{
nir_builder bld;
if (nir_cf_node_is_first(&block->cf_node))
return false;
nir_cf_node *prev_node = nir_cf_node_prev(&block->cf_node);
if (prev_node->type != nir_cf_node_if)
return false;
nir_if *if_stmt = nir_cf_node_as_if(prev_node);
nir_block *then_block = nir_if_first_then_block(if_stmt);
nir_block *else_block = nir_if_first_else_block(if_stmt);
/* check there is only one else block and it is empty */
if (nir_if_last_else_block(if_stmt) != else_block)
return false;
if (!exec_list_is_empty(&else_block->instr_list))
return false;
/* check there is only one then block and it has only one instruction in it */
if (nir_if_last_then_block(if_stmt) != then_block)
return false;
if (exec_list_is_empty(&then_block->instr_list))
return false;
if (exec_list_length(&then_block->instr_list) > 1)
return false;
/*
* make sure no subsequent phi nodes point at this if.
*/
nir_block *after = nir_cf_node_as_block(nir_cf_node_next(&if_stmt->cf_node));
nir_foreach_instr_safe(instr, after) {
if (instr->type != nir_instr_type_phi)
break;
nir_phi_instr *phi = nir_instr_as_phi(instr);
nir_foreach_phi_src(phi_src, phi) {
if (phi_src->pred == then_block ||
phi_src->pred == else_block)
return false;
}
}
/* Get the first instruction in the then block and confirm it is
* a discard or a discard_if
*/
nir_instr *instr = nir_block_first_instr(then_block);
if (instr->type != nir_instr_type_intrinsic)
return false;
nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
if (intrin->intrinsic != nir_intrinsic_discard &&
intrin->intrinsic != nir_intrinsic_discard_if)
return false;
nir_src cond;
nir_builder_init(&bld, mem_ctx);
bld.cursor = nir_before_cf_node(prev_node);
if (intrin->intrinsic == nir_intrinsic_discard)
cond = if_stmt->condition;
else
cond = nir_src_for_ssa(nir_iand(&bld,
nir_ssa_for_src(&bld, if_stmt->condition, 1),
nir_ssa_for_src(&bld, intrin->src[0], 1)));
nir_intrinsic_instr *discard_if =
nir_intrinsic_instr_create(mem_ctx, nir_intrinsic_discard_if);
nir_src_copy(&discard_if->src[0], &cond, discard_if);
nir_instr_insert_before_cf(prev_node, &discard_if->instr);
nir_instr_remove(&intrin->instr);
nir_cf_node_remove(&if_stmt->cf_node);
return true;
}
bool
nir_opt_conditional_discard(nir_shader *shader)
{
bool progress = false;
nir_foreach_function(function, shader) {
if (function->impl) {
void *mem_ctx = ralloc_parent(function->impl);
nir_foreach_block_safe(block, function->impl) {
progress |= nir_opt_conditional_discard_block(block, mem_ctx);
}
}
}
return progress;
}

View File

@@ -86,17 +86,15 @@ opt_undef_vecN(nir_builder *b, nir_alu_instr *alu)
assert(alu->dest.dest.is_ssa);
unsigned num_components = nir_op_infos[alu->op].num_inputs;
for (unsigned i = 0; i < num_components; i++) {
for (unsigned i = 0; i < nir_op_infos[alu->op].num_inputs; i++) {
if (!alu->src[i].src.is_ssa ||
alu->src[i].src.ssa->parent_instr->type != nir_instr_type_ssa_undef)
return false;
}
b->cursor = nir_before_instr(&alu->instr);
nir_ssa_def *undef =
nir_ssa_undef(b, num_components, nir_dest_bit_size(alu->dest.dest));
nir_ssa_def *undef = nir_ssa_undef(b, alu->dest.dest.ssa.num_components,
nir_dest_bit_size(alu->dest.dest));
nir_ssa_def_rewrite_uses(&alu->dest.dest.ssa, nir_src_for_ssa(undef));
return true;

View File

@@ -98,6 +98,16 @@ match_value(const nir_search_value *value, nir_alu_instr *instr, unsigned src,
{
uint8_t new_swizzle[4];
/* Searching only works on SSA values because, if it's not SSA, we can't
* know if the value changed between one instance of that value in the
* expression and another. Also, the replace operation will place reads of
* that value right before the last instruction in the expression we're
* replacing so those reads will happen after the original reads and may
* not be valid if they're register reads.
*/
if (!instr->src[src].src.is_ssa)
return false;
/* If the source is an explicitly sized source, then we need to reset
* both the number of components and the swizzle.
*/
@@ -116,9 +126,6 @@ match_value(const nir_search_value *value, nir_alu_instr *instr, unsigned src,
switch (value->type) {
case nir_search_value_expression:
if (!instr->src[src].src.is_ssa)
return false;
if (instr->src[src].src.ssa->parent_instr->type != nir_instr_type_alu)
return false;
@@ -131,8 +138,7 @@ match_value(const nir_search_value *value, nir_alu_instr *instr, unsigned src,
assert(var->variable < NIR_SEARCH_MAX_VARIABLES);
if (state->variables_seen & (1 << var->variable)) {
if (!nir_srcs_equal(state->variables[var->variable].src,
instr->src[src].src))
if (state->variables[var->variable].src.ssa != instr->src[src].src.ssa)
return false;
assert(!instr->src[src].abs && !instr->src[src].negate);
@@ -204,43 +210,27 @@ match_value(const nir_search_value *value, nir_alu_instr *instr, unsigned src,
return true;
case nir_type_int:
for (unsigned i = 0; i < num_components; ++i) {
int64_t val;
switch (load->def.bit_size) {
case 32:
val = load->value.i32[new_swizzle[i]];
break;
case 64:
val = load->value.i64[new_swizzle[i]];
break;
default:
unreachable("unknown bit size");
}
if (val != const_val->data.i)
return false;
}
return true;
case nir_type_uint:
case nir_type_bool32:
for (unsigned i = 0; i < num_components; ++i) {
uint64_t val;
switch (load->def.bit_size) {
case 32:
val = load->value.u32[new_swizzle[i]];
break;
case 64:
val = load->value.u64[new_swizzle[i]];
break;
default:
unreachable("unknown bit size");
switch (load->def.bit_size) {
case 32:
for (unsigned i = 0; i < num_components; ++i) {
if (load->value.u32[new_swizzle[i]] !=
(uint32_t)const_val->data.u)
return false;
}
return true;
if (val != const_val->data.u)
return false;
case 64:
for (unsigned i = 0; i < num_components; ++i) {
if (load->value.u64[new_swizzle[i]] != const_val->data.u)
return false;
}
return true;
default:
unreachable("unknown bit size");
}
return true;
default:
unreachable("Invalid alu source type");

View File

@@ -1055,16 +1055,30 @@ vtn_handle_constant(struct vtn_builder *b, SpvOp opcode,
SpvOp opcode = get_specialization(b, val, w[3]);
switch (opcode) {
case SpvOpVectorShuffle: {
struct vtn_value *v0 = vtn_value(b, w[4], vtn_value_type_constant);
struct vtn_value *v1 = vtn_value(b, w[5], vtn_value_type_constant);
unsigned len0 = glsl_get_vector_elements(v0->const_type);
unsigned len1 = glsl_get_vector_elements(v1->const_type);
struct vtn_value *v0 = &b->values[w[4]];
struct vtn_value *v1 = &b->values[w[5]];
assert(v0->value_type == vtn_value_type_constant ||
v0->value_type == vtn_value_type_undef);
assert(v1->value_type == vtn_value_type_constant ||
v1->value_type == vtn_value_type_undef);
unsigned len0 = v0->value_type == vtn_value_type_constant ?
glsl_get_vector_elements(v0->const_type) :
glsl_get_vector_elements(v0->type->type);
unsigned len1 = v1->value_type == vtn_value_type_constant ?
glsl_get_vector_elements(v1->const_type) :
glsl_get_vector_elements(v1->type->type);
uint32_t u[8];
for (unsigned i = 0; i < len0; i++)
u[i] = v0->constant->value.u[i];
for (unsigned i = 0; i < len1; i++)
u[len0 + i] = v1->constant->value.u[i];
if (v0->value_type == vtn_value_type_constant) {
for (unsigned i = 0; i < len0; i++)
u[i] = v0->constant->value.u[i];
}
if (v1->value_type == vtn_value_type_constant) {
for (unsigned i = 0; i < len1; i++)
u[len0 + i] = v1->constant->value.u[i];
}
for (unsigned i = 0; i < count - 6; i++) {
uint32_t comp = w[i + 6];
@@ -2707,6 +2721,7 @@ vtn_handle_variable_or_type_instruction(struct vtn_builder *b, SpvOp opcode,
vtn_handle_constant(b, opcode, w, count);
break;
case SpvOpUndef:
case SpvOpVariable:
vtn_handle_variables(b, opcode, w, count);
break;

View File

@@ -527,12 +527,13 @@ vtn_handle_phi_second_pass(struct vtn_builder *b, SpvOp opcode,
nir_variable *phi_var = phi_entry->data;
for (unsigned i = 3; i < count; i += 2) {
struct vtn_ssa_value *src = vtn_ssa_value(b, w[i]);
struct vtn_block *pred =
vtn_value(b, w[i + 1], vtn_value_type_block)->block;
b->nb.cursor = nir_after_instr(&pred->end_nop->instr);
struct vtn_ssa_value *src = vtn_ssa_value(b, w[i]);
vtn_local_store(b, src, nir_deref_var_create(b, phi_var));
}

View File

@@ -565,16 +565,21 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 entrypoint,
build_exp(nb, nir_fneg(nb, src[0]))));
return;
case GLSLstd450Tanh:
/* (0.5 * (e^x - e^(-x))) / (0.5 * (e^x + e^(-x))) */
val->ssa->def =
nir_fdiv(nb, nir_fmul(nb, nir_imm_float(nb, 0.5f),
nir_fsub(nb, build_exp(nb, src[0]),
build_exp(nb, nir_fneg(nb, src[0])))),
nir_fmul(nb, nir_imm_float(nb, 0.5f),
nir_fadd(nb, build_exp(nb, src[0]),
build_exp(nb, nir_fneg(nb, src[0])))));
case GLSLstd450Tanh: {
/* tanh(x) := (0.5 * (e^x - e^(-x))) / (0.5 * (e^x + e^(-x)))
*
* With a little algebra this reduces to (e^2x - 1) / (e^2x + 1)
*
* We clamp x to (-inf, +10] to avoid precision problems. When x > 10,
* e^2x is so much larger than 1.0 that 1.0 gets flushed to zero in the
* computation e^2x +/- 1 so it can be ignored.
*/
nir_ssa_def *x = nir_fmin(nb, src[0], nir_imm_float(nb, 10));
nir_ssa_def *exp2x = build_exp(nb, nir_fmul(nb, x, nir_imm_float(nb, 2)));
val->ssa->def = nir_fdiv(nb, nir_fsub(nb, exp2x, nir_imm_float(nb, 1)),
nir_fadd(nb, exp2x, nir_imm_float(nb, 1)));
return;
}
case GLSLstd450Asinh:
val->ssa->def = nir_fmul(nb, nir_fsign(nb, src[0]),

View File

@@ -805,8 +805,12 @@ vtn_get_builtin_location(struct vtn_builder *b,
set_mode_system_value(mode);
break;
case SpvBuiltInPrimitiveId:
*location = VARYING_SLOT_PRIMITIVE_ID;
*mode = nir_var_shader_out;
if (*mode == nir_var_shader_out) {
*location = VARYING_SLOT_PRIMITIVE_ID;
} else {
*location = SYSTEM_VALUE_PRIMITIVE_ID;
set_mode_system_value(mode);
}
break;
case SpvBuiltInInvocationId:
*location = SYSTEM_VALUE_INVOCATION_ID;
@@ -1054,7 +1058,8 @@ var_decoration_cb(struct vtn_builder *b, struct vtn_value *val, int member,
is_vertex_input = false;
location += VARYING_SLOT_VAR0;
} else {
unreachable("Location must be on input or output variable");
vtn_warn("Location must be on input or output variable");
return;
}
if (vtn_var->var) {
@@ -1154,6 +1159,12 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
const uint32_t *w, unsigned count)
{
switch (opcode) {
case SpvOpUndef: {
struct vtn_value *val = vtn_push_value(b, w[2], vtn_value_type_undef);
val->type = vtn_value(b, w[1], vtn_value_type_type)->type;
break;
}
case SpvOpVariable: {
struct vtn_variable *var = rzalloc(b, struct vtn_variable);
var->type = vtn_value(b, w[1], vtn_value_type_type)->type;

View File

@@ -95,8 +95,8 @@ AM_CFLAGS += \
-I$(top_srcdir)/src/egl/drivers/dri2 \
-I$(top_srcdir)/src/gbm/backends/dri \
-I$(top_srcdir)/src/egl/wayland/wayland-egl \
-I$(top_srcdir)/src/egl/wayland/wayland-drm \
-I$(top_builddir)/src/egl/wayland/wayland-drm \
-I$(top_srcdir)/src/egl/wayland/wayland-drm \
-DDEFAULT_DRIVER_DIR=\"$(DRI_DRIVER_SEARCH_DIR)\" \
-D_EGL_BUILT_IN_DRIVER_DRI2

View File

@@ -241,6 +241,15 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
return NULL;
break;
case __DRI_ATTRIB_MAX_PBUFFER_WIDTH:
_eglSetConfigKey(&base, EGL_MAX_PBUFFER_WIDTH,
_EGL_MAX_PBUFFER_WIDTH);
break;
case __DRI_ATTRIB_MAX_PBUFFER_HEIGHT:
_eglSetConfigKey(&base, EGL_MAX_PBUFFER_HEIGHT,
_EGL_MAX_PBUFFER_HEIGHT);
break;
default:
key = dri2_to_egl_attribute_map[attrib];
if (key != 0)
@@ -1077,6 +1086,20 @@ dri2_create_context(_EGLDriver *drv, _EGLDisplay *disp, _EGLConfig *conf,
if (!_eglInitContext(&dri2_ctx->base, disp, conf, attrib_list))
goto cleanup;
/* The EGL_EXT_create_context_robustness spec says:
*
* "Add to the eglCreateContext context creation errors: [...]
*
* * If the reset notification behavior of <share_context> and the
* newly created context are different then an EGL_BAD_MATCH error is
* generated."
*/
if (share_list && share_list->ResetNotificationStrategy !=
dri2_ctx->base.ResetNotificationStrategy) {
_eglError(EGL_BAD_MATCH, "eglCreateContext");
goto cleanup;
}
switch (dri2_ctx->base.ClientAPI) {
case EGL_OPENGL_ES_API:
switch (dri2_ctx->base.ClientMajorVersion) {

View File

@@ -80,8 +80,6 @@
#include "eglimage.h"
#include "eglsync.h"
#define ARRAY_SIZE(a) (sizeof(a) / sizeof((a)[0]))
struct wl_buffer;
struct dri2_egl_driver

View File

@@ -66,7 +66,8 @@ dri2_fallback_swap_buffers_with_damage(_EGLDriver *drv, _EGLDisplay *dpy,
_EGLSurface *surf,
const EGLint *rects, EGLint n_rects)
{
return EGL_FALSE;
struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
return dri2_dpy->vtbl->swap_buffers(drv, dpy, surf);
}
static inline EGLBoolean

View File

@@ -766,8 +766,6 @@ droid_add_configs_for_visuals(_EGLDriver *drv, _EGLDisplay *dpy)
EGL_NATIVE_VISUAL_TYPE, 0,
EGL_FRAMEBUFFER_TARGET_ANDROID, EGL_TRUE,
EGL_RECORDABLE_ANDROID, EGL_TRUE,
EGL_MAX_PBUFFER_WIDTH, _EGL_MAX_PBUFFER_WIDTH,
EGL_MAX_PBUFFER_HEIGHT, _EGL_MAX_PBUFFER_HEIGHT,
EGL_NONE
};
unsigned int format_count[ARRAY_SIZE(visuals)] = { 0 };

View File

@@ -118,6 +118,13 @@ resize_callback(struct wl_egl_window *wl_win, void *data)
(*dri2_dpy->flush->invalidate)(dri2_surf->dri_drawable);
}
static void
destroy_window_callback(void *data)
{
struct dri2_egl_surface *dri2_surf = data;
dri2_surf->wl_win = NULL;
}
/**
* Called via eglCreateWindowSurface(), drv->API.CreateWindowSurface().
*/
@@ -159,6 +166,7 @@ dri2_wl_create_surface(_EGLDriver *drv, _EGLDisplay *disp,
dri2_surf->wl_win->private = dri2_surf;
dri2_surf->wl_win->resize_callback = resize_callback;
dri2_surf->wl_win->destroy_window_callback = destroy_window_callback;
dri2_surf->base.Width = -1;
dri2_surf->base.Height = -1;
@@ -254,8 +262,11 @@ dri2_wl_destroy_surface(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf)
if (dri2_surf->throttle_callback)
wl_callback_destroy(dri2_surf->throttle_callback);
dri2_surf->wl_win->private = NULL;
dri2_surf->wl_win->resize_callback = NULL;
if (dri2_surf->wl_win) {
dri2_surf->wl_win->private = NULL;
dri2_surf->wl_win->resize_callback = NULL;
dri2_surf->wl_win->destroy_window_callback = NULL;
}
free(surf);
@@ -1070,6 +1081,7 @@ static struct dri2_egl_display_vtbl dri2_wl_display_vtbl = {
static const __DRIextension *dri2_loader_extensions[] = {
&dri2_loader_extension.base,
&image_loader_extension.base,
&image_lookup_extension.base,
&use_invalidate.base,
NULL,
@@ -1272,6 +1284,8 @@ dri2_initialize_wayland_drm(_EGLDriver *drv, _EGLDisplay *disp)
cleanup_registry:
wl_registry_destroy(dri2_dpy->wl_registry);
wl_event_queue_destroy(dri2_dpy->wl_queue);
if (disp->PlatformDisplay == NULL)
wl_display_disconnect(dri2_dpy->wl_dpy);
cleanup_dpy:
free(dri2_dpy);
disp->DriverData = NULL;
@@ -1731,6 +1745,8 @@ dri2_wl_swrast_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp,
dri2_surf->format = WL_SHM_FORMAT_ARGB8888;
dri2_surf->wl_win = window;
dri2_surf->wl_win->private = dri2_surf;
dri2_surf->wl_win->destroy_window_callback = destroy_window_callback;
dri2_surf->base.Width = -1;
dri2_surf->base.Height = -1;
@@ -1913,6 +1929,8 @@ dri2_initialize_wayland_swrast(_EGLDriver *drv, _EGLDisplay *disp)
cleanup_registry:
wl_registry_destroy(dri2_dpy->wl_registry);
wl_event_queue_destroy(dri2_dpy->wl_queue);
if (disp->PlatformDisplay == NULL)
wl_display_disconnect(dri2_dpy->wl_dpy);
cleanup_dpy:
free(dri2_dpy);
disp->DriverData = NULL;

View File

@@ -438,6 +438,25 @@ dri3_query_buffer_age(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf)
return loader_dri3_query_buffer_age(&dri3_surf->loader_drawable);
}
static EGLBoolean
dri3_query_surface(_EGLDriver *drv, _EGLDisplay *dpy,
_EGLSurface *surf, EGLint attribute,
EGLint *value)
{
struct dri3_egl_surface *dri3_surf = dri3_egl_surface(surf);
switch (attribute) {
case EGL_WIDTH:
case EGL_HEIGHT:
loader_dri3_update_drawable_geometry(&dri3_surf->loader_drawable);
break;
default:
break;
}
return _eglQuerySurface(drv, dpy, surf, attribute, value);
}
static __DRIdrawable *
dri3_get_dri_drawable(_EGLSurface *surf)
{
@@ -460,6 +479,7 @@ struct dri2_egl_display_vtbl dri3_x11_display_vtbl = {
.post_sub_buffer = dri2_fallback_post_sub_buffer,
.copy_buffers = dri3_copy_buffers,
.query_buffer_age = dri3_query_buffer_age,
.query_surface = dri3_query_surface,
.create_wayland_buffer_from_image = dri2_fallback_create_wayland_buffer_from_image,
.get_sync_values = dri3_get_sync_values,
.get_dri_drawable = dri3_get_dri_drawable,

View File

@@ -734,7 +734,9 @@ eglCreateContext(EGLDisplay dpy, EGLConfig config, EGLContext share_list,
_EGL_CHECK_DISPLAY(disp, EGL_NO_CONTEXT, drv);
if (!config && !disp->Extensions.KHR_no_config_context)
if (config != EGL_NO_CONFIG_KHR)
_EGL_CHECK_CONFIG(disp, conf, EGL_NO_CONTEXT, drv);
else if (!disp->Extensions.KHR_no_config_context)
RETURN_EGL_ERROR(disp, EGL_BAD_CONFIG, EGL_NO_CONTEXT);
if (!share && share_list != EGL_NO_CONTEXT)
@@ -847,7 +849,7 @@ _eglCreateWindowSurfaceCommon(_EGLDisplay *disp, EGLConfig config,
RETURN_EGL_ERROR(disp, EGL_BAD_NATIVE_WINDOW, EGL_NO_SURFACE);
#ifdef HAVE_SURFACELESS_PLATFORM
if (disp->Platform == _EGL_PLATFORM_SURFACELESS) {
if (disp && disp->Platform == _EGL_PLATFORM_SURFACELESS) {
/* From the EGL_MESA_platform_surfaceless spec (v1):
*
* eglCreatePlatformWindowSurface fails when called with a <display>
@@ -866,6 +868,9 @@ _eglCreateWindowSurfaceCommon(_EGLDisplay *disp, EGLConfig config,
_EGL_CHECK_CONFIG(disp, conf, EGL_NO_SURFACE, drv);
if ((conf->SurfaceType & EGL_WINDOW_BIT) == 0)
RETURN_EGL_ERROR(disp, EGL_BAD_MATCH, EGL_NO_SURFACE);
surf = drv->API.CreateWindowSurface(drv, disp, conf, native_window,
attrib_list);
ret = (surf) ? _eglLinkSurface(surf) : EGL_NO_SURFACE;
@@ -968,7 +973,7 @@ _eglCreatePixmapSurfaceCommon(_EGLDisplay *disp, EGLConfig config,
EGLSurface ret;
#if HAVE_SURFACELESS_PLATFORM
if (disp->Platform == _EGL_PLATFORM_SURFACELESS) {
if (disp && disp->Platform == _EGL_PLATFORM_SURFACELESS) {
/* From the EGL_MESA_platform_surfaceless spec (v1):
*
* [Like eglCreatePlatformWindowSurface,] eglCreatePlatformPixmapSurface
@@ -984,6 +989,10 @@ _eglCreatePixmapSurfaceCommon(_EGLDisplay *disp, EGLConfig config,
#endif
_EGL_CHECK_CONFIG(disp, conf, EGL_NO_SURFACE, drv);
if ((conf->SurfaceType & EGL_PIXMAP_BIT) == 0)
RETURN_EGL_ERROR(disp, EGL_BAD_MATCH, EGL_NO_SURFACE);
surf = drv->API.CreatePixmapSurface(drv, disp, conf, native_pixmap,
attrib_list);
ret = (surf) ? _eglLinkSurface(surf) : EGL_NO_SURFACE;
@@ -1054,6 +1063,9 @@ eglCreatePbufferSurface(EGLDisplay dpy, EGLConfig config,
_EGL_FUNC_START(disp, EGL_OBJECT_DISPLAY_KHR, NULL, EGL_NO_SURFACE);
_EGL_CHECK_CONFIG(disp, conf, EGL_NO_SURFACE, drv);
if ((conf->SurfaceType & EGL_PBUFFER_BIT) == 0)
RETURN_EGL_ERROR(disp, EGL_BAD_MATCH, EGL_NO_SURFACE);
surf = drv->API.CreatePbufferSurface(drv, disp, conf, attrib_list);
ret = (surf) ? _eglLinkSurface(surf) : EGL_NO_SURFACE;
@@ -2382,7 +2394,7 @@ _eglLockDisplayInterop(EGLDisplay dpy, EGLContext context,
return MESA_GLINTEROP_SUCCESS;
}
int
PUBLIC int
MesaGLInteropEGLQueryDeviceInfo(EGLDisplay dpy, EGLContext context,
struct mesa_glinterop_device_info *out)
{
@@ -2404,7 +2416,7 @@ MesaGLInteropEGLQueryDeviceInfo(EGLDisplay dpy, EGLContext context,
return ret;
}
int
PUBLIC int
MesaGLInteropEGLExportObject(EGLDisplay dpy, EGLContext context,
struct mesa_glinterop_export_in *in,
struct mesa_glinterop_export_out *out)

View File

@@ -184,19 +184,33 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay *dpy,
break;
}
/* The EGL_KHR_create_context_spec says:
*
* "If the EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR bit is set in
* EGL_CONTEXT_FLAGS_KHR, then a context supporting <robust buffer
* access> will be created. Robust buffer access is defined in the
* GL_ARB_robustness extension specification, and the resulting
* context must also support either the GL_ARB_robustness
* extension, or a version of OpenGL incorporating equivalent
* functionality. This bit is supported for OpenGL contexts.
*/
if ((val & EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR) &&
(api != EGL_OPENGL_API ||
!dpy->Extensions.EXT_create_context_robustness)) {
api != EGL_OPENGL_API) {
/* The EGL_KHR_create_context spec says:
*
* 10) Which error should be generated if robust buffer access
* or reset notifications are requested under OpenGL ES?
*
* As per Issue 6, this extension does not support creating
* robust contexts for OpenGL ES. This is only supported via
* the EGL_EXT_create_context_robustness extension.
*
* Attempting to use this extension to create robust OpenGL
* ES context will generate an EGL_BAD_ATTRIBUTE error. This
* specific error is generated because this extension does
* not define the EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR
* and EGL_CONTEXT_OPENGL_RESET_NOTIFICATION_STRATEGY_KHR
* bits for OpenGL ES contexts. Thus, use of these bits fall
* under condition described by: "If an attribute is
* specified that is not meaningful for the client API
* type.." in the above specification.
*
* The spec requires that we emit the error even if the display
* supports EGL_EXT_create_context_robustness. To create a robust
* GLES context, the *attribute*
* EGL_CONTEXT_OPENGL_ROBUST_ACCESS_EXT must be used, not the
* *flag* EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR.
*/
err = EGL_BAD_ATTRIBUTE;
break;
}

View File

@@ -34,6 +34,8 @@
#ifndef EGLDEFINES_INCLUDED
#define EGLDEFINES_INCLUDED
#include "util/macros.h"
#ifdef __cplusplus
extern "C" {
#endif
@@ -48,9 +50,6 @@ extern "C" {
#define _EGL_VENDOR_STRING "Mesa Project"
#define ARRAY_SIZE(a) (sizeof(a) / sizeof((a)[0]))
#define MIN2(A, B) (((A) < (B)) ? (A) : (B))
#ifdef __cplusplus
}
#endif

View File

@@ -262,9 +262,13 @@ _eglInitSurface(_EGLSurface *surf, _EGLDisplay *dpy, EGLint type,
{
const char *func;
EGLint renderBuffer = EGL_BACK_BUFFER;
EGLint swapBehavior = EGL_BUFFER_PRESERVED;
EGLint swapBehavior = EGL_BUFFER_DESTROYED;
EGLint err;
/* Swap behavior can be preserved only if config supports this. */
if (conf->SurfaceType & EGL_SWAP_BEHAVIOR_PRESERVED_BIT)
swapBehavior = EGL_BUFFER_PRESERVED;
switch (type) {
case EGL_WINDOW_BIT:
func = "eglCreateWindowSurface";

View File

@@ -188,7 +188,9 @@ cso_insert_state(struct cso_cache *sc,
void *state)
{
struct cso_hash *hash = _cso_hash_for_type(sc, type);
sanitize_hash(sc, hash, type, sc->max_size);
if (type != CSO_SAMPLER)
sanitize_hash(sc, hash, type, sc->max_size);
return cso_hash_insert(hash, hash_key, state);
}

View File

@@ -1275,7 +1275,6 @@ cso_restore_fragment_samplers(struct cso_context *ctx)
{
struct sampler_info *info = &ctx->samplers[PIPE_SHADER_FRAGMENT];
info->nr_samplers = ctx->nr_fragment_samplers_saved;
memcpy(info->samplers, ctx->fragment_samplers_saved,
sizeof(info->samplers));
cso_single_sampler_done(ctx, PIPE_SHADER_FRAGMENT);

View File

@@ -606,7 +606,10 @@ gallivm_compile_module(struct gallivm_state *gallivm)
util_snprintf(filename, sizeof(filename), "ir_%s.bc", gallivm->module_name);
LLVMWriteBitcodeToFile(gallivm->module, filename);
debug_printf("%s written\n", filename);
debug_printf("Invoke as \"llc -o - %s\"\n", filename);
debug_printf("Invoke as \"llc %s%s -o - %s\"\n",
(HAVE_LLVM >= 0x0305) ? "[-mcpu=<-mcpu option] " : "",
"[-mattr=<-mattr option(s)>]",
filename);
}
if (USE_MCJIT) {

View File

@@ -98,6 +98,7 @@
#include "util/u_cpu_detect.h"
#include "lp_bld_misc.h"
#include "lp_bld_debug.h"
namespace {
@@ -596,7 +597,8 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
#if defined(PIPE_ARCH_PPC)
MAttrs.push_back(util_cpu_caps.has_altivec ? "+altivec" : "-altivec");
#if HAVE_LLVM >= 0x0304
#if (HAVE_LLVM >= 0x0304)
#if (HAVE_LLVM <= 0x0307) || (HAVE_LLVM == 0x0308 && MESA_LLVM_VERSION_PATCH == 0)
/*
* Make sure VSX instructions are disabled
* See LLVM bug https://llvm.org/bugs/show_bug.cgi?id=25503#c7
@@ -604,11 +606,32 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
if (util_cpu_caps.has_altivec) {
MAttrs.push_back("-vsx");
}
#else
/*
* However, bug 25503 is fixed, by the same fix that fixed
* bug 26775, in versions of LLVM later than 3.8 (starting with 3.8.1):
* Make sure VSX instructions are ENABLED
* See LLVM bug https://llvm.org/bugs/show_bug.cgi?id=26775
*/
if (util_cpu_caps.has_altivec) {
MAttrs.push_back("+vsx");
}
#endif
#endif
#endif
builder.setMAttrs(MAttrs);
if (gallivm_debug & (GALLIVM_DEBUG_IR | GALLIVM_DEBUG_ASM | GALLIVM_DEBUG_DUMP_BC)) {
int n = MAttrs.size();
if (n > 0) {
debug_printf("llc -mattr option(s): ");
for (int i = 0; i < n; i++)
debug_printf("%s%s", MAttrs[i].c_str(), (i < n - 1) ? "," : "");
debug_printf("\n");
}
}
#if HAVE_LLVM >= 0x0305
StringRef MCPU = llvm::sys::getHostCPUName();
/*
@@ -623,7 +646,23 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
* when not using MCJIT so no instructions are generated which the old JIT
* can't handle. Not entirely sure if we really need to do anything yet.
*/
#if defined(PIPE_ARCH_LITTLE_ENDIAN) && defined(PIPE_ARCH_PPC_64)
/*
* Versions of LLVM prior to 4.0 lacked a table entry for "POWER8NVL",
* resulting in (big-endian) "generic" being returned on
* little-endian Power8NVL systems. The result was that code that
* attempted to load the least significant 32 bits of a 64-bit quantity
* from memory loaded the wrong half. This resulted in failures in some
* Piglit tests, e.g.
* .../arb_gpu_shader_fp64/execution/conversion/frag-conversion-explicit-double-uint
*/
if (MCPU == "generic")
MCPU = "pwr8";
#endif
builder.setMCPU(MCPU);
if (gallivm_debug & (GALLIVM_DEBUG_IR | GALLIVM_DEBUG_ASM | GALLIVM_DEBUG_DUMP_BC)) {
debug_printf("llc -mcpu option: %s\n", MCPU.str().c_str());
}
#endif
ShaderMemoryManager *MM = NULL;

View File

@@ -36,6 +36,7 @@
#include "hud/hud_private.h"
#include "util/list.h"
#include "os/os_time.h"
#include "os/os_thread.h"
#include "util/u_memory.h"
#include <stdio.h>
#include <unistd.h>
@@ -61,6 +62,7 @@ struct cpufreq_info
static int gcpufreq_count = 0;
static struct list_head gcpufreq_list;
pipe_static_mutex(gcpufreq_mutex);
static struct cpufreq_info *
find_cfi_by_index(int cpu_index, int mode)
@@ -112,14 +114,6 @@ query_cfi_load(struct hud_graph *gr)
}
}
static void
free_query_data(void *p)
{
struct cpufreq_info *cfi = (struct cpufreq_info *)p;
list_del(&cfi->list);
FREE(cfi);
}
/**
* Create and initialize a new object for a specific CPU.
* \param pane parent context.
@@ -155,6 +149,7 @@ hud_cpufreq_graph_install(struct hud_pane *pane, int cpu_index,
break;
case CPUFREQ_MAXIMUM:
snprintf(gr->name, sizeof(gr->name), "%s-Max", cfi->name);
break;
default:
return;
}
@@ -162,11 +157,6 @@ hud_cpufreq_graph_install(struct hud_pane *pane, int cpu_index,
gr->query_data = cfi;
gr->query_new_value = query_cfi_load;
/* Don't use free() as our callback as that messes up Gallium's
* memory debugger. Use simple free_query_data() wrapper.
*/
gr->free_query_data = free_query_data;
hud_pane_add_graph(pane, gr);
hud_pane_set_max_value(pane, 3000000 /* 3 GHz */);
}
@@ -199,16 +189,21 @@ hud_get_num_cpufreq(bool displayhelp)
int cpu_index;
/* Return the number of CPU metrics we support. */
if (gcpufreq_count)
pipe_mutex_lock(gcpufreq_mutex);
if (gcpufreq_count) {
pipe_mutex_unlock(gcpufreq_mutex);
return gcpufreq_count;
}
/* Scan /sys/devices.../cpu, for every object type we support, create
* and persist an object to represent its different metrics.
*/
list_inithead(&gcpufreq_list);
DIR *dir = opendir("/sys/devices/system/cpu");
if (!dir)
if (!dir) {
pipe_mutex_unlock(gcpufreq_mutex);
return 0;
}
while ((dp = readdir(dir)) != NULL) {
@@ -238,6 +233,7 @@ hud_get_num_cpufreq(bool displayhelp)
snprintf(fn, sizeof(fn), "%s/cpufreq/scaling_max_freq", basename);
add_object(dp->d_name, fn, CPUFREQ_MAXIMUM, cpu_index);
}
closedir(dir);
if (displayhelp) {
list_for_each_entry(struct cpufreq_info, cfi, &gcpufreq_list, list) {
@@ -251,6 +247,7 @@ hud_get_num_cpufreq(bool displayhelp)
}
}
pipe_mutex_unlock(gcpufreq_mutex);
return gcpufreq_count;
}

View File

@@ -35,6 +35,7 @@
#include "hud/hud_private.h"
#include "util/list.h"
#include "os/os_time.h"
#include "os/os_thread.h"
#include "util/u_memory.h"
#include <stdio.h>
#include <unistd.h>
@@ -81,6 +82,7 @@ struct diskstat_info
*/
static int gdiskstat_count = 0;
static struct list_head gdiskstat_list;
pipe_static_mutex(gdiskstat_mutex);
static struct diskstat_info *
find_dsi_by_name(const char *n, int mode)
@@ -162,14 +164,6 @@ query_dsi_load(struct hud_graph *gr)
}
}
static void
free_query_data(void *p)
{
struct diskstat_info *nic = (struct diskstat_info *) p;
list_del(&nic->list);
FREE(nic);
}
/**
* Create and initialize a new object for a specific block I/O device.
* \param pane parent context.
@@ -208,11 +202,6 @@ hud_diskstat_graph_install(struct hud_pane *pane, const char *dev_name,
gr->query_data = dsi;
gr->query_new_value = query_dsi_load;
/* Don't use free() as our callback as that messes up Gallium's
* memory debugger. Use simple free_query_data() wrapper.
*/
gr->free_query_data = free_query_data;
hud_pane_add_graph(pane, gr);
hud_pane_set_max_value(pane, 100);
}
@@ -257,16 +246,21 @@ hud_get_num_disks(bool displayhelp)
char name[64];
/* Return the number of block devices and partitions. */
if (gdiskstat_count)
pipe_mutex_lock(gdiskstat_mutex);
if (gdiskstat_count) {
pipe_mutex_unlock(gdiskstat_mutex);
return gdiskstat_count;
}
/* Scan /sys/block, for every object type we support, create and
* persist an object to represent its different statistics.
*/
list_inithead(&gdiskstat_list);
DIR *dir = opendir("/sys/block/");
if (!dir)
if (!dir) {
pipe_mutex_unlock(gdiskstat_mutex);
return 0;
}
while ((dp = readdir(dir)) != NULL) {
@@ -290,8 +284,11 @@ hud_get_num_disks(bool displayhelp)
/* Add any partitions */
struct dirent *dpart;
DIR *pdir = opendir(basename);
if (!pdir)
if (!pdir) {
pipe_mutex_unlock(gdiskstat_mutex);
closedir(dir);
return 0;
}
while ((dpart = readdir(pdir)) != NULL) {
/* Avoid 'lo' and '..' and '.' */
@@ -311,6 +308,7 @@ hud_get_num_disks(bool displayhelp)
add_object_part(basename, dpart->d_name, DISKSTAT_WR);
}
}
closedir(dir);
if (displayhelp) {
list_for_each_entry(struct diskstat_info, dsi, &gdiskstat_list, list) {
@@ -322,6 +320,7 @@ hud_get_num_disks(bool displayhelp)
puts(line);
}
}
pipe_mutex_unlock(gdiskstat_mutex);
return gdiskstat_count;
}

View File

@@ -35,6 +35,7 @@
#include "hud/hud_private.h"
#include "util/list.h"
#include "os/os_time.h"
#include "os/os_thread.h"
#include "util/u_memory.h"
#include <stdio.h>
#include <unistd.h>
@@ -66,6 +67,7 @@ struct nic_info
*/
static int gnic_count = 0;
static struct list_head gnic_list;
pipe_static_mutex(gnic_mutex);
static struct nic_info *
find_nic_by_name(const char *n, int mode)
@@ -234,14 +236,6 @@ query_nic_load(struct hud_graph *gr)
}
}
static void
free_query_data(void *p)
{
struct nic_info *nic = (struct nic_info *) p;
list_del(&nic->list);
FREE(nic);
}
/**
* Create and initialize a new object for a specific network interface dev.
* \param pane parent context.
@@ -284,11 +278,6 @@ hud_nic_graph_install(struct hud_pane *pane, const char *nic_name,
gr->query_data = nic;
gr->query_new_value = query_nic_load;
/* Don't use free() as our callback as that messes up Gallium's
* memory debugger. Use simple free_query_data() wrapper.
*/
gr->free_query_data = free_query_data;
hud_pane_add_graph(pane, gr);
hud_pane_set_max_value(pane, 100);
}
@@ -342,16 +331,21 @@ hud_get_num_nics(bool displayhelp)
char name[64];
/* Return the number if network interfaces. */
if (gnic_count)
pipe_mutex_lock(gnic_mutex);
if (gnic_count) {
pipe_mutex_unlock(gnic_mutex);
return gnic_count;
}
/* Scan /sys/block, for every object type we support, create and
* persist an object to represent its different statistics.
*/
list_inithead(&gnic_list);
DIR *dir = opendir("/sys/class/net/");
if (!dir)
if (!dir) {
pipe_mutex_unlock(gnic_mutex);
return 0;
}
while ((dp = readdir(dir)) != NULL) {
@@ -412,6 +406,7 @@ hud_get_num_nics(bool displayhelp)
}
}
closedir(dir);
list_for_each_entry(struct nic_info, nic, &gnic_list, list) {
char line[64];
@@ -424,6 +419,7 @@ hud_get_num_nics(bool displayhelp)
}
pipe_mutex_unlock(gnic_mutex);
return gnic_count;
}

View File

@@ -32,6 +32,7 @@
#include "hud/hud_private.h"
#include "util/list.h"
#include "os/os_time.h"
#include "os/os_thread.h"
#include "util/u_memory.h"
#include <stdio.h>
#include <unistd.h>
@@ -49,6 +50,7 @@
*/
static int gsensors_temp_count = 0;
static struct list_head gsensors_temp_list;
pipe_static_mutex(gsensor_temp_mutex);
struct sensors_temp_info
{
@@ -189,17 +191,6 @@ query_sti_load(struct hud_graph *gr)
}
}
static void
free_query_data(void *p)
{
struct sensors_temp_info *sti = (struct sensors_temp_info *) p;
list_del(&sti->list);
if (sti->chip)
sensors_free_chip_name(sti->chip);
FREE(sti);
sensors_cleanup();
}
/**
* Create and initialize a new object for a specific sensor interface dev.
* \param pane parent context.
@@ -237,11 +228,6 @@ hud_sensors_temp_graph_install(struct hud_pane *pane, const char *dev_name,
gr->query_data = sti;
gr->query_new_value = query_sti_load;
/* Don't use free() as our callback as that messes up Gallium's
* memory debugger. Use simple free_query_data() wrapper.
*/
gr->free_query_data = free_query_data;
hud_pane_add_graph(pane, gr);
switch (sti->mode) {
case SENSORS_TEMP_CURRENT:
@@ -338,12 +324,17 @@ int
hud_get_num_sensors(bool displayhelp)
{
/* Return the number of sensors detected. */
if (gsensors_temp_count)
pipe_mutex_lock(gsensor_temp_mutex);
if (gsensors_temp_count) {
pipe_mutex_unlock(gsensor_temp_mutex);
return gsensors_temp_count;
}
int ret = sensors_init(NULL);
if (ret)
if (ret) {
pipe_mutex_unlock(gsensor_temp_mutex);
return 0;
}
list_inithead(&gsensors_temp_list);
@@ -377,6 +368,7 @@ hud_get_num_sensors(bool displayhelp)
}
}
pipe_mutex_unlock(gsensor_temp_mutex);
return gsensors_temp_count;
}

View File

@@ -672,17 +672,19 @@ iter_instruction(
}
}
switch (inst->Instruction.Opcode) {
case TGSI_OPCODE_IF:
case TGSI_OPCODE_UIF:
case TGSI_OPCODE_ELSE:
case TGSI_OPCODE_BGNLOOP:
case TGSI_OPCODE_ENDLOOP:
case TGSI_OPCODE_CAL:
case TGSI_OPCODE_BGNSUB:
TXT( " :" );
UID( inst->Label.Label );
break;
if (inst->Instruction.Label) {
switch (inst->Instruction.Opcode) {
case TGSI_OPCODE_IF:
case TGSI_OPCODE_UIF:
case TGSI_OPCODE_ELSE:
case TGSI_OPCODE_BGNLOOP:
case TGSI_OPCODE_ENDLOOP:
case TGSI_OPCODE_CAL:
case TGSI_OPCODE_BGNSUB:
TXT( " :" );
UID( inst->Label.Label );
break;
}
}
/* update indentation */

View File

@@ -485,6 +485,7 @@ tgsi_opcode_infer_src_type( uint opcode )
case TGSI_OPCODE_UMUL_HI:
case TGSI_OPCODE_UP2H:
case TGSI_OPCODE_U2I64:
case TGSI_OPCODE_MEMBAR:
return TGSI_TYPE_UNSIGNED;
case TGSI_OPCODE_IMUL_HI:
case TGSI_OPCODE_I2F:

Some files were not shown because too many files have changed in this diff Show More