Compare commits

...

6241 Commits

Author SHA1 Message Date
Emil Velikov
13953f012d docs: use correct year for the 12.0.6 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 02:05:20 +00:00
Emil Velikov
36e3f2542d docs: add sha256 checksums for 12.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 02:02:48 +00:00
Emil Velikov
555885a0bf docs: add release notes for 12.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 01:32:02 +00:00
Emil Velikov
ab62405953 Update version to 12.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 01:28:48 +00:00
Kenneth Graunke
806de4a224 i965: Properly flush in hsw_pause_transform_feedback().
Fixes a number of transform feedback tests when run with Linux 4.8,
which allows us to use the MI_LOAD_REGISTER_REG command, at which point
we started using this new broken path.

ES3-CTS.functional.transform_feedback.array_element.interleaved.lines.*
and Piglit's arb_transform_feedback2/draw-auto are both fixed by this
patch, for example.

Thanks to Chris Wilson for catching this mistake!

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99030
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 2138347a45)
2017-01-20 22:47:28 +00:00
Chad Versace
2b87bb9b90 anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0
The spec implicitly allows the incoming count to be 0. From the Vulkan
1.0.38 spec, Section 4.1 Physical Devices:

    If the value referenced by pQueueFamilyPropertyCount is not 0 [then
    do stuff].

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit d6545f2345)
2017-01-20 22:40:51 +00:00
Emil Velikov
689ca381b5 egl/wayland: use the destroy_window_callback for swrast
As described in commit 690ead4a13 ("egl/wayland-egl: Fix for segfault
in dri2_wl_destroy_surface.") if we attempt to destroy a EGL surface
attached to already destroyed Wayland window we'll get a segfault.

v2: set the correct callback alongside the window->private. (Dan)

Cc: Daniel Stone <daniels@collabora.com>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit bfd6314350)
2017-01-20 22:21:58 +00:00
Emil Velikov
cc2894d376 automake: use shared llvm libs for make distcheck
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 23dcce0c03)
2017-01-13 05:27:03 +00:00
Chad Versace
febf22ff55 i965/mt: Disable HiZ when sharing depth buffer externally (v2)
intel_miptree_make_shareable() discarded and disabled CCS. Fix it so
that it discards and disables HiZ too.

Fixes dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer
on Skylake.

v2: Actually do what the commit message says. Discard the HiZ buffer.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98329
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Nanley Chery <nanley.g.chery@intel.com
Cc: Haixia Shi <hshi@chromium.org>
(cherry picked from commit 42011be1e2)
[Emil Velikov: patch is a backport by Chad of above commit]
2017-01-13 05:27:02 +00:00
Chad Versace
3c7b53bba3 i965/mt: Disable aux surfaces after making miptree shareable
The entire goal of intel_miptree_make_shareable() is to permanently
disable the miptree's aux surfaces. So set
intel_mipmap_tree:disable_aux_buffers after the function's done with
discarding down the aux surfaces.

References: https://bugs.freedesktop.org/show_bug.cgi?id=98329
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Nanley Chery <nanley.g.chery@intel.com
Cc: Haixia Shi <hshi@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1c8be049be)
2017-01-13 05:27:02 +00:00
Emil Velikov
c880deef41 get-typod-pick-list.sh: add new script
Typos do happen as people nominate patches for stable. This script aims
to catch most of those.

Due to the subtle nature of things, one has to pay special attention to
the output, similar to get-extra-pick-list.sh.

At the moment only the following is handled:
 grep -i "CC:.*mesa-dev"

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f0bdd13fdb)
2017-01-13 05:27:02 +00:00
Ilia Mirkin
09973d9a99 nouveau: take extra push space into account for pushbuf_space calls
Ever since a long time ago when I messed around with fences, I ensure
that after a PUSH_SPACE call there is enough space to write a fence out
into the pushbuf.

However the PUSH_SPACE macro is not all-knowing, and so sometimes we
have to invoke nouveau_pushbuf_space manually with the relocs/pushes
args set. If we don't take the extra allocation from PUSH_SPACE into
account, then we will end up accidentally flushing when the code was not
expecting a flush. This can lead to various runtime and rendering
failures.

The amount of extra allocation isn't that important - it has to be at
least 8 based on the current nouveau_winsys.h setting, but even more
won't hurt. I just rounded up to powers of 2.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99354
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Ben Skeggs <bskeggs@redhat.com>
(cherry picked from commit eb60a89bc3)
2017-01-13 05:27:02 +00:00
Kenneth Graunke
36a54c27fd spirv: Move cursor before calling vtn_ssa_value() in phi 2nd pass.
vtn_ssa_value() can produce variable loads, and the cursor might
be after a return statement, causing nir_builder assert failures
about not inserting instructions after a jump.

This fixes:
dEQP-VK.spirv_assembly.instruction.graphics.barrier.in_if
dEQP-VK.spirv_assembly.instruction.graphics.barrier.in_switch

Cc: "13.0 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 203c128781)
2017-01-13 05:27:02 +00:00
Fredrik Höglund
57708155d2 dri3: Fix MakeCurrent without a default framebuffer
In OpenGL 3.0 and later it is legal to make a context current without
a default framebuffer.

This has been broken since DRI3 support was introduced.

Cc: "13.0 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b6670157d7)
2017-01-13 05:27:02 +00:00
Michel Dänzer
cf18ee4fcc cso: Don't restore nr_samplers in cso_restore_fragment_samplers
If info->nr_samplers > ctx->nr_fragment_samplers_saved, the assignment
would prevent cso_single_sampler_done from unbinding the no longer used
samplers from the driver, which could result in use-after-free. This is
probably unlikely to happen in practice though.

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 3d661a12be)
2017-01-13 05:27:02 +00:00
Jason Ekstrand
76816e70a9 anv/descriptor_set: Write the state offset in the surface state free list.
When Kristian reworked descriptor set allocation, somehow he forgot to
actually store the offset in the free list.  Somehow, this completely
missed CTS testing until now... This fixes all 2744 of the new
'dEQP-VK.texture.filtering.* tests in the latest CTS.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 37537b7d86)
2017-01-13 05:27:02 +00:00
Jason Ekstrand
c0934035a5 anv/device: Implicitly unmap memory objects in FreeMemory
From the Vulkan spec version 1.0.32 docs for vkFreeMemory:

   "If a memory object is mapped at the time it is freed, it is implicitly
   unmapped."

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit b1217eada9)
2017-01-13 05:27:02 +00:00
Jason Ekstrand
08a9f69a8b anv/device: Return the right error for failed maps
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 920f34a2d9)
2017-01-13 05:27:02 +00:00
Jason Ekstrand
d780f89966 spirv/nir: Add support for ImageQuerySamples
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 9e05e51cff)
2017-01-13 05:27:01 +00:00
Jason Ekstrand
a02edabb67 spirv/nir: Handle texture projectors
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 71202352c8)
2017-01-13 05:27:01 +00:00
Jason Ekstrand
70bb67febc nir/spirv: Refactor coordinate handling in handle_texture
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 36c31b8fa2)
2017-01-13 05:27:01 +00:00
Jason Ekstrand
9126479017 spirv/nir: Refactor type handling in handle_texture
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit b820c8b78c)
2017-01-13 05:27:01 +00:00
Jason Ekstrand
f76da483a2 spirv/nir: Move opcode selection higher up in handle_texture
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 561be50a1a)
2017-01-13 05:27:01 +00:00
Jason Ekstrand
4ac5633618 anv/image: Assert that the image format is actually supported
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit c8da91aa24)
2017-01-13 05:27:01 +00:00
Jason Ekstrand
32d7a060fa spirv/nir: Don't increment coord_components for array lod queries
For lod query instructions, we really don't care whether or not the sampler
is an array type because that doesn't factor into the LOD.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 34a39e91ba)
2017-01-13 05:27:01 +00:00
Jason Ekstrand
eb96145c74 i965: Get rid of the do_lower_unnormalized_offsets pass
We can do this in NIR now.  No need to keep a GLSL pass lying around for
it.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 67b7d876e4)
2017-01-13 05:27:01 +00:00
Jason Ekstrand
ddd048bbf5 i965/nir: Enable NIR lowering of txf and rect offsets
This fixes the following piglit tests on gen6+:

tex-miplevel-selection textureProjGradOffset 2DRect
tex-miplevel-selection textureGradOffset 2DRect
tex-miplevel-selection textureGradOffset 2DRectShadow
tex-miplevel-selection textureProjGradOffset 2DRect_ProjVec4
tex-miplevel-selection textureProjGradOffset 2DRectShadow

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 9f32721f86)
2017-01-13 05:27:01 +00:00
Jason Ekstrand
236ecd3c4e nir/lower_tex: Add support for lowering coordinate offsets
On i965, we can't support coordinate offsets for texelFetch or rectangle
textures.  Previously, we were doing this with a GLSL pass but we need to
do it in NIR if we want those workarounds for SPIR-V.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit d9156efc52)
2017-01-13 05:27:01 +00:00
Jason Ekstrand
81e78ee65c nir/lower_tex: Add some helpers for working with tex sources
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 843fc8f3e7)
2017-01-13 05:27:00 +00:00
Jason Ekstrand
6ebb536800 nir: Add a helper for determining the type of a texture source
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 09135cd55a)
2017-01-13 05:27:00 +00:00
Jason Ekstrand
89a8fd71af anv/pipeline: Set binding_table.gather_texture_start
This should get texture gather working on gen8+ and mostly working on gen7.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 3c0077a6ec)
2017-01-13 05:27:00 +00:00
Jason Ekstrand
8a293e6a0c spirv/nir: Properly handle gather components
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 95e9d58bdb)
2017-01-13 05:27:00 +00:00
Jason Ekstrand
231ace7eec spirv/nir: Add support for shadow samplers that return vec4
While SPIR-V technically doesn't support "old style" shadow, the
shadow-compare gather instruction does return a vec4 so we need to be able
to set the old_style_shadow bit in NIR.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 7c7acf53b2)
2017-01-13 05:27:00 +00:00
Jason Ekstrand
c07386e2c8 spirv/nir: Fix some texture opcode asserts
We can't get an lod with txf_ms and SPIR-V considers textureGrad to be an
explicit-LOD texturing instruction.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
(cherry picked from commit 2ddefd03b7)
2017-01-13 05:27:00 +00:00
Nicolai Hähnle
bb4195ca26 radeonsi: enable WQM in PS prolog when needed
WQM is needed when the PS prolog computes a VGPR that is consumed by a shader
with (implicit or explicit) derivatives.

Depends on http://reviews.llvm.org/D20839 / LLVM r272063 for this to be
effective (otherwise it's just a no-op).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130
Cc: 12.0 <mesa-dev@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b42bc90b6a)
2017-01-13 05:27:00 +00:00
Matt Turner
0386f956b3 i965/fs: Reject copy propagation into SEL if not min/max.
We shouldn't ever see a SEL with conditional mod other than GE (for max)
or L (for min), but we might see one with predication and no conditional
mod.

total instructions in shared programs: 8241806 -> 8241902 (0.00%)
instructions in affected programs: 13284 -> 13380 (0.72%)
HURT: 62

total cycles in shared programs: 84165104 -> 84166244 (0.00%)
cycles in affected programs: 75364 -> 76504 (1.51%)
helped: 10
HURT: 34

Fixes generated code in at least Sanctum 2, Borderlands 2, Goat
Simulator, XCOM: Enemy Unknown, and Shogun 2.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92234
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7bed52bb5f)
2017-01-13 05:27:00 +00:00
Matt Turner
eb9127d224 i965/fs: Add unit tests for copy propagation pass.
Pretty basic, but it's a start.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 091a8a04ad)
[Emil Velikov: s/gen_device_info/brw_device_info/, nir_shader_create()
has only three arguments]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-13 05:25:00 +00:00
Matt Turner
d37d8d81d5 i965/fs: Rename opt_copy_propagate -> opt_copy_propagation.
Matches the vec4 backend, cmod propagation, and saturate propagation.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6014da50ec)
[Emil Velikov: resolve trivial conflicts - don't rename instances which
do not exist]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_fs.cpp
2016-12-16 22:35:24 +00:00
Marek Olšák
630c41e2aa gallium/radeon: fix the draw-calls HUD query
reported by kisak on irc,
it only applies to stable, not master

Fix separated/backported from commit 4140afd04b ("gallium/radeon: add
driver queries for compute/dma call stats and spills")

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
2016-12-16 18:52:25 +00:00
Marek Olšák
d278c15a17 radeonsi: disable the constant engine (CE) on Carrizo and Stoney
It must be disabled until the kernel bug is fixed, and then we'll enable CE
based on the DRM version.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 31f988a9d6)
2016-12-16 18:52:25 +00:00
Marek Olšák
ce56dfca9a radeonsi: disable CE on SI + AMDGPU
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 49c798e902)
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-16 18:52:25 +00:00
Marek Olšák
3197612a1a radeonsi: fix incorrect FMASK checking in bind_sampler_states
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 38d4859b94)
2016-12-16 18:52:25 +00:00
Marek Olšák
6d919a6fc6 radeonsi: always restore sampler states when unbinding sampler views
Cc: 13.0 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b3a2aa9cba)
2016-12-16 18:52:24 +00:00
Marek Olšák
f71c3734ce cso: don't release sampler states that are bound
This fixes random radeonsi GPU hangs in Batman Arkham: Origins (Wine) and
probably many other games too.

cso_cache deletes sampler states when the cache size is too big and doesn't
check which sampler states are bound, causing use-after-free in drivers.
Because of that, radeonsi uploaded garbage sampler states and the hardware
went bananas. Other drivers may have experienced similar issues.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 6dc96de303)
2016-12-16 18:40:42 +00:00
Emil Velikov
6b1c3c3aa0 docs: add sha256 checksums for 12.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 15:38:13 +00:00
Emil Velikov
01579a9d00 docs: add release notes for 12.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 15:31:47 +00:00
Emil Velikov
cd9a116558 Update version to 12.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 15:25:21 +00:00
Marek Olšák
4a5cce8bd5 radeonsi: silence runtime warnings with LLVM 3.9
Such as:
Warning: LLVM emitted unknown config register: 0x4

This is a non-intrusive back port of commit 0f7a6ea5e7.
2016-12-05 13:15:35 +00:00
Marek Olšák
b4c28b1755 radeonsi: disable RB+ blend optimizations for dual source blending
This fixes dual source blending on Stoney. The fix was copied from Vulkan.
The problem was discovered during internal testing.

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 5e5573b1bf)
2016-12-05 13:13:11 +00:00
Marek Olšák
4f71f93878 radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending
copied from Vulkan

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit ff50c44a5f)
2016-12-05 13:12:21 +00:00
Marek Olšák
a9e5a98c19 radeonsi: always set all blend registers
better safe than sorry

Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 87b208a54e)

Conflicts:
	src/gallium/drivers/radeonsi/si_state.c
2016-12-05 13:11:05 +00:00
Nanley Chery
c1cb184488 mesa/fbobject: Update CubeMapFace when reusing textures
Framebuffer attachments can be specified through FramebufferTexture*
calls. Upon specifying a depth (or stencil) framebuffer attachment that
internally reuses a texture, the cube map face of the new attachment
would not be updated (defaulting to TEXTURE_CUBE_MAP_POSITIVE_X).
Fix this issue by actually updating the CubeMapFace field.

This bug manifested itself in BindFramebuffer calls performed on
framebuffers whose stencil attachments internally reused a depth
texture.  When binding a framebuffer, we walk through the framebuffer's
attachments and update each one's corresponding gl_renderbuffer. Since
the framebuffer's depth and stencil attachments may share a
gl_renderbuffer and the walk visits the stencil attachment after
the depth attachment, the uninitialized CubeMapFace forced rendering
to TEXTURE_CUBE_MAP_POSITIVE_X.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77662
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 63318d34ac)
2016-12-02 21:23:10 +00:00
Marek Olšák
e3ef7da79c gallium/radeon: add support for sharing textures with DCC between processes
v2: use a function for calculating WORD1 of bo metadata

[Lyude]
On Fedora 24 and 25, I ended up noticing some rather nasty graphical
glitches on my desktop (using an R9 380 w/ amdgpu, Mesa version 12.0.4)
while I was in Wayland where the content of windows was garbled, as seen
here:

https://people.freedesktop.org/~lyudess/archive/11-30-2017/amdgpu-fix-example.png

After doing some reverse bisecting with Mesa v13, I ended up tracking
down the fix to this patch, which seems to fix the problem entirely on
all of the systems I've tested.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Lyude <lyude@redhat.com>
CC: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 095803a37a)
2016-12-02 19:57:55 +00:00
Matt Turner
9666f75b1b anv: Replace "abi_versions" with correct "api_version".
git history shows "abi_versions" was used from the outset.

Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98415
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 07755237d3)
2016-12-02 19:57:55 +00:00
Marek Olšák
0afbb9d052 radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit b425b57d1e)
2016-12-02 19:57:55 +00:00
Marek Olšák
bd114e6be6 radeonsi: fix a crash in imageSize for cubemap arrays
Sometimes it was f32, other times it was i32. Now it's always i32.

This fixes:
GL45-CTS.texture_cube_map_array.image_texture_size.texture_size_compute_sh

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 3e756f09d4)
2016-12-02 19:57:55 +00:00
Marek Olšák
29bac28a04 radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader
This fixes:
GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation
.gl_PatchVerticesIn

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 03708deed2)
2016-12-02 19:57:55 +00:00
Marek Olšák
31aa3c014b gallium/radeon: set VPORT_ZMIN/MAX registers correctly
Calculate depth ranges from viewport states and
pipe_rasterizer_state::clip_halfz.

The evergreend.h change is required to silence a warning.

This fixes this recently updated piglit: arb_depth_clamp/depth-clamp-range

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 687c4be9cf)
2016-12-02 19:57:55 +00:00
Marek Olšák
b65a812d60 gallium/radeon: unify viewport emission code
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 8b0507672e)
2016-12-02 19:57:54 +00:00
Haixia Shi
5dd6e23ad8 mesa: change state query return value for RGB565
The GL_BGR and GL_UNSIGNED_SHORT_5_6_5_REV are not defined anywhere in
OpenGL ES 3.2 (or earlier) specification, and there are no known extensions
in the Khronos registry that would add these enums as valid responses for
glGetIntegerv(GL_IMPLEMENTATION_COLOR_READ_TYPE) and
glGetIntegerv(GL_IMPLEMENTATION_COLOR_READ_FORMAT) queries.

Note that this patch does not change the bit layout returned by the query. As
defined by the GL spec, the bit layout of GL_RGB + GL_UNSIGNED_SHORT_5_6_5 and
GL_BGR + GL_UNSIGNED_SHORT_5_6_5_REV are identical.

TEST=dEQP-GLES3.functional.state_query.integers.*

Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Change-Id: I81bbc8ccdc7e125edaeae443baf6fa8fdefcc6b6
(cherry picked from commit 8c56ff643b)
2016-12-02 19:57:54 +00:00
Adam Jackson
422b584c00 glx/glvnd: Fix dispatch function names and indices
As this array was not actually sorted, FindGLXFunction's binary search
would only sometimes work.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 8bca8d89ef)
2016-12-02 19:57:54 +00:00
Adam Jackson
b1bced0d1f glx/glvnd: Don't modify the dummy slot in the dispatch table
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit deb0eb1660)
2016-12-02 19:57:54 +00:00
Steinar H. Gunderson
9baee818b6 Fix races during _mesa_HashWalk().
There is currently no protection against walking a hash (using
_mesa_HashWalk()) and modifying it at the same time, for instance by inserting
or deleting elements. This leads to segfaults in multithreaded code if e.g.
someone calls glTexImage2D (which may have to walk the list of FBOs) while
another thread is calling glDeleteFramebuffers on another thread with the two
contexts sharing lists.

The reason for this is that _mesa_HashWalk() doesn't actually take the mutex
that normally protects the hash; it takes an entirely different mutex.
Thus, walks are only protected against other walks, and there is also no
outer lock taking this. There is an old comment saying that this is to fix
problems with deadlock if the callback needs to take a mutex; we solve this
by changing the mutex to be recursive.

A demonstration Helgrind hit from a real application:

==13412== Possible data race during write of size 8 at 0x3498C6A8 by thread #1
==13412== Locks held: 2, at addresses 0x1AF09530 0x2B3DF400
==13412==    at 0x1F040C99: _mesa_hash_table_remove (hash_table.c:395)
==13412==    by 0x1EE98174: _mesa_HashRemove_unlocked (hash.c:350)
==13412==    by 0x1EE98174: _mesa_HashRemove (hash.c:365)
==13412==    by 0x1EE2372D: _mesa_DeleteFramebuffers (fbobject.c:2669)
==13412==    by 0x6105AA4: movit::ResourcePool::cleanup_unlinked_fbos(void*) (resource_pool.cpp:473)
==13412==    by 0x610615B: movit::ResourcePool::release_fbo(unsigned int) (resource_pool.cpp:442)
[...]
==13412== This conflicts with a previous read of size 8 by thread #20
==13412== Locks held: 2, at addresses 0x1AF09558 0x1AF73318
==13412==    at 0x1F040CD9: _mesa_hash_table_next_entry (hash_table.c:415)
==13412==    by 0x1EE982A8: _mesa_HashWalk (hash.c:426)
==13412==    by 0x1EED6DFD: _mesa_update_fbo_texture.part.33 (teximage.c:2683)
==13412==    by 0x1EED9410: _mesa_update_fbo_texture (teximage.c:3043)
==13412==    by 0x1EED9410: teximage (teximage.c:3073)
==13412==    by 0x1EEDA28F: _mesa_TexImage2D (teximage.c:3105)
==13412==    by 0x166A68: operator() (mixer.cpp:454)

There are many more interactions than just these two possible.

Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Steinar H. Gunderson <steinar+mesa@gunderson.no>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 2e2562cabb)
2016-12-02 19:57:54 +00:00
Jason Ekstrand
68dd6ad433 anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4
This fixes hangs in Dota2

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a6c3d0f92b)
2016-12-02 19:57:54 +00:00
Jason Ekstrand
6bcdb0611f anv/cmd_buffer: Take a command buffer instead of a batch in two helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1e3e347fd5)
2016-12-02 19:57:54 +00:00
Emil Velikov
0703bab2cd cherry-ignore: add reverted LLVM_LIBDIR patch
The patch was reverted shortly after it was merged.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-02 19:57:54 +00:00
Anuj Phogat
a7b662633e i965: Fix GPU hang related to multiple render targets and alpha testing
This patch should have been the part of commit e592f7df.
In a situation when there are multiple render targets with alpha testing
enabled, if fragment shader doesn't write to draw buffer zero, it causes
the GPU hang on SKL. No GPU hang is seen on HSW. Simulator gives a
warning for all gen6+ h/w:
"Illegal render target write message length 0xa expected 0xc"

This patch fixes the GPU hang as well as the simulator warning with
new piglit test fbo-mrt-alphatest-no-buffer-zero-write:
https://patchwork.freedesktop.org/patch/118212

No regressions in Jenkins CI system.

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit b9df2251c1)
2016-12-02 19:57:54 +00:00
Marek Olšák
faa684802f radeonsi: fix an assertion failure in si_decompress_sampler_color_textures
This fixes a crash in Deus Ex: Mankind Divided. Release builds were
unaffected, so it's not too serious.

Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 00baaa4752)
2016-12-02 19:57:54 +00:00
Jason Ekstrand
9a844035c0 i965/fs/generator: Don't use the address immediate for MOV_INDIRECT
The address immediate field is only 9 bits and, since the value is in
bytes, the highest GRF we can point to with it is g15.  This makes it
pretty close to useless for MOV_INDIRECT.  There were already piles of
restrictions preventing us from using it prior to Broadwell, so let's get
rid of the gen8+ code path entirely.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97779
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 2a4a86862c)
2016-12-02 19:57:54 +00:00
Tim Rowley
5f4284fd36 swr: [rasterizer] add support for llvm-3.9
v2: use signed compare, remove unneeded vmask

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
(cherry picked from commit f810907669)
2016-12-02 19:57:54 +00:00
Tim Rowley
a4cd90283a swr: [rasterizer jitter] fix llvm-3.7 compile
d3d97f8 broke llvm-3.7, which has a mismatched API for
setDataLayout/getDataLayout.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
(cherry picked from commit ae4f2c849a)
2016-12-02 19:57:53 +00:00
Tim Rowley
0934f29c50 swr: [rasterizer jitter] cleanup supporting different llvm versions
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit d3d97f8395)
2016-12-02 19:57:53 +00:00
Kenneth Graunke
e6bc5248aa intel: Fix pixel shader scratch space allocation on Gen9+ platforms.
We had missed a bit of errata - PS scratch needs to be computed as if
there were 4 subslices per slice, rather than 3.

This is a conservative backport of commit aaee3daa90.
It only increases the scratch amount, unlike the original commit which
decreases it on Skylake GT1-3 to avoid overallocating.

Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-11-30 07:01:47 -08:00
Marek Olšák
352902218e gallium/radeon: make sure HTILE address is aligned properly
This should fix random GPU hangs on Hawaii and Fiji.
It's already been fixed in 13.0 and later.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-23 18:58:24 +01:00
Marek Olšák
6e77fbc8d7 gallium/radeon: fix behavior of GLSL findLSB(0)
This is for 12.0 and older. A different commit fixes 13.0 and newer.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
2016-11-11 22:41:45 +01:00
Emil Velikov
7b9d7257b2 docs: add sha256 checksums for 12.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-11 01:55:08 +00:00
Emil Velikov
3776e97f9d docs: add release notes for 12.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-11 01:53:32 +00:00
Jonathan Gray
20370d4f1b mesa: automake: include mesa_glinterop.h in distfile
Add mesa_glinterop.h to the list of headers that will get included
in the distfile as it is required to build Mesa itself.

Corrects a regression introduced in a89faa2022.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 23392abf50)
2016-11-10 21:57:37 +00:00
Emil Velikov
72539c5e38 Update version to 12.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-10 21:03:41 +00:00
Dave Airlie
e7c1408870 Revert "st/vdpau: use linear layout for output surfaces"
This reverts commit d180de3532.

This is a radeon specific hack that causes problems on nouveau
when combined with the SHARED flag later. If radeonsi needs a fix
for this, please fix it in the driver.

[chk]
Using linear surfaces for this makes sense because tilling isn't
beneficial and the surfaces can potentially be shared with other GPUs
using the VDPAU OpenGL interop.

[airlied]
I think we need a flag that isn't SHARED/LINEAR that is more
SHARED_OTHER_GPU.

[mareko]
Does radeonsi need PIPE_BIND_VIDEO_DECODE_OUTPUT that it would translate
into linear ?

[mareko]
My only concern is decoding performance. If the decoder works in 64x1
blocks, tiling will hurt. That's the theory. I don't know how the
decoder works.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (I+A)
(cherry picked from commit d0d5f7600c)
2016-11-08 20:45:03 +00:00
Marek Olšák
422e4da25c glx: make interop ABI visible again
This was broken when the GLAPI use was removed from mesa_glinterop.h.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 64c2593a5c)
2016-11-08 20:45:03 +00:00
Marek Olšák
1040360e9f egl: make interop ABI visible again
This was broken when the GLAPI use was removed from mesa_glinterop.h.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ee39d4456e)
2016-11-08 20:45:03 +00:00
Marek Olšák
ad7d21bc3a egl: use util/macros.h
I need the definition of PUBLIC.

Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bf51b45313)
[Emil Velikov: Keep the MIN2 macro]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 20:44:27 +00:00
Jason Ekstrand
3d4a219dd8 intel/blorp: Rework our usage of ralloc when compiling shaders
Previously, we were creating the shader with a NULL ralloc context and then
trusting in blorp_compile_fs to clean it up.  The only problem was that
blorp_compile_fs didn't clean up its context properly so we were leaking.
When I went to fix that, I realized that it couldn't because it has to
return the shader binary which is allocated off of that context and used by
the caller.  The solution is to make blorp_compile_fs take a ralloc
context, allocate the nir_shaders directly off that context, and clean it
all up in whatever function creates the shader and calls blorp_compile_fs.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0, 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43dadb6edd)
[Emil Velikov: attribute src/intel/blorp file movement]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/blorp/blorp.c
	src/intel/blorp/blorp_clear.c
	src/intel/blorp/blorp_priv.h
	src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
2016-11-08 16:23:23 +00:00
Samuel Pitoiset
76a77249ed nvc0/ir: fix emission of IMAD with NEG modifiers
The emitter tried to emit sub instead of subr when src0 has
actually a NEG modifier.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 84e946380b)
2016-11-08 16:23:23 +00:00
Tapani Pälli
c1f138149e egl: set preserved behavior for surface only if config supports it
Otherwise we can end up with mismatching behavior between config and
surface when client queries surface attributes. As example, configs
for DRI3 do not support preserved behavior but here we were setting
preserved behavior for pixmap and pbuffer.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98326
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit 2035930966)
2016-11-08 16:23:22 +00:00
Emil Velikov
3a27b813b4 cherry-ignore: add ClientWaitSync fixes
Patches (and extension overall) depends on gallium API and driver work
which hasn't landed in branch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:22 +00:00
Marek Olšák
ac3abe534b winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures
Maybe this is why SDMA has been broken for many amdgpu users?

SDMA is the only block which is used with imported textures and relies
on this variable. DB also uses it, but it doesn't get imported textures,
so it's unaffected.

I do get SDMA failures on Tonga before this patch if R600_DEBUG=testdma
is changed to use imported textures.

Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 6ec3b2a4b1)
[Emil Velikov: resolve trivial conflicts - SI support does not exist in
branch]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
2016-11-08 16:23:22 +00:00
Marek Olšák
20008a9fb8 gallium/radeon: make sure the address of separate CMASK is aligned properly
This should fix random GPU hangs on Hawaii and Fiji.

Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit dce05b3423)
2016-11-08 16:23:22 +00:00
Samuel Pitoiset
79a1cc2364 nvc0: use correct bufctx when invalidating CP textures
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7b2712c367)
2016-11-08 16:23:22 +00:00
Tapani Pälli
cac5e31b0f mesa: fix error handling in DrawBuffers
Patch rearranges error checking so that enum checking provided via
destmask happens before other checks. It needs to be done in this
order because other error checks do not work properly if there were
invalid enums passed.

Patch also refines one existing check and it's documentation to match
GLES 3.0 spec (also in later specs). This was somewhat mysteriously
referring to desktop GL but had a check for gles3.

Fixes following dEQP tests:

   dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers

no CI regressions observed.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98134
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit a1652a059e)
2016-11-08 16:23:22 +00:00
Tapani Pälli
979e4b9c3f egl: add check that eglCreateContext gets a valid config
Fixes following dEQP test:

   dEQP-EGL.functional.negative_api.create_context

v2: don't break EGL_KHR_no_config_context (Eric Engestrom)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5876f3c85a)
[Emil Velikov: drop EGL_NO_CONFIG_KHR, use MESA_configless_context]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/egl/main/eglapi.c
2016-11-08 16:23:22 +00:00
Emil Velikov
71d0b5f7c7 cherry-ignore: add N/A EGL revert
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:22 +00:00
Tapani Pälli
8fbad64732 egl/dri2: set max values for pbuffer width and height
While these max values were previously fixed for pbuffer creation, this
change makes also eglGetConfigAttrib() return correct values.

Fixes following dEQP tests:

   dEQP-EGL.functional.create_surface.pbuffer.rgb888_no_depth_no_stencil
   dEQP-EGL.functional.create_surface.pbuffer.rgb888_depth_stencil
   dEQP-EGL.functional.create_surface.pbuffer.rgba8888_no_depth_no_stencil
   dEQP-EGL.functional.create_surface.pbuffer.rgba8888_depth_stencil

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98326
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b91e1e38e8)
2016-11-08 16:23:21 +00:00
Axel Davy
a106d8c872 st/nine: Fix locking CubeTexture surfaces.
Only one face of Cubetextures was locked when in DEFAULT Pool.
Fixes:
https://github.com/iXit/Mesa-3D/issues/129

CC: "12.0 13.0" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit eed605a473)
2016-11-08 16:23:21 +00:00
Axel Davy
5abbe84671 st/nine: Fix mistake in Volume9 UnlockBox
In the format fallback path,
the height was used instead of the depth.

CC: "12.0 13.0" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
(cherry picked from commit fe7bb46134)
2016-11-08 16:23:21 +00:00
Jonathan Gray
7133f0054d mapi: automake: set VISIBILITY_CFLAGS for shared glapi
shared glapi was previously built without setting CFLAGS for
AM_CFLAGS and VISIBILITY_CFLAGS.

This resulted in symbols being exported that shouldn't be.

The x86 and sparc assembly versions of the dispatch table partially
mitigated this by using .hidden.  Otherwise shared_dispatch_stub_*
were being exported.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: "11.2 12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 907ace5798)
2016-11-08 16:23:21 +00:00
Stencel, Joanna
b5cb4f5980 egl/wayland: add missing destroy_window callback
The original patch by Joanna added the function pointer and callback yet
things got only partially applied - the infra was added, but the
implementation was missing.

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Fixes: 690ead4a13 ("egl/wayland-egl: Fix for segfault in
dri2_wl_destroy_surface.")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

(cherry picked from commit 2e0ab61e29)
2016-11-08 16:23:21 +00:00
Emil Velikov
d2be28c2bd automake: don't forget to pick wglext.h in the tarball
Earlier commit reworked the header install rules, to ensure that the
correct ones are installed only as needed.

By doing so it dropped a wildcard which was effectively including the
wglext.h header in the tarball.

Add the header to the top-level noinst_HEADERS, since the it is not
meant to be installed (autoconf is not used on Windows plaforms).

Fixes: a89faa2022 ("autoconf: Make header install distinct for various
APIs (v2)")
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Cc: Chuck Atkins <chuck.atkins@kitware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

(cherry picked from commit 3511a86111)
2016-11-08 16:23:21 +00:00
Emil Velikov
cd2db885bf get-pick-list.sh: Require explicit "12.0" for nominating stable patches
A nomination unadorned with a specific version is now interpreted as
being aimed at the 12.0 branch, which was recently opened.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:21 +00:00
Nicolai Hähnle
648f012459 radeonsi: fix 64-bit loads from LDS
Fixes spec/arb_tessellation_shader/execution/dvec[23]-vs-tcs-tes, among
others.

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 4a2dbfff05)
2016-11-08 16:23:21 +00:00
Nicolai Hähnle
f8f9d7528a st/mesa: only set primitive_restart when the restart index is in range
Even when enabled, primitive restart has no effect when the restart index
is larger than the representable values in the index buffer.

Fixes GL45-CTS.gtf31.GL3Tests.primitive_restart.primitive_restart_upconvert
for radeonsi VI.

v2: add an explanatory comment

Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
(cherry picked from commit bfa50f88ce)
2016-11-08 16:23:20 +00:00
Nicolai Hähnle
3a030e886d st/glsl_to_tgsi: fix block copies of arrays of doubles
Set the type of the left-hand side to the same as the right-hand side,
so that when the base type is double, the writemask of the MOV instruction
is properly fixed up.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ca592af880)
2016-11-08 16:23:20 +00:00
Ilia Mirkin
4d478aad50 nv50/ir: process texture offset sources as regular sources
With ARB_gpu_shader5, texture offsets can be any source, including TEMPs
and IN's. Make sure to process them as regular sources so that we pick
up masks, etc.

This should fix some CTS tests that feed offsets directly to
textureGatherOffset, and we were not picking up the input use, thus not
advertising it in the shader header.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dave Airlie <airlied@redhat.com>
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cd45d758ff)
2016-11-08 16:23:20 +00:00
Ilia Mirkin
e8ae2da8a0 nv50,nvc0: avoid reading out of bounds when getting bogus so info
The state tracker tries to attach the info to the wrong shader. This is
easy enough to protect against.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 313fba5ee1)
2016-11-08 16:23:20 +00:00
Jonathan Gray
34cb65716e genxml: add generated headers to EXTRA_DIST
Building the Mesa 12.0.3 distfile failed on a system without python
as generated files were not included in the distfile.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 41754f743f)
2016-11-08 16:23:20 +00:00
Ilia Mirkin
4ddcb9cb22 gm107/ir: fix bit offset of tex lod setting for indirect texturing
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8c78fdb328)
2016-11-08 16:23:20 +00:00
Ilia Mirkin
aeabbc1e1d gm107/ir: fix texturing with indirect samplers
The indirect handle has to come right after the coordinates, so if there
was a sample/bias/depth compare/offset, everything would end up being
shifted by one argument position.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ecea2f69ef)
2016-11-08 16:23:20 +00:00
Kenneth Graunke
9cecb50bf2 i965: Fix gl_InvocationID in dual object GS where invocations == 1.
dEQP-GLES31.functional.geometry_shading.instanced.geometry_1_invocations
draws using a geometry shader that specifies

   layout(points, invocations = 1) in;

and then uses gl_InvocationID.  According to the Haswell PRM, the
"GS Instance ID 0" (and 1) thread payload fields are undefined in
dual object mode:

   "If 'dispatch mode' is DUAL_OBJECT this field is not valid."

But there's no point in using them - if there's only one invocation,
the ID will be 0.  So just load a constant.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 9f677d6541)
2016-11-08 16:23:20 +00:00
Nicolai Hähnle
c403863348 st/glsl_to_tgsi: fix atomic counter addressing
When more than one atomic counter buffer is in use, UniformStorage[n].opaque
is set up to contain indices that are contiguous across all used buffers.

This appears to be used by i965 via NIR, but for TGSI we do not treat atomic
counter buffers as opaque, so using the data in the opaque array is incorrect.

Fixes GL45-CTS.compute_shader.resource-atomic-counter.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 1dd99a15a4)
2016-11-08 16:23:19 +00:00
Nicolai Hähnle
7c7973606f radeonsi: fix indirect loads of 64 bit constants
This fixes GL45-CTS.compute_shader.fp64-case3.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 51f9b38ce8)
2016-11-08 16:23:19 +00:00
Chad Versace
341889d6ca egl: Don't advertise unsupported platform extensions
Mesa's set of supported platform extensions depends on the autoconf
option --with-egl-platforms=foo,bar,baz. If --with-egl-platforms lacks
foo, then eglGetPlatformDisplay(EGL_PLATFORM_FOO, ...) unconditonally
fails.

So, if --with-egl-platforms lacks foo, then remove
EGL_VENDOR_platform_foo from the EGL client extension string.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c177ef9d47)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/egl/main/eglglobals.c
2016-11-08 16:23:19 +00:00
Emil Velikov
866aee0264 egl/x11: don't crash if dri2_dpy->conn is NULL
The dri3 version of commits 60e9c35b3a and 6de9a03bed.

While using xcb_connect() guarantees that we always get a non NULL
return value, XGetXCBConnection() does/can not.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit b10c05d4ff)
2016-11-08 16:23:19 +00:00
Emil Velikov
32a469b8ed isl/gen6: correctly check msaa layout samples count
Samples == 1 is a valid value, so returning false is plain wrong.
Seeming copy/paste typo introduced since day 1.

Fixes: afdadec77f ("isl: Implement isl_surf_init() for gen4-gen9")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit 84f9ef1de4)
2016-11-08 16:23:19 +00:00
Vinson Lee
eb9236e275 Revert "mesa_glinterop: remove inclusion of GLX header"
This reverts commit 8472045b16.

Conflicts:

	include/GL/mesa_glinterop.h

This patch fixes this build error with GCC 4.4.

  Compiling src/glx/dri_common_interop.c ...
In file included from src/glx/dri_common_interop.c:33:
include/GL/mesa_glinterop.h:62: error: redefinition of typedef ‘GLXContext’
include/GL/glx.h:165: note: previous declaration of ‘GLXContext’ was here

Fixes: 8472045b16 ("mesa_glinterop: remove inclusion of GLX header")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96770
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit c10dcb2ce8)

Squashed with commit:

mesa_glinterop: allow building without X and related headers

This commit effectively reverts c10dcb2ce8
and fixes the typedef redefinition which inspired it.

In order to prevent requiring X packages at build time earlier commit
forward declared the required X/GLX typedefs. Since that approach
introduced typedef redefinition (a C11 feature) it was reverted.

To avoid the redefinition while _not_ mandating X and related headers
forward declare the structs and use those through the header.

As anyone uses the mesa interop header they ensure that the X (or others
in terms of EGL) headers are included, which ensures that everything is
resolved within the compilation unit.

Cc: Vinson Lee <vlee@freedesktop.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Fixes: c10dcb2ce8 ("Revert "mesa_glinterop: remove inclusion of GLX
header"")
Fixes: 8472045b16 ("mesa_glinterop: remove inclusion of GLX header")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96770
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

(cherry picked from commit c85b34ffd0)
2016-11-08 16:23:19 +00:00
Mario Kleiner
50afa72f3c glx: Perform check for valid fbconfig against proper X-Screen.
Commit cf804b4455
('glx: fix crash with bad fbconfig') introduced a check
in glXCreateNewContext() if the given config is a valid
fbconfig.

Unfortunately the check always checks the given config against
the fbconfigs of the DefaultScreen(dpy), instead of the
actual X-Screen specified in the config config->screen.

This leads to failure whenever a GL context is created
on a non-DefaultScreen(dpy), e.g., on X-Screen 1 of
a multi-x-screen setup, where the default screen is
typically 0.

Fix this by using config->screen instead of DefaultScreen(dpy).

Tested to fix context creation failure on a dual-x-screen setup.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0c94ed0987)
2016-11-08 16:23:19 +00:00
Dave Airlie
80409971c0 anv/wsi: fix apps that acquire multiple images up front
This fix was found in the radv codebase when running dota2,
no idea if anyone has reported it on anv, but the same problem
occurs.

Once an image is acquired we need to mark it busy.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8980ac0411)

Squashed with commit

anv: fix the wayland wsi busy flag setting

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit a3834ebaf9)
2016-11-08 16:23:19 +00:00
Dave Airlie
920150f28a anv: initialise and increment send_sbc
At least set this to not be uninitialised memory.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit dfe74fd1a9)
2016-11-08 16:23:18 +00:00
Marek Olšák
12bdcc105c radeonsi: disable ReZ
This is a serious performance fix. Discovered by luck.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94354

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit e12c1cab5d)
2016-11-08 16:23:18 +00:00
Nicolai Hähnle
1cb2a483ba st/mesa: fix vertex elements setup for doubles
Whether one or two slots are taken up by one API array depends on the
vertex shader, not on how the array is configured. When an array is
set up with fewer components than the shader expects, the high components
are undefined.

Fixes GL45-CTS.vertex_attrib_binding.basic-inputL-case1.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d413fbb159)
2016-11-08 16:23:18 +00:00
Nicolai Hähnle
79e84bafc1 st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1d7685e52c)
2016-11-08 16:23:18 +00:00
Nicolai Hähnle
5181624675 st/glsl_to_tgsi: simplify translate_tex_offset
This fixes a bug with offsets from uniforms which seems to have only been
noticed as a crash in piglit's
arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag
on radeonsi.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b234e37765)
2016-11-08 16:23:18 +00:00
Ilia Mirkin
d18d830744 nvc0/ir: fix textureGather with a single offset
Recent fix for non-const offsets broke the case of a single offset (vs 4
offsets). The later code relies on the offs array to contain null values
to tell whether they should be added onto the srcs list.

Fixes: 5239bd592 ("nvc0/ir: fix overwriting of value backing non-constant gather offset")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit a48a343c29)
2016-11-08 16:23:18 +00:00
Ilia Mirkin
bef0cc6287 nv50/ir: copy over value's register id when resolving merge of a phi
The offset needs to be properly copied over to the phi value, otherwise
it will get assigned to the base of the merge instead of the proper
location.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 300b5ad023)
2016-11-08 16:23:18 +00:00
Nicolai Hähnle
f0f5bd9607 st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations
This optimization is incorrect with 64-bit operations, because the
channel-splitting logic in emit_asm ends up being applied twice to
the source operands.

A lucky coincidence of how the writemask test works resulted in this
optimization basically never being applied anyway. As far as I can tell,
the only case where it would (incorrectly) have been applied is something
like

    dvec2 d;
    float x = (float)d.y;

which nobody seems to have ever done. But the moral equivalent does occur
in one of the component layout piglit test.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 63193b9cde)
2016-11-08 16:23:18 +00:00
Axel Davy
ca135ebd76 st/nine: Fix the calculation of the number of vs inputs
Fixes hangs on radeonsi, and assert on llvmpipe.

Signed-off-by: Axel Davy <axel.davy@ens.fr>

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d2fd296648)
2016-11-08 16:23:17 +00:00
Axel Davy
ec2751f967 gallium/util: Really allow aliasing of dst for u_box_union_*
Gallium nine relies on aliasing to work with this function.
Without this patch, dirty region tracking was incorrect, which
could lead to incorrect textures or vertex buffers.
Fixes several game bugs with nine.
Fixes https://github.com/iXit/Mesa-3D/issues/234

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2290eac84e)
2016-11-08 16:23:17 +00:00
Ilia Mirkin
24a16d0a9d nvc0/ir: fix overwriting of value backing non-constant gather offset
Normally the value is an immediate, which is moved to some temporary, so
there's no problem. In the case of a non-constant offset (as allowed by
ARB_gpu_shader5), we have to take care to copy it first before using it
to build up the bits.

This fixes a compilation error observed in F1 2015.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 5239bd5920)
2016-11-08 16:23:17 +00:00
Eric Anholt
9c5de2546c gallium: Fix install-gallium-links.mk on non-bash /bin/sh
Debian uses dash by default, which doesn't do '+='.  Fixes servo's
osmesa-based headless testing system, which was looking for libOSMesa in
the lib/ directory.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ec9ed1c4d8)
2016-11-08 16:23:17 +00:00
Martin Peres
17f59da9db loader/dri3: import prime buffers in the currently-bound screen
This tries to mirrors the codepath taken by DRI2 in IntelSetTexBuffer2()
and fixes many applications when using DRI3:
 - Totem with libva on hw-accelerated decoding
 - obs-studio, using Window Capture (Xcomposite) as a Source
 - gstreamer with VAAPI

v2:
 - introduce get_dri_screen() in the dri3 loader's vtable (krh)

Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Tested-by: Ionut Biru <biru.ionut@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71759
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
(cherry picked from commit a599b1c203)
2016-11-08 16:23:17 +00:00
Martin Peres
0ae4d909ea loader/dri3: add get_dri_screen() to the vtable
This allows querying the current active screen from the
loader's common code.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
(cherry picked from commit 0247e5ee3e)
2016-11-08 16:23:17 +00:00
Chuck Atkins
2fedb106bb autoconf: Make header install distinct for various APIs (v2)
This fixes a problem where GL headers would only get installed if
glx was enabled.  So if osmesa was enabled but not glx, then the
GL headers required by osmesa would be missing from the install.

v2: Dropped unneeded mesa_glinterop.h redundant osmesa.h install

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit a89faa2022)
2016-11-08 16:23:17 +00:00
Chad Versace
c5de4cbf32 i965/sync: Fix uninitalized usage and leak of mutex
We locked an unitialized mutex in the callstack
    glClientWaitSync
    intel_gl_client_wait_sync
    brw_fence_client_wait_sync
because we forgot to initialize it in intel_gl_fence_sync.
(The EGLSync codepath didn't have this bug. It initialized the mutex in
intel_dri_create_sync).

We also forgot to tear down (mtx_destroy) the mutex when destroying
the sync object.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ce1d67c2e5)
2016-11-08 16:23:17 +00:00
Marek Olšák
e7b274c552 radeonsi: fix texture border colors for compute shaders
There are VM faults without this.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit cc4a19c4ad)
2016-11-08 16:23:16 +00:00
Marek Olšák
63c1a5d391 radeonsi: fix interpolateAt opcodes for .zw components
Not returning garbage in .zw seems pretty important.

This fixes:
GL45-CTS.shader_multisample_interpolation.render.interpolate_at_*_check.*

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 1b37e5541c)
2016-11-08 16:23:16 +00:00
Kenneth Graunke
d16be6898b i965: Add missing BRW_CS_PROG_DATA to CS work group surface atom.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit bab1c05634)
2016-11-08 16:23:16 +00:00
Kenneth Graunke
4b0512ade4 i965: Add missing BRW_NEW_CS_PROG_DATA to compute constant atom.
CACHE_NEW_CS_PROG hasn't existed in quite a long time...the old
comment was there, but not the actual bit.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ce6c80ebbb)
2016-11-08 16:23:16 +00:00
Emil Velikov
39344de587 cherry-ignore: add update_renderbuffer_read_surfaces()
The function (and underlying work) is not in branch. Former introduced
with 786108e7b2.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:16 +00:00
Kenneth Graunke
915683485f i965: Move BRW_NEW_FRAGMENT_PROGRAM from 3DSTATE_PS to PS_EXTRA.
3DSTATE_PS doesn't need this.  3DSTATE_PS_EXTRA however does,
for brw_color_buffer_write_enabled().

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0047d600af)
2016-11-08 16:23:16 +00:00
Kenneth Graunke
bc04c92aef i965: Add missing BRW_NEW_VS_PROG_DATA to 3DSTATE_CLIP.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 28e1538be7)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/gen6_clip_state.c
2016-11-08 16:23:16 +00:00
Kenneth Graunke
5fb22e1258 i965: Fix missing _NEW_TRANSFORM in Gen8+ 3DSTATE_DS atom.
Needed for user clip plane enables.  Broken since this code was
introduced.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 78df96256b)
2016-11-08 16:23:16 +00:00
Chad Versace
b5de58d7e1 egl: Fix truncation error in _eglParseSyncAttribList64
The function stores EGLAttrib values in EGLint variables. On 64-bit
systems, this truncated the values.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 69adb9a778)
2016-11-08 16:23:16 +00:00
Emil Velikov
082ea77cdf cherry-ignore: add EGL_KHR_debug fix
The extension landed after the branchpoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:15 +00:00
Nicolai Hähnle
a5c0b8784a gallium/radeon: cleanup and fix branch emits
Some of the existing code is needlessly complicated. The basic principle
should be: control-flow opcodes emit branches to properly terminate the
current block, _unless_ the current block already has a terminator (which
happens if and only if there was a BRK or CONT).

This also fixes a bug where multiple terminators were created in a block.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97887
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 6f87d7a146)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
2016-11-08 16:23:15 +00:00
James Legg
020550e099 radeonsi: Fix primitive restart when index changes
If primitive restart is enabled for two consecutive draws which use
different primitive restart indices, then the first draw's primitive
restart index was incorrectly used for the second draw.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98025

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e33f31d61f)
2016-11-08 16:23:15 +00:00
Matt Whitlock
e7491c3bbd gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 42ed8a6c9c)
2016-11-08 16:23:15 +00:00
Matt Whitlock
39c0535646 st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit ac6064f918)
2016-11-08 16:23:15 +00:00
Matt Whitlock
7c27d56535 st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 0c060f691c)
2016-11-08 16:23:15 +00:00
Matt Whitlock
ea3e778bff gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 5d0069eca2)
2016-11-08 16:23:15 +00:00
Matt Whitlock
d82738fbd9 egl/android: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit c8fd7d060d)
2016-11-08 16:23:15 +00:00
Jason Ekstrand
4e1298be04 nir/spirv/cfg: Use a nop intrinsic for tagging the ends of blocks
Previously, we were saving off the last nir_block in a vtn_block before
moving on so that we could find the nir_block again when it came time to
handle phi sources.  Unfortunately, NIR's control flow modification code is
inconsistent when it comes to how it splits blocks so the block pointer we
saved off may point to a block somewhere else in the shader by the time we
get around to handling phi sources.  In order to get around this, we insert
a nop instruction and use that as the logical end of our block.  Since the
control flow manipulation code respects instructions, the nop will keeps
its place like any other instruction and we can easily find the end of our
block when we need it.

This fixes a bug triggered by a couple of vkQuake shaders.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97233
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6ffbfc760d)
2016-11-08 16:23:14 +00:00
Jason Ekstrand
17429a22a6 nir: Add a nop intrinsic
This intrinsic has no destination, no sources, no variables, and can be
eliminated.  In other words, it does nothing and will always get deleted by
dead code elimination.  However, it does provide a quick-and-easy way to
temporarily tag a particular location in a NIR shader.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7697b4b98b)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/compiler/nir/nir_intrinsics.h
2016-11-08 16:23:14 +00:00
Tapani Pälli
d8d7de3b29 egl: stop claiming support for pbuffer + msaa
This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test
and same crash in many dEQP EGL tests.

I also found that some Qt example did a workaround because of this
crash: https://bugreports.qt.io/browse/QTBUG-47509

v2: Ian pointed out that v1 removed support for all multisample
    configs, including window ones. This one removes pbuffer bit
    when adding configs, now only pbuffer+msaa gets rejected and
    window+msaa continues to work. Fixed also comment (Emil)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 4d6d55deef)
2016-11-08 16:23:14 +00:00
Jason Ekstrand
5e4aeeb8ec nir/spirv/cfg: Detect switch_break after loop_break/continue
While the current CFG code is valid in the case where a switch break also
happens to be a loop continue, it's a bit suboptimal.  Since hardware is
capable of handling the continue as a direct jump, it's better to use a
continue instruction when we can than to bother with all of the nasty
switch break lowering.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ef3c5ac7fb)
2016-11-08 16:23:14 +00:00
Jason Ekstrand
12d09e24f8 nir/spirv/cfg: Handle switches whose break block is a loop continue
It is possible that the break block of a switch is actually the continue of
the loop containing the switch.  In this case, we need to identify the
break block as a continue and break out of current level of CFG handling.
If we don't, the continue portion of the loop will get handled twice, once
by following after the break and a second time by the loop handling code
handling it explicitly.

This fixes 6 of the new Vulkan CTS tests:
 - dEQP-VK.spirv_assembly.instruction.graphics.opphi.out_of_order*
 - dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order*

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4d02faede5)
2016-11-08 16:23:14 +00:00
Eric Anholt
cd88ea6d82 travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
(cherry picked from commit 78ab62b1e9)
2016-11-08 16:23:14 +00:00
Eric Anholt
c004014db4 travis: Enable vc4 in libdrm to satisfy vc4 test build dependency.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
(cherry picked from commit 084678ccbb)
2016-11-08 16:23:14 +00:00
Eric Anholt
6931dab9b4 travis: Update to the Ubuntu Trusty image.
This will hopefully fix wget from x.org (no real reason explained in
Travis CI bug reports), and may also mean that we can enable LLVM driver
builds.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
(cherry picked from commit 80a872f3f0)
2016-11-08 16:23:14 +00:00
Eric Anholt
a8010d31ce travis: Parse configure.ac to pick an updated LIBDRM_VERSION.
Travis has been broken a couple of times by configure.ac updates.  To make
it useful, auto-update the version necessary.

This could potentially be used for other dependencies, too, but those get
bumped less frequently.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
(cherry picked from commit ecbc76cf6e)
2016-11-08 16:23:13 +00:00
Ian Romanick
0909e54c3c glsl: Fix cut-and-paste bug in hierarchical visitor ir_expression::accept
At this point in the code, s must be visit_continue.  If the child
returned visit_stop, visit_stop is the only correct thing to return.

Found by inspection.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ea6ed2379d)
2016-11-08 16:23:13 +00:00
Tim Rowley
6cd0438840 configure.ac: add llvm inteljitevents component if enabled
Needed to successfully link llvmpipe or swr when using shared llvm libs
built with inteljitevents enabled.

v2: Make adding inteljitevents component global rather than just
llvmpipe/swr, since libgallium will have a symbol dependency.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bacdd9ef4c)
2016-11-08 16:23:13 +00:00
Nicholas Bishop
f228c90f80 st/dri: check pipe_screen->resource_get_handle() return value
Change dri2_query_image to check the return value of resource_get_handle
and return GL_FALSE if an error occurs.

For reference this is an example callstack that should propagate the
error back to the user:

    i915_drm_buffer_get_handle
    i915_texture_get_handle
    u_resource_get_handle_vtbl
    dri2_query_image
    gbm_dri_bo_get_fd
    gbm_bo_get_fd

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Nicholas Bishop <nbishop@neverware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
[Emil Velikov: Split from larger patch, polish coding style, cc stable]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit aa560e8e63)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/state_trackers/dri/dri2.c
2016-11-08 16:23:13 +00:00
Nicholas Bishop
29320aa06a gbm: return appropriate error when queryImage() fails
Change gbm_dri_bo_get_fd to check the return value of queryImage and
return -1 (an invalid file descriptor) if an error occurs.

Update the comment for gbm_bo_get_fd to return -1, since (apart from the
above) we've already return -1 on error.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Nicholas Bishop <nbishop@neverware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
[Emil Velikov: Split from larger patch, polish coding style, cc stable]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit 2d05ba2ca0)
2016-11-08 16:23:13 +00:00
Emil Velikov
48f001b836 cherry-ignore: add vaapi encode fix
The encode codepaths landed after the branch point.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:13 +00:00
Brian Paul
0823c98a65 st/mesa: fix swizzle issue in st_create_sampler_view_from_stobj()
Some demos, like Heaven, were creating and destroying a large number
of sampler views because of a swizzle issue.

Basically, we compute the sampler view's swizzle by examining the
texture format, user swizzle, depth mode, etc.  Later, during validation
we recompute that swizzle (in case something like depth mode changes)
and see if it matches the view's swizzle.

In the case of PIPE_FORMAT_RGTC2_UNORM, get_texture_format_swizzle
returned SWIZZLE_XYZW but the u_sampler_view_default_template() function
was setting the sampler view's swizzle to SWIZZLE_XY01.  This mismatch
caused the validation step to always "fail" so we'd destroy the old
sampler view and create a new one.

By removing the conditional, the sampler view's swizzle and the computed
texture swizzle match and validation "passes".  When creating a new sampler
view, we always want to use the texture swizzle which we just computed.

Fixes VMware issue 1733389.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 1cdc232e1a)
2016-11-08 16:23:13 +00:00
Samuel Pitoiset
a287f820b8 gk110/ir: fix wrong emission of OP_NOT
This should emit src0 instead of src1.
Found by inspection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d8b4f5fcca)
2016-11-08 16:23:13 +00:00
Samuel Pitoiset
725ef1fbba nvc0/ir: fix subops for IMAD
Offset was wrong, it's at bit 8, not 4. Also, uses subr instead
of sub when src2 has neg. Similar to GK110 now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 50baaf6bc6)
2016-11-08 16:23:13 +00:00
Marek Olšák
a4a2378408 mesa: fix glGetFramebufferAttachmentParameteriv w/ on-demand FRONT_BACK alloc
This fixes 66 CTS tests on st/mesa.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d58a3906cb)
2016-11-08 16:23:12 +00:00
Emil Velikov
cc34777cec cherry-ignore: add non-applicable i965 commit
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:12 +00:00
Hans de Goede
cd614f31b8 pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms
Make pipe_loader_sw_probe_kms take ownership of the passed in fd,
like pipe_loader_drm_probe_fd does.

The only caller is dri_kms_init_screen which passes in a dupped fd,
just like dri2_init_screen passes in a dupped fd to
pipe_loader_drm_probe_fd.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 459cc94507)

Squashed with commit:

pipe_loader_sw: Don't invoke Unix close() on Windows.

Trivial.

(cherry picked from commit c6d17701c8)
2016-11-08 16:23:12 +00:00
Vedran Miletić
67c99c1245 clover: Fix build against clang SVN >= r273191
setLangDefaults() now requires PreprocessorOptions as an argument.

Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 82e0bbd01a)
Nominated-by: Andreas Boll <andreas.boll.dev@gmail.com>
Nominated-by: Timo Aaltonen <tjaalton@ubuntu.com>
2016-11-08 16:23:12 +00:00
Kenneth Graunke
6a72af2aeb mesa: Expose RESET_NOTIFICATION_STRATEGY with KHR_robustness.
This is supposed to be exposed with the GL_KHR_robustness extension,
which we support on ES 2.0 and later.  On desktop GL, it's also exposed
by GL_ARB_robustness, which is supported by all drivers ("dummy_true").
so we also allow desktop GL.

Fixes:
- ES32-CTS.robust.robustness.noResetNotification
- ES32-CTS.robust.robustness.loseContextOnReset

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 3bcdc2e3db)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/main/get.c
	src/mesa/main/get_hash_params.py
2016-11-08 16:23:12 +00:00
Kenneth Graunke
e591b0b206 nir: Call nir_metadata_preserve from nir_lower_alu_to_scalar().
This is mandatory.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit e6eed3533e)
2016-11-08 16:23:12 +00:00
Brendan King
96aa7ca98c configure.ac: fix the name of the Wayland Scanner pc file
The Wayland Scanner pkg-config file is called wayland-scanner.pc.

Fixes: 153539bd9d ("configure: rework wayland_scanner
       handling (fix make distcheck)")

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Brendan King <Brendan.King@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 95f3e5861c)
2016-11-08 16:23:12 +00:00
Marek Olšák
b1c5719d7b radeonsi: fix FP64 UBO loads with indirect uniform block indexing
No known tests.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 15a127bc2c)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
        src/gallium/drivers/radeonsi/si_shader.c
2016-11-08 16:23:12 +00:00
Ilia Mirkin
ec1f6700b6 st/mesa: fix is_scissor_enabled when X/Y are negative
Similar to commit 49c24d8a24 ("i965: fix noop_scissor range issue on
width/height") - take the X/Y into account to determine whether the
scissor covers the whole area or not.

Fixes the recently-added gl-1.0-scissor-depth-clear-negative-xy piglit
test.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 742832434a)
2016-11-08 16:23:11 +00:00
Julien Isorce
b2495c2202 st/va: also honors interlaced preference when providing a video format
This fixes a crash when using the prefered video format with vaapisink
on Nvidia hardwares.
Also caught by the following assert:
  nouveau_vp3_video.c:91: Assertion `templat->interlaced' failed.

TEST= gst-launch-1.0 videotestsrc ! video/x-raw, format=NV12 ! vaapisink

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Tested-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com>
Tested-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit bf901a2f8c)
2016-11-08 16:23:11 +00:00
Chuanbo Weng
7ad97bc307 gbm: fix potential NULL deref of mapImage/unmapImage.
The mapImage/unmapImage functions of DRIimage extension can be NULL,
so we should add additional check for them.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9a1eb54237)
2016-11-08 16:23:11 +00:00
Ilia Mirkin
106f2dc8a7 gm107/ir: AL2P writes to a predicate register
We have to force it to write to predicate 7 (aka PT) in order for it not
to mess up another predicate. Unclear what would be returned in the
predicate, perhaps an error code for out-of-bounds requests. Blob
doesn't seem to check it.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit a22aee5ad1)
2016-11-08 16:23:11 +00:00
Marek Olšák
a3c232db2f radeonsi: take compute shader and dispatch indirect memory usage into account
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit e62caf576e)

Squashed with commit:

radeonsi: flush TC L2 before using a compute indirect buffer

There is no known test for this.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 08bcbfdc07)
2016-11-08 16:23:11 +00:00
Max Staudt
f75a108434 r300g: Set R300_VAP_CNTL on RSxxx to avoid triangle flickering
On the RSxxx chip series, HW TCL is missing and r300_emit_vs_state()
is never called.

However, if R300_VAP_CNTL is never set, the hardware (at least the
RS690 I tested this on) comes up with rendering artifacts, and
parts that are uploaded before this "fix" remain broken in VRAM.
This causes artifacts as in fdo#69076 ("triangle flickering").

It seems like this setup needs to happen at least once after power on
for 3D rendering to work properly. In the DDX with EXA, this happens in
RADEON_SWITCH_TO_3D() when processing an XRENDER Composite or an
Xv request. So playing back a video or starting a GTK+2 application
fixes 3D rendering for the rest of the session. However, this auto-fix
doesn't happen when EXA is not used, such as with GLAMOR or Wayland.

This patch ensures the register is configured even in absence of
the DDX's EXA module.

The register setting is taken from:
  xf86-video-ati  --  RADEONInit3DEngineInternal()
  mesa/src/mesa/drivers/dri/r300  --  r300EmitClearState()

Tested on RS690.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Max Staudt <mstaudt@suse.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 02675622b0)
2016-11-08 16:23:11 +00:00
Jason Ekstrand
258e651e0f nir/spirv: Refactor variable deocration handling
Previously, we dind't apply variable decorations to the members of a split
structure variable.  This doesn't quite work, unfortunately, because things
such as the "flat" qualifier may get applied to an entire structure instead
of propagated to the members.  This fixes 9 of the new CTS tests in the
dEQP-VK.glsl.linkage.varying.struct.* group.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a00bd7bc27)
2016-11-08 16:23:11 +00:00
Jason Ekstrand
662a7c627b nir/spirv: Break variable decoration handling into a helper
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f5505730d3)
2016-11-08 16:23:11 +00:00
Ilia Mirkin
ed8e99761d nir: fix definition of pack_uvec2_to_uint
Found by inspection. Untested beyond compilation. This also matches the
logic used in nir_lower_alu_to_scalar.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8c8874eafb)
2016-11-08 16:23:10 +00:00
Ilia Mirkin
98c9fcf259 mesa/formatquery: limit ES target support, fix core context support
First off, as late as ES 3.2, GetInternalformat only supports
RENDERBUFFER and 2DMS(_ARRAY) targets.

Secondly, the _mesa_has_ext helpers are very accurate... a little too
accurate, some might say. If we only show an extension in compat
profiles because core profiles have the functionality guaranteed, they
will return false. Fix these to either check for a core profile
explicitly, or to a different-but-identical extension available in core
profile.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matteo Bruni <matteo.mystral@gmail.com>
Tested-by: Matteo Bruni <matteo.mystral@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c42acd93d4)
2016-11-08 16:23:10 +00:00
Ilia Mirkin
5eabc81d50 main: GL_RGB10_A2UI does not come with GL 3.0/EXT_texture_integer
Add a separate extension check for that format. Prevents glTexImage from
trying to find a matching format, which fails on drivers without support
for this format.

Fixes: sized-texture-format-channels (on a3xx)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 36347c8d6f)
2016-11-08 16:23:10 +00:00
Jason Ekstrand
2fbce4c9e1 nir/spirv: Use the correct sources for CompareExchange on images
The CompareExchange operation has two "Memory Semantics" parameters instead
of one so the real arguments start at w[7] instead of w[6].

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit f2a10937d8)
2016-11-08 16:23:10 +00:00
Jason Ekstrand
ab6126bb1d nir/spirv: Swap the argument order for AtomicCompareExchange
SPIR-V has the two arguments in the opposite order from GLSL.  NIR uses the
GLSL order so we had them backwards.

Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0ead7bef6b)
2016-11-08 16:23:10 +00:00
Marek Olšák
f5de7da4e1 radeonsi: fix cubemaps viewed as 2D
This fixes: GL43-CTS.texture_view.view_sampling

v2: fix a typo, merge both if statements

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit a4fa215058)
2016-11-08 16:23:10 +00:00
Ilia Mirkin
05ec6a7c03 a3xx: use window scissor to simulate viewport xy clip
Unfortunately a3xx does not have a separate disable for depth clipping,
so when depth clamp is enabled, we disable the whole 3d clipper logic.
This in turn also gets rid of the xy clip that it would normally do.
When we detect this would happen, instead we integrate the viewport into
the window scissor. This may have slightly different behavior around
wide points, but it's unlikely that anything depends on this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ca313e00b6)
[Emil Velikov: s|batch->||g]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/freedreno/a3xx/fd3_emit.c
2016-11-08 16:23:10 +00:00
Ilia Mirkin
dee992caa1 a3xx: make use of software clipping when hw can't handle it
The hw clipper only handles up to 6 UCPs. If there are more than 6 UCPs,
or a clip vertex, or clip distances are in use, then we must use the
fallback discard-based clipping from the frag shader.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 83d7230fd5)
2016-11-08 16:23:10 +00:00
Ilia Mirkin
68620d14d4 a3xx: make sure to actually clamp depth as requested
We were previously ... not clamping. I guess this meant that everything
got clamped to 1/0, which was enough to pass the existing tests. Or
perhaps the clamping would only happen to the rasterized depth value and
not the frag shader's output depth value.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit dac72234c7)
[Emil Velikov: s|batch->||g]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-11-08 16:23:10 +00:00
Ilia Mirkin
42a890891c nv30: set usage to staging so that the buffer is allocated in GART
The code a few lines below expects to migrate the bo in question to
VRAM. Since we're filling the initial data via CPU, it's more efficient
to create the temporary buffer in GART. There is no "push" method
implemented, otherwise we'd use that instead.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 6118bcab4e)
2016-10-31 17:39:42 +00:00
Michel Dänzer
d3d33918c7 loader/dri3: Overhaul dri3_update_num_back
Always use 3 buffers when flipping. With only 2 buffers, we have to wait
for a flip to complete (which takes non-0 time even with asynchronous
flips) before we can start working on the next frame. We were previously
only using 2 buffers for flipping if the X server supports asynchronous
flips, even when we're not using asynchronous flips. This could result
in bad performance (the referenced bug report is an extreme case, where
the inter-frame stalls were preventing the GPU from reaching its maximum
clocks).

I couldn't measure any performance boost using 4 buffers with flipping.
Performance actually seemed to go down slightly, but that might have
been just noise.

Without flipping, a single back buffer is enough for swap interval 0,
but we need to use 2 back buffers when the swap interval is non-0,
otherwise we have to wait for the swap interval to pass before we can
start working on the next frame. This condition was previously reversed.

Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97260
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 1e3218bc5b)

Squashed with commit:

loader/dri3: Always use at least two back buffers

This can make a significant difference for performance with some extreme
test cases such as vblank_mode=0 glxgears.

Fixes: 1e3218bc5b ("loader/dri3: Overhaul dri3_update_num_back")
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97549
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit dc3bb5db8c)
2016-10-31 17:39:32 +00:00
Emil Velikov
09460b8cf7 docs: add sha256 checksums for 12.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-15 11:29:24 +01:00
Emil Velikov
d79b2e7bf3 docs: add release notes for 12.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-15 10:19:26 +01:00
Emil Velikov
e487048f8c Update version to 12.0.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-15 10:15:57 +01:00
Emil Velikov
71b47b9cfe Revert "i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations"
This reverts commit be0344f630.

The commit depends on 48e9ecc47f ("Revert "i965/miptree: Set
logical_depth0 == 6 for cube maps") which was reverted earlier.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97781
2016-09-14 12:14:01 +01:00
Jose Fonseca
bde8f418bd appveyor: Update winflexbison download URL.
This particular version got moved into a `old_versions` subdirectory.
2016-09-14 11:21:04 +01:00
Emil Velikov
614fb93a6d docs: add sha256 checksums for 12.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-05 16:03:06 +01:00
Emil Velikov
2fc6a31f10 docs: add release notes for 12.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-05 12:14:11 +01:00
Emil Velikov
63001e7ddf Update version to 12.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-05 12:09:24 +01:00
Emil Velikov
7757de1ebf glx/glvnd: list the strcmp arguments in correct order
Currently, due to the inverse order, strcmp will produce negative result
when the needle is towards the start of the haystack. Thus on the next
iteration(s) we'll end up further towards the end and eventually fail to
locate the entry.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 62b224d428)
2016-09-05 11:59:25 +01:00
Ilia Mirkin
8e9b6161eb gk110/ir: fix quadop dall emission
We recently starting to always emit the NDV (== dall) bit for quadops.
However it was folded into the wrong code word.

Fixes: e0a067ed48 (nv50/ir: always emit the NDV bit for OP_QUADOP)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 61e978524a)
2016-09-05 11:37:18 +01:00
Ilia Mirkin
7c96b11fd6 a4xx: make sure to actually clamp depth as requested
We were previously ... not clamping. I guess this meant that everything
got clamped to 1/0, which was enough to pass the existing tests. Or
perhaps the clamping would only happen to the rasterized depth value and
not the frag shader's output depth value.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org

(cherry-picked from 89f00f749f)
[imirkin: adjust ctx->batch to just ctx]
2016-09-05 11:37:07 +01:00
Emil Velikov
49e84b8f18 Revert "i965/miptree: Set logical_depth0 == 6 for cube maps"
This reverts commit 48e9ecc47f.

The commit regressed several piglit tests on SNB/ILK hardware.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97567
2016-09-05 11:37:06 +01:00
Jose Fonseca
463d9ea0dc appveyor: Force Visual Studio 2013 image.
It seems the default build image is now Visual Studio 2015, and Visual
Studio 2013 is not installed.
2016-09-01 22:10:08 +01:00
Jose Fonseca
53e8701c7b appveyor: Install pywin32 extensions.
AppVeyor build images seem to have been upgraded to Python 2.7.12, but
no longer have pywin32 pre-installed.
2016-09-01 22:10:08 +01:00
Ilia Mirkin
0fa0e2a505 nv30: only bail on color/depth bpp mismatch when surfaces are swizzled
The actual restriction is a little weaker than I originally thought. See
https://bugs.freedesktop.org/show_bug.cgi?id=92306#c17 for the
suggestion. This also explain why things weren't *always* failing
before, only sometimes. We will allocate a non-swizzled depth buffer for
NPOT winsys buffer sizes, which they almost always are.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8caf2cb0c0)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
9a8d605398 anv: Rework pipeline caching
The original pipeline cache the Kristian wrote was based on a now-false
premise that the shaders can be stored in the pipeline cache.  The Vulkan
1.0 spec explicitly states that the pipeline cache object is transiant and
you are allowed to delete it after using it to create a pipeline with no
ill effects.  As nice as Kristian's design was, it doesn't jive with the
expectation provided by the Vulkan spec.

The new pipeline cache uses reference-counted anv_shader_bin objects that
are backed by a large state pool.  The cache itself is just a hash table
mapping keys hashes to anv_shader_bin objects.  This has the added
advantage of removing one more hand-rolled hash table from mesa.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97476
Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
(cherry picked from commit 10f9901bce)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
17d40ca82b anv/pipeline: Add support for caching the push constant map
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
(cherry picked from commit ffcef720b7)
[Emil Velikov: dependency for the next patch]
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-01 11:39:47 +01:00
Jason Ekstrand
a8e4b59cfd anv: Add a struct for storing a compiled shader
This new anv_shader_bin struct stores the compiled kernel (as an anv_state)
as well as all of the metadata that is generated at shader compile time.
The struct is very similar to the old cache_entry struct except that it
is reference counted and stores the actual pipeline_bind_map.  Similarly to
cache_entry, much of the actual data is floating-size and stored after the
main struct.  Unlike cache_entry, which was storred in GPU-accessable
memory, the storage for anv_shader_bin kernels comes from a state pool.
The struct itself is reference-counted so that it can be used by multiple
pipelines at a time without fear of allocation issues.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
(cherry picked from commit 6899718470)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
2566315063 anv: Add pipeline_has_stage guards a few places
All of these worked before because they were depending on prog_data to be
null.  Soon, we won't be able to depend on a nice prog_data pointer and
it's nice to be more explicit anyway.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 13c09fdd0c)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
b529a77d79 anv: Remove unused fields from anv_pipeline_bind_map
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b259d86ad6)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
d159ca4fa2 anv/pipeline: Properly handle OOM during shader compilation
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d5945bec12)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
1ce414accf anv/allocator: Correctly set the number of buckets
The range from ANV_MIN_STATE_SIZE_LOG2 to ANV_MAX_STATE_SIZE_LOG2 should
be inclusive and we have asserts that ensure that you never try to allocate
a state larger than (1 << ANV_MAX_STATE_SIZE_LOG2).  However, without
adding 1 to the difference, we allocate 1 too few bucckts and so, even
though we have an assert, anything landing in the last bucket will fail to
allocate properly..

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0f5c496e3)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
e12b7486b3 anv/pipeline: Fix bind maps for fragment output arrays
Found by inspection.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4200c2266e)
2016-09-01 11:39:47 +01:00
Jason Ekstrand
1d0c79b13b anv/descriptor_set: memset anv_descriptor_set_layout
We hash this data structure so we can't afford to have uninitialized data
even if it is just structure padding.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d316cec1c1)
2016-09-01 11:39:46 +01:00
Samuel Pitoiset
04f04ab6a6 nv50/ir: always emit the NDV bit for OP_QUADOP
This silences a divergent error found with F1 2015.

Basically, the NDV bit has to be set when a FSWZ instruction is
inside divergent code, but it's not needed otherwise. The correct
fix should be to set it only in divergent code situations.

GM107 emitter already sets that bit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e0a067ed48)
2016-09-01 11:39:46 +01:00
Emil Velikov
5af16ddf84 i915: Check return value of screen->image.loader->getBuffers
Ported from the i965 commit e7ab358e81.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 5de640a518)
2016-09-01 11:39:46 +01:00
Ilia Mirkin
178c34c535 nouveau: always enable at least one RC
Experimentally, this is required for glxgears and others to display the
proper colors. This is also what the code used to do before the
referenced commit.

Fixes: c703658b39 (mesa: Drop _EnabledUnits.)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 357d8261f1)
2016-09-01 11:39:46 +01:00
Brian Paul
7c583adfb5 mesa: fix format conversion bug in get_tex_rgba_uncompressed()
We need to set the need_convert flag with each loop iteration, not
just when the rgba pointer is null.

Bug reported by Markus Müller <mueller@imfusion.de> on mesa-users list.
Fixes new piglit arb_texture_float-get-tex3d test.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit b9b88516f8)
2016-09-01 11:39:46 +01:00
Ilia Mirkin
f70585e56a main: add missing EXTRA_END in OES_sample_variables get check
Fixes: 3002296cb6 (mesa: add GL_OES_shader_multisample_interpolation support)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 05b37e20de)
2016-09-01 11:39:46 +01:00
Jason Ekstrand
2d48468e58 isl: Allow multisampled array textures
This probably isn't the only thing that needs to be done to get
multisampled array textures working in Vulkan but I think this is all that
ISL really needs and it does fix 8 of the new CTS tests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
(cherry picked from commit fb89551047)
2016-09-01 11:39:46 +01:00
Ian Romanick
ab0183172f glsl: Mark cube map array sampler types as reserved in GLSL ES 3.10
All the GLSL 4.x keywords were added to the list of reserved keywords
in GLSL ES 3.10.  As far as I can tell, these are the only ones that
were missed.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c879dbc4e4)
2016-09-01 11:39:46 +01:00
Miklós Máté
1d4c887020 vbo: set draw_id
Fixes conditional jump depending on uninitialized value
in si_state_draw.c:593

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit b9ac72b511)
2016-09-01 11:39:46 +01:00
Chad Versace
a0e81225bd i965: Respect miptree offsets in intel_readpixels_tiled_memcpy()
Respect intel_miptree_slice::x_offset,y_offset and
intel_mipmap_tree::offset. All three may be non-zero when glReadPixels
is called on an EGLImage created from the non-base slice of a miptree.

Patch 2/2 that fixes test
'dEQP-EGL.functional.image.create.gles2_cubemap_*'.

Reported-by: Haixia Shi <hshi@chromium.org>
Diagnosed-by: Haixia Shi <hshi@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Change-Id: I4b397b27e55a743a7094d29fb0a6a4b6b34352b0
(cherry picked from commit 5b03975889)
2016-09-01 11:39:46 +01:00
Chad Versace
6898eb5859 i965: Fix miptree layout for EGLImage-based renderbuffers
When glEGLImageTargetRenderbufferStorageOES() was given an EGLImage
created from the non-base slice of a miptree,
intel_image_target_renderbuffer_storage() forgot to apply the intra-tile
offsets __DRIimage::tile_x,tile_y to the miptree layout.

This patch fixes the problem with a quick hack suitable for
cherry-picking. A proper fix requires more thorough plumbing in
intel_miptree_create_layout() and brw_tex_layout().

Patch 1/2 that fixes test
'dEQP-EGL.functional.image.create.gles2_cubemap_*'.

Reported-by: Haixia Shi <hshi@chromium.org>
Diagnosed-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Change-Id: I8a64b0048a1ee9e714ebb3f33fffd8334036450b
(cherry picked from commit c82f99e883)
2016-09-01 11:39:46 +01:00
Matt Turner
9aa4e400d2 nir: Walk blocks in source code order in lower_vars_to_ssa.
Prior to this commit rename_variables_block() is recursively called,
performing a depth-first traversal of the control flow graph. The
function uses a non-trivial amount of stack space for local variables,
which puts us in danger of smashing the stack, given a sufficiently deep
dominance tree.

XCOM: Enemy Within contains a shader with such a dominance tree (1574
nir_blocks in total, depth of at least 143).

Jason tells me that he believes that any walk over the nir_blocks that
respects dominance is sufficient (a DFS might have been necessary prior
to the introduction of nir_phi_builder).

In fact, the introduction of nir_phi_builder made the problem worse:
rename_variables_block(), walks to the bottom of the dominance tree
before calling nir_phi_builder_value_get_block_def() which walks back to
the top of the dominance tree...

In any case, this patch ensures we avoid that problem as well.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97225
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit e53130cc27)
2016-09-01 11:39:46 +01:00
Brian Paul
b061b2e3eb swrast: fix incorrectly positioned putImage() in swrast driver
Some front buffer rendering was in the wrong position.  This included
scissored clears, glDrawPixels and glCopyPixels.  The problem was the
y coordinate passed to putImage() didn't match the y coordinate passed
to getImage().

We fix this by setting xrb->map_y to the inverted coordinate in
swrast_map_renderbuffer() which is used later by the putImage() call.
Also pass xrb->map_y to getImage() to be symmetric.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97426
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 2a2dc416b6)
2016-09-01 11:39:45 +01:00
Marek Olšák
69acfb7c94 radeonsi: disable SDMA texture copying on Carrizo
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 3ff0b67e1b)
2016-09-01 11:39:45 +01:00
Jason Ekstrand
544a92ad49 anv: Include the pipeline layout in the shader hash
The pipeline layout affects shader compilation because it is what
determines binding table locations as well as whether or not a particular
buffer has dynamic offsets.  Since this affects the generated shader, it
needs to be in the hash.  This fixes a bunch of CTS tests now that the CTS
is using a pipeline cache.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2301705dee)
2016-09-01 11:39:45 +01:00
Samuel Pitoiset
9fced1aa53 nvc0: invalidate textures/samplers on GK104+
Like Fermi, textures and samplers are aliased between 3D and compute,
especially the TIC_FLUSH/TSC_FLUSH methods and we have to re-validate
these resources when switching between the two pipelines.

This fixes a GPU hang with Elemental (and most likely with other UE4 demos).

Tested on GK107 and GM107.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
CC: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a227b0a4f1)
2016-09-01 11:39:45 +01:00
Marek Olšák
5ad09f744c radeonsi: fix VM faults due NULL internal const buffers on CIK
They are harmless, but the interrupts do decrease performance.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97039

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2c13abb491)
2016-09-01 11:39:45 +01:00
Nicolai Hähnle
6001e37b8e radeonsi: add si_set_rw_buffer to be used for internal descriptors
So that callers outside of si_descriptors.c need to worry less about the
details of descriptor handling.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ba4a2840c7)
2016-09-01 11:39:45 +01:00
Tomasz Figa
e45e500d4b gallium/winsys/kms: Look up the GEM handle after importing a prime FD
drmPrimeHandleToFD() will return the same GEM handle every time the same
buffer is imported, even from a different prime FD. Since GEM handles
are not reference counted, we need to make sure that each GEM handle is
referenced only by one display target struct, by looking it up in
kms_sw->bo_list first and bumping the refcount of the found dt on hit
and falling back to creating a new dt only on miss.

v2: Split into separate function.
    Use helper function for lookup.

v3 [Emil Velikov]:
    Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan)

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 577f85e2bb)
2016-09-01 11:39:45 +01:00
Tomasz Figa
4ec509533e gallium/winsys/kms: Move display target handle lookup to separate function
As a preparation to use the lookup in more than once place, move the
code that looks up given KMS/GEM handle to a separate function. This
change should not introduce any functional changes.

v2: Split into separate patch.
    Move lookup code into separate function.

v3 [Emil Velikov]:
    Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan)

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0465c72d46)
2016-09-01 11:39:45 +01:00
Tomasz Figa
731e6575e6 gallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2)
Currently kms_sw_displaytarget_add_from_prime() allocates the struct and
fills in only some of the fields, resulting in a half-baked struct that
needs to be further completed by the caller. To make this a bit more
consistent, pass width, height and stride to this function and fill in
everything there, so that caller can take the returned struct as is.

v2: Split from one big patch into four fixing one thing at a time.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit e71b78ebf9)
2016-09-01 11:39:45 +01:00
Tomasz Figa
ab157ffd86 gallium/winsys/kms: Fix double refcount when importing from prime FD (v2)
Currently the code creates a display target struct with refcount field
initialized to 1 and then the caller again increments it, leading to
a leaked reference. Let's remove the unnecessary increment.

v2: Split from one big patch into four fixing one thing at a time.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0aa6a818ef)
2016-09-01 11:39:45 +01:00
Ilia Mirkin
61a6d84679 nv50/ir: make sure cfg iterator always hits all blocks
In some very specially-crafted cases, we could attempt to visit a node
that has already been visited, and then run out of bb's to visit, while
there were still cross blocks on the list. Make sure that those get
moved over in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96274
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 092f994a03)
2016-09-01 11:39:45 +01:00
Eric Anholt
fabc5c2783 vc4: Fix leak of the bo_handles table.
(cherry picked from commit 9f95690959)
2016-09-01 11:39:45 +01:00
Rob Herring
ec68600280 vc4: add hash table look-up for exported dmabufs
It is necessary to reuse existing BOs when dmabufs are imported. There
are 2 cases that need to be handled. dmabufs can be created/exported and
imported by the same process and can be imported multiple times.
Copying other drivers, add a hash table to track exported BOs so the
BOs get reused.

v2: Whitespace fixup (by anholt)

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 9ace2c1355)
2016-09-01 11:39:45 +01:00
Eric Anholt
838c1cbde4 vc4: Fix a leak of the src[] array of VPM reads in optimization.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0671d67de)
2016-09-01 11:39:44 +01:00
Eric Anholt
965ceef596 vc4: Disable early Z with computed depth.
We don't tell the hardware whether we're computing depth, so we need
to manage early Z state manually.  Fixes piglit early-z.

(cherry picked from commit ce8504d196)
2016-09-01 11:39:44 +01:00
Eric Anholt
0f097f28eb vc4: Close our screen's fd on screen close.
We're passed in a freshly dup()ed fd on screen create, so we should close
it on exit.  Debugged by Hugh Cole-Baker.

(cherry picked from commit c65a00eaff)
2016-09-01 11:39:44 +01:00
Rob Herring
7c583f85e1 vc4: fix vc4_resource_from_handle() stride calculation
The expected stride calculation is completely wrong. It should
ultimately be multiplying cpp and width rather than dividing. The width
also needs to be aligned to the tiling width first before converting to
stride bytes.

The whole stride check here is possibly pointless. Any buffers which
were allocated outside of vc4 may have strides with larger alignment
requirements.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 067c5b10b6)
2016-09-01 11:39:44 +01:00
Matt Turner
cdfd7c7b72 mesa: Use AC_HEADER_MAJOR to include correct header for major().
Gentoo has been smoke testing an upcoming change to glibc.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=580392
(cherry picked from commit 20553e4a2d)
2016-09-01 11:39:44 +01:00
Stencel, Joanna
e671f40e79 egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.
Segfault occurs when destroying EGL surface attached to already destroyed
Wayland window. The fix is to set to NULL the pointer of surface's
native window when wl_egl_destroy_window() is called.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Stencel, Joanna <joanna.stencel@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 690ead4a13)
2016-09-01 11:39:44 +01:00
Jason Ekstrand
42fe245370 anv/clear: Clear E5B9G9R9 images as R32_UINT
We can't actually clear these images normally because we can't render to
them.  Instead, we have to manually unpack the rgb9e5 color value on the
CPU and clear it as R32_UINT.  We still have a bit of work to do to clear
non-power-of-two images, but this should get all of the power-of-two clears
working on at least Haswell.  This fixes three of the new Vulkan CTS tests
in the dEQP-VK.api.image_clearing.clear_color_image.* group.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7bdccd104b)
[Emil Velikov: rgb9e5 header is renamed in master
s/format_rgb9e5.h/u_format_rgb9e5.h/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-01 11:39:44 +01:00
Jason Ekstrand
015b920fb4 anv/clear: Make cmd_clear_image take an actual VkClearValue
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit afa7ca0f77)
2016-09-01 11:39:44 +01:00
Jason Ekstrand
32a0038a02 anv/blit2d: Add support for RGB destinations
This fixes 104 of the new image_clearing and copy_and_blit Vulkan CTS
tests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cf3cf2ecfc)
2016-09-01 11:39:44 +01:00
Jason Ekstrand
346f9e5a85 anv/blit2d: Add a format parameter to bind_dst and create_iview
Signed-off-by: Jasosn Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 16ddda8452)
[Emil Velikov: don't attribute if using ISL_TILING_W. patches that
attribute and require the ISL_TILING_W handling aren't in 12.0]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/vulkan/anv_meta_blit2d.c
2016-09-01 11:39:44 +01:00
Dave Airlie
7fe8cad4e4 st/glsl_to_tgsi: fix st_src_reg_for_double constant.
This needs to set the src swizzle so it doesn't access the .zw
members ever when we are just emitting a 0 constant here.

This fixes:
vert-conversion-explicit-dvec3-bvec3.shader_test
and a bunch of other fp64 tests on softpipe and radeonsi.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 26187f3890)
2016-09-01 11:39:44 +01:00
Daniel Scharrer
6847a37363 mesa: Fix fixed function spot lighting on newer hardware (again)
This was first fixed in commit b3f9c5c and then broken again in commit
fe2d2c7, which removed the abs modifier from input registers.

v2: Don't change the size of struct ureg.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Daniel Scharrer <daniel@constexpr.org>
(cherry picked from commit 16ef7ab5c1)
2016-09-01 11:39:44 +01:00
Matt Turner
bf61f2e5c3 i965/vec4: Ignore swizzle of VGRF for use by var_range_end().
var_range_end(v, n) loops over the n components of variable number v and
finds the maximum value, giving the last use of any component of v.
Therefore it expects v to correspond to the variable associated with the
.x channel of the VGRF.

var_from_reg() however returns the variable for the first channel of the
VGRF, post-swizzle.

So, if the last register had a swizzle with y, z, or w in the swizzle
component, we would read out of bounds. For any other register, we would
read liveness information from the next register.

The fix is to convert the src_reg to a dst_reg in order to call the
dst_reg version of var_from_reg() that doesn't consider the swizzle.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e7c376adfd)
2016-09-01 11:39:43 +01:00
Emil Velikov
d2f0a2925f cherry-ignore: temporary(?) drop "a4xx: make sure to actually clamp depth"
The commit depends a 700+ patch introducing fd_batch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-09-01 11:39:43 +01:00
Ilia Mirkin
9e71069d8f a4xx: only disable depth clipping, not all clipping, when requested
The previous bit disables the whole clipper, including the regular
viewport-related clipping that would go on. The two new bits disable
near and far clipping (separately, as verified with the
depth-clamp-range piglit).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit cd8e30452f)
2016-09-01 11:39:43 +01:00
Ilia Mirkin
8d1029fb7b vbo: add basevertex when looking up elements for vbo splitting
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97351
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 659dc10d32)
2016-09-01 11:39:43 +01:00
Emil Velikov
16751e0be4 isl: automake: use VISIBILITY_CFLAGS to restrict symbol visibility
v2: Add VISIBILITY_CFLAGS to AM_CFLAGS (Ken)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d61d259518)
[Emil Velikov: drop not applicable gen4-6 hunks]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/isl/Makefile.am
2016-09-01 11:39:43 +01:00
mil Velikov
52f094859a anv: remove dummy VK_DEBUG_MARKER_EXT entry points
The vkCmdDbgMarker{Begin,End} symbols are exported, yet the json does no
advertise that the driver supports the extension. Furthermore the
functions are empty stubs.

Remove those until we get a proper implementation and json notation.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ebd5dc8826)
2016-09-01 11:39:43 +01:00
Emil Velikov
21411adc27 anv: do not export the Vulkan API
With version 1 of the Loader interface there is an internal/private symbol
(vk_icdGetInstanceProcAddr) which is used to retrieve all the API from the
Vulkan entrypoints from the ICD. Implying that exposing the Vulkan API is not
recommended.

Version 2 goes a step further explicitly forbiding the ICD from exposing Vulkan
symbols (and adding a negotiation API)

As a reference:
 - Nvidia 367.35
Missing negotiation API - version 1.
Exposes only vk_icdGetInstanceProcAddr.

 - AMD 16.30.3.306809
Have negotiation API - version 2,
Exposes vk_icdGetInstanceProcAddr.
Exposes a couple of Vulkan entry points - seems to be in violation with the spec.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 49394e8d77)
2016-09-01 11:39:43 +01:00
Emil Velikov
e609ec2c1a anv: automake: build with -Bsymbolic
Explicitly suggested in the Loader interface version 2 section, but it's good
idea either way. It essentially, ensures that our symbols are not interposed.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 1cdb6ca40b)
2016-09-01 11:39:43 +01:00
Emil Velikov
610857acea anv: automake: use VISIBILITY_CFLAGS to restrict symbol visibility
Hide the internal symbols and annotate the vk_icdGetInstanceProcAddr as public
since the loader needs it (since v1 of the loader interface).

v2: Add VISIBILITY_CFLAGS to AM_CFLAGS (Ken)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 40e4fff563)
2016-09-01 11:39:43 +01:00
Emil Velikov
de1f9ce703 anv: remove internal 'validate' layer
Presently the layer has only a single entry point. As mentioned by Jason the
function does not validate anything that isn't checked elsewhere, thus we can
drop the whole thing.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b0d56f2f4f)
2016-09-01 11:39:43 +01:00
Kenneth Graunke
951b508e44 i965: Fix barrier count shift in scalar TCS backend.
The "Barrier Count" field goes in 14:9 of m0.2.  The vec4 backend
correctly shifts by 9, but the scalar backend only shifted by 8.

It's not like this changed - I think I just made a typo when writing
the original scalar TCS backend code.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit d14dd727f4)
2016-09-01 11:39:43 +01:00
Kenneth Graunke
47b72990fe i965: Fix execution size of scalar TCS barrier setup code.
Previously, the scalar TCS backend was generating:

mov(8)   g17<1>UD     0x00000000UD    { align1 WE_all 1Q compacted };
and(8)   g17.2<1>UD   g0.2<0,1,0>UD   0x0001e000UD  { align1 WE_all 1Q };
shl(8)   g17.2<1>UD   g17.2<8,8,1>UD  0x0000000bUD  { align1 WE_all 1Q };
or(8)    g17.2<1>UD   g17.2<8,8,1>UD  0x00008200UD  { align1 WE_all 1Q };
send(8)  null<1>UW    g17<8,8,1>UD
         gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q };

This is rubbish - g17.2<8,8,1>UD spans two registers, and is an illegal
region.  Not to mention it clobbers 8 channels of data when we only
wanted to touch m0.2.

Instead, we want:

mov(8)   g17<1>UD     0x00000000UD    { align1 WE_all 1Q compacted };
and(1)   g17.2<1>UD   g0.2<0,1,0>UD   0x0001e000UD  { align1 WE_all };
shl(1)   g17.2<1>UD   g17.2<0,1,0>UD  0x0000000bUD  { align1 WE_all };
or(1)    g17.2<1>UD   g17.2<0,1,0>UD  0x00008200UD  { align1 WE_all };
send(8)  null<1>UW    g17<8,8,1>UD
         gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q };

Using component() accomplishes this.

Fixes GL44-CTS.tessellation_shader.tessellation_shader_tc_barriers.
barrier_guarded_read_write_calls on Skylake.  Probably fixes other
barrier issues on Gen8+.

v2: Use a group(1, 0) builder so inst->exec_size is set correctly
    (thanks to Francisco Jerez for catching that it was incorrect).

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 159f037755)
2016-09-01 11:39:42 +01:00
Kenneth Graunke
9cf5eb292b i965: Implement the WaPreventHSTessLevelsInterference workaround.
Fixes several GL44-CTS.tessellation_shader (and GL45 and ES31) subcases:
- vertex_spacing
- tessellation_shader_point_mode.points_verification
- tessellation_shader_quads_tessellation.inner_tessellation_level_rounding

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit 9e778837ff)
[Emil Velikov: attribute for the lack of gl_linked_shader struct.]
[Namely: s/tes->info./shader_prog->/;s/gl_linked_shader/gl_shader/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_tcs.c
2016-09-01 11:39:42 +01:00
Kenneth Graunke
6ea7a82b5e nir/builder: Add bany_inequal and bany helpers.
The first simply picks the bany_inequal[234] opcodes based on the SSA
def's number of components.  The latter implicitly compares with zero
to achieve the same semantics of GLSL's any().

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit d8971128ac)
2016-09-01 11:39:42 +01:00
Kenneth Graunke
ab441496ca mesa: Fix uf10_to_f32() scale factor in the E == 0 and M != 0 case.
GL_EXT_packed_float, 2.1.B Unsigned 10-Bit Floating-Point Numbers:

        0.0,                      if E == 0 and M == 0,
        2^-14 * (M / 32),         if E == 0 and M != 0,
        2^(E-15) * (1 + M/32),    if 0 < E < 31,
        INF,                      if E == 31 and M == 0, or
        NaN,                      if E == 31 and M != 0,

In the second case (E == 0 and M != 0), we were multiplying the mantissa
by 2^-20, when we should have been multiplying by 2^-19 (which is
2^(-14 + -5), or 2^-14 * 2^-5, or 2^-14 / 32).

The previous section defines the formula for 11-bit numbers, which is:

        2^-14 * (M / 64),         if E == 0 and M != 0,

In other words, we had accidentally copy and pasted the 11-bit code
to the 10-bit case, and neglected to change the exponent.

Fixes dEQP-GLES3.functional.pbo.renderbuffer.r11f_g11f_b10f_triangles
when run with surface dimensions of 1536x1152 or 1920x1080.

Cc: mesa-stable@lists.freedesktop.org
References: https://code.google.com/p/chrome-os-partner/issues/detail?id=56244
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
(cherry picked from commit 01e99cba04)
2016-09-01 11:39:42 +01:00
Michel Dänzer
d1142926c2 glx: Don't use current context in __glXSendError
There's no guarantee that there is one, and we don't need one anyway.

Fixes piglit tests:

glx@glx-fbconfig-bad
glx@glx_ext_import_context@import context, multi process
glx@glx_ext_import_context@import context, single process

Fixes: 2e3f067458 ("glx: fix error code when there is no context bound")
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 4ac640e3d2)
2016-09-01 11:39:42 +01:00
Ilia Mirkin
8b76a3744c nv50/ir: fix bb positions after exit instructions
It's fairly rare that the BB layout puts BBs after the exit block, which
is likely the reason these issues lingered for so long.

This fixes a fraction of issues with the giant pixmark piano shader.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e988999791)
2016-09-01 11:39:42 +01:00
Dave Airlie
d279dec359 anv: fix writemask on blit fragment shader.
I'm not sure if anything even uses this, but I found this on radv, so
just fix it on anv for consistency.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c2f2252037)
2016-09-01 11:39:42 +01:00
Bernard Kilarski
c04ee8c110 glx: fix error code when there is no context bound
v2: change all related NULL checks to check against dummyContext
v3: really check for dummyContext *only* when ctx was from
    __glXGetCurrentContext
v4: cover more checks, add dummyBuffer, dummyVtable (Emil)

Signed-off-by: Bernard Kilarski <bernard.r.kilarski@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2e3f067458)
2016-09-01 11:39:42 +01:00
Ilia Mirkin
07df4bf0c8 nv50,nvc0: fix depth range when halfz is enabled
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5c1ccd8053)
2016-09-01 11:39:42 +01:00
Ilia Mirkin
32c009b116 gallium/util: add helper to compute zmin/zmax for a viewport state
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c85b7f0e87)
2016-09-01 11:39:42 +01:00
Ilia Mirkin
ea29312a79 vbo: allow DrawElementsBaseVertex in display lists
Looks like it was missed originally. The multi version is there already.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97331
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 68b64f32e8)
2016-09-01 11:39:42 +01:00
Kenneth Graunke
bc40bc5527 glsl: Fix invariant matching in GLSL 4.30 and GLSL ES 1.00.
Old languages (GLSL <= 4.20 and GLSL ES 1.00) require "invariant"
to be specified on both inputs and outputs, and match when linking.

New languages only allow outputs to be qualified as "invariant"
and remove the "invariant must match" restriction when linking
varyings (because no input can have that qualifier).

Commit 426a50e208 introduced the new
behavior for ES 3.00.  It also removed the "must match" restriction
for ES 1.00 shaders, which I believe is incorrect.  This patch adds
that back, as well as making 4.30+ follow the new rules.

Thanks to Qiankun Miao for noticing this discrepancy.

Fixes a WebGL 2.0 conformance test when run in Chromium:
https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96971
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit f9f462936a)
2016-09-01 11:39:42 +01:00
Ian Romanick
0c25b3e4b0 glcpp: Only disallow #undef of pre-defined macros on GLSL ES >= 3.00 shaders
Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says:

   It is an error to undefine or to redefine a built-in (pre-defined)
   macro name.

The GLSL ES 1.00 spec does not contain this text.

Section 3.3 (Preprocessor) of the GLSL 1.30 spec says:

   #define and #undef functionality are defined as is standard for C++
   preprocessors for macro definitions both with and without macro
   parameters.

At least as far as I can tell GCC allow '#undef __FILE__'.  Furthermore,
there are desktop OpenGL conformance tests that expect '#undef
__VERSION__' and '#undef GL_core_profile' to work.

Fixes:

    GL45-CTS.shaders.preprocessor.definitions.undefine_version_vertex
    GL45-CTS.shaders.preprocessor.definitions.undefine_version_fragment
    GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_vertex
    GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_fragment

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 50b49d242d)

Squashed with commit

glcpp: Update tests for new #undef of built-in macro rules.

Ian recently changed the preprocessor to allow this in most GLSL
versions, but not GLSL ES 3.00+.  This patch converts the existing
test that expects a failure to a #version 300 es shader, and adds
a #version 110 shader to make sure that it's allowed.

Fixes 'make check'.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97307
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit 1f47f78fc3)
2016-09-01 11:39:29 +01:00
Ian Romanick
e1698fa455 glcpp: Track the actual version instead of just the version_resolved flag
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit eda6349346)
[Emil Velikov: resolve trivial conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/compiler/glsl/glcpp/glcpp-parse.y
2016-09-01 10:06:24 +01:00
Jason Ekstrand
3dca5c8eb1 i965/vec4: Make opt_vector_float reset at the top of each block
The pass isn't really control-flow aware and you can get into case where it
tries to combine instructions from different blocks.  This can actually
lead to an assertion failure when removing unneeded instructions if part of
the vector is set in one block and part in another.  This prevents
regressions in the next commit.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4c3a6b07e2)
2016-09-01 10:06:24 +01:00
Marek Olšák
659d9f189c radeonsi: only set dual source blending for MRT0
This is the proper fix for Overlord and Witcher 2 hangs.

The hang condition is that 1 app must write to MRT0 and MRT1 from a pixel
shader while MRT1 is disabled in CB_TARGET_MASK (does this generate
unflushable pixel quads? I don't know), and another app (e.g. Glamor)
must enable dual source blending in both MRT0 and MRT1. The hw gets
confused, which leads to corruption and hangs.

Cc: 12.0 11.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 947e0614d0)
2016-09-01 10:06:23 +01:00
Nicolai Hähnle
1959b57310 radeonsi: flush TC L2 cache for indirect draw data
This fixes a bug when indirect draw data is generated by transform
feedback.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2852dedaa0)
2016-09-01 10:06:23 +01:00
Kenneth Graunke
fce2e3b493 glsl: Fix location bias for patch variables.
We need to subtract VARYING_SLOT_PATCH0, not VARYING_SLOT_VAR0.

Since "patch" only applies to inputs and outputs, we can just handle
this once outside the switch statement, rather than replicating the
check twice and complicating the earlier conditions.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 398428f406)
2016-09-01 10:06:23 +01:00
Kenneth Graunke
236172997c glsl: Fix the program resource names of gl_TessLevelOuter/Inner[].
These are lowered to gl_TessLevel{Outer,Inner}MESA.  We need them to
appear in the program resource list with their original names and types.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 1556f16e46)
2016-09-01 10:06:23 +01:00
Kenneth Graunke
cd009c46be glsl: Delete bogus ir_set_program_inouts assert.
This assertion is bogus.  Varying structs, and arrays of structs, are
allowed by GLSL, and we can see them here.  While we currently don't
have any partial-variable support for those, simply returning false
and marking the entire thing as used is certainly legitimate.

I believe this is often swept under the rug by varying packing,
but that's disabled in certain tessellation situations.

Hit by 20 dEQP-GLES31.functional.tessellation.user_defined_io.* tests.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 4a49851da1)
2016-09-01 10:06:23 +01:00
Nanley Chery
f20168723f anv/gen7_pipeline: Set PixelShaderKillPixel for discards
According to the IVB PRM Vol2 P1, this bit must be set if a pixel shader
contains a discard instruction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97207
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit c495c18b24)
2016-09-01 10:06:23 +01:00
Jan Ziak
6980f686d0 loader: fix memory leak in loader_dri3_open
Found via "valgrind --leak-check=full glxgears".

Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com>
Acked-by: Boyan Ding <boyan.j.ding@gmail.com>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit fd32868590)
2016-09-01 10:06:23 +01:00
Marek Olšák
9e22182223 gallium/util: fix align64
it cut off the upper 32 bits

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
(cherry picked from commit 6db93cd167)
2016-09-01 10:06:23 +01:00
Nicolas Boichat
e82567c02b egl/dri2: Add reference count for dri2_egl_display
android.opengl.cts.WrapperTest#testGetIntegerv1 CTS test calls
eglTerminate, followed by eglReleaseThread. A similar case is
observed in this bug: https://bugs.freedesktop.org/show_bug.cgi?id=69622,
where the test calls eglTerminate, then eglMakeCurrent(dpy, NULL, NULL, NULL).

With the current code, dri2_dpy structure is freed on eglTerminate
call, so the display is not initialized when eglReleaseThread calls
MakeCurrent with NULL parameters, to unbind the context, which
causes a a segfault in drv->API.MakeCurrent (dri2_make_current),
either in glFlush or in a latter call.

eglTerminate specifies that "If contexts or surfaces associated
with display is current to any thread, they are not released until
they are no longer current as a result of eglMakeCurrent."

However, to properly free the current context/surface (i.e., call
glFlush, unbindContext, driDestroyContext), we still need the
display vtbl (and possibly an active dri dpy connection). Therefore,
we add some reference counter to dri2_egl_display, to make sure
the structure is kept allocated as long as it is required.

One drawback of this is that eglInitialize may not completely reinitialize
the display (if eglTerminate was called with a current context), however,
this seems to meet the EGL spec quite well, and does not permanently
leak any context/display even for incorrectly written apps.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9ee683f877)

Squashed with commit

egl/dri2: dri2_make_current: Release previous context's display

eglMakeCurrent can also be used to change the active display. In that
case, we need to decrement ref_count of the previous display (possibly
destroying it), and increment it on the next display.

Also, old_dsurf/old_rsurf cannot be non-NULL if old_ctx is NULL, so
we only need to test if old_ctx is non-NULL.

v2: Save the old display before destroying the context.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97214
Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Alexandr Zelinsky <mexahotabop@w1l.ru>
Tested-by: Alexandr Zelinsky <mexahotabop@w1l.ru>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
(cherry picked from commit 78e3cea419)

Squashed with commit

egl/dri2: dri2_initialize: Do not reference-count TestOnly display

In the case where dri2_initialize is called with a TestOnly display,
the display is not actually initialized, so dri2_egl_display always
fails, and we cannot do any reference counting.

Fixes piglit spec@egl_khr_create_context@verify gl flavor (reproducible
with LIBGL_ALWAYS_SOFTWARE=1).

Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4f3f8bb59d)
2016-09-01 10:05:40 +01:00
Nicolas Boichat
6503a05f35 egl/android: Set dpy->DriverData to NULL on error
Avoid use-after-free on error.

Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Tested-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c0580f6a38)
2016-09-01 10:05:40 +01:00
Nicolas Boichat
447501914a egl/drm: Set disp->DriverData to NULL on error
Avoid use-after-free on error.

Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Tested-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit a9e8fb7397)
2016-09-01 10:05:40 +01:00
Nicolas Boichat
f8a5d340e8 egl/surfaceless: Set disp->DriverData to NULL on error
Avoid use-after-free on error.

Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Tested-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0e67d86540)
2016-09-01 10:05:39 +01:00
Nicolas Boichat
6ec0c92b3c egl/wayland: Set disp->DriverData to NULL on error
Avoid use-after-free, fix spec@egl_khr_fence_sync@conformance.

Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Tested-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 48fd952f28)
2016-09-01 10:05:39 +01:00
Jan Ziak
fbde508c18 egl/x11: avoid using freed memory if dri2 init fails
Found with valgrind:

==4841== Invalid read of size 4
==4841==    at 0x56BDC80: dri2_initialize (egl_dri2.c:783)
==4841==    by 0x56BAFE5: _eglMatchAndInitialize (egldriver.c:261)
==4841==    by 0x56BB15E: _eglMatchDriver (egldriver.c:295)
==4841==    by 0x56B58C9: eglInitialize (eglapi.c:480)
==4841==    by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x402E59: main
==4841==  Address 0x6a05824 is 148 bytes inside a block of size 480 free'd
==4841==    at 0x4C2B680: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4841==    by 0x56C2AAE: dri2_initialize_x11_swrast (platform_x11.c:1233)
==4841==    by 0x56C2AAE: dri2_initialize_x11 (platform_x11.c:1493)
==4841==    by 0x56BDCEB: dri2_initialize (egl_dri2.c:805)
==4841==    by 0x56BAFAF: _eglMatchAndInitialize (egldriver.c:261)
==4841==    by 0x56BB0C9: _eglMatchDriver (egldriver.c:292)
==4841==    by 0x56B58C9: eglInitialize (eglapi.c:480)
==4841==    by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x402E59: main
==4841==  Block was alloc'd at
==4841==    at 0x4C2A868: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4841==    by 0x56C2A47: dri2_initialize_x11_swrast (platform_x11.c:1171)
==4841==    by 0x56C2A47: dri2_initialize_x11 (platform_x11.c:1493)
==4841==    by 0x56BDCEB: dri2_initialize (egl_dri2.c:805)
==4841==    by 0x56BAFAF: _eglMatchAndInitialize (egldriver.c:261)
==4841==    by 0x56BB0C9: _eglMatchDriver (egldriver.c:292)
==4841==    by 0x56B58C9: eglInitialize (eglapi.c:480)
==4841==    by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2)
==4841==    by 0x402E59: main

Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com>
Fixes: 9ee683f877 (egl/dri2: Add reference count for dri2_egl_display)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 769ac1ec78)
2016-09-01 10:05:39 +01:00
Nicolai Hähnle
178b889823 glsl: fix optimization of discard nested multiple levels
The order of optimizations can lead to the conditional discard optimization
being applied twice to the same discard statement. In this case, we must
ensure that both conditions are applied.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96762
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 21556d86fc)
[Emil Velikov: s/get_head_raw()/head/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/compiler/glsl/opt_conditional_discard.cpp
2016-07-28 17:05:28 +01:00
Nicolai Hähnle
7208d82dfb st_glsl_to_tgsi: only skip over slots of an input array that are present
When an application declares varying arrays but does not actually do any
indirect indexing, some array indices may end up unused in the consuming
shader, so the number of input slots that correspond to the array ends
up less than the array_size.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 185b0c15ab)
2016-07-28 14:51:32 +01:00
Jason Ekstrand
be0344f630 i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations
intel_mipmap_tree::logical_depth0 is now in number of 2D slices so we no
longer need to be multiplying by 6.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5d76690f17)
2016-07-28 14:50:32 +01:00
Nicolai Hähnle
6156d8d93e radeonsi: ensure sample locations are set for line and polygon smoothing
Since commit d938b8c, the sample locations are no longer set unconditionally,
so we need to set the atom to dirty on all chips, not just Polaris.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3d69357da9)
2016-07-28 14:49:12 +01:00
Nicolai Hähnle
3237c07e98 radeonsi: fix Polaris MSAA regression
The regression was introduced by commit d938b8c. The problem here is that in
order to use the small primitive filter, we need to explicitly set the sample
locations to 0. But the DB doesn't properly process the change of sample
locations without a flush, and so we can end up with incorrect Z values.

Instead of doing a flush, just disable the small primitive filter when MSAA
is force-disabled.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96908
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f755da0f2f)
2016-07-28 14:47:49 +01:00
Kenneth Graunke
5f09454e34 mesa: Don't call GenerateMipmap if Width or Height == 0.
One of the WebGL 2.0 conformance tests is trying to call
glGenerateMipmaps with a width and height of 0.  With the meta
implementation, this generates a "framebuffer attachment incomplete"
status, and falls back to the CPU path, calling MapTextureImage.

Except that there's no actual texture to map, and we assert fail.

There's no work to do in this case.  The test expects it to succeed,
so just return early with no error and avoid hassling the driver.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96911
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit f80bea2d80)
2016-07-28 14:46:50 +01:00
Jason Ekstrand
1951df7812 anv/pipeline: Set up point coord enables
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b33bccb519)
2016-07-28 14:45:48 +01:00
Kenneth Graunke
2f8cd4a8c3 mesa: Add GL_BGRA_EXT to the list of GenerateMipmap internal formats.
The GL_EXT_texture_format_BGRA8888 extension specification defines a
GL_BGRA_EXT unsized internal format (which is a little odd - usually
BGRA is a pixel transfer format).  The extension is written against
the ES 1.0 specification, so it's a little hard to map, but I believe
it's effectively adding it to the table used here, so we should allow
it here as well.

Note that GL_EXT_texture_format_BGRA8888 is always enabled (dummy_true),
so we don't need to check if it's enabled here.

This fixes mipmap generation in Skia and ChromeOS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
References: https://bugs.chromium.org/p/chromium/issues/detail?id=630371
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reported-by: Stéphane Marchesin <marcheu@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit cb70773129)
2016-07-28 14:44:45 +01:00
Kenneth Graunke
2f80fb368b i965: Fix shared atomic intrinsics to pay attention to base.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 76e161056a)
2016-07-28 14:43:35 +01:00
Kenneth Graunke
e1c20919a8 nir: Add a base const_index to shared atomic intrinsics.
Commit 52e75dcb8c made nir_lower_io
start using nir_intrinsic_set_base instead of writing const_index[0]
directly.  However, those intrinsics apparently don't /have/ a base,
so this caused assert failures.

However, the old code was happily setting non-existent const_index
fields, so it was pretty bogus too.

Jason pointed out that load_shared and store_shared have a base,
and that the i965 driver uses that field.  So presumably atomics
should have one as well, so that loads/stores/atomics all refer
to variables with consistent addressing.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit cf6f2d3ce7)
2016-07-28 14:40:24 +01:00
Kenneth Graunke
a94056e2c7 i965: Include VUE handles for GS with invocations > 1.
We always resort to the pull model for instanced GS inputs.  So, we'd
better include the VUE handles, or else we can't actually pull anything.

Ian reports that on his branch with OES_geometry_shader enabled,
this fixes a bunch of dEQP-GLES31.functional.geometry_shading tests::

- instanced.draw_2_instances_geometry_2_invocations
- instanced.draw_2_instances_geometry_8_invocations
- instanced.draw_4_instances_geometry_2_invocations
- instanced.draw_4_instances_geometry_8_invocations
- instanced.draw_8_instances_geometry_2_invocations
- instanced.draw_8_instances_geometry_8_invocations
- instanced.geometry_2_invocations
- instanced.geometry_32_invocations
- instanced.geometry_8_invocations
- instanced.geometry_max_invocations
- instanced.geometry_output_different_2_invocations
- instanced.geometry_output_different_32_invocations
- instanced.geometry_output_different_8_invocations
- instanced.geometry_output_different_max_invocations
- instanced.invocation_output_vary_by_attribute
- instanced.invocation_output_vary_by_texture
- instanced.invocation_output_vary_by_uniform
- query.primitives_generated_instanced

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 2db357e4c3)
2016-07-28 14:21:29 +01:00
Chuck Atkins
d70f97784b swr: Refactor checks for compiler feature flags
Encapsulate the test for which flags are needed to get a compiler to
support certain features.  Along with this, give various options to try
for AVX and AVX2 support.  Ideally we want to use specific instruction
set feature flags, like -mavx2 for instance instead of -march=haswell,
but the flags required for certain compilers are different.  This
allows, for AVX2 for instance, GCC to use -mavx2 -mfma -mbmi2 -mf16c
while the Intel compiler which doesn't support those flags can fall
back to using -march=core-avx2.

This addresses a bug where the Intel compiler will silently ignore the
AVX2 instruction feature flags and then potentially fail to build.

v2: Pass preprocessor-check argument as true-state instead of
    false-state for clarity.
v3: Reduce AVX2 define test to just __AVX2__.  Additional defines suchas
    __FMA__, __BMI2__, and __F16C__ appear to be inconsistently defined
    w.r.t thier availability.
v4: Fix C++11 flags being added globally and add more logic to
    swr_require_cxx_feature_flags

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Tested-by: Tim Rowley <timothy.o.rowley@Intel.com>
Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit c1bf6692be)
2016-07-27 15:15:34 +01:00
Tim Rowley
cd10b86026 swr: switch from overriding -march to selecting features
Acked-by: Chuck Atkins <chuck.atkins@kitware.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit 5a64549f54)
2016-07-27 15:14:57 +01:00
Marek Olšák
3b4c74963a winsys/amdgpu: disallow DCC with mipmaps
It has never been implemented. master will get a different fix.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96381

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
2016-07-27 14:37:01 +01:00
Samuel Pitoiset
faa432c0b6 nvc0: upload sample locations on GM20x
This fixes a bunch of multisample piglit tests on GM206, like
bin/arb_texture_multisample-texelfetch 2 -auto -fbo

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit e7b2ce5fd8)
[Emil Velikov: resolve conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
2016-07-27 14:21:10 +01:00
Rob Herring
271eee2464 Android: add missing u_math.h include path for libmesa_isl
Commit 87d062a940 ("i965: Fix shared local memory size for Gen9+.")
added u_math.h include which broke the Android build:

In file included from external/mesa3d/src/intel/isl/isl_storage_image.c:25:
In file included from external/mesa3d/src/mesa/drivers/dri/i965/brw_compiler.h:29:
external/mesa3d/src/mesa/main/macros.h:35:10: fatal error: 'util/u_math.h' file not found
         ^

Add the missing include paths for libmesa_isl.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Kenneth Garunke <kenneth@whitecape.org>
Nominated-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 789ed13284)
2016-07-27 14:06:27 +01:00
Matt Turner
fd2312e745 mapi: Massage code to allow clang to compile.
According to https://llvm.org/bugs/show_bug.cgi?id=19778#c3 this code
was violating the spec, resulting in it failing to compile.

Cc: mesa-stable@lists.freedesktop.org
Co-authored-by: Tomasz Paweł Gajc <tpgxyz@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89599
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 5ec140c17b)

Squashed with commit:

mapi: fix typo in macro name

Fixes: 5ec140c17b ("mapi: Massage code to allow clang to compile.")
Reported-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 4da9f7e7ce)
2016-07-27 11:07:53 +01:00
Jason Ekstrand
67032af87a nir/inline: Constant-initialize local variables in the callee if needed
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9d503aea06)
2016-07-21 16:08:59 +01:00
Jason Ekstrand
f9964dd2c6 nir: Add a nir_deref_foreach_leaf helper
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dc9f2436c3)
2016-07-21 16:06:08 +01:00
Kenneth Graunke
1c9412bb1a anv: Properly call gen75_emit_state_base_address on Haswell.
This should fix MOCS values.  Caught by Coverity.

CID: 1364155

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit e614062e54)
2016-07-21 16:05:12 +01:00
Kenneth Graunke
726f47e495 genxml: Rename "API Rendering Disable" to "Rendering Disable".
Gen7/7.5 call it "Rendering Disable" while Gen8/9 prefix it with "API".

Pick one for consistency, and so we can share code between generations.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 87660579f5)
2016-07-21 16:04:04 +01:00
Kenneth Graunke
1dd0c22ab0 anv: Unify 3DSTATE_CLIP code across generations.
The bulk of this is the same.  There are just a couple fields that only
exist on one generation or another, and we can easily handle those with
an #ifdef.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit bfd9942cdc)
2016-07-21 16:03:03 +01:00
Kenneth Graunke
a00081a9e8 anv: Enable early culling on Gen7.
We set the cull mode, but forgot the enable bit.  Gen8 uses this.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 44502afd82)
2016-07-21 16:02:06 +01:00
Kenneth Graunke
626f21051b anv: Fix near plane clipping on Gen7/7.5.
The Gen7/7.5 clip code used APIMODE_OGL, while the Gen8+ clip code used
APIMODE_D3D.  The meaning hasn't changed, so one of these must be wrong.

It appears that the hardware documentation is completely wrong.  It
claims that the "API Mode" bit means:

   0h    APIMODE_OGL    NEAR_VP boundary == 0.0 (NDC)
   1h    APIMODE_D3D    NEAR_VP boundary == -1.0 (NDC)

However, DirectX typically uses 0.0 for the near plane, while unextended
OpenGL uses -1.0.  i965's gen6_clip_state.c uses APIMODE_D3D for the
GL_ZERO_TO_ONE case, so I believe the meanings are backwards from what
the documentation says.

Section 23.2 ("Primitive Clipping") of the Vulkan 1.0.21 specification
contains the following equations:

   -w_c <= x_c <= w_c
   -w_c <= y_c <= w_c
      0 <= z_c <= w_c

This means that Vulkan follows D3D semantics.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0d77f08042)
2016-07-21 16:01:05 +01:00
Kenneth Graunke
629a7b32e0 genxml: Add APIMODE_D3D missing enum values and improve consistency.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6b67270262)
2016-07-21 15:59:45 +01:00
Kenneth Graunke
749e4cb96b genxml: Add CLIPMODE_* prefix to 3DSTATE_CLIP's "Clip Mode" enum values.
Gen6-7.5 use CLIPMODE_REJECT_ALL, while Gen8+ just used REJECT_ALL.
Being consistent will let me unify code, and I prefer having the prefix.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit c31cf532af)
2016-07-21 15:58:37 +01:00
Jason Ekstrand
48e9ecc47f i965/miptree: Set logical_depth0 == 6 for cube maps
This matches what we do for cube maps where logical_depth0 is in number of
face-layers rather than number of cubes.  This does mean that we will
temporarily be setting the surface bounds too loose for cube map textures
but we are already setting them too loose for cube arrays and we will be
fixing that in the next commit anyway.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "12.0 11.2 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e19b7f7f1b)
2016-07-21 15:57:39 +01:00
Jason Ekstrand
57c1d0ea07 i965/miptree: Enforce that height == 1 for 1-D array textures
The GL API and mesa internals do this differently than we do.  In GL, there
is no depth parameter for 1-D arrays and height is used.  In the i965
miptree code we do the sane thing and make height == 1 and use depth for
number of slices.  This makes for a mismatch every time we create a 1-D
array texture from GL.  Instead of actually solving this problem, we just
said "1-D is hard, let's make sure it works no matter which way we pass the
parameters" and called it a day.

This commit fixes the one GL -> i965 transition point where we weren't
already handling 1-D array textures to do the right thing and then replaces
the magic fixup code with an assert that you're doing the right thing.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "12.0 11.2 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d4d505d0b0)
2016-07-21 14:32:33 +01:00
Stefan Dirsch
fb8b548ac1 Avoid overflow in 'last' variable of FindGLXFunction(...)
This 'last' variable used in FindGLXFunction(...) may become negative,
but has been defined as unsigned int resulting in an overflow,
finally resulting in a segfault when accessing _glXDispatchTableStrings[...].
Fixed this by definining it as signed int. 'first' variable also needs to be
defined as signed int. Otherwise condition for while loop fails due to C
implicitly converting signed to unsigned values before comparison.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Stefan Dirsch <sndirsch@suse.de>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 27ef7bfd6c)
2016-07-21 14:31:27 +01:00
Tomasz Figa
92799eee93 egl/android: Stop leaking DRI images
Current implementation of the DRI image loader does not free the images
created in get_back_bo() and so leaks memory. Moreover, it creates a new
image every time the DRI driver queries for buffers, even if the backing
native buffer has not changed. leaking memory again.

This patch adds missing call to destroyImage() in droid_enqueue_buffer()
and a check if image is already created to get_back_bo() to fix the
above.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9e1248d075)
2016-07-21 14:29:40 +01:00
Tomasz Figa
b717b49286 egl/android: Check return value of dri2_get_dri_config()
It might return NULL if specific config variant is unsupported.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 94282b6dd0)
2016-07-21 14:29:40 +01:00
Emil Velikov
bb0c49bf76 i965: store reference to the context within struct brw_fence (v2)
As the spec allows for {server,client}_wait_sync to be called without
currently bound context, while our implementation requires context
pointer.

v2: Add a mutex and acquire it for the duration of
    brw_fence_client_wait() and brw_fence_is_completed() as suggested
    by Chad.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
(cherry picked from commit 4f48674d51)
2016-07-21 14:29:40 +01:00
Nicolas Boichat
de695014eb egl/dri2: dri2_make_current: Set EGL error if bindContext fails
Without this, if a configuration is, say, available only on GLES2/3, but
not on GLES1, and is rejected by the dri module's bindContext call,
eglMakeCurrent fails with error "EGL_SUCCESS".

In this patch, we set error to EGL_BAD_MATCH, which is what CTS/dEQP
dEQP-EGL.functional.surfaceless_context expect.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9bebef4034)
2016-07-21 14:29:40 +01:00
Tomasz Figa
67e04622d8 egl/android: Remove unused variables
There are some unused variables left after previous clean-ups triggering
compiler warnings. Let's remove them.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ccda100a5a)
2016-07-21 14:29:40 +01:00
Haixia Shi
fde7fc1ab2 platform_android: prevent deadlock in droid_swap_buffers
To avoid blocking other EGL calls, release the display mutex before
we enqueue buffer to android frameworks and re-acquire the mutex
upon return.

v2: moved lock/unlock inside droid_window_enqueue_buffer().

TEST=verify pinch zoom in Photos app no longer causes hangs

Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 1ea233c6f3)
2016-07-21 14:29:40 +01:00
Tomasz Figa
abaf0e9817 gallium/dri: Add shared glapi to LIBADD on Android
An earlier patch fixed the problem for classic drivers, however Gallium
was still left broken. This patch applies the same workaround to
Gallium, when compiled for Android. Following is a quote from the
original patch:

0cbc90c57c mesa: dri: Add shared glapi to LIBADD on Android

/system/vendor/lib/dri/*_dri.so actually depend on libglapi: without
this, loading the so file fails with:
cannot locate symbol "__emutls_v._glapi_tls_Context"

On non-Android (non-bionic) platform, EGL uses the following
workflow, which works fine:
  dlopen("libglapi.so", RTLD_LAZY | RTLD_GLOBAL);
  dlopen("dri/<driver>_dri.so", RTLD_NOW | RTLD_GLOBAL);

However, bionic does not respect the RTLD_GLOBAL flag, and the dri
library cannot find symbols in libglapi.so, so we need to link
to libglapi.so explicitly. Android.mk already does this.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 70a28afb29)
2016-07-21 14:01:15 +01:00
Emil Velikov
485c6d231e mesa: scons: list builddir before srcdir
Analogous to previous commit.

Note: scons always uses OOT builds, while the in-tree generated files
could be created either manually or by the autoconf build.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 1c7c0d77ac)
[Emil Velikov: remove not-applicable "Dir('main')" entry]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/SConscript
2016-07-21 13:59:25 +01:00
Emil Velikov
1f3ce210cd mesa: automake: list builddir before srcdir
In the case of building in out-of-tree fashion, while having generated
in-tree sources, the latter [likely stale] files will be used.

Flip the order to prevent that.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit eafa82e20e)
2016-07-21 13:53:16 +01:00
Samuel Pitoiset
f123f574fa gm107/ir: make use of ADD32I for all immediates
ADD only allows to emit 19-bits immediates.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9c63224540)
2016-07-21 13:51:55 +01:00
Samuel Pitoiset
bbb0587c78 gm107/ir: add missing NEG modifier for IADD32I
Like FADD32I, the NEG modifier of src0 is at position 56.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0904a2ba97)
2016-07-21 13:50:33 +01:00
Andreas Boll
fb4ab871a8 configure.ac: Use ${datarootdir} for --with-vulkan-icddir help string too
The help string wasn't updated in cbc37f7.

Fixes: cbc37f7 ("anv: install the intel_icd.json to ${datarootdir} by
default")

Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d66cb7c84f)
2016-07-21 13:49:34 +01:00
Ilia Mirkin
53bb4e0354 nv50,nvc0: srgb rendering is only available for rgba/bgra
Mark both L8_SRGB and L8A8_SRGB as non-renderable (the latter already
didn't have the bind flags). This makes the state tracker pick a
different format when rendering is required, or mark the fb as
incomplete. This fixes:

  bin/getteximage-formats init-by-clear-and-render -auto -fbo
  bin/getteximage-formats init-by-rendering -auto -fbo

which previously ran into srgb-encoding differences.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ed9dd3bcd9)
2016-07-21 13:48:16 +01:00
Leo Liu
aeb3ca9754 vl/dri3: fix a memory leak from front buffer
Inspired by fix for mem leak of vdpau interop, resource_from_handle
set texture reference count, that need to be decreased and released,
recall there is a similar case for DRI3, that is with VA-API glx
extension, there is temporary TFP(texture from pixmap), we target it
through dma-buf. leak happens when without count down the reference.

Checked and found with mpv vo=opengl case, there only one static TFP,
the leak happens once, but for totem player using gstreamer VA-API glx,
the dynamic TFP for each frame, so leak quite a bit.

This fixes mem leak for mpv and totem.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 134d6e4e4f)
2016-07-21 13:46:55 +01:00
Jason Ekstrand
b1b601fc7c anv: Handle VK_WHOLE_SIZE properly for buffer views
The old calculation, which used view->offset, encorporated buffer->offset
into the size calculation where it doesn't belong.  This meant that, if
buffer->offset > buffer->size, you would always get a negative size.  This
fixes 170 dEQP-VK.renderpass.attachment.* Vulkan CTS tests on Haswell.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 593731ea3c)
[Emil Velikov: s|bpb / 8|bs|g]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/intel/vulkan/anv_image.c
2016-07-21 13:46:02 +01:00
Jason Ekstrand
7441632753 anv: Add an align_down_npot_u32 helper
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 827405f072)
2016-07-21 12:31:28 +01:00
Jason Ekstrand
56d6f64206 anv: Enable independentBlend on gen7
We can totally do it, we were just only setting up one BLEND_STATE and, now
that the code is unified with gen8, we should be handling it correctly.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f124f4a394)
2016-07-21 12:30:32 +01:00
Jason Ekstrand
f194c84b37 anv/pipeline: Unify blend state setup between gen7 and gen8
This fixes all 674 broken dEQP-VK.pipeline.blend Vulkan CTS tests on
Haswell.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a2e7b2e653)
2016-07-21 12:29:34 +01:00
Jason Ekstrand
cb9a2a4b85 genxml: Make gen6-7 blending look more like gen8
This renames BLEND_STATE to BLEND_STATE_ENTRY and adds an new struct
BLEND_STATE which is just an array of 8 BLEND_STATE_ENTRYs.  This will make
it much easier to write gen-agnostic blend handling code.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit aaa202ebe7)
2016-07-21 12:28:20 +01:00
Brian Paul
9798eb14da mesa: use _mesa_clear_texture_image() in clear_texture_fields()
This avoids a failed assert(img->_BaseFormat != -1) in
init_teximage_fields_ms() because the internalFormat argument is GL_NONE.
This was hit when using glTexStorage() to do a proxy texture test.

Fixes a failure with the updated Piglit tex3d-maxsize test.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit e477d92c94)
2016-07-21 12:27:27 +01:00
Nanley Chery
60e41ca10a isl: Fix isl_tiling_is_any_y()
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 00caba4152)
2016-07-21 12:25:54 +01:00
Nanley Chery
826117a1c4 anv/device: Fix max buffer range limits
Set limits that are consistent with ISL's assertions in
isl_genX(buffer_fill_state_s)() and Anvil's format-DescriptorType
mapping in anv_isl_format_for_descriptor_type().

Fixes the following new crucible tests:
* stress.limits.buffer-update.range.uniform
* stress.limits.buffer-update.range.storage

These tests are in this patch: https://patchwork.freedesktop.org/patch/98726/

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit a5748cb920)
2016-07-21 12:24:55 +01:00
Nanley Chery
f334b6deaa isl: Fix assert on raw buffer surface state size
See inline PRM reference.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 028f6d8317)
2016-07-21 12:23:58 +01:00
Nanley Chery
30c6bff143 anv/descriptor_set: Fix binding partly undefined descriptor sets
Section 13.2.3. of the Vulkan spec requires that implementations be able to
bind sparsely-defined Descriptor Sets without any errors or exceptions.

When binding a descriptor set that contains a dynamic buffer binding/descriptor,
the driver attempts to dereference the descriptor's buffer_view field if it is
non-NULL. It currently segfaults on undefined descriptors as this field is never
zero-initialized. Zero undefined descriptors to avoid segfaulting. This
solution was suggested by Jason Ekstrand.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96850
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit fd16e64321)
2016-07-21 12:23:00 +01:00
Brian Paul
474b169c1f svga: handle mismatched number of samplers, sampler views
in svga_init_shader_key_common().  Since the CSO module only tracks
sampler views for fragment shaders, the number of samplers and sampler
views can be mismatched for other types of shaders.  This situation
triggered an assertion in Chrome with maps.google.com

This patch adds defensive code to handle that situation.

Fixes VMware bug 1694027
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>

(cherry picked from commit 50a669de4e)
2016-07-21 12:21:57 +01:00
Leo Liu
6deeccf5aa st/omx/enc: check uninitialized list from task release
The uninitialized list should be checked and returned.

Thank Julien for the notification and suggested fix.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9d10e79c8)
2016-07-21 12:20:56 +01:00
Jason Ekstrand
bf11931c95 glsl/types: Use _mesa_hash_data for hashing function types
This is way better than the stupid string approach especially since you
could overflow the string.  Again, I thought I had something better at one
point but it obviously got lost.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b919100d61)
2016-07-21 12:19:38 +01:00
Jason Ekstrand
17e1b016fc glsl/types: Fix function type comparison function
It was returning true if the function types have different lengths rather
than false.  This was new with the SPIR-V to NIR pass and I thought I'd
fixed it a while ago but it may have gotten lost in rebasing somewhere.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 11ac1c4dbb)
2016-07-21 12:18:19 +01:00
Christian König
b0d1395480 st/mesa: fix reference counting bug in st_vdpau
Otherwise we leak the resources created for the DMA-buf descriptors.

Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Tested-and-Reviewed by: Leo Liu <leo.liu@amd.com>
Ack-by: Tom St Denis <tom.stdenis@amd.com>

(cherry picked from commit 9ce52baf7f)
2016-07-21 12:17:21 +01:00
Jason Ekstrand
ad09c8142f anv: Add a stub for CmdCopyQueryPoolResults on Ivy Bridge
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9e99282a6)
2016-07-21 12:16:14 +01:00
Tim Rowley
2e010ab1cc Revert "gallium: Force blend color to 16-byte alignment"
This reverts commit d8d6091a84.

Heap allocations may be only 8-byte aligned on 32-bit system, and so having
members with 16-byte alignment (such as in the case where pipe_blend_color is
embedded in radeonsi's si_context) is undefined behavior which indeed causes
crashes when compiled with gcc -O3.

Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96835
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
Acked-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit 29f53d7937)
2016-07-21 12:08:18 +01:00
Jason Ekstrand
0aae486a8b nir/spirv: Don't multiply the push constant block size by 4
I have no idea why we were multiplying by 4 before.  The offsets we get
from SPIR-V are in bytes and so is nir->num_uniforms so there's no need to
do any adjustment whatsoever.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 49476576dd)
2016-07-21 12:07:01 +01:00
Marek Olšák
605063953d radeonsi: add a workaround for a compute VGPR-usage LLVM bug
v2: use abort(), describe which LLVM version is affected

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit d227dbe272)
2016-07-21 12:05:48 +01:00
Marek Olšák
cbdbf67f1c glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast
This bug is uncovered by glsl/lower_if_to_cond_assign.
I don't know if it can be reproduced in any other way.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit ead7736821)
2016-07-21 12:04:46 +01:00
Ilia Mirkin
a2aec66444 mesa: set _NEW_BUFFERS when updating texture bound to current buffers
When a glTexImage call updates the parameters of a currently bound
framebuffer, we might miss out on revalidating whether it is complete.
Make sure to set _NEW_BUFFERS which will trigger the revalidation in
that case.

Also while we're at it, fix the fb parameter passed in to the eventual
RenderTexture call.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94148
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
(cherry picked from commit da7223ebdc)
2016-07-21 12:03:44 +01:00
Ilia Mirkin
1ed8237b1a st/mesa: return appropriate mesa format for ETC texture formats
Even when the backend driver does not support ETC formats, we handle the
decoding into an uncompressed backing texture. However as far as core
mesa is concerned, it's an ETC texture and we should return the relevant
ETC mesa format. This condition can get hit when using glTexStorage to
create the texture object.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 00d4315d37)
2016-07-21 12:02:43 +01:00
Ilia Mirkin
e7de53fefd mesa: etc2 online compression is unsupported, don't attempt it
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8ee3cdde04)
2016-07-21 12:00:55 +01:00
Samuel Pitoiset
76a2950c1e nvc0: fix the driver cb size when draw parameters are used
The size of the driver constant buffer for each stage should be 2048
and not 512 because it has been increased recently for buffers/images.
While we are at it, do the same change for indirect draws.

This fixes all ARB_shader_draw_parameters tests on GM107.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 31a615677b)
2016-07-21 11:59:33 +01:00
Samuel Pitoiset
d6c387933d nvc0/ir: fix images indirect access on Fermi
This fixes the following piglits:

arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index
arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index2

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 19d0450b27)
2016-07-21 11:57:57 +01:00
Nicolai Hähnle
60eabe9ad3 radeonsi: explicitly choose center locations for 1xAA on Polaris
Unlike SC, the small primitive filter does not automatically use center
locations in 1xAA mode, so this is needed to avoid artifacts caused by
the small primitive filter discarding triangles that it shouldn't.

As a side effect of how the effective number of samples is now calculated,
this patch also avoids submitting the sample locations for line/poly smoothing
when they're not really needed.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit d938b8c0bf)
2016-07-21 11:56:25 +01:00
Francisco Jerez
04584d5835 i965: Fix remaining flush vs invalidate race conditions in brw_emit_pipe_control_flush.
This hardware race condition has caused problems several times already
(see "i965: Fix cache pollution race during L3 partitioning set-up.",
"i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs." and
"i965: intel_texture_barrier reimplemented").  The problem is that
whenever we attempt to both flush and invalidate multiple caches with
a single pipe control command the flush and invalidation happen in
reverse order, so the contents flushed from the R/W caches aren't
guaranteed to become visible from the invalidated caches after the
PIPE_CONTROL command completes execution if some concurrent rendering
workload happened to pollute any of the invalidated R/O caches in the
short window of time between the invalidation and flush.

This makes sure that brw_emit_pipe_control_flush() has the effect
expected by most callers of making the contents flushed from any R/W
caches visible from the invalidated R/O caches.

Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 37b901003b)
2016-07-21 11:54:41 +01:00
Francisco Jerez
2035ac24ce i965: Make room in the batch epilogue for three more pipe controls.
Review carefully, it sucks to have to keep track of the number of
command packet dwords emitted in the batch epilogue manually.  The
MI_REPORT_PERF_COUNT_BATCH_DWORDS calculation was obviously wrong.

Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0bd3a121c6)
2016-07-21 11:52:56 +01:00
Francisco Jerez
72d287e347 i965: Emit SKL VF cache invalidation W/A from brw_emit_pipe_control_flush.
There were two places in the driver doing a pipe control VF cache
flush, one of them was missing this workaround, move it down into
brw_emit_pipe_control_flush to make sure we don't miss it again.

Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit a10879f48c)
2016-07-21 11:51:53 +01:00
Ian Romanick
3a35da7e8a glsl: Pack integer and double varyings as flat even if interpolation mode is none
v2: Also update varying_matches::compute_packing_class().  Suggested by
Timothy Arceri.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 3119871bd9)
2016-07-21 11:48:58 +01:00
Ian Romanick
bc68532a06 mesa: Strip arrayness from interface block names in some IO validation
Outputs from the vertex shader need to be able to match
per-vertex-arrayed inputs of later stages.  Acomplish this by stripping
one level of arrayness from the names and types of outputs going to a
per-vertex-arrayed stage.

v2: Add missing checks for TESS_EVAL->GEOMETRY.  Noticed by Timothy
Arceri.

v3: Use a slightly simpler stage check suggested by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 73a6a4ce49)
2016-07-21 11:30:46 +01:00
Emil Velikov
edfc17a19a docs: add sha256 checksums for 12.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-09 00:02:13 +01:00
Emil Velikov
04277f058d docs: add release notes for 12.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-08 23:48:58 +01:00
Emil Velikov
a2770e55a2 Update version to 12.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-08 23:48:11 +01:00
Emil Velikov
a705f82a56 radeon: reference the correct cdw/max_dw
With commit f41f78cda1 ("radeonsi: drop the DRAW_PREAMBLE packet on
Polaris") we failed to attribute that the separate current/prev
radeon_winsys_cs_chunk(s) are not applicable/available in branch.

The latter of which introduced with commit 89ba076de4 ("radeon/winsys:
introduce radeon_winsys_cs_chunk").

Just drop "current." from the respective places to get things up and
running again.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96864
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-08 23:48:11 +01:00
Emil Velikov
3a146a789c docs: add sha256 checksums for 12.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-08 23:48:11 +01:00
Emil Velikov
8b06176f31 docs: Update 12.0.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-08 14:45:24 +01:00
Emil Velikov
ab8938817f Update version to 12.0.0(final)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-07-08 14:43:18 +01:00
Neha Bhende
88a095962f svga: Fix failures caused in fedora 24
SVGA_3D_CMD_DX_GENRATE_MIPMAP & SVGA_3D_CMD_DX_SET_PREDICATION commands
are not presents in fedora 24 kernel module. Because of this
reason application like supertuxkart are not running.

v2: Add few comments and code modifications suggested by Brian P.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit 7988513ac3)
2016-07-08 14:29:58 +01:00
Ilia Mirkin
450f076482 glsl: don't try to lower non-gl builtins as if they were gl_FragData
If a shader has an output array, it will get treated as though it were
gl_FragData and rewritten into gl_out_FragData instances. We only want
this to happen on the actual gl_FragData and not everything else.

This is a small part of the problem pointed out by the below bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a37e46323c)
2016-07-08 14:29:50 +01:00
Emil Velikov
168fdc6a07 bugzilla_mesa.sh: Drop "Bug " from sed command
After a recent Bugzilla update the word is no longer in the title. Thus
the script ended up producing bogus HTML.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f35f8464ec)
2016-07-07 16:12:34 +01:00
Akihiko Odaki
5193fe9f4f mesa: don't install GLX files if GLX is not built
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Akihiko Odaki <akihiko.odaki.4i@stu.hosei.ac.jp>
[Emil Velikov: Drop guards around dri_interface.h, add stable tag]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

(cherry picked from commit 42968424fb)
2016-07-07 16:12:34 +01:00
Mathias Fröhlich
4a3d510b5b osmesa: Export OSMesaCreateContextAttribs.
Since the function is exported like any other
public api function and put in the header
as if you could link against it, export it also
from shared objects.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 13affe0d3f)
2016-07-07 16:12:34 +01:00
Rob Clark
63b7c6ffc8 glsl: add driconf to zero-init unintialized vars
Some games are sloppy.. perhaps because it is defined behavior for DX or
perhaps because nv blob driver defaults things to zero.

So add driconf param to force uninitialized variables to default to zero.

This issue was observed with rust, from steam store.  But has surfaced
elsewhere in the past.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f78a6b1ce3)
2016-07-07 16:12:33 +01:00
Rob Clark
3276443935 i965: don't drop const initializers in vector splitting
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 01ccb0d91e)
2016-07-07 16:12:33 +01:00
Rob Clark
8a700c562c freedreno: fix crash on smaller gpus and higher resolutions
Devices with smaller GMEM size need more tiles.  On db410c at 2048x1152,
glmark2 shadow needed ~330 tiles for fullscreen.  Lets bump it up to
512.  (Maybe with MRT you could end up needing more, but at that point
things are probably going to be painfully slow.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 7295428e41)
2016-07-07 16:12:33 +01:00
Emil Velikov
dcc4df858a anv: vulkan: remove the anv_device.$(OBJEXT) rule
Atm the actual rule will expand to foo.o which is used for static
libraries only.

Thus the automake manual recommendation [to use OBJEXT] won't help us,
since since we're working with a shared library.

Thus let's 'demote' the file and add it back to BUILT_SOURCES. This will
manage all the complexity for us, at the (existing expense) of working
only with the all, check and install targets.

The crazy (why the issue was hard to spot):
If the dependencies (.deps/*.Plo) are already created one can alter the
anv_device.$(OBJEXT) line and/or nuke it all together. That won't lead
to any warnings/issues, even though the Makefile is regenerated.

Moral of the story:
Always rm -rf top_builddir or don't resolve the dependencies manually
and use BUILT_SOURCES.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96825
Fixes: d7a604c3f7a ("anv: use cache uuid based on the build timestamp.")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit 9618e2a24c)
2016-07-07 16:12:33 +01:00
Emil Velikov
dbb4c3c7c8 anv: install the intel_icd.json to ${datarootdir} by default
As mentioned by the spec (and used by Archlinux and Debian) default to
${datarootdir} as opposed to ${sysconfdir} for the default location.

Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit cbc37f72e3)
2016-07-07 16:12:33 +01:00
Emil Velikov
7af5c2834c swr: automake: don't ship LLVM version specific generated sources
Otherwise things will fail to build, if the builder is using another
version of LLVM.

v2: annotate all the dependencies of builder_gen.h
v3: clean the generated files as needed
v4: comment cleanups (Tim)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Tim Rowley <timothy.o.rowley@intel.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com> (v2)
Reported-by: Chuck Atkins <chuck.atkins@kitware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 744d0d8f3b)
2016-07-07 16:12:33 +01:00
Emil Velikov
0e5e20dca0 automake: don't mandate git_sha1.h/MESA_GIT_SHA1
It has proven subtle to get it right both from the build side POV (see
commit list below) and builders due to their varying workflows.

Furthermore it does not fully fulfil the reason why it was enforced -
to detect uniqueness between different builds, in order to distinguish
and invalidate Vulkan/GL caches.

With that having a much better solution (previous commit) we can drop
this solution.

This effectively reverts the following commits:
359d9dfec3 ("mesa: automake: add directory prefix for git_sha1.h")
2c424e00c3 ("mesa: automake: ensure that git_sha1.h.tmp has the right
attributes")
b7f7ec7843 ("mesa: automake: distclean git_sha1.h when building OOT")
8229fe68b5 ("automake: get in-tree `make distclean' working again.")

Cc: Timo Aaltonen <tjaalton@debian.org>
Cc: Haixia Shi <hshi@chromium.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
(cherry picked from commit 22e9357028)
2016-07-07 16:12:33 +01:00
Emil Velikov
cc2c350416 anv: use cache uuid based on the build timestamp.
Do not rely on the git sha1:
 - its current truncated form makes it less unique
 - it does not attribute for local (Vulkand or otherwise) changes

Use a timestamp produced at the time of build. It's perfectly unique,
unless someone explicitly thinkers with their system clock. Even then
chances of producing the exact same one are very small, if not zero.

v2: Remove .tmp rule. Its not needed since we want for the header to be
regenerated on each time we call make (Eric).

v3:
 - Honour SOURCE_DATE_EPOCH, to make the build reproducible (Michel)
 - Replace the generated header with a define, to prevent needless
builds on consecutive `make' and/or `make install' calls. (Dave)

v4:
 - Keep the timestamp generation at make time. (Jason)

v5:
 - Ensure that file is regenerated on incremental builds.

Cc: Michel Dänzer <michel@daenzer.net>
Cc: Dave Airlie <airlied@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit addb099ce8)
2016-07-07 16:12:33 +01:00
Emil Velikov
66fe2be1f5 clover: conditionally use MESA_GIT_SHA1
Considering how hard/annoying it was for many peoples' workflow to
properly generate the macro, it will be demoted to conditionally
available with follow-up commits.

v2: Kill off gracious blank line (Vedran).

Cc: mesa-stable@lists.freedesktop.org
Cc: Vedran Miletić <vedran@miletic.net>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Vedran Miletić <vedran@miletic.net>
(cherry picked from commit f98530b739)
2016-07-07 16:12:33 +01:00
Dave Airlie
568ba49673 Revert "st/glsl_to_tgsi: don't increase immediate index by 1."
This reverts commit 27d456cc87.

DOH, what seems right and what is right with fp64 are always
two different things.

This regressed:
spec@arb_gpu_shader_fp64@shader_storage@layout-std140-fp64-mixed-shader
on radeonsi

Reported-by: Michel Dänzer <michel@daenzer.net>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit cb728df967)
2016-07-07 16:12:33 +01:00
Samuel Pitoiset
134523aa7d nvc0/ir: reset the base offset for indirect images accesses
In presence of an indirect image access, the base offset should be
zeroed because the stride will be computed twice. This is a pretty
rare situation but it can happen when tex.r > 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f3b9fff3c3)
2016-07-07 16:12:33 +01:00
Samuel Pitoiset
9f364ed35e gm107/ir: fix sign bit emission for FADD32I
When emitting OP_SUB, the sign bit for FADD and FADD32I is not
at the same position. It's at position 45 for FADD but 51 for FADD32I.

This fixes the following piglit test:
tests/spec/arb_fragment_program/fdo30337b.shader_test

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb828b7b18)
2016-07-07 16:12:32 +01:00
Lionel Landwerlin
d0e1d6b1c8 anv/wsi: create swapchain images using specified image usage
The image usage specified by the caller of vkCreateSwapchainKHR should be
passed onto the internal image creation. Otherwise the driver might later
crash when the user tries to use the image as a combined sampler even though
the creation was explicitly created with VK_IMAGE_USAGE_TRANSFER_SRC_BIT.

Leaving the previous VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT as this might be
expected even if the swapchain is created without any flag.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96791
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dbbc4fb4cc)
2016-07-07 16:12:32 +01:00
Dave Airlie
7d40db8cdb st/glsl_to_tgsi: don't increase immediate index by 1.
Immediates are stored into a separate table, and are
consolidated, so if we get an immediate we don't need
to offset it as the index it has is correct.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 27d456cc87)
2016-07-07 16:12:32 +01:00
Nicolai Hähnle
250e13e585 st/mesa: check the texture image level in st_texture_match_image
Otherwise, 1x1 images of arbitrarily high level are accepted.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96639#add_comment
Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 07cc838b10)
2016-07-07 16:12:32 +01:00
Nicolai Hähnle
5469fc9800 st/mesa: an incomplete texture may have a zero-size first image
Fixes a regression introduced by commit 42624ea83 which triggered
an assertion in
dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0

While stImage must have a non-zero size as verified by the caller, we also
look at the size of the base image in an attempt to make a better guess at
the level0 size (this is important when the base image size is odd). However,
the base image may have a zero size even when it exists.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96629
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 0ba053b34c)
2016-07-07 16:12:32 +01:00
Chuck Atkins
951be8a50c gallium: Force blend color to 16-byte alignment
This aligns the 4-element color float array to 16 byte boundaries.  This
should allow compiler vectorizers to generate better optimizations.
Also fixes broken vectorization generated by Intel compiler.

v2: Fixed indentation and added a lengthy comment explaining the
    reason for the alignment.

Cc: <mesa-stable@lists.freedesktop.org>
Reported-by: Tim Rowley <timothy.o.rowley@intel.com>
Tested-by: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit d8d6091a84)
2016-07-07 16:12:32 +01:00
Emil Velikov
b08e5a1940 Revert "swr: Refactor checks for compiler feature flags"
This reverts commit a380199e3968462da8291e8dda25888f19e86783.
2016-07-07 16:12:32 +01:00
Chuck Atkins
2ad47d912e swr: Refactor checks for compiler feature flags
Encapsulate the test for which flags are needed to get a compiler to
support certain features.  Along with this, give various options to try
for AVX and AVX2 support.  Ideally we want to use specific instruction
set feature flags, like -mavx2 for instance instead of -march=haswell,
but the flags required for certain compilers are different.  This
allows, for AVX2 for instance, GCC to use -mavx2 -mfma -mbmi2 -mf16c
while the Intel compiler which doesn't support those flags can fall
back to using -march=core-avx2.

This addresses a bug where the Intel compiler will silently ignore the
AVX2 instruction feature flags and then potentially fail to build.

v2: Pass preprocessor-check argument as true-state instead of
    false-state for clarity.
v3: Reduce AVX2 define test to just __AVX2__.  Additional defines suchas
    __FMA__, __BMI2__, and __F16C__ appear to be inconsistently defined
    w.r.t thier availability.
v4: Fix C++11 flags being added globally and add more logic to
    swr_require_cxx_feature_flags

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Tested-by: Tim Rowley <timothy.o.rowley@Intel.com>
Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit c1bf6692be)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-07-07 16:12:32 +01:00
Ian Romanick
bb819a9e21 mapi: Export all GLES 3.1 functions in libGLESv2.so
Khronos recommends that the GLES 3.1 library also be called libGLESv2.
It also requires that functions be statically linkable from that
library.

NOTE: Mesa has supported the EGL_KHR_get_all_proc_addresses extension
since at least Mesa 10.5, so applications targeting Linux should use
eglGetProcAddress to avoid problems running binaries on systems with
older, non-GLES 3.1 libGLESv2 libraries.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Cc: Mike Gorchak <mike.gorchak.qnx@gmail.com>
Reported-by: Mike Gorchak <mike.gorchak.qnx@gmail.com>
Acked-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit 5921f372c8)
2016-07-07 16:12:32 +01:00
sonjiang
0076e14f53 radeon/uvd: fix a h265 context size bug
Fixes a h265 video corruption bug which caused by uvd fw interface changes.

Signed-off-by: sonjiang <sonny.jiang@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit b928ff6f62)
2016-07-07 16:12:32 +01:00
sonjiang
930425df1e radeon/uvd: separate uvd context buffer from DPB
Adapt driver for Polairs uvd firmware interface changes.

Signed-off-by: sonjiang <sonny.jiang@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 5c80354a23)
2016-07-07 16:12:32 +01:00
sonjiang
700c1412e7 radeon: uvd add uvd fw version for amdgpu
Because Polaris uvd fw interface changes, the driver need to check fw version
to apply right interface. This change is to add uvd fw version.

Signed-off-by: sonjiang <sonny.jiang@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 28f85eab49)
[Emil Velikov: resolve trivial s/bool/boolean/ conflicts]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeon/radeon_winsys.h
2016-07-07 16:12:31 +01:00
Samuel Pitoiset
1eaba7b5b3 gm107/ir: make sure that flagsDef is set when emitting setcond
Rely on the existence of a second destination when emitting a setcond
flag is dangerous, because this doesn't mean that the flag has been
correctly set. Instead rely on flagsDef like what emitX() does
for flagsSrc.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cc97b6a34a)
2016-07-07 16:12:31 +01:00
Marek Olšák
684e555aaa radeonsi: set PA_SU_SMALL_PRIM_FILTER_CNTL register on Polaris
This was missing.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c1dbc563f4)
2016-07-07 16:12:31 +01:00
Kenneth Graunke
dcc6dde5f9 i965: Make emit_urb_writes() not produce an EOT message for GS.
emit_urb_writes() contains code to emit an EOT write with no actual
data when there are no output varyings.  This makes sense for the VS
and TES stages, where it's called once at the end of the program.

However, in the geometry shader stage, emit_urb_writes() is called once
for every EmitVertex().  We explicitly emit a URB write with EOT set at
the end of the shader, separately from this path.  So we'd better not
terminate the thread.  This could get us into trouble for shaders which
do EmitVertex() with no varyings followed by SSBO/image/atomic writes.

It also caused us to emit multiple sends with EOT set, which apparently
confuses the register allocator into not using g112-g127 for all but
the first one.  This caused EU validation failures in OglGSCloth
shaders in shader-db.  (The actual application was fine, but shader-db
thinks there are no outputs because it doesn't understand transform
feedback.)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 7e7e501acf)
2016-07-07 16:12:31 +01:00
Kenneth Graunke
6d47c3eb4e glsl: Ignore ir_texture in lower_const_arrays_to_uniforms.
The only part of an ir_texture which can be an array is the
offsets array in textureGatherOffsets() calls.  We don't want
to lower those, because they're required to remain constants.

Fixes textureGatherOffsets with Gallium drivers such as llvmpipe,
which commit ef78df8d3b regressed.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit a36a73a7b8)
2016-07-07 16:12:31 +01:00
Samuel Pitoiset
7f0984ca51 gm107/ir: add missing setcond flags for LOP variants
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7b9b096775)
2016-07-07 16:12:31 +01:00
Samuel Pitoiset
94bfef8a71 gm107/ir: make use of LOP32I for all immediates
LOP only allows to emit 19-bits immediates.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 83a4f28dc2)
2016-07-07 16:12:31 +01:00
Dave Airlie
2bd400d953 virgl: reduce some limits for now
These need to be passed from the host in caps structure if they
are larger, this fixes a bunch of tests on Intel hw, that I'd
put the limits too high for.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c7cc264ca9)
2016-07-07 16:12:31 +01:00
Samuel Pitoiset
d4ab5bcad5 gm107/ir: make use of MOV32I for all immediates
MOV only allows to emit 19-bits immediates. This is similar to the
previous fix I did for IMUL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c7fa3c92f8)
2016-07-07 16:12:31 +01:00
Jordan Justen
b82362f52c i965: Use miptree to decide format on multi-plane images for gen < 7
This wasn't handled correctly for multi-plane images on gen < 7 in
727a9b2493.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96674
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 367cf3a2e3)
2016-07-07 16:12:31 +01:00
Samuel Pitoiset
2fc24cfd8c gm107/ir: make use of IMUL32I for all immediates
IMUL only allows to emit 19-bits immediates. This is similar to
d30768025a which fixed the same thing
for the GK110 emitter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b84c97587b)
2016-07-07 16:12:30 +01:00
Jordan Justen
507d19c44f i965: Skip update_texture_surface when the plane doesn't exist
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96607
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit 727a9b2493)
2016-07-07 16:12:30 +01:00
Kenneth Graunke
a6a246e17c i965: Set fs_inst::base_mrf = -1 by default.
On MRF platforms, we need to set base_mrf to the first MRF value we'd
like to use for the message.  On send-from-GRF platforms, we set it to
-1 to indicate that the operation doesn't use MRFs.

As MRF platforms are becoming increasingly a thing of the past, we've
forgotten to bother with this.  It makes more sense to set it to -1 by
default, so we don't have to think about it for new code.

I searched the code for every instance of 'mlen =' in brw_fs*cpp, and
it appears that all MRF-based messages correctly program a base_mrf.

Forgetting to set base_mrf = -1 can confuse the register allocator,
causing it to think we have a large fake-MRF region.  This ends up
moving the send-with-EOT registers earlier, sometimes even out of
the g112-g127 range, which is illegal.  For example, this fixes
illegal sends in Piglit's arb_gpu_shader_fp64-layout-std430-fp64-shader,
which had SSBO messages with mlen > 0 but base_mrf == 0.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 3e04e3758e)
2016-07-07 16:12:30 +01:00
Marek Olšák
a51a9d7ba3 radeonsi: fix fractional odd tessellation spacing for Polaris
ported from Vulkan (and no source explains why this is needed)

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 28d0d0c5b4)
2016-07-07 16:12:30 +01:00
Marek Olšák
4f784775a7 radeonsi: fix a compute shader hang with big threadgroups on SI & CI
ported from Vulkan

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 1e8adb0ee4)
[Emil Velikov: resolve trivial conflict in si_launch_grid()]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/gallium/drivers/radeonsi/si_compute.c
2016-07-07 16:12:30 +01:00
Ilia Mirkin
48fe283158 nvc0: when mapping directly, provide accurate xfer info + start
We were ignoring the incoming box parameters, and were providing totally
bogus stride/layer stride, and other bits, for when a non-full-surface
map was requested.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b433cb51e5)
2016-06-24 21:33:24 +01:00
Nicolai Hähnle
f41f78cda1 radeonsi: drop the DRAW_PREAMBLE packet on Polaris
It will be removed from the firmware for the Polaris.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 0da890e62c)
2016-06-24 21:31:45 +01:00
Nicolai Hähnle
197e2eaea8 radeonsi: use DRAW_(INDEX_)INDIRECT_MULTI on Polaris
The non-MULTI variants will be removed in Polaris firmware.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2aa0485902)
2016-06-24 21:30:37 +01:00
Jordan Justen
eadccf8c67 i965: Preserve the internal format of the dri image
Since the OpenGLES API is strict about the internal format matching
the for many operations, we need to preserve it.

See _mesa_es3_error_check_format_and_type in
src/mesa/main/glformats.c.

Fixes ES2-CTS.gtf.GL2ExtensionTests.egl_image.egl_image

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96351
Reported-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit c36a363a2d)
2016-06-24 21:29:47 +01:00
Kenneth Graunke
1fc705366c i965: Implement rasterizer discard via SOL unless required for queries.
We currently use CL_INVOCATION_COUNT for the GL_PRIMITIVES_GENERATED
query, which involves passing all primitives to the clipper.  When
rasterizer discard is enabled, we program the clipper in REJECT_ALL
mode, rather than using the SOL stage's "Rendering Disable" feature.

See commit f09b91f782 for an explanation
of why we implement GL_PRIMITIVES_GENERATED this way.

Apparently the SOL stage's "Rendering Disable" feature is a lot faster
than having the clipper reject all primitives.  It's safe to use when
no GL_PRIMITIVES_GENERATED query is active, as we don't care about
CL_INVOCATION_COUNT incrementing.

This patch makes us use SO_RENDERING_DISABLE when no query is active,
but continues falling back to the clipper in REJECT_ALL mode when the
queries are enabled.  It brings back the perf_debug for the clipper
case (which I removed in commit 1f9445ff57, thinking it wasn't useful).

Improves performance in Gl32GSCloth by 84.8303% +/- 2.07132% (n = 10)
on my Broadwell GT2 laptop.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit b0629e6894)
2016-06-24 21:28:59 +01:00
Kenneth Graunke
91de94a119 i965: Combine 3DSTATE_STREAMOUT emitters and genX_sol_state atoms.
They're basically the same.  Let's avoid the code duplication.

v2: Fix SO_BUFFER_ENABLE stuff to only happen on Gen < 8 (caught
    by Jason Ekstrand).

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4db98f8beb)
2016-06-24 21:27:28 +01:00
Kenneth Graunke
d7ea3eada7 glsl: Don't constant propagate arrays.
Constant propagation on arrays doesn't make a lot of sense.  If the
array is only accessed with constant indexes, then opt_array_splitting
would split it up.  Otherwise, we have variable indexing.  If there's
multiple accesses, then constant propagation would end up replicating
the data.

The lower_const_arrays_to_uniforms pass creates uniforms for each
ir_constant with array type that it encounters.  This means that it
creates redundant uniforms for each copy of the constant, which means
uploading too much data.  It can even mean exceeding the maximum number
of uniform components, causing link failures.

We could try and teach the pass to de-duplicate the data by hashing
constants, but it makes more sense to avoid duplicating it in the first
place.  We should promote constant arrays to uniforms, then propagate
the uniform access.

Fixes the TressFX shaders from Tomb Raider, which exceeded the maximum
number of uniform components by a huge margin and failed to link.

On Broadwell:

total instructions in shared programs: 9067702 -> 9068202 (0.01%)
instructions in affected programs: 10335 -> 10835 (4.84%)
helped: 10 (Hoard, Shadow of Mordor, Amnesia: The Dark Descent)
HURT: 20 (Natural Selection 2)

loops in affected programs: 4 -> 0

The hurt programs appear to no longer have a constarray uniform, as
all constants were successfully propagated.  Apparently before this
patch, we successfully unrolled a loop containing array access, but
only after promoting constant arrays to uniforms.  With this patch,
we unroll it first, so all array access is direct, and the array
is split up, and individual constants are propagated.  This seems
better.

Cc: mesa-stable@lists.freedesktop.org
Reported-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit fb857b5eea)
2016-06-24 21:26:39 +01:00
Kenneth Graunke
0af4f5c1ba glsl: Make lower_const_arrays_to_uniforms work directly on constants.
There's really no point in looking at ir_dereference_array of a
constant.  It also misses cases like:

  (assign () (var_ref tmp) (constant (array ...) ...))

No changes in shader-db, but keeps it working after the next commit.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit ef78df8d3b)
2016-06-24 21:25:50 +01:00
Kenneth Graunke
335193107a i965: Copy propagate before doing variable index lowering.
The scalar backend currently doesn't support variable indexing on
temporary arrays, but it does support it on uniform arrays, and
some stages support it for input arrays.  Make sure these are
propagated through before exploding indirects into piles of
if-ladders unnecessarily.

On Broadwell, no instruction count change in shader-db.

total cycles in shared programs: 80675652 -> 80674928 (-0.00%)
cycles in affected programs: 649972 -> 649248 (-0.11%)
helped: 386
HURT: 165

This will help avoid code quality regressions in a future commit.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit f7741c5211)
2016-06-24 21:25:00 +01:00
Kenneth Graunke
32bb867118 glsl: Propagate invariant/precise after lowering const arrays.
The new uniform may need precise as well.

Fixes copy propagation of constant array uniforms in Tomb Raider shaders.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 586f4a42e7)
2016-06-24 21:24:09 +01:00
Kenneth Graunke
60b5ba557e glsl: Split arrays even in the presence of whole-array copies.
Previously, we failed to split constant arrays.  Code such as

   int[2] numbers = int[](1, 2);

would generates a whole-array assignment:

  (assign () (var_ref numbers)
             (constant (array int 4) (constant int 1) (constant int 2)))

opt_array_splitting generally tried to visit ir_dereference_array nodes,
and avoid recursing into the inner ir_dereference_variable.  So if it
ever saw a ir_dereference_variable, it assumed this was a whole-array
read and bailed.  However, in the above case, there's no array deref,
and we can totally handle it - we just have to "unroll" the assignment,
creating assignments for each element.

This was mitigated by the fact that we constant propagate whole arrays,
so a dereference of a single component would usually get the desired
single value anyway.  However, I plan to stop doing that shortly;
early experiments with disabling constant propagation of arrays
revealed this shortcoming.

This patch causes some arrays in Gl32GSCloth's geometry shaders to be
split, which allows other optimizations to eliminate unused GS inputs.
The VS then doesn't have to write them, which eliminates the entire VS
(5 -> 2 instructions).  It still renders correctly.

No other change in shader-db.

v2: Drop !AOA check and improve a comment (feedback from Tim Arceri).

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit c264fdbc07)
2016-06-24 21:23:20 +01:00
Kenneth Graunke
9013f56bb7 glsl: Make constant propagation's folder not propagate into an LHS.
opt_constant_propagation.cpp contains constant folding code which can
actually do constant propagation in some cases.  It was happily
propagating constants into the left-hand-side of assignments.

For example,

   (assign () (var_ref temp) (constant ...))

would brilliantly be turned into:

   (assign () (constant ...) (constant ....))

This is a bigger hammer than necessary - it prevents propagation
into the left-hand-side altogether.  We could certainly do better
someday.  Notably, the constant propagation pass itself already
takes this approach - it's just the constant propagation pass's
built-in constant folding code (which actually propagates, too)
that was broken.

No change in shader-db, but prevents regressions after future commits.
It seems plausible that this could be hit today, but I haven't seen it
happen.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit acf5444044)
2016-06-24 21:22:31 +01:00
Ardinartsev Nikita
133d0f0882 i965: Avoid division by zero.
Fixes regression introduced by af5ca43f26

Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419
(cherry picked from commit 01c89ccc5d)
2016-06-24 21:21:34 +01:00
Tim Rowley
1e8fb90f19 swr: push/pop DEBUG macro around llvm includes
llvm redefines DEBUG; adding push/pop prevents a undefined reference
to debug_refcnt_state in llvm-3.7+.

v2: add undef DEBUG

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
(cherry picked from commit 9ca741c645)
2016-06-24 21:20:28 +01:00
Jose Fonseca
6c1911effb include: Require MSVC 2013 Update 4.
Earlier MSVC 2013 releases have troubles compiling some of our C99 code,
so make sure we have Update 4 to avoid confusion.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 805dbdf06d)
2016-06-24 20:58:23 +01:00
Jason Ekstrand
aefcbf41ef anv: Use different BOs for different scratch sizes and stages
This solves a race condition where we can end up having different stages
stomp on each other because they're all trying to scratch in the same BO
but they have different views of its layout.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c2f2c8e407)
2016-06-24 20:57:17 +01:00
Jason Ekstrand
892cbc202c genxml: Make ScratchSpaceBasePointer an address instead of an offset
While we're here, we also fixup MEDIA_VFE_STATE and rename the field in
3DSTATE_VS on gen6-7.5 to be consistent with the others.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 45c0f60999)
2016-06-24 20:56:10 +01:00
Jason Ekstrand
7a4641cdbe anv: Add an allocator for scratch buffers
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 966bed17c1)
2016-06-24 20:55:06 +01:00
Jason Ekstrand
13d82b7690 genxml: Put append counter fields before MCS in RENDER_SURFACE_STATE on gen7
The pack header generation scripts can't handle the case where you have
two addresses in the same dword; they just take whatever is the last one.
This meant that the MCS address wasn't properly getting handled.  Since we
don't care about append counters, we can just re-arrange the XML for now.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 89ded099f8)
2016-06-24 20:53:35 +01:00
Jason Ekstrand
94cd7425e8 anv,isl: Lower storage image formats in anv
ISL was being a bit too clever for its own good and lowering the format for
us.  This is all well and good *if* we always want to lower it.  However,
the GL driver selectively lowers the format depending on whether the
surface is write-only or not.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d82322eb18)
2016-06-24 20:52:38 +01:00
Jason Ekstrand
40a9ffbbca isl/state: Allow for full 31-bit buffer texture sizes
Ivy Bridge and above can handle up to 2^31 elements for RAW buffer
surfaces.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 97f12773b8)
2016-06-24 20:51:48 +01:00
Jason Ekstrand
02bf08e124 isl/state: Don't use designated initializers for buffer surface state
Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bb64e666ba)
2016-06-24 20:50:53 +01:00
Jason Ekstrand
feaa68e38a isl/state: Add assertions for buffer surface restrictions
Acked-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4061fde66e)
2016-06-24 20:49:57 +01:00
Jason Ekstrand
72cc8544a8 isl/state: Don't set SurfacePitch for gen9 1-D textures
This field is ignored by the hardware in this case and, on very large 1-D
textures, it can end up being larger than the maximum allowed value.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ce24097abe)
2016-06-24 20:49:05 +01:00
Jason Ekstrand
fcefb53c37 isl/state: Use TILEWALK_XMAJOR for linear surfaces on gen7
This matches better what happens on gen8 where the "Tiled Surface" and
"Tile Walke" bits are combined into a single two-bit value.  This is also
more consistent with what the GL driver does.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f47e23a8b6)
2016-06-24 20:48:09 +01:00
Jason Ekstrand
913e9e14f0 isl/state: Emit no-op mip tail setup on SKL
This hasn't ever been a problem in the past but it is recommended by the
hardware docs.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 96706bad5f)
2016-06-24 20:47:21 +01:00
Jason Ekstrand
a49f97fae3 isl/state: Only set cube face enables if usage includes CUBE_BIT
It seems safe to set it all the time, but this reduces the diff between
the way i965 does it and what ISL does.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 14d7c16e50)
2016-06-24 20:46:31 +01:00
Jason Ekstrand
672872051d isl/state: Use the layout for computing qpitch rather than dimensions
For depth/stencil 1-D textures on SKL, we want them layed out in the old
format that has been used since gen4.  In order for the surface state
fill-out code to handle, this it needs to distinguish based on layout
rather than just dimensionality.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5d24e9cfa1)
2016-06-24 20:45:39 +01:00
Jason Ekstrand
667beb92a9 isl/state: Set the IntegerSurfaceFormat bit on Haswell
This fixes 688 Vulkan CTS tests on Haswell.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6a43204afa)
2016-06-24 20:44:46 +01:00
Jason Ekstrand
262282c1bf isl/format: Mark R9G9B9E5 as containing 9-bit unsigned float channels
Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 324103da75)
2016-06-24 20:43:55 +01:00
Jason Ekstrand
350ae65585 isl/state: Don't set RenderTargetViewExtent for texture surfaces
The docs specify that this only matters for render targets and surfaces
used with typed dataport messages.  On some platforms (gen4-6) the Depth
field has more bits than RenderTargetViewExtent so we can have textures
with more levels than we can render to.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 215282c9f4)
2016-06-24 20:43:01 +01:00
Jason Ekstrand
415869c5c9 isl/state: Set SurfaceArray based on the surface dimension
According to the PRM, you can't set SurfaceArray for 3D or buffer textures.
There doesn't seem to be a good reason not to set it when we can.  On the
other hand, if we don't set it we can end up getting strange results for
1-layer array textures such as textureSize() returning the wrong results.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bb326f7b01)
2016-06-24 20:42:07 +01:00
Jason Ekstrand
6a3f08be3a isl/state: Don't force-disable L2 bypass for everything
We already set the bit in the few cases where it's required by the docs so
there's no need to set it all the time.  This has no noticable perf impact
for Dota 2 on Vulkan with the time demo I have.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d050ffbce9)
2016-06-24 20:41:15 +01:00
Jason Ekstrand
0315650532 isl/state: Refactor the setup of clear colors
This commit switches clear colors to use #if's instead of a C if.  This
lets us properly handle SNB where the clear color field doesn't exist.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 87f0ffa646)
2016-06-24 20:40:19 +01:00
Jason Ekstrand
9259e0f990 isl/state: Refactor the per-gen isl_to_gen_h/valign tables
This moves the #if's around so that halign and valign have different sets
of #if conditions.  This also prepares us for SNB because isl_to_gen_halign
is not defined at all on gen6.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 62a5e6e031)
2016-06-24 20:39:23 +01:00
Jason Ekstrand
dbc94da586 isl/state: Return an extent3d from the halign/valign helper
Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b1b0d6fb54)
2016-06-24 20:38:27 +01:00
Jason Ekstrand
1dd276aa7c isl/state: Put pitch calculations together
This is purely cosmetic, but it makes things look a bit more readable.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a60ae9e10a)
2016-06-24 20:37:27 +01:00
Jason Ekstrand
652161bdc8 isl/state: Put all dimension setup together and towards the top
This is purely cosmetic, but it makes things look a bit more readable.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 70c8afc0c8)
2016-06-24 20:36:22 +01:00
Jason Ekstrand
29b24d75eb isl/state: Put surface format setup at the top
This is purely cosmetic, but it makes things look a bit more readable.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e66e70ef47)
2016-06-24 20:35:22 +01:00
Jason Ekstrand
8b3333d1df isl/state: Remove some unused fields
They're already zero-initialized and we have no plans of doing anything
more interesting with them.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 39baea551f)
2016-06-24 20:34:30 +01:00
Jason Ekstrand
bf59ce8869 isl/state: Don't use designated initializers for the surface state
While designated initializers are nice, they also force us to put some
things in the initializer and some things later.  Surface state setup is
complicated enough that this really hurts readability in the long run.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit caf2af4181)
2016-06-24 20:33:35 +01:00
Jason Ekstrand
0a7671a309 genxml/gen8,9: Prefix the multisample format enum with MSFMT
This is what gen7 does and it's nice to have a prefix

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit de1d194856)
2016-06-24 20:32:34 +01:00
Jason Ekstrand
69234ef45e i965/gen4: Subtract 1 from buffer sizes
The PRM states that the values put in Width, Height, and Depth should be
various bits from the value size - 1.  We seem to have done this wrong
more-or-less from the start.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2a1cc94d27)
2016-06-24 20:31:42 +01:00
Jason Ekstrand
2681454102 i965/fs: Use a default Y coordinate of 0 for TXF on gen9+
Previously, we were incrementing length but not actually putting anything
in the Y coordinate.  This meant that 1-D TXF operations had a garbage
array index.  If the surface is emitted as 1-D non-array, the coordinate
gets discarded and it works fine.  If it happens to be bound as an array
surface, it may count as an out-of-bounds array access and you get zero.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0195299c86)
2016-06-24 20:30:48 +01:00
Jason Ekstrand
e9fd680fde i965/gen8: Use the qpitch from the aux_mt for AUX_QPITCH
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1436238b75)
2016-06-24 20:29:56 +01:00
Jason Ekstrand
6a6947d89a i965/blorp/gen8: Use the correct max level and layer in emit_surface_states
We were adding in the base which is wrong because the values given in the
miptree are relative to zero and not the base layer/level.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 620f81d2ed)
2016-06-24 20:29:03 +01:00
Jason Ekstrand
1673dec65c i965: Drop the maximum 3D texture size to 512 on Sandy Bridge
The RenderTargetViewExtent field of RENDER_SURFACE_STATE is supposed to be
set to the depth of a 3-D texture when rendering.  Unfortunatley, that
field is only 9 bits on Sandy Bridge and prior so we can't actually bind
a 3-D texturing for rendering if it has depth > 512.  On Ivy Bridge, this
field was bumpped to 11 bits so we can go all the way up to 2048.  On Iron
Lake and prior, we don't support layered rendering and we use OffsetX/Y
hacks to render to particular layers so 2048 is ok there too.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6ba88bce64)
2016-06-24 20:28:06 +01:00
Jason Ekstrand
af12f81147 i965/gen4-6: Handle gl_texture_object::BaseLevel and MinLayer correctly
This is basically a direct translation of what we do for gen7.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0f9cd74aab)
2016-06-24 20:27:12 +01:00
Jason Ekstrand
d9219b5b79 i965/gen4: Pull texture formats from the texture object not the miptree
This makes texture views sort-of work.  It doesn't add full texture view
support for gen4-5 but it is enough to fix the GL_ARB_copy_image formats
piglit test on Iron Lake.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ee39d3ba91)
2016-06-24 20:26:14 +01:00
Ilia Mirkin
abfed13bf4 glsl: only match gl_FragData and not gl_SecondaryFragDataEXT
There's special logic around finding gl_FragData. It latches onto any
array with FRAG_RESULT_DATA0. However gl_SecondaryFragDataEXT[], added
by GL_EXT_blend_func_extended, fits those parameters as well. The real
frag data array should have index 0 though, so we can use that to
distinguish them.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96617
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 36ed1b695e)
2016-06-24 20:25:10 +01:00
Ilia Mirkin
8ac0a713f7 nv50,nvc0: fix start_instance in manual push path
The start instance is applied as an offset into the buffer directly,
ignoring the divisor, not as an instance id offset that respects the
divisor.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1f4bca798d)
2016-06-24 20:23:49 +01:00
Ilia Mirkin
f7af3868f7 translate: fix start_instance parameter in sse version
The generic version gets this right already, but this was using an
incorrect formula in SSE.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 5b0d64886d)
2016-06-24 20:22:16 +01:00
Jason Ekstrand
15d06d4d61 anv/cmd: Dirty descriptor sets when a new pipeline is bound
Ever since c2581a9375, the binding table layout has depended on the
pipeline.  This means that whenever we change pipelines we also need to
re-emit binding tables for the new layout.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 35b53c8d47)
2016-06-24 20:21:18 +01:00
Jason Ekstrand
6fd7d618f4 anv/cmd: Move emit_descriptor_pointers to genX_cmd_buffer.c
It's tiny and fully generic so there's really no reason for it to be in a
gen7-specific file.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2bfe0c3374)
2016-06-24 20:20:21 +01:00
Jason Ekstrand
045d6bc023 anv/cmd: Move flush_descriptor_sets to anv_cmd_buffer.c
There's no good reason for recompiling it

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9df4d6bb36)
2016-06-24 20:19:11 +01:00
Jason Ekstrand
b2fe134064 spirv: Use the system value version of gl_FrontFace
SPIR-V treats it as an input but NIR wants the system value.  This
shouldn't have been too much of a surprise given that we have to do the
same conversion in the GLSL IR to NIR pass.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 295e03c980)
2016-06-24 20:03:46 +01:00
Kenneth Graunke
2e8129ddf8 i965: Reorganize prog_data->total_scratch code a bit.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 40013c5033)
2016-06-24 18:17:50 +01:00
Emil Velikov
5e0b11cb6d Update version to 12.0.0-rc4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-21 13:32:04 +01:00
Nicolai Hähnle
6306930c3f st/mesa: flush bitmap cache before CopyImageSubData
Found by inspection.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit f9ddd52317)
2016-06-21 11:53:55 +01:00
Nicolai Hähnle
76377387c2 st/mesa: flush bitmap cache before texture functions
As far as I can tell, a sequence of glBitmap followed by texture functions
that refer to a texture bound as the framebuffer is well within what should
be allowed.

Found by inspection.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e7fff3cfe1)
2016-06-21 11:52:36 +01:00
Nicolai Hähnle
6775b169cd st/mesa: flush bitmap cache before compute dispatch
In the unlikely case that a program uses glBitmap to render to a framebuffer
whose texture is bound in a compute shader.

Found by inspection.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c542b7e43d)
2016-06-21 11:51:20 +01:00
Kenneth Graunke
a0235eb0f7 i965: Fix multiplication of immediates on Cherryview/Broxton.
Cherryview and Broxton don't support DW x DW multiplication.  We have
piles of code to handle this, but apparently weren't retyping in the
immediate case.

For example,
tests/spec/arb_tessellation_shader/execution/dvec3-vs-tcs-tes
makes the simulator angry about instructions such as:

   mul(8) r18<1>:D r10.0<8;8,1>:D 0x00000003:D

Just retype to W or UW.  It should be safe on all platforms.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit cd89c834a8)
2016-06-21 11:49:55 +01:00
Jason Ekstrand
09a098bdeb anv: Add proper support for depth clamping
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit eb6764c4a7)
2016-06-21 11:48:39 +01:00
Jason Ekstrand
f3c8dde2e4 anv/cmd_buffer: Split emit_viewport in two
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8a46b505cb)
2016-06-21 11:47:20 +01:00
Jason Ekstrand
3fddb9fd46 anv/cmd_buffer: Set depth/stencil extent based on the image
It used to be based on the framebuffer which isn't quite right.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 20e95a746d)
2016-06-21 11:46:03 +01:00
Jason Ekstrand
f614a1f4d8 anv/cmd_buffer: Don't crash if push constants are provided for missing stages
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b65f2e4163)
2016-06-21 11:44:48 +01:00
Jason Ekstrand
f4bc7218d5 anv/pipeline: Do invariance propagation on SPIR-V shaders
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e6c2fe4519)
2016-06-21 11:43:29 +01:00
Jason Ekstrand
77f241bd37 nir/alu_to_scalar: Respect the exact ALU operation qualifier
Just setting builder->exact isn't sufficient because that only applies to
instructions that are built with the builder but instructions created
manually and only inserted using the builder are left alone.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bec07b7292)
2016-06-21 11:41:49 +01:00
Jason Ekstrand
deedb368de nir: Add a pass for propagating invariant decorations
This pass is similar to propagate_invariance in the GLSL compiler.  The
real "output" of this pass is that any algebraic operations which are
eventually consumed by an invariant variable get marked as "exact".

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 202751fbb7)
2016-06-21 11:37:37 +01:00
Jason Ekstrand
bac23b13eb nir/algebraic: Remove imprecise flog2 optimizations
While mathematically correct, these two optimizations result in an
expression with substantially lower precision than the original.  For any
positive finite floating-point value, log2(x) is well-defined and finite.
More precisely, it is in the range [-150, 150] so any sum of logarithms
log2(a) + log2(b) is also well-defined and finite as long as a and b are
both positive and finite.  However, if a and b are either very small or
very large, their product may get flushed to infinity or zero causing
log2(a * b) to be nowhere close to log2(a) + log2(b).

This imprecision was causing incorrect rendering in Talos Principal because
part of its HDR rendering process involves doing 8 texture operations,
clamping the result to [0, 65000], taking a dot-product with a constant,
and then taking the log2.  This is done 6 or 8 times and summed to produce
the final result which is written to a red texture.  In cases where you
have a region of the screen that is very dark, it can end up getting a
result value of -inf which is not what is intended.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96425
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68e308d853)
2016-06-21 11:36:08 +01:00
Nicolai Hähnle
b03b256e92 radeonsi: fix calculation of valid RB mask per SE
The old calculation treated too many RBs as disabled.

Cc: 11.0 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit c95175581e)
2016-06-21 11:34:38 +01:00
Nicolai Hähnle
52ae654569 radeonsi: raise SI_PM4_MAX_DW
The old limit, introduced in commit afa752d3f0,
was exceeded by 4 SE configurations which hit si_write_harvested_raster_configs.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 6c2e636982)
2016-06-21 11:33:00 +01:00
Roland Scheidegger
f675339b22 gallivm: don't use integer min/max sse intrinsics with llvm >= 3.9
Apparently, these are deprecated. There's some AutoUpgrade feature which
is supposed to promote these to cmp/select, which apparently doesn't work
with jit code. It is possible it's not actually even meant to work (see
the bug filed against llvm which couldn't provide an answer neither)
but in any case this is meant to be only temporary unless the intrinsics
are really illegal. So, just use the fallback code (which should be cmp/select,
we're actually doing cmp/sext/trunc/select, but in any case llvm 3.9 manages
to optimize this back to pmin/pmax in the end).

This addresses https://llvm.org/bugs/show_bug.cgi?id=28176

CC: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Aaron Watry <awatry@gmail.com>
(cherry picked from commit b0cf99165a)
2016-06-21 11:31:08 +01:00
Ilia Mirkin
cdbcd315b3 nvc0: don't make use of push hint if there are no non-const user vbos
This makes the check match up what we do on nv50 as well - there's no
point in switching over the push path if everything's in managed
buffers. This can happen when a shader uses a vertex without an enabled
array - we end up passing it a constant attribute.

This also has the effect of "fixing" some flickering in Talos. I have no
idea why. I've stared at the push logic forwards, backwards, and
sideways. By always forcing the push path (which is slow), the
flickering also goes away, but other rendering is still wrong
(specifically draw 383068 as identified in the bug). However by not
switching over to the push path, draw 383068 is correct.

Note that other flickering remains in Talos, like the red/green
walls/floors. This takes care of the shadow flickering though.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90513
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 154c0a42a2)
2016-06-21 11:29:27 +01:00
Ilia Mirkin
7f1a4dc740 gk104/ir: fix tex use generation to be more careful about eliding uses
If we have a loop, instructions before the tex might be added as tex
uses, and those may in fact dominate all other uses of the tex results.
This however doesn't mean that we don't need a texbar after the tex.
Only check if uses dominate each other they are dominated by the tex.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96565
Fixes: 7752bbc44 (gk104/ir: simplify and fool-proof texbar algorithm)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1804aa0b80)
2016-06-21 11:27:50 +01:00
Samuel Iglesias Gonsálvez
97440cc2ed i965/fs: indirect addressing with doubles is not supported in CHV/BSW/BXT
From the Cherryview's PRM, Volume 7, 3D Media GPGPU Engine, Register Region
Restrictions, page 844:

  "When source or destination datatype is 64b or operation is integer DWord
   multiply, indirect addressing must not be used."

v2:
- Fix it for Broxton too.

v3:
- Simplify code by using subscript() and not creating a new num_components
variable (Kenneth).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bdab572a86)
2016-06-17 14:41:16 +01:00
Iago Toral Quiroga
3265becac3 i965/fs: Fix single-precision to double-precision conversions for CHV/BSW/BXT
From the Cherryview PRM, Volume 7, 3D Media GPGPU Engine,
Register Region Restrictions:

   "When source or destination is 64b (...), regioning in Align1
    must follow these rules:

    1. Source and destination horizontal stride must be aligned to
       the same qword.
    (...)"

v2:
- Fix it for Broxton too.

v3:
- Remove inst->regs_written change as it is not necessary (Ken)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0177dbb6c2)
2016-06-17 14:40:12 +01:00
Ian Romanick
033279c961 mesa: If validation fails in a debug context just emit a debug message
There are quite a few pipelines that desktop applications (including a
bunch of piglit test) can expect to have run but don't meet the GLES
requirements.  Instead of failing validation, just emit a debug message.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 6bec55a780)
2016-06-17 14:39:16 +01:00
Ian Romanick
6572273631 glsl: Always strip arrayness in precision_qualifier_allowed
Previously some callers of precision_qualifier_allowed would strip the
arrayness from the type and some would not.  As a result, some places
would not notice that float[6], for example, needed a precision
qualifier.

Fixes the new piglit test no-default-float-array-precision.frag.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 9c87282041)
2016-06-17 14:38:08 +01:00
Kenneth Graunke
dab4a6001b i965: Use a uniform for gl_PatchVerticesIn in the TCS on Gen8+.
We still need to recompile the passthrough shader when this value
changes, as it also affects the output vertex count.  But otherwise,
we can eliminate recompiles on Gen8+.

We probably want to do this for Gen7 as well, but that requires
rewriting the input release code to use a loop, which is a trade-off
I'd need to consider in more detail.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit c319512e16)
2016-06-17 14:37:06 +01:00
Kenneth Graunke
286ed3aff0 glsl: Optionally lower TCS gl_PatchVerticesIn to a uniform.
i965 has no special hardware for this, so the best way to implement
this is to pass it in via a uniform.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 2b867264d2)
2016-06-17 14:28:44 +01:00
Kenneth Graunke
baa6ef4ed0 i965: Use a uniform for gl_PatchVerticesIn in the TES.
Fixes three GL44-CTS.tessellation_shader subtests:
- max_patch_vertices
- single.max_patch_vertices
- tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn

These use gl_PatchVerticesIn in the TES, but don't link against a
TCS (which would allow the linker to lower it to a constant).  We had
no handling for the system value in the backend, so it would just
assert fail.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1bc194cd64)
2016-06-17 14:27:42 +01:00
Kenneth Graunke
b7e91a0421 glsl: Optionally lower TES gl_PatchVerticesIn to a uniform.
i965 has no special hardware for this, so we need to pass this value in
as a uniform (unless the TES is linked against a TCS, in which case the
linker can just replace this with a constant).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 0be2105137)
2016-06-17 14:19:26 +01:00
Nicolai Hähnle
05c5ed47d1 mesa/main: fix integer overflows in _mesa_image_offset
Found using -fsanitize=undefined.

Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6510e07345)
2016-06-17 14:18:27 +01:00
Kenneth Graunke
a9647850d1 mesa: Pass gl_constant_value union into _mesa_fetch_state().
We've had some trouble in the past with copying integers around via
float pointers, as the C compiler sometimes uses x87 floating point
registers to load values on 32-bit systems.  Passing the
gl_constant_value union should be safer.

To avoid churn, this patch creates a "GLfloat *value" variable so
existing uses can stay the same.

Not observed to fix anything, but I was in the area adding more integer
state vars, and thought it'd be wise.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 8b408972ff)
2016-06-17 14:01:23 +01:00
Emil Velikov
7d41c8aa25 Update version to 12.0.0-rc3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-15 09:29:14 +01:00
Nicolai Hähnle
575f9eaa2d radeonsi: mark buffer texture range valid for shader images
When a shader image view into a buffer texture can be written to, the buffer's
valid range must be updated, or subsequent transfers may incorrectly skip
synchronization.

This fixes a bug that was exposed in Xephyr by PBO acceleration for glReadPixels,
reported by Michel Dänzer.

Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit a64c7cd2ba)

Back-ported from commit a64c7cd2ba:
- include util/u_format.h
- code was extracted to si_set_shader_image in master, move it back

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
--
 src/gallium/drivers/radeonsi/si_descriptors.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)
2016-06-15 09:29:14 +01:00
Ilia Mirkin
792a5ee425 nv50/ir: record number of threads in a compute shader
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 27a51ff9b4)
2016-06-15 09:29:14 +01:00
Ilia Mirkin
59841f5466 nvc0/ir: limit max number of regs based on availability in SM
This effectively limits registers to 32 and 64 for fermi and kepler when
1024 threads are used, but allows the full amount to be used with
smaller thread sizes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1f895caba0)
2016-06-15 09:29:14 +01:00
Tomasz Figa
966ee94558 i965: Check return value of screen->image.loader->getBuffers (v2)
The images struct is an uninitialized local variable on the stack. If the
callback returns 0, the struct might not have been updated and so should
be considered uninitialized. Currently the code ignores the return value,
which (depending on stack contents) might end up in reading a non-zero
value from images.image_mask and dereferencing further fields.

Another solution would be to initialize image_mask with 0, but checking
the return value seems more sensible and it is what Gallium is doing.

v2: fix typos in commit message,
    fix indentation,
    remove unnecessary parentheses and pointer dereference to keep line
    length reasonable.

Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit e7ab358e81)
2016-06-15 09:29:14 +01:00
Dylan Baker
8ed5204182 isl: Replace bash generator with python generator
This replaces the current bash generator with a python based generator
using mako. It's quite fast and works with both python 2.7 and python
3.5, and should work with 3.3+ and maybe even 3.2.

It produces an almost identical file except for a minor layout changes,
and the addition of a "generated file, do not edit" warning.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 5a87bc7181)
2016-06-15 09:29:14 +01:00
Bas Nieuwenhuizen
28294573c7 radeonsi: Reinitialize all descriptors in CE preamble.
This fixes a problem with the CE preamble and restoring only stuff in the
preamble when needed.

To illustrate suppose we have two graphics IB's 1 and 2, which  are submitted in
that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we
have a context switch at the start of IB 1, but not between IB 1 and IB 2.

The old code put the CE RAM loads in the preamble of IB 2. As the preamble of
IB 1 does not have the loads and the preamble of IB 2 does not get executed, the
old values are not load into CE RAM.

Fix this by always restoring the entire CE RAM.

v2: - Just load all descriptor set buffers instead of load and store the entire
      CE RAM.
    - Leave the ce_ram_dirty tracking in place for the non-preamble case.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

Note: This commit differs from the one in master - 54f755fa0f
("radeonsi: Reinitialize all descriptors in CE preamble.")
2016-06-15 09:29:13 +01:00
Emil Velikov
7bed792ebb cherry-ignore: drop the "i965 bring back INTEL_PRECISE_TRIG"
The commit that removes it isn't in branch, thus there's nothing to do
here.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-15 09:29:13 +01:00
Samuel Iglesias Gonsálvez
7d5cdb7675 i965: Defeat the register stride checker in pull uniform messages.
Pulling DF uniforms from pull constant buffer generates messages like:
    send(4)         g12<1>DF        g12<0,1,0>F
         sampler ld SIMD4x2 Surface = 1 Sampler = 0 mlen 1 rlen 1

which produces GPU hangs in Cherryview/Braswell:

    "For 64-bit Align1 operation or multiplication of dwords in CHV,
     source horizontal stride must be aligned to qword."

This seems to be documented in the Cherryview PRM, Volume 7, Page 843:

    "When source or destination datatype is 64b or operation is integer
     DWord multiply, regioning in Align1 must follow these rules:

     1. Source and Destination horizontal stride must be aligned to the
        same qword."

We should set the destination type to UD, D, or F so that
the register stride checker doesn't notice.  The destination type of
send messages is basically irrelevant anyway.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a0ed8503b7)
2016-06-15 09:29:13 +01:00
Kenneth Graunke
465be91421 i965: Defeat the register stride checker in URB reads.
Pulling DF inputs from the URB generates messages like:

   send(8)         g23<1>DF        g1<8,8,1>UD
                   urb 3 SIMD8 read mlen 1 rlen 2      { align1 1Q };

which makes the simulator angry:

   "For 64-bit Align1 operation or multiplication of dwords in CHV,
    source horizontal stride must be aligned to qword."

This seems to be documented in the Cherryview PRM, Volume 7, Page 823:

   "When source or destination datatype is 64b or operation is integer
    DWord multiply, regioning in Align1 must follow these rules:

    1. Source and Destination horizontal stride must be aligned to the
       same qword."

Setting the source horizontal stride to QWord is insane, as it's the
message header containing 8 URB handles in a single 32-bit DWord.
Instead, we should whack the destination type to UD, D, or F so that
the register stride checker doesn't notice.  The destination type of
send messages is basically irrelevant anyway.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit ed3ba651f6)
2016-06-15 09:29:13 +01:00
Kenneth Graunke
4a6fecdf69 i965: Fix issues with number of VS URB entries on Cherryview/Broxton.
Cherryview/Broxton annoyingly have a minimum number of VS URB entries
of 34, which is not a multiple of 8.  When the VS size is less than 9,
the number of VS entries has to be a multiple of 8.

Notably, BLORP programmed the minimum number of VS URB entries (34), with
a size of 1 (less than 9), which is invalid.

It seemed like this could be a problem in the regular URB code as well,
so I went ahead and updated that to be safe.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 9f37df06da)
2016-06-15 09:29:13 +01:00
Timothy Arceri
883a1b3bd2 glsl: make sure UBO arrays are sized in ES
This check was removed in 5b2675093e add it back in.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
https://bugs.freedesktop.org/show_bug.cgi?id=96349
(cherry picked from commit b010fa8567)
2016-06-15 09:29:13 +01:00
Vedran Miletić
a71e0fd8cd clover: Update OpenCL version string to match OpenGL
Change MESA into Mesa in CL_PLATFORM_VERSION and CL_DEVICE_VERSION. For
both, always append git version suffix from git_sha1.h.

v5: move semicolon to same line as MESA_GIT_SHA1.
v4: drop #ifdef guards.
v3: add missing include.
v2: change CL_DEVICE_VERSION as well.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(cherry picked from commit 4825264f75)

Squashed with commit

clover: Include generated sources in AM_CPPFLAGS

git_sha1.c is generated in $(top_builddir)/src.

Fixes out-of-tree builds since 4825264f75.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96516
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit fafe026dbe)
2016-06-15 09:29:13 +01:00
Francisco Jerez
547b5d2daa i965/fs: Fix regs_written for SIMD-lowered instructions some more.
ISTR having suggested this during review of the recent FP64 changes to
the SIMD lowering pass, but it doesn't look like it was taken into
account in the end.  Using the fs_reg::component_size helper instead
of this open-coded variant makes sure that the stride is taken into
account correctly.  Fixes at least the following piglit tests with
spilling forced on (since otherwise regs_written would be calculated
incorrectly and the spilling code would be rather confused about how
much data needs to be spilled):

 spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-shader
 spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-mixed-shader

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit bd9f972651)
2016-06-15 09:29:13 +01:00
Francisco Jerez
7154fa614b i965: Fix cross-primitive scratch corruption when changing the per-thread allocation.
I haven't found any mention of this in the hardware docs, but
experimentally what seems to be going on is that when the per-thread
scratch slot size is changed between two pipelined draw calls, shader
invocations using the old and new scratch size setting may end up
being executed in parallel, causing their scratch offset calculations
to be based in a different partitioning of the scratch space, which
can cause their thread-local scratch space to overlap leading to
cross-thread scratch corruption.

I've been experimenting with alternative workarounds, like emitting a
PIPE_CONTROL with DC flush and CS stall between draw (or dispatch
compute) calls using different per-thread scratch allocation settings,
or avoiding reuse of the scratch BO if the per-thread scratch
allocation doesn't exactly match the original.  Both seem to be as
effective as this workaround, but they have potential performance
implications, while this should be basically for free.

Fixes over 40 failures in our CI system with spilling forced on
(including CTS, dEQP and Piglit failures) on a number of different
platforms from Gen4 to Gen9.  The 'glsl-max-varyings' piglit test
seems to be able to reproduce this bug consistently in the vertex
shader on at least Gen4, Gen8 and Gen9 with spilling forced on.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a84b5d43e2)
2016-06-15 09:29:13 +01:00
Francisco Jerez
2cf78b4851 i965: Keep track of the per-thread scratch allocation in brw_stage_state.
This will be used to find out what per-thread slot size a previously
allocated scratch BO was used with in order to fix a hardware race
condition without introducing additional stalls or memory allocations.
Instead of calling brw_get_scratch_bo() manually from the various
codegen functions, call a new helper function that keeps track of the
per-thread scratch size and conditionally allocates a larger scratch
BO.

v2: Handle BO allocation manually instead of relying on
    brw_get_scratch_bo (Ken).

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d960284e44)
2016-06-15 09:29:12 +01:00
Francisco Jerez
b9f69df93d i965: Fix scratch overallocation if the original slot size was already a power of two.
The bitwise arithmetic trick used in brw_get_scratch_size() to clamp
the scratch allocation to 1KB has the unintended side effect that it
will cause us to allocate 2x the required amount of scratch space if
the original per-thread scratch size happened to be already a power of
two.  Instead use the obvious MAX2 idiom to clamp the scratch
allocation to the expected range.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 013ae4a70a)
2016-06-15 09:29:12 +01:00
Kenneth Graunke
eaa8561230 i965: Fix encode_slm_size() to take a generation, not a device info.
In the Vulkan driver, we have the generation number (a compile time
constant) but not necessarily the brw_device_info struct.  I meant
to rework the function to take a generation number instead of a
brw_device_info pointer to accomodate this.  But I forgot, and left
it taking a brw_device_info pointer, while making Vulkan pass the
generation number (8, 9, ...) directly.  This led to crashes.

Brown paper bag fix for commit 87d062a940.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96504
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5a0d294d38)
2016-06-15 09:29:12 +01:00
Kenneth Graunke
9edc2f1828 i965: Don't leak scratch BOs for TCS/TES.
These need to be freed too.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 667e5cec76)
2016-06-15 09:29:12 +01:00
Nanley Chery
5e41ac197f anv/pipeline: Don't dereference NULL dynamic state pointers
Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts
of pCreateInfo members are moved to the earliest points at which they
should not be NULL.

This fixes a segfault seen in the McNopper demo, VKTS_Example09.

v3 (Jason Ekstrand):
   - Fix disabled rasterization check
   - Revert opaque detection of color attachment usage

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a4a5917248)
2016-06-15 09:29:12 +01:00
Nanley Chery
cdeb3e8eb4 anv: Document and rename anv_pipeline_init_dynamic_state()
To reduce confusion, clarify that the state being copied is not dynamic.

This agrees with the Vulkan spec's usage of the term. Various sections
specify that the various pipeline state which have VkDynamicState enums
(e.g. viewport, scissor, etc.) may or may not be dynamic.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0d84a9ef9)
2016-06-15 09:29:12 +01:00
Samuel Pitoiset
0b71ef5e46 nvc0/ir: clamp the UBO index for compute on Kepler
We already check that the address is not "too far", but we should also
clamp the UBO index in order to avoid looking at the wrong place in the
driver cb. This is a pretty rare situation though.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7f257abc1b)
2016-06-15 09:29:12 +01:00
Jimmy Berry
c01ebdc83e st/va: hardlink driver instances to gallium_drv_video.so
Removes the need to set LIBVA_DRIVER_NAME=gallium for supported targets and is
consistent with vdpau and general gallium drivers.

Note: some versions of libva can detect the gallium name and use the
backend. Although that behaviour seems inconsistent since it only works
for some platforms/backends.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0c0f841e5d)
2016-06-15 09:29:12 +01:00
Emil Velikov
501e8421f8 swr: automake: add missing -I flag
When building from a release tarball (where the generated/built files
are in srcdir) in an OOT fashion we need to have both builddir and
srcdir in the includes list.

Otherwise we'll error out, as the file (header gen_knobs.h in this case)
won't be in the location where we are looking.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit fcb5a75a66)
2016-06-15 09:29:12 +01:00
Emil Velikov
3162e2f9fc automake: add SWR to `make distcheck' gallium drivers
Will allows us to catch missing files and build issues before getting
the tarball out for general consumption.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f4d26856df)
2016-06-15 09:29:11 +01:00
Emil Velikov
766f852616 configure.ac: strip out the llvm-config -march/mtune flags
Otherwise drivers such as SWR that depend on providing their own values
will fail to build.

v2: Add -mcpu for good measure (Chuck)

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chuck Atkins <chuck.atkins@kitware.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit bab5ab6940)
2016-06-15 09:29:11 +01:00
Chuck Atkins
b499d1062d swr: Add missing headers for package inclusion
CC: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c86fcaca72)
2016-06-15 09:29:11 +01:00
Emil Velikov
939cd6edac automake: get in-tree `make distclean' working again.
With earlier commit we've handled the `make distclean' out of tree
build, yet we failed to attribute that for in-tree builds the test
condition will return 1. Thus effectively the target will be considered
as "failed".

Fixes: b7f7ec7843 ("mesa: automake: distclean git_sha1.h when building
OOT")
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Andy Furniss <adf.lists@gmail.com>
Reported-by: Andy Furniss <adf.lists@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

(cherry picked from commit 8229fe68b5)
2016-06-15 09:29:11 +01:00
Kenneth Graunke
2b6817c91c i965: Use the correct number of threads for compute shaders.
We were programming the number of threads per subslice, when we should
have been programming the total number of threads on the GPU as a whole.

Thanks to Curro and Jordan for helping track this down!

On Skylake GT3e:
- Improves performance in Unreal's Elemental Demo by roughly 1.5-1.7x.
- Improves performance in Synmark's Gl43CSDof by roughly 3.7x.
- Improves performance in Synmark's Gl43GSCloth by roughly 1.18x.

On Broadwell GT2:
- Improves performance in Unreal's Elemental Demo by roughly 1.2-1.5x.
- Improves performance in Synmark's Gl43CSDof by roughly 2.0x.
- Improves performance in Synmark's Gl43GSCloth by 1.47035% +/-
  0.255654% (n=25).

On Haswell GT3e:
- Improves performance in Unreal's Elemental Demo (in GL 4.3 mode)
  by roughly 1.10x.
- Improves performance in Synmark's Gl43CSDof by roughly 1.18x.
- Decreases performance in Synmark's Gl43CSCloth by -1.99484% +/-
  0.432771% (n=64).

On Ivybridge GT2:
- Improves performance in Unreal's Elemental Demo (in GL 4.2 mode)
  by roughly 1.03x.
- Improves performance in Synmark's G/43CSDof by roughly 1.25x.
- No change in Synmark's Gl43CSCloth (n=28).

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 0fb85ac08d)
2016-06-15 09:29:11 +01:00
Kenneth Graunke
9a118c79e7 i965: Assert that the scratch spaces are in range.
I don't know that anything actually guarantees this, but if we exceed
the limits, we may end up overflowing and trashing random buffers that
happen to be nearby in the VMA space, leading to rendering corruption,
hangs, or worse.

We should really fix this properly.  However, the pitfall has existed
for ages, so for now we should at least detect it.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 1db37ebecf)
2016-06-15 09:29:11 +01:00
Kenneth Graunke
be426c46ab i965: Fix CS scratch size calculations on Ivybridge and Baytrail.
These are linear, not powers of two, and much more limited.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a42a93dc12)
2016-06-15 09:29:11 +01:00
Kenneth Graunke
02f381bb17 i965: Fix Haswell CS per-thread scratch space encoding.
Most scratch stages use power of two sizes, in kilobytes, where
0 means 1kB.  But compute shaders on Haswell have a minimum of 2kB,
and use a representation where 0 = 2kB.

This meant that we were effectively telling the hardware to allocate
each thread twice as much space as we meant to, while simultaneously
not allocating that much space in the buffer, leading to overflows.

Note that the existing code is completely wrong for Ivybridge,
but that will take additional work to sort out, so I've left it
as is for now.  A subsequent commit will take care of that.

Together with the previous patches, this fixes rendering corruption
on Synmark's Gl43CSDof on Haswell.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 147a90d82a)
2016-06-15 09:29:11 +01:00
Kenneth Graunke
e84116f364 i965: Account for poor address calculations in Haswell CS scratch size.
Curro figured this out by investigating the simulator.  Apparently
there's also a workaround in the Windows driver.  I'm not sure it's
actually documented anywhere.

We were underallocating the scratch buffer by a factor of 128/70.

v2: Rename threads_per_subslice to scratch_ids_per_subslice
    (suggested by Jordan Justen).

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a7d029d3df)
2016-06-15 09:29:11 +01:00
Kenneth Graunke
6c5c1bc1b9 i965: Allocate scratch space for the maximum number of compute threads.
We were allocating enough space for the number of threads per subslice,
when we should have been allocating space for the number of threads in
the entire GPU.

Even though we currently run with a reduced thread count (due to a bug),
we might still overflow the scratch buffer because the address
calculation is based on the FFTID, which can depend on exactly which
threads, EUs, and threads are executing.  We need to allocate enough
for every possible thread that could run.

Fixes rendering corruption in Synmark's Gl43CSDof on Gen8+.
Earlier platforms need additional bug fixes.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 2213ffdb4b)
2016-06-15 09:29:10 +01:00
Kenneth Graunke
fdcc6a855b i965: Set subslice_total on Gen7/7.5 platforms.
We'll use this for compute shader thread counts and scratch space
calculations shortly.

Note that subslices are referred to as "half slices" on Ivybridge.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 9cd8f95809)
2016-06-15 09:29:10 +01:00
Kenneth Graunke
c9477e0a80 i965: Fix shared local memory size for Gen9+.
Skylake changes the representation of shared local memory size:

 Size   | 0 kB | 1 kB | 2 kB | 4 kB | 8 kB | 16 kB | 32 kB | 64 kB |
 -------------------------------------------------------------------
 Gen7-8 |    0 | none | none |    1 |    2 |     4 |     8 |    16 |
 -------------------------------------------------------------------
 Gen9+  |    0 |    1 |    2 |    3 |    4 |     5 |     6 |     7 |

The old formula would substantially underallocate the amount of space.
This fixes GPU hangs on Skylake when running with full thread counts.

v2: Fix the Vulkan driver too, use a helper function, and fix the table
    in the comments and commit message.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 87d062a940)
2016-06-15 09:29:10 +01:00
Ilia Mirkin
8d9bf67bba mesa: add drawbuffer argument to ClearNamedFramebufferfi
This was fixed in revision 47 of the ARB_dsa spec in Oct 22, 2015. Since
it's horrible to have differing APIs across library versions, we should
attempt to minimize the impact by backporting it as far as possible and
hope no one notices.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7d7e015381)
2016-06-15 09:29:10 +01:00
Ilia Mirkin
ca009cf8ba GL: update glcorearb.h to svn 32433
This brings in the fixed glClearNamedFramebufferfi definition, as well
as a lot of GLsizei -> GLsizeiptr changes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92351a71a8)
2016-06-15 09:29:10 +01:00
Ilia Mirkin
7487d5cbdc GL: update glext to svn 32957
This brings in defines from GL_EXT_window_rectangles and fixes the
glClearNamedFramebufferfi definition.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f81374fd3e)
2016-06-15 09:29:10 +01:00
Anuj Phogat
72dbdf6f89 gallium: Fix region overlap conditions for rectangles with a shared edge
>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
to the destination rectangle bounded by the locations (dstX0, dstY 0)
and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
while the upper bounds are exclusive."

So, the rectangles sharing just an edge shouldn't overlap.
 -----------
|           |
 ------- ---
|       |   |
|       |   |
 ------- ---

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 466b320163)
2016-06-15 09:29:10 +01:00
Anuj Phogat
dd1943f904 mesa: Fix region overlap conditions for rectangles with a shared edge
>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
 rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
 to the destination rectangle bounded by the locations (dstX0, dstY 0)
 and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
 while the upper bounds are exclusive."

So, the rectangles sharing just an edge shouldn't overlap.
     -----------
    |           |
     ------- ---
    |       |   |
    |       |   |
     ------- ---

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit f8679badd4)
2016-06-15 09:29:10 +01:00
Jason Ekstrand
6eb0240a32 anv/entrypoints: Rework #if guards
This reworks the #if guards a bit.  When Emil originally wrote them, he
just guarded everything.  However, part of what anv_entrypoints_gen.py
generates is a hash table for looking up entrypoints based on their name.
This table *cannot* get out of sync between C and python regardless of
preprocessor flags.  In order to prevent this, this commit makes us use
void pointers in the dispatch table for those entrypoints which aren't
available.  This means that the dispatch table size and entry order is
constant and it should never get out-of-sync with the python.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8d37556ec9)
2016-06-15 09:29:10 +01:00
Jason Ekstrand
eb0197ad53 anv/entrypoints: Use the function pointer types provided by vulkan.h
This is a bit cleaner than generating the types ourselves when making the
table.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9ed0d9dd06)
2016-06-15 09:29:09 +01:00
Jason Ekstrand
242ac96a24 anv/entrypoints: Emit #if guards for all platforms
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d1a53f91ee)
2016-06-15 09:28:54 +01:00
Nicolai Hähnle
c03b4444d1 st/mesa: use base level size as "guess" when available
When an applications specifies mip levels _before_ setting a mipmap texture
filter, we will initially guess a single texture level. When the second level
image is created, we try to allocate the full texture -- however, we get the
base level size guess wrong if that size is odd. This leads to yet another
re-allocation of the texture later during st_finalize_texture.

Even worse, this re-allocation breaks a (reasonable) assumption made by
st_generate_mipmaps, because the re-allocation in the finalization call will
again allocate a single-level pipe texture (based on the non-mipmap texture
filter!). As a result, mipmap generation fails in interesting ways.

All of this can be avoided by just using the fact that we already know the
size of the base level.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95529
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 42624ea837)
2016-06-14 15:48:40 +01:00
Jason Ekstrand
ad684cee3a anv: Remove the PhysicalDeviceLimits FINISHME
At this point, the limits are probably more-or-less correct.  If there is
an invalid limit, that's a bug not a FINSHME.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a1e69930e4)
2016-06-14 15:48:40 +01:00
Jason Ekstrand
ea24c9be4a anv/pipeline_cache: Allow for an zero-sized cache
This gets ANV_ENABLE_PIPELINE_CACHE=false working again.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4f5bbf804b)
2016-06-14 15:48:40 +01:00
Jason Ekstrand
86dbf1ef4b anv/pipeline: Store the (set, binding, index) tripple in the bind map
This way the the bind map (which we're caching) is mostly independent of
the pipeline layout.  The only coupling remaining is that we pull the array
size of a binding out of the layout.  However, that size is also specified
in the shader and should always match so it's not really coupled.  This
rendering issues in Dota 2.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a1a25db699)
2016-06-14 15:48:40 +01:00
Jason Ekstrand
b1f217b5a9 anv/descriptor_set: Ensure that bindings are always in increasing order
Since applications are allowed to specify some set of bindings which need
not be dense they also need not be in order.  For most things, this doesn't
matter, but it could result getting the wrong dynamic offsets. This adds a
quick-and-dirty sort to ensure that everything is always in increasing
order of binding index.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c13c5ac561)
2016-06-14 15:48:40 +01:00
Jason Ekstrand
a0be8d3d08 anv/descriptor_set: Add a type field in debug builds
This allows for some extra validation and makes it easier to see what's
going on when poking around in gdb.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e2265926f2)
2016-06-14 15:48:40 +01:00
Jason Ekstrand
901c78786f anv/descriptor_set: Set array_size to zero for non-existant descriptors
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cd21015abd)
2016-06-14 15:48:40 +01:00
Leo Liu
986159437d vl/dri3: support receiving new pixmap for front buffer
With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets
renewed in each frame, so when we receive a new pixmap, should get a new
front buffer for it.

This also fixes Totem player playback corruption.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2ad443e4cc)
2016-06-14 15:48:39 +01:00
Leo Liu
ab75b22029 vl/dri3: get Makefile properly
From original commit, the macro "if HAVE_DRI3" was in Makefile.sources,
this file is shared with SCons, SCons is not able to parse this marco,
the SCons build failed. Jose quickly gave two approaches and quick fix
with his second approach, thanks Jose for the solutions and fixes.

This patch is Jose's first approach, and it's more proper, because the
dri3 c file should not be included to build when DRI3 is not enabled.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0ef8500aab)
2016-06-14 15:48:39 +01:00
Daniel Czarnowski
5cae2ac47e glx: fix crash with bad fbconfig
GLX documentation states:
	glXCreateNewContext can generate the following errors: (...)
	GLXBadFBConfig if config is not a valid GLXFBConfig

Function checks if the given config is a valid config and sets proper
error code.

Fixes currently crashing glx-fbconfig-bad Piglit test.

v2: coding style cleanups (Emil, Topi)
    use DefaultScreen macro (Emil)

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cf804b4455)
2016-06-14 15:48:39 +01:00
Jason Ekstrand
7d515b26bb i965: Emit surface states for extra planes prior to gen8
When Kristian implemented GL_TEXTURE_EXTERNAL_OES, he hooked it up for gen8
but not for gen7 or earlier.  It all works, we just need to emit the states
for the extra planes.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 037ce5d734)
2016-06-14 15:48:39 +01:00
Marc-André Lureau
0e554f54dc virgl: fix checking fences
When calling virgl_fence_wait() with timeout=0,
virgl_{drm,vtest}_resource_is_busy() is called. However, it returns TRUE
for a busy resource, whereace virgl_fence_wait() should return TRUE for
a completed (non-busy) resource.

This fixes running supertuxkart in a VM (I could not reproduce locally
with vtest though there is a similar fix)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit dc81b3ad43)
2016-06-14 15:48:39 +01:00
Nicolai Hähnle
201f357c52 st/mesa: directly compute level=0 texture size in st_finalize_texture
The width0/height0/depth0 on stObj may not have been set at this point.
Observed in a trace that set up levels 2..9 of a 2d texture, and set the base
level to 2, with height 1. This made the guess logic always bail.

Originally investigated by Ilia Mirkin, this patch gets rid of the somewhat
redundant storage of width0/height0/depth0 and makes sure we always compute
pipe texture sizes that are compatible with the base level image of the
GL texture.

Fixes the gl-1.2-texture-base-level piglit test provided by Brian Paul.

v2:
- try to re-use an existing pipe texture when possible
- handle a corner case where the base level is not level 0 and it is of
  size 1x1x1

v3:
- ptHeight = ptWidth in cube map 1x1 case (suggested by Brian)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit bd5c41fe5f)
2016-06-14 15:48:39 +01:00
Ilia Mirkin
bf3d6d9601 st/mesa: use buffer usage history to set dirty flags for revalidation
We were previously unconditionally doing this for arrays and ubo's, and
ignoring texture/storage/atomic buffers. Instead use the usage history
to determine which atoms need to be revalidated.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6e6fd911da)
2016-06-14 15:48:39 +01:00
Marek Olšák
f51e99f704 gallium/radeon: don't allocate DCC for non-renderable texture formats
R9G9B9E5 is the only uncompressed one hopefully.

This fixes incorrect rendering not discovered (due to a lack of tests)
until DCC mipmapping was enabled.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit d4d733e39d)
2016-06-14 15:48:39 +01:00
Nicolai Hähnle
b2afa23a40 tgsi/scan: add uses_derivatives (v2)
v2:
- TG4 does not calculate derivatives (Ilia)
- also handle SAMPLE* instructions (Roland)

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit d3a584defe)
2016-06-14 15:48:39 +01:00
Ilia Mirkin
6f38259419 st/mesa: revalidate image atoms when a texture is updated
A texture may be redefined with _NEW_TEXTURE, which might have been
bound to a shader image slot. We have to revalidate the image atoms to
pick up on the new resource.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c81b090c92)
2016-06-14 15:48:39 +01:00
Ilia Mirkin
bba2299735 gk104/ir: fix conditions for adding a texbar
Sometimes a register source can actually be double- or even quad-wide.
We must make sure that the inserted texbars take that width into
account.

Based on an earlier patch by Samuel Pitoiset.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 71ad8a173f)
2016-06-14 15:48:38 +01:00
Dave Airlie
49c53a2987 i965/gen8: fix cull distance emission for tessellation shaders.
This fixes some cases of:
GL45-CTS.cull_distance.functional
on Skylake.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c295923d13)
2016-06-14 15:48:38 +01:00
Samuel Pitoiset
4306e01ece nv50/ir: use round toward 0 when converting doubles to integers
Like floats, we should use the round toward 0 mode instead of the
nearest one (which is the default) for doubles to integers.

This fixes all arb_gpu_shader_fp64 piglits which convert doubles to
integers (16 tests).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 08ddfe7b2f)
2016-06-14 15:48:38 +01:00
Dave Airlie
b9920d2bba mesa/program_resource: return -1 for index if no location.
The GL4.5 spec quote seems clear on this:
"The value -1 will be returned by either command if an error occurs,
if name does not identify an active variable on programInterface,
or if name identifies an active variable that does not have a valid
location assigned, as described above."

This fixes:
GL45-CTS.program_interface_query.output-built-in

[airlied: use _mesa_program_resource_location_index as
suggested by Eduardo]
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

(cherry picked from commit 07403014c3)
2016-06-14 15:48:38 +01:00
Nicolai Hähnle
9bf30be693 radeonsi: set descriptor dirty mask on shader buffer unbind
Found randomly while skimming the code. This might have caused VM faults in
robustness tests.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ec2b52e2d9)
2016-06-14 15:48:38 +01:00
Samuel Iglesias Gonsálvez
05d33806cd i965/gs/scalar: Fix load input for doubles
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2b648ec17c)
2016-06-14 15:48:38 +01:00
Samuel Iglesias Gonsálvez
507d25f6f1 i965/fs: fix offset when loading double vector input varyings
When we are not packing a double input varying, we might need to
read its data in a non-aligned to 64-bit offset, so we read
the wrong data. This is happening when using explicit locations
in varyings because Mesa disables packing varying for that case.

const_index is in 32-bit size units but offset() is multiplying
it by destination type size units. When operating with double
input varyings, const_index value could be not aligned to 64 bits.
To fix it, we load the double vector as if it was a float based vector
with twice the number of components.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d6f82a294)
2016-06-14 15:48:38 +01:00
Samuel Iglesias Gonsálvez
4daa331e25 i965/fs: fix FS_OPCODE_CINTERP for unpacked double input varyings
Data starts at suboffet 3 in 32-bit units (12 bytes), so it is not
64-bit aligned and the current implementation fails to read the data
properly. Instead, when there is is a double input varying, read it as
vector of floats with twice the number of components.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb30727648)
2016-06-14 15:48:38 +01:00
Dave Airlie
fc0a469e4c glsl: geom shader max_vertices layout must match.
From GLSL 4.5 spec, "4.4.2.3 Geometry Outputs".
"all geometry shader output vertex count declarations in a
program must declare the same count."

Fixes:
GL45-CTS.geometry_shader.output.conflicted_output_vertices_max

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4c86399378)
2016-06-14 15:48:38 +01:00
Dave Airlie
0ce3dc9a30 i965: don't use NumLayers for 3D textures.
For 3D textures we shouldn't be using NumLayers, we need
to get it from the depth.

This fixes:
GL45-CTS.geometry_shader.layered_framebuffer.clear_call_support

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ff2e569153)
2016-06-14 15:48:37 +01:00
Dave Airlie
89bc5f9a90 glsl: for anonymous struct matching use without_array() (v3)
With tessellation shaders we can have cases where we have
arrays of anon structs, so make sure we match using without_array().

Fixes:
GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in

v2:
test lengths match as well (Ilia)
v3:
descend array lengths to check for matches as well (Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1f66a4b689)
2016-06-14 15:48:37 +01:00
Dave Airlie
09f48203c5 glsl/ast: don't crash when func_name is NULL
This fixes a crash in
GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types

If we can't find the func_name in one of these paths,
we have emitted an earlier error so just return here.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6702c15810)
2016-06-14 15:48:37 +01:00
Dave Airlie
997bcc45ec glsl: handle ast_aggregate in has_sequence_subexpression. (v2)
GL43-CTS.compute_shader.work-group-size does
uniform uint g_uniform[gl_WorkGroupSize.z + 20] = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 };

The initializer triggers the GLSL 4.30/GLES3 tests
for constant sequence subexpressions, so it doesn't
happen unless you are using those, so just return
false as this path is now reachable.

v2: update commit msg with diagnosis
Acked-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4336196b7f)
2016-06-14 15:48:37 +01:00
Ilia Mirkin
a0e36438a8 nv50,nvc0: fix BGR10_A2UI vertex format
This is mostly academic as this is not reachable from GL, which only has
the packed RGB10_A2UI vertex format.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 092ec3920f)
2016-06-14 15:48:37 +01:00
Samuel Pitoiset
954829ebbb nvc0: do not clear surfaces bins in the validate function
We should not call nouveau_bufctx_reset() inside a validate function.
This only affects Fermi where images are aliased between 3D and CP.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit be365f34f0)
2016-06-14 15:48:37 +01:00
Samuel Pitoiset
f12a16ec99 nvc0: re-validate images after launching a grid on Fermi
Images invalidation is a bit weird on Fermi and there is already a hack
which forces invalidating all images when launching a computer shader
to help in fixing 3D<->CP interaction.

However, we need to re-validate images for compute because
nvc0_compute_invalidate_surfaces() will destroy the previous binding.
This is not really good for performance purposes but this might be
improved later.

This fixes the following piglits:
- spec/arb_compute_shader/execution/basic-uniform-access
- spec/arb_compute_shader/execution/mutiple-texture-reading
- spec/arb_compute_shader/execution/multiple-workgroups
- spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 43d3ecfb33)
2016-06-14 15:48:37 +01:00
Ilia Mirkin
ceb9ed0e38 nvc0: reduce overhead from always marking images dirty
We would revalidate images when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the images.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fd6bbc2ee2)
2016-06-14 15:48:37 +01:00
Ilia Mirkin
5a63ae9f15 nvc0: reduce overhead from always marking buffers dirty
We would revalidate buffers when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the SSBOs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0f673db6f0)
2016-06-14 15:48:37 +01:00
Ilia Mirkin
a95560bac5 nvc0: fix memory barrier flag handling
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e8ee161b16)
2016-06-14 15:48:37 +01:00
Ilia Mirkin
1adbe2f45c nvc0: mark bound buffer range valid
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 29abbeecd8)
2016-06-14 15:48:37 +01:00
Marek Olšák
ccc9783a98 r600g: write WAIT_UNTIL in the correct place
This has been wrong all along. Fixing this will allow removing useless
cache flushes.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
(cherry picked from commit 7746903d3a)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
c632590996 anv/blit: Use CLAMP_TO_EDGE for scaled blits
When upscaling you can end up interpolating between the edge pixel and one
past the edge.  Using CLAMP_TO_EDGE seems like the most reasonable thing to
do in this case.  This fixes two of the new Vulkan CTS tests in
dEQP-VK.api.copy_and_blit.blit_image.*

Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>

(cherry picked from commit 441194edd9)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
2830ae638c anv/copy: Account for the anv_surface.offset when creating a blit2d_surf
This was causing problems if the user tried to copy to/from the stencil
portion of a combined depth/stencil image.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9313a56816)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
5ca18b6a4b nir/spirv: Make a decoration switch complete
Getting rid of the default case makes the compiler warn if we are missing
cases.  While we're here, we also add the one missing case.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 526a8de22d)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
bb4ff53a71 nir/spirv: Make unhandled decorations and capabilities non-fatal
glslang frequently throw bogus decorations into shaders.  While we are free
to assert-fail, it's a bit nicer to the application to just warn.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 62c6e94bd6)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
d5dc87a1ef nir/spirv: Add a way to print non-fatal warnings
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ed14d21d04)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
85579221a4 nir/spirv: Add string lookup tables for a couple of SPIR-V enums
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2e46a5d155)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
6eb39fa255 nir/spirv: Complete the list of capabilities
Previously we supported a subset of capabilities and just left a default
case for the others.  It's time to stop being lazy and actually audit the
capabilities.  This should bring them up-to-date with reality.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5a1e56f344)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
13999dc70d anv/pipeline: Add support for early depth stencil
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9fa958e95b)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
a3fce26907 i965/fs Add a wm_prog_data bit for has_side_effects
This is more accurate than calling
_mesa_active_fragment_shader_has_side_effects because it looks at whether
or not the SSBOs, images, or atomic buffers are actually written rather
than just existing in the program.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3fb289f957)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
7ddbe91435 anv/pipeline: Silently pass tests if depth or stencil is missing
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 56a178922f)
2016-06-14 15:48:36 +01:00
Jason Ekstrand
c6ca6f0728 anv/pipeline: Unify gen7/8 emit_ds_state
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bc7f7e1953)
2016-06-14 15:48:35 +01:00
Jason Ekstrand
300737042c genxml/gen6,7,75: s/BackFace/Backface
This is more consistent with gen8+

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fdc3c5dd05)
2016-06-14 15:48:35 +01:00
Jason Ekstrand
09b2be7a51 nir/spirv: Handle the WorkgroupSize builtin decoration
This fixes the 7 dEQP-VK.pipeline.spec_constant.compute.local_size.* tests
in the latest dev version of the Vulkan CTS.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1f7b54ed29)
2016-06-14 15:48:35 +01:00
Jason Ekstrand
6efd37f30d nir/spirv: Use breaks instead of returns in constant handling
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b26cdd65e8)
2016-06-14 15:48:35 +01:00
Jason Ekstrand
af2a278dfe anv/pipeline: Refactor specialization constant handling a bit
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a19ae36ce5)
2016-06-14 15:48:35 +01:00
Jason Ekstrand
07f5e621cf nir/lower_indirect_derefs: Use the direct array deref for recursion
This fixes about 100 of the new Vulkan CTS tests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 45542f554c)
2016-06-14 15:48:35 +01:00
Jason Ekstrand
d0dddbf4ee anv/clear: Handle ClearImage on 3-D images
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 59f06ac389)
2016-06-14 15:48:35 +01:00
Francisco Jerez
cbb02ebd74 Revert "i965/fs: Allow scalar source regions on SNB math instructions."
This reverts commit c1107cec44.
Apparently the hardware spec text I quoted in the commit message was
outright lying about scalar source math being supported on SNB, the
hardware seems to load 32 contiguous bits of data for each channel
regardless of the regioning mode.  Fixes regressions in the following
CTS tests (which we didn't catch early due to CTS being temporarily
disabled in our CI system):

   es2-cts.gtf.gl.atan.atan_vec3_frag_xvary
   es2-cts.gtf.gl.cos.cos_vec2_frag_xvary
   es2-cts.gtf.gl.atan.atan_vec2_frag_xvary
   es2-cts.gtf.gl.pow.pow_vec2_frag_xvary_yconsthalf
   es2-cts.gtf.gl.cos.cos_float_frag_xvary
   es2-cts.gtf.gl.pow.pow_float_frag_xvary_yconsthalf
   es2-cts.gtf.gl.atan.atan_vec3_frag_xvaryyvary
   es2-cts.gtf.gl.pow.pow_vec3_frag_xvary_yconsthalf
   es2-cts.gtf.gl.cos.cos_vec3_frag_xvary
   es2-cts.gtf.gl.atan.atan_vec2_frag_xvaryyvary

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96346
Reported-by: Mark Janes <mark.a.janes@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 7244dc1e06)
2016-06-14 15:48:35 +01:00
Francisco Jerez
d5528d5f55 i965/vec4: Fix cmod propagation not to propagate non-identity cmod into CMP(N).
The conditional mod of these instructions determines the semantics of
the comparison itself (rather than being evaluated based on the result
of the instruction as is usually the case for most other instructions
that allow conditional mods), so it's in general not legal to
propagate a conditional mod into a CMP instruction.  This prevents
cmod propagation from (mis)optimizing:

 cmp.z.f0 tmp, ...
 mov.z.f0 null, tmp

into:

 cmp.z.f0 tmp, ...

which gives the negation of the flag result of the original sequence.
I originally noticed this while working on SIMD32 in the scalar
back-end, but the same scenario is likely to be possible in vec4
programs so this commit ports the bugfix with the same name from the
scalar back-end to the vec4 cmod propagation pass.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit a2135c6fd9)
2016-06-14 15:48:35 +01:00
Emil Velikov
0e540b4a15 anv: add the X related and Wayland CFLAGS to VULKAN_ENTRYPOINT_CPPFLAGS
Otherwise we will fail to find the headers in some scenarios.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
(cherry picked from commit 7a3a0d9212)
2016-06-14 15:48:35 +01:00
Dave Airlie
911eddd37b mesa/get: return correct value for layer provoking vertex.
This fixes:
GL45-CTS.geometry_shader.layered_rendering.layered_rendering

on Skylake.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d10ae20b96)
2016-06-10 12:36:36 +01:00
Samuel Pitoiset
2185edf699 nvc0: mark buffer texture range valid for shader images
Loosely based on radeonsi (Thanks to Nicolai).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 28590eb949)
2016-06-10 12:35:15 +01:00
Francisco Jerez
09f0e97d1c i965/fs: Reindent emit_zip().
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 060c8d245d)
2016-06-10 12:34:19 +01:00
Francisco Jerez
2db670cf3e i965/fs: Skip SIMD lowering destination zipping if possible.
Skipping the temporary allocation and copy instructions is easy (just
return dst), but the conditions used to find out whether the copy can
be optimized out safely without breaking the program are rather
complex: The destination must be exactly one component of at most the
execution width of the lowered instruction, and all source regions of
the instruction must be either fully disjoint from the destination or
be aligned with it group by group.

v2: Don't handle partial source-destination overlap for simplicity
    (Jason).  No instruction count regressions with respect to v1 in
    either shader-db or the few FP64 shader_runner test-cases with
    partial overlap I've checked manually.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7aa76d66a1)
2016-06-10 12:33:14 +01:00
Anuj Phogat
5000556d5d blorp: Fix 16x multisample scaled blits
Piglit test ext_framebuffer_multisample_blit_scaled-blit-scaled
(with added 16x sample support) now passes with this patch.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 75da9c9933)
2016-06-10 12:32:17 +01:00
Dave Airlie
ab525a637a mesa/copyimage: report INVALID_VALUE for missing cube face
The specs says INVALID_VALUE for exceeding dimensions,
which is really what is happening here.

This fixes:
GL45-CTS.copy_image.non_existent_mipmap

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit af7bf610cf)
2016-06-10 12:31:22 +01:00
Dave Airlie
d130c53ac1 mesa/copyimage: fix num samples check to handle renderbuffers.
This test was only happening for textures, but there is
nothing in the spec to say this, so test it for all cases.

This fixes:
GL45-CTS.copy_image.invalid_target

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c0856eacf1)
2016-06-10 12:30:23 +01:00
Nanley Chery
669836e1be mesa/extensions: Fix ES1 extension reporting
Commit eda15abd84 , unintentionally
advertised these extensions in ES1 contexts. Undo this error.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c06cef7f9b)
2016-06-10 12:26:10 +01:00
Eric Engestrom
1dce03e4c1 st/osmesa: remove double-write (overwriting)
These two lines have been here since the file was created.
I'm guessing the second one was just for testing during dev, so it's the
one that's going away.

CoverityID: 1296205

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 17f4c723eb)
2016-06-10 12:08:02 +01:00
Emil Velikov
a7649abe9f Update version to 12.0.0-rc2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:35:59 +01:00
Emil Velikov
bcfda0a1fe mesa: automake: distclean git_sha1.h when building OOT
In the case of out-of-tree (OOT) builds, in particular when building
from tarball, we'll end up with the file in both srcdir and builddir.

We want the former to remain intact (since we need it on rebuild) while
the latter should be removed otherwise `make distclean' gets angry at
us.

Ideally there'll be a solution that feels a bit less of a hack. Until
then this does the job exactly as expected.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b7f7ec7843)
2016-06-07 12:35:53 +01:00
Emil Velikov
998e503592 mesa: automake: ensure that git_sha1.h.tmp has the right attributes
... when copied from git_sha1.h.

As the latter file can we lacking the write attribute, one should set it
explicitly. Otherwise we'll get a warning/failure at cleanup stage.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2c424e00c3)
2016-06-07 12:35:50 +01:00
Emil Velikov
5e3e292502 mesa: automake: add directory prefix for git_sha1.h
Otherwise the build will assume that we've talking about builddir, which
is not the case in the else statement.

Here the file is already generated and is part of the tarball.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 359d9dfec3)
2016-06-07 12:35:46 +01:00
Emil Velikov
3be5c6a9ec egl: android: don't add the image loader extension for !render_node
With earlier commit we introduced support for render_node devices, which
was couples with the use of the image loader extension.

As the work was inspired by egl/wayland we (erroneously) added the
extension for the !render_node path as well.

That works for wayland, as the implementations of the DRI2 and IMAGE
loader extensions converge behind the scenes. As that is not yet
the case for Android we shouldn't expose the extension.

Fixes: 34ddef39ce ("egl: android: add dma-buf fd support")

Cc: <mesa-stable@lists.freedesktop.org>
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 1816c837c1)
2016-06-07 12:35:40 +01:00
Emil Velikov
a26ca04fe3 anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards
The generated sources should follow the example set by the vulkan
headers and our non-generated code. Namely: the code for all supported
platforms should be available, each one guarded by its respective
VK_USE_PLATFORM_*_KHR macro.

v2: Reword commit message.

Cc: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96285
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1 over IRC)
(cherry picked from commit b8e1f59d62)
2016-06-03 01:44:56 +01:00
Mauro Rossi
1a5d6a232f isl: add support for Android libmesa_isl static library
isl library is needed to build i965, libmesa_isl static library is added
to fix related Android building errors.

Any attempt to build libmesa_genxml as phony package module failed to deliver
gen{7,75,8,9}_pack.h generated headers, needed for libmesa_isl_gen{7,75,8,9}

Due to constraints in Android Build System, libmesa_genxml is built as static,
at least one source is needed, so dummy.c is autogenerated for this scope,
libmesa_genxml dependency is declared using LOCAL_WHOLE_STATIC_LIBRARIES,
to avoid building errors due to missing genxml/gen{7,75,8,9}_pack.h headers.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 278c2212ac)
2016-06-02 22:35:29 +01:00
Mauro Rossi
702a1121c9 android: libmesa_glsl: add a dependency on libmesa_nir static
Fixes the following building error:

target  C++: libmesa_glsl <= external/mesa/src/compiler/glsl/glsl_to_nir.cpp
In file included from external/mesa/src/compiler/glsl/glsl_to_nir.h:28:0,
                 from external/mesa/src/compiler/glsl/glsl_to_nir.cpp:28:
external/mesa/src/compiler/nir/nir.h:42:25: fatal error: nir_opcodes.h: No such file or directory
compilation terminated.
build/core/binary.mk:432: recipe for target 'out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o' failed
make: *** [out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o] Error 1
make: *** Waiting for unfinished jobs....

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4143245c23)
2016-06-02 22:35:29 +01:00
Emil Velikov
9a21315ea9 isl: automake: don't include isl_format_layout.c in two lists.
Including the file in both ISL_FILES and ISL_GENERATED_FILES makes
the actual dependency list less obvious.

v2: Drop unrelated vulkan hunk (Jason).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit af1a0ae8ce)
2016-06-02 22:35:29 +01:00
Emil Velikov
94630ce0c7 automake: bring back the .PHONY git_sha1.h.tmp rule
With earlier commit 3689ef32af ("automake: rework the git_sha1.h rule,
include in tarball") we/I erroneously removed the PHONY rule and the
temporary file.

The former is used to ensure that the header is regenerated when on each
make invocation, while the latter helps us avoid the unneeded rebuild(s)
when the SHA1 hasn't changed.

Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit af2637aa32)
2016-06-02 22:35:29 +01:00
Christian König
6ad61d90ea radeon/uvd: fix the H264 level for Tonga v2
We support 5.2 for a while now.

v2: we even support 5.2 for H264, 5.1 is for HEVC.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b3e75c3997)
2016-06-02 14:04:14 +01:00
Jordan Justen
a136b8bfe2 i965: Remove old CS local ID handling
The old method pushed data for each channels uvec3 data of
gl_LocalInvocationID.

The new method pushes 1 dword of data that is a 'thread local ID'
value. Based on that value, we can generate gl_LocalInvocationIndex
and gl_LocalInvocationID with some calculations.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0a3acff5b5)
2016-06-02 14:02:05 +01:00
Jordan Justen
52ba7abe1e i965: Enable cross-thread constants and compact local IDs for hsw+
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.

One complication is that cross-thread constants are loaded into
registers before per-thread constants. Previously, our local IDs were
loaded before the uniform data and treated as 'payload' data, even
though they were actually pushed into the registers like the other
uniform data.

Therefore, in this patch we simultaneously enable a newer layout where
each thread now uses a single uniform slot for a unique local ID for
the thread. This uniform is handled specially to make sure it is added
last into the uniform push constant registers. This minimizes our
usage of push constant registers, and maximizes our ability to use
cross-thread constants for registers.

To swap from the old to the new layout, we also need to flip some
lowering pass switches to let our driver handle the lowering instead.
We also no longer force thread_local_id_index to -1.

v4:
 * Minimize size of patch that switches from the old local ID layout
   to the new layout (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit b1f22c6317)
2016-06-02 14:01:31 +01:00
Jordan Justen
28ecf2b90e anv: Support new local ID generation & cross-thread constants
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.

We also support per-thread data which allows us to store a per-thread
ID in one of the uniforms that can be used to calculate the
gl_LocalInvocationIndex and gl_LocalInvocationID variables.

v4:
 * Support the old local ID push constant layout as well (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 3ba9594f32)
2016-06-02 14:01:04 +01:00
Jordan Justen
ead833a395 i965: Support new local ID push constant & cross-thread constants
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.

We also support per-thread data which allows us to store a per-thread
ID in one of the uniforms that can be used to calculate the
gl_LocalInvocationIndex and gl_LocalInvocationID variables.

v4:
 * Support the old local ID push constant layout as well (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 30685392e0)
2016-06-02 13:59:04 +01:00
Jordan Justen
ee77c4a099 i965: Add CS push constant info to brw_cs_prog_data
We need information about push constants in a few places for the GL
driver, and another couple places for the vulkan driver.

When we add support for uploading both a common (cross-thread) set of
push constants, combined with the previous per-thread push constant
data, things are going to get even more complicated. To simplify
things, we add push constant info into the cs prog_data struct.

The cross-thread constant support is added as of Haswell. To support
it we need to make sure all push constants with uniform values are
added to earlier registers. The register that varies per thread and
holds the thread invocation's unique local ID needs to be added last.

For now we add the code that would calculate cross-thread constatn
information for hsw+, but we force it (cross_thread_supported) off
until the other parts of the driver support it.

v4:
 * Support older local ID push constant layout as well. (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit d437798ace)
2016-06-02 13:56:54 +01:00
Jordan Justen
a94be40ecc i965: Store number of threads in brw_cs_prog_data
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 1b79e7ebbd)
2016-06-02 13:54:44 +01:00
Jordan Justen
632d7ef148 i965: Add nir based intrinsic lowering and thread ID uniform
We add a lowering pass for nir intrinsics. This pass can replace nir
intrinsics with driver specific nir lower code.

We lower the gl_LocalInvocationIndex intrinsic based on a uniform
which is loaded with a thread specific ID.

We also lower the gl_LocalInvocationID based on
gl_LocalInvocationIndex.

v2:
 * Create variable during lowering pass. (Ken)

v3:
 * Don't create a variable, but instead just insert an intrisic call
   to load a uniform from the allocated location. (Jason)

v4:
 * Don't run this pass if thread_local_id_index < 0

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 3ef0957dac)
2016-06-02 13:53:58 +01:00
Jordan Justen
5513300f59 i965: Put CS local thread ID uniform in last push register
This thread ID uniform will be used to compute the
gl_LocalInvocationIndex and gl_LocalInvocationID values.

It is important for this uniform to be added in the last push constant
register. fs_visitor::assign_constant_locations is updated to make
sure this happens.

The reason this is important is that the cross-thread push constant
registers are loaded first, and the per-thread push constant registers
are loaded after that. (Broadwell adds another push constant upload
mechanism which reverses this order, but we are ignoring this for
now.)

v2:
 * Add variable in intrinsics lowering pass
 * Make sure the ID is pushed last in assign_constant_locations, and
   that we save a spot for the ID in the push constants

v3:
 * Simplify code based with Jason's suggestions.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 04fc72501a)
2016-06-02 13:53:24 +01:00
Jordan Justen
33d0016836 i965: Add uniform for a CS thread local base ID
v4:
 * Force thread_local_id_index to -1 for now, and have
   fs_visitor::setup_cs_payload look at thread_local_id_index. This
   enables us to more easily cut over from the old local ID layout to
   the new layout, as suggested by Jason.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit fa279dfbf0)
2016-06-02 13:51:11 +01:00
Jordan Justen
169b700dfd i965: Add nir channel_num system value
v2:
 * simd16/32 fixes (curro)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 8f48d23e0f)
2016-06-02 13:48:20 +01:00
Jordan Justen
33e985f8b9 nir: Make lowering gl_LocalInvocationIndex optional
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6f316c9d86)
2016-06-02 13:45:29 +01:00
Jordan Justen
c9de6190a0 glsl: Add glsl LowerCsDerivedVariables option
v2:
 * Move lower flag to context constants. (Ken)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7b9def3583)
2016-06-02 13:38:06 +01:00
Jason Ekstrand
05d88165d9 i965/fs: Copy the offset when lowering logical pull constant sends
This fixes 64 Vulkan CTS tests per gen

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96299
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1205999c22)
2016-06-02 13:37:30 +01:00
Dave Airlie
d1cf18497a glsl/distance: make sure we use clip dist varying slot for lowered var.
When lowering, we always want to use the clip dist varying.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8d4f4adfbd)
2016-06-02 13:36:45 +01:00
Kenneth Graunke
5a44d36b46 i965: Fix isoline reads in scalar TES.
Isolines aren't reversed.  commit 5b2d8c2273 fixed this for the vec4
TES backend, but not the scalar one.

Found while debugging GL45-CTS.tessellation_shader.
tessellation_control_to_tessellation_evaluation.gl_tessLevel.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 25e1b8d366)
2016-06-02 13:36:14 +01:00
Ian Romanick
0e54eebeed glsl: Use Geom.VerticesOut == -1 to specify unset
Because apparently layout(max_vertices=0) is a thing.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a428c955ce)
2016-06-02 13:35:18 +01:00
Ian Romanick
0ab1a3957a i965: If control_data_header_size_bits is zero, don't do EndPrimitive
This can occur when max_vertices=0 is explicitly specified.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b27dfa5403)
2016-06-02 13:34:43 +01:00
Ian Romanick
1398a9510f mesa: Fix bogus strncmp
The string "[0]\0" is the same as "[0]" as far as the C string datatype
is concerned.  That string has length 3.  strncmp(s, length_3_string, 4)
is the same as strcmp(s, length_3_string), so make it be strcmp.

v2: Not the same as strncmp(..., 3).  Noticed by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 049bb94d2e)
2016-06-02 13:33:53 +01:00
Ilia Mirkin
b265796c79 nir: allow sat on all float destination types
With the introduction of fp64 and fp16 to nir, there are now a bunch of
float types running around. A F1 2015 shader ends up with an i2f.sat
operation, which has a nir_type_float32 destination. Allow sat on all
the float destination types.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ca135a2612)
2016-06-02 13:32:52 +01:00
Alex Deucher
4a00da1662 radeonsi: fix the raster config setup for 1 RB iceland chips
I didn't realize there were 1 and 2 RB variants when this code
was originally added.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bd85e4a041)
2016-06-02 13:32:05 +01:00
Dave Airlie
e817522728 mesa/sampler: fix error codes for sampler parameters.
The initial ARB_sampler_objects spec had GL_INVALID_VALUE in it,
however version 8 of it fixed this, and the GL specs also have
the fixed value in them.

Fixes:
GL45-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6400144041)
2016-06-02 13:31:18 +01:00
Dave Airlie
915cc490d7 glsl: define some GLES3 constants in GLSL 4.1
The GLSL 4.1 spec adds:
gl_MaxVertexUniformVectors
gl_MaxFragmentUniformVectors
gl_MaxVaryingVectors

This fixes:
GL45-CTS.gtf31.GL3Tests.uniform_buffer_object.uniform_buffer_object_build_in_constants

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0ebf4257a3)
2016-06-02 13:30:24 +01:00
Topi Pohjolainen
683c6940d8 i965: Add norbc debug option
This INTEL_DEBUG option disables lossless compression (also known
as render buffer compression).

v2: (Matt) Use likely(!lossless_compression_disabled) instead of
           !likely(lossless_compression_disabled)
    (Grazvydas) Update docs/envvars.html

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6ca118d2f4)
2016-06-02 13:28:22 +01:00
Topi Pohjolainen
2d483256d5 i965/gen9: Configure rbc buffers as plain for non-rbc tex views
Fixes rendering in Shadow of Mordor with rbc. Application writes
RGBA_UNORM texture filling it with values the application wants to
later on treat as SRGB_ALPHA.
Intel driver enables lossless compression for the buffer by the time
of writing. However, the driver fails to make sure the buffer can be
sampled as something else later on and unfortunately there is
restriction in the hardware for using lossless compression for srgb
formats which looks to extend itself to the sampling engine also.
Requesting srgb to linear conversion on top of compressed buffer
results the color values to be pretty much garbage.

Fortunately none of tracked benchmarks showed a regression with
this.

v2 (Matt): Add missing space

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 30e9e6bd07)
2016-06-02 13:27:53 +01:00
Kenneth Graunke
8c627af1f0 i965: Fix the passthrough TCS for isolines.
We weren't setting up several of the uniform values for the patch
header, so we'd crash when uploading push constants.  We at least
need to initialize them to zero.  We also had the isoline parameters
reversed, so it would also render incorrectly (if it didn't crash).

Fixes a new Piglit test(*) (isoline-no-tcs), as well as crashes in
GL44-CTS.tessellation_shader.single.max_patch_vertices.

(*) https://lists.freedesktop.org/archives/piglit/2016-May/019866.html

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit a3dc99f3d4)
2016-06-02 13:27:23 +01:00
Dave Airlie
86e367a572 i965/xfb: skip components in correct buffer.
The driver was adding the skip components but always for buffer 0.

This fixes:
GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_skip_multiple_buffers

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ebb81cd683)
2016-06-02 13:26:50 +01:00
Dave Airlie
64015c03bb glsl/linker: fix multiple streams transform feedback.
e2791b38b4
mesa/program_interface_query: fix transform feedback varyings.

caused a regression in
GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_multiple_streams
on radeonsi.

The problem was it was using the skip components varying to set
the stream id, when it should wait until a varying was written,
this just adds the varying checks in the right place.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1fe7bbb911)
2016-06-02 13:25:59 +01:00
Dave Airlie
99fcfd985e mesa/bufferobj: use mapping range in BufferSubData.
According to GL4.5 spec:
An INVALID_OPERATION error is generated if any part of the speci-
fied buffer range is mapped with MapBufferRange or MapBuffer (see sec-
tion 6.3), unless it was mapped with MAP_PERSISTENT_BIT set in the Map-
BufferRange access flags.

So we should use the if range is mapped path.

This fixes:
GL45-CTS.buffer_storage.map_persistent_buffer_sub_data

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0, 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e891f7cf55)
2016-06-02 13:25:08 +01:00
Ilia Mirkin
7bc29c784a nv50/ir: fix error finding free element in bitset in some situations
This really only hits for bitsets with a size of a multiple of 32. We
can end up with pos = -1 as a result of the ffs, which we in turn decide
is a valid position (since we fall through the loop and i == 1, we end
up adding 32 to it, so end up returning 31 again).

Up until recently this was largely unreachable, as the register file
sizes were all 63 or 255. However with the advent of compute shaders
which can restrict the number of registers, this can now happen.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 18d11c9989)
2016-06-02 13:24:08 +01:00
Timothy Arceri
b2b7f05da6 Revert "glsl: fix xfb_offset unsized array validation"
This reverts commit aac90ba292.

The commit caused a regression in:
piglit.spec.glsl-1_50.compiler.gs-input-nonarray-named-block.geom

Also the CTS test it was meant to fix seems like it may be bogus.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 98d40b4d11)
2016-06-02 13:21:36 +01:00
Francisco Jerez
eb56a2f250 i965/fs: Allow scalar source regions on SNB math instructions.
I haven't found any evidence that this isn't supported by the
hardware, in fact according to the SNB hardware spec:

 "The supported regioning modes for math instructions are align16,
  align1 with the following restrictions:
   - Scalar source is supported.
  [...]
   - Source and destination offset must be the same, except the case of
     scalar source."

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit c1107cec44)
2016-06-02 13:20:45 +01:00
Francisco Jerez
c1269825cf i965/fs: Fix constant combining for instructions that cannot accept source mods.
This is the case for SNB math instructions so we need to be careful
and insert the literal value of the immediate into the table (rather
than its absolute value) if the instruction is unable to invert the
sign of the constant on the fly.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 06d8765bc0)
2016-06-02 13:20:15 +01:00
Francisco Jerez
f651a4bb2e i965/fs: Extend remove_duplicate_mrf_writes() to handle non-VGRF to MRF copies.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 303ec22ed6)
2016-06-02 13:19:41 +01:00
Francisco Jerez
44029d4237 i965/fs: Fix compute_to_mrf() to coalesce VGRFs initialized by multiple single-GRF writes.
Which requires using a bitset instead of a boolean flag to keep track
of the GRFs we've seen a generating instruction for already.  The
search loop continues until all instructions initializing the value of
the source VGRF have been found, or it is determined that coalescing
is not possible.

Fixes a few piglit test cases on Gen4-6 which were regressed by
6956015aa5 due to the different (yet
perfectly valid) ordering in which copy instructions are emitted now
by the simd lowering pass, which had the side effect of causing this
optimization pass to start corrupting the program in cases where a
VGRF-to-MRF copy instruction would be eliminated but only the last
instruction writing to the source VGRF region would be rewritten to
point to the target MRF.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4fe4f6e8a7)
2016-06-02 13:19:07 +01:00
Francisco Jerez
910fa7a824 i965/fs: Teach compute_to_mrf() about the COMPR4 address transformation.
This will be required to correctly transform the destination of 8-wide
instructions that write a single GRF of a VGRF to MRF copy marked
COMPR4.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 1898673f58)
2016-06-02 13:18:33 +01:00
Francisco Jerez
3b78304025 i965/fs: Refactor compute_to_mrf() to split search and rewrite into separate loops.
This will allow compute_to_mrf to handle cases where the source of the
VGRF-to-MRF copy is initialized by more than one instruction.  In such
cases we cannot rewrite the destination of any of the generating
instructions until it's known whether the whole VGRF source region can
be coalesced into the destination MRF, which will imply continuing the
search until all generating instructions have been found or it has
been determined that the VGRF and MRF registers cannot be coalesced.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 485fbaff03)
2016-06-02 13:18:00 +01:00
Francisco Jerez
dd96daa55e i965/fs: Fix compute-to-mrf VGRF region coverage condition.
Compute-to-mrf was checking whether the destination of scan_inst is
more than one component (making assumptions about the instruction data
type) in order to find out whether the result is being fully copied
into the MRF destination, which is rather inaccurate in cases where a
single-component instruction is only partially contained in the source
region, or when the execution size of the copy and scan_inst
instructions differ.  Instead check whether the destination region of
the instruction is really contained within the bounds of the source
region of the copy.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4b0ec9f475)
2016-06-02 13:17:26 +01:00
Francisco Jerez
a6011c6fc6 i965/fs: Simplify and improve accuracy of compute_to_mrf() by using regions_overlap().
Compute-to-mrf was being rather heavy-handed about checking whether
instruction source or destination regions interfere with the copy
instruction, which could conceivably lead to program miscompilation.
Fix it by using regions_overlap() instead of the open-coded and
dubiously correct overlap checks.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit bb61e24787)
2016-06-02 13:16:52 +01:00
Francisco Jerez
2d83aad693 i965/fs: Teach regions_overlap() about COMPR4 MRF regions.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 88f380a2dd)
2016-06-02 13:16:04 +01:00
Dylan Baker
665f57c513 Don't use python 3
Now there are not files that require python 3, so for now just remove
the python 3 dependency and use python 2. I think the right plan is to
just get all of the python ready for python 3, and then use whatever
python is available.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 604010a7ed)
2016-06-02 13:15:38 +01:00
Dylan Baker
7e62585ee8 genxml: change chbang to python 2
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ab31817fed)
2016-06-02 13:15:08 +01:00
Dylan Baker
4dd70617a1 genxml: use the isalpha method rather than str.isalpha.
This fixes gen_pack_header to work on python 2, where name[0] is unicode
not str.

Signed-off-by: Dylan Bake <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 12c1a01c72)
2016-06-02 13:14:38 +01:00
Dylan Baker
9ed6965749 genxml: require future imports for python2 compatibility.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a45a25418b)
2016-06-02 13:14:08 +01:00
Dylan Baker
aed6230269 genxml: mark re strings as raw
This is a correctness issue.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e5681e4d70)
2016-06-02 13:13:39 +01:00
Dylan Baker
f73a68ec37 genxml: Make classes descendants of object
This is the default in python3, but in python2 you get old style
classes. No one likes old-style classes.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit de2e9da2e9)
2016-06-02 13:13:09 +01:00
Dylan Baker
0c12887764 genxml: mark gen_pack_header.py as encoded in utf-8
There is unicode in this file, and I'm actually surprised that the
python interpreter hasn't gotten grumpy.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9f50e3572c)
2016-06-02 13:12:35 +01:00
Marek Olšák
145705e49c mesa: fix crash in driver_RenderTexture_is_safe
This just fixed the crash with the apitrace in bug report.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95246

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 8a10192b4b)
2016-06-02 13:11:43 +01:00
Dave Airlie
d3c92267e0 glsl/images: bounds check image unit assignment
The CTS test:
GL45-CTS.multi_bind.dispatch_bind_image_textures
binds 192 image uniforms, we reject this later,
but not until after we trash the contents of the
struct gl_shader.

Error now reads:
Too many compute shader image uniforms (192 > 16)
instead of
Too many compute shader image uniforms (2745344416 > 16)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit f87352d769)
2016-06-02 13:10:50 +01:00
Ilia Mirkin
36e26f2ee2 nvc0/ir: fix spilling predicates to registers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4b1a167a2b)
2016-06-02 12:54:52 +01:00
Emil Velikov
9a56e7d25b Update version to 12.0.0-rc1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 19:20:34 +01:00
Emil Velikov
7ad2cb6f08 docs: rename release notes to 12.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 19:20:34 +01:00
Emil Velikov
a43a368457 nir: add the SConscript.nir to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 922b471777)
2016-05-30 19:20:33 +01:00
Rhys Kidd
f25fdf21e7 vc4: Fix doxygen warnings
Now that vc4 automated code documentation can be generated with
doxygen, fix the warnings issued by Doxygen 1.8.11.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 17:53:45 +01:00
Rhys Kidd
db975fa86c doxygen: Plumb through gallium/ to automated documentation
Add Gallium and the Gallium-based drivers to doxygen's automated
code documentation infrastructure.

Can be individually created with:

  cd $MESA_TOP_LEVEL/
  make -C doxygen/ gallium.tag

Benefits from the existing doxygen Makefile runners to clean up
afterwards with 'make clean'.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 17:53:45 +01:00
Emil Velikov
26f4638684 Revert "osmesa: don't try to bundle osmesa.def SConscript"
This reverts commit c07df0f201.

Now that the SCons build is back we need to include the files in the
tarball.
2016-05-30 17:53:45 +01:00
Andreas Fänger
9601815b4b scons: build osmesa swrast and gallium
This patch makes it possible to build classic osmesa/swrast on windows
again. It was removed in commit 69db422218.
Although there is a gallium version of osmesa now, the swrast version
still has more features lacking in llvmpipe, e.g. anisotropic filtering.

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
[Emil Velikov: remove trailing whitespace]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 17:53:45 +01:00
Emil Velikov
3689ef32af automake: rework the git_sha1.h rule, include in tarball
As we'll need the file in the release tarball, rework the rule so that
the file is regenerated _only_ if we're in a git repository.

With this in place we can build vulkan (anv) from a release tarball.

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 17:53:45 +01:00
Emil Velikov
4cd9cd6abc automake: move the git_sha1.h rule a level up
This way we can reuse the header from other places like -
src/intel/vulkan and src/gallium. Only the former is hooked up atm.

Make sure .gitignore is updated, as well as all the users (the mesa
code does not need any changes).

Also ensure that the file is always created by adding it to the
BUILT_SOURCES target.

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 17:53:45 +01:00
Emil Velikov
13faddb6b8 mesa_glinterop: remove mesa_glinterop typedefs
As is there are two places that do the typedefs - dri_interface.h and
this header. As we cannot include the former in here, just drop the
typedefs and use the struct directly (as needed).

This is required because typedef redefinition is C11 feature which is
not supported on all the versions of GCC used to build mesa.

v2: Kill the typedef alltogether, as per Marek.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96236
Cc: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-30 17:53:44 +01:00
Emil Velikov
d43c894471 glx/glvnd: automake: include all the sources in libglx_la_SOURCES
Otherwise the headers will be missing from the release tarball.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 17:53:44 +01:00
Emil Velikov
f9db61d095 glx/glvnd: remove the final if defined($extension) guards
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 17:53:44 +01:00
Emil Velikov
3bf00b6c6a glx/glvnd: rework dispatch functions/indices tables lookup
Rather than checking if the function name maps to a valid entry in the
respective table, just create a dummy entry at the end of each table.

This allows us to remove some unnessesary "index >= 0" checks, which get
executed quite often.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 17:53:44 +01:00
Emil Velikov
eab7e54981 glx/glvnd: Use strcmp() based binary search in FindGLXFunction()
It will allows us to find the function within 6 attempts, out of the ~80
entry long table.

v2: calculate middle on each iteration, correctly set the lower limit.

Reviewed-by: Adam Jackson <ajax@redhat.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 17:53:44 +01:00
Chuck Atkins
f9a35bf012 configure.ac: correct the xlib/xlib-gallium GLX detection for GLVND
Things have changed since commit a92910a ("glx: Refactor the configure
options for glx implementation choice (v3)") where only a single
configure option is used to control the GLX provider.

[Emil Velikov: Ensure that the check is moved after the detection code.]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 17:53:34 +01:00
Kyle Brenneman
22a9e00aab glx: Implement the libglvnd interface.
With reference to the libglvnd branch:

https://cgit.freedesktop.org/mesa/mesa/log/?h=libglvnd

This is a squashed commit containing all of Kyle's commits, all but two
of Emil's commits (to follow), and a small fixup from myself to mark the
rest of the glX* functions as _GLX_PUBLIC so they are not exported when
building for libglvnd. I (ajax) squashed them together both for ease of
review, and because most of the changes are un-useful intermediate
states representing the evolution of glvnd's internal API.

Co-author: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
2016-05-30 16:29:49 +01:00
Frederic Devernay
cee459d84d gallivm: initialize init_native_targets_once_flag correctly
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-05-30 16:13:52 +02:00
Ilia Mirkin
8cc80e396e nvc0/ir: fix emission of predicate spill to register
The lane mask only applies to real mov's, while here we're using PSET.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-30 10:07:01 -04:00
Ilia Mirkin
9444d71611 nvc0: fix some compute texture validation bits on kepler
(a) Make sure to update the TIC in case of an updated buffer address
(b) Mark newly-inactive textures dirty so that we update the handle in
set_tex_handles.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-30 10:07:01 -04:00
Dave Airlie
bac39dddcf mesa/xfb: report calculated size for XFB buffer objects.
This fixes:
GL45-CTS.direct_state_access.xfb_buffers

This test looks correct to me, we should work out the
size value and report it rather than using only the size
from the Range interface.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-30 21:18:54 +10:00
Emil Velikov
e7bd5b4b77 swr: automake: silence the python invocation
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 10:31:08 +01:00
Emil Velikov
04987ef229 swr: automake: attempt to fix the out-of-tree build
Make sure that the output folder is created otherwise the python scripts
yells at us.

Cc: 0xe2.0x9a.0x9b@gmail.com
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96238
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 10:31:07 +01:00
Emil Velikov
3a59a624d0 swr: remove LLVM dependency from source generation rules.
The dependencies should not mention any files external to the project.
If we want to do sanity checks for the LLVM installed on the system we
should do that in configure, yet again where is the merit which header
gets checked and which doesn't ?

Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 10:31:07 +01:00
Emil Velikov
b05b782b43 swr: add all the generators to the release tarball.
Namely the python scripts and the knobs.template.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 10:31:07 +01:00
Emil Velikov
38394b5d76 anv: automake: don't forget to cleanup dev_icd.json
Otherwise `make distcheck' will barf at us as the file is dangling.

Ideally this should be part of the clean-local hook, although we include
install-lib-links.mk which already has one.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-30 10:29:21 +01:00
Emil Velikov
220d8c99fa anv: automake: bring back VULKAN_ENTRYPOINT_CPPFLAGS
We should not have removed them in the first place. There's a subtle
difference between generating the complete sources and using them which
was not obvious as we nuked them.

Without this, the release tarball ends up without various hunks of the
generated sources, thus things fail at a later stage as we attempt to
build them.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-30 10:28:56 +01:00
Emil Velikov
82514f26d8 anv: automake: ship the json files in the release tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-30 10:28:53 +01:00
Emil Velikov
f80b10df8d softpipe: add sp_buffer.h to the sources list (release tarball)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 10:28:53 +01:00
Emil Velikov
2f43908395 freedreno: make sure we pick up ir3_nir_trig.py in the release tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 10:28:53 +01:00
Emil Velikov
36859022ea isl: add isl_priv.h to the sources list
Otherwise it will be missing from the release tarball.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-30 10:28:50 +01:00
Mauro Rossi
41d252e418 isl: move the sources lists to Makefile.sources
[Emil Velikov: use the file in the autoconf build]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-30 10:28:48 +01:00
Emil Velikov
b4f6c70397 isl: automake: list builddir before srcdir in the includes list
As seen elsewhere - we want to include the freshly built sources as
opposed the the (likely) stale ones in the srcdir.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-30 10:28:46 +01:00
Emil Velikov
53a2167e68 isl: automake: flatten the tests rules
Fold the unneeded extra variable tests_ldadd, the explicit sources
section (single file with the default extension) and flip the
check_PROGRAMS <> TESTS order (TESTS includes scripts, while
check_PROGRAMS is binaries only).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-30 10:28:43 +01:00
Emil Velikov
1eecc09584 isl: automake: remove unneeded install-lib-links.mk include
One uses the makefile to create compatibility symlinks (to
$top_builddir/libs) for shared libraries/modules. As we don't create any
here, there's no need to include the file.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-30 10:28:40 +01:00
Emil Velikov
afc1db739a isl: automake: remove unneeded SUBDIRS
As we do not include any other subdirs but self, we don't need to set
it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-30 10:28:37 +01:00
Mauro Rossi
779653489e genxml: move the sources (headers) list to Makefile.sources
[Emil Velikov: use the file in the autoconf build]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-30 10:26:36 +01:00
Emil Velikov
ace5403453 anv: bail out if anv_wsi_init() fails
Otherwise we'll end up setting up a device with no winsys integration.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
---
Hard-coding the rendernode name in anv_physical_device_init() is a bad
idea really. We could/should be using drmGetDevices() to get info on all
the devices (master/render/etc. node names, pci location etc.) and apply
our heuristics on top of that.

That can come up as a follow up change.
2016-05-30 10:26:36 +01:00
Emil Velikov
93e65fdcac anv: resolve wayland-only build
Ensure that the final X11/XCB hunk is guarded by the correct macro.
Otherwise we'll require the symbol even when building without said
platform.

Cc: Cedric Sodhi <manday@openmail.cc>
Reported-by: Cedric Sodhi <manday@openmail.cc>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-30 10:26:35 +01:00
Robert Foss
5068d307f9 anv: Fix use of uninitialized variable.
The return variable was not set for failure paths.
It has now been changed to VK_ERROR_INITIALIZATION_FAILED
for failure paths.

Coverity: 1358944
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Emil Velikov: rebase against master, s/vulkan/anv/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-30 10:26:35 +01:00
Stanimir Varbanov
e382bc649b gallium: push offset down to driver
Push offset down to drivers when importing dmabuf. This is needed
to more fully support EGL_EXT_image_dma_buf_import when a non-zero
offset is specified.

Tesing has been done for freedreno, and compile tested following
gallium drivers:
nouveau,svga,virgl,r600,r300,radeonsi,swrast,i915,ilo

Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-30 10:26:35 +01:00
Stanimir Varbanov
30d28d7c31 st/dri: cleanup image_from_fd/dma_buf paths
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-30 10:26:35 +01:00
Stanimir Varbanov
9d852a1f75 st/dri: add handling of R8 and GR88 DRI fourcc formats
This helps to import dmabuf buffers from DRM_FORMAT_R8 and
DRM_FORMAT_GR88 used for example by GStreamer for YUV to RGB
conversion using shaders.

Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-30 10:26:35 +01:00
Bas Nieuwenhuizen
e9d3246a7a radeonsi: Don't offset OFFCHIP_BUFFERING on pre-VI cards.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96239
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-30 09:59:50 +02:00
Francisco Jerez
d8cf982f7d i965: Expose GL 4.3 on Gen8+.
ARB_compute_shader was the last feature missing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:38 -07:00
Francisco Jerez
4decc426c2 i965/fs: Skip gen4 pre/post-send dependency workaronds for the first/last block.
We know that there cannot be any destination dependency race if we
reach the beginning or end of the program without having found any
other instruction the send could possibly race with.  This avoids
emitting a pile of useless moves at the beginning or end of the
program in the most common case in which the program has a single
basic block only.

On the original i965 I get the following shader-db results:

 total instructions in shared programs: 3354165 -> 3215637 (-4.13%)
 instructions in affected programs: 3183065 -> 3044537 (-4.35%)
 helped: 13498
 HURT: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:38 -07:00
Francisco Jerez
daf4a71883 i965/fs: Skip SIMD lowering source unzipping for regular scalar regions.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:38 -07:00
Francisco Jerez
6956015aa5 i965/fs: Factor out region zipping and unzipping from the SIMD lowering pass.
Just to make sure we keep the SIMD lowering pass tidy when we
introduce additional logic to try to optimize out the copy
instructions used to zip and unzip the destination and source regions
into multiple packed regions of the lowered instruction width.
Shouldn't cause any functional changes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:38 -07:00
Francisco Jerez
a9f00a9e53 i965/fs: Generalize regions_overlap() from copy propagation to handle non-VGRF files.
This will be useful in several places.  The only externally visible
difference (other than non-VGRF files being supported now) is that the
region sizes are now passed in byte units instead of in GRF units
because the loss of precision would have become a problem in the SIMD
lowering pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:38 -07:00
Francisco Jerez
4db93592de i965/fs: Refactor offset() into a separate function taking the width as argument.
This will be useful in the SIMD lowering pass to avoid having to
construct a builder object of the known region width just to pass it
as argument to offset(), which doesn't do anything with it other than
taking the builder dispatch_width as region width.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:38 -07:00
Francisco Jerez
a5b4f63c15 i965/fs: Implement opt_sampler_eot() in terms of logical sends.
This makes the whole LOAD_PAYLOAD munging unnecessary which simplifies
the code and will allow the optimization to succeed in more cases
independent of whether the LOAD_PAYLOAD instruction can be found or
not.

The following patch is squashed in:

SQUASH: i965/fs: Add basic dataflow check to opt_sampler_eot().

The sampler EOT optimization pass naively assumes that the texturing
instruction provides all the data used by the FB write just because
they're standing next to each other.  The least we should be checking
is whether the source and destination regions of the FB write and
texturing instructions match.  Without this the previous seemingly
harmless patch would have caused opt_sampler_eot() to misoptimize a
shader from dota-2 causing DCE to eliminate all of its 78 instructions
except for the final sampler EOT message (!).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:38 -07:00
Francisco Jerez
a0d9aed268 i965/fs: Fix UB list sentinel dereference in opt_sampler_eot().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:38 -07:00
Francisco Jerez
2a166c13d4 i965/fs: Take opt_redundant_discard_jumps out of the optimization loop.
No shader-db regressions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:38 -07:00
Francisco Jerez
d5f2f32b11 i965/fs: Run SIMD and logical send lowering after the optimization loop.
There are two reasons why this is useful:

 - It avoids the introduction of an amount of partial writes emitted
   by the SIMD lowering pass to zip and unzip register regions early
   during optimization, which can make subsequent optimization less
   effective.

 - It substantially reduces the burden on the compiler when a large
   fraction of the instructions in the program need to be split (e.g.
   during SIMD32 builds).  Individual halves of split instructions
   will be optimized identically (if they can still be optimized at
   all), so doing it up front can duplicate the amount of instructions
   the optimizer has to deal with which causes the compilation time to
   explode in some cases due to the worse-than-linear runtime
   behaviour of the back-end.

It seems helpful to re-run a few optimization passes in cases where
any of the lowering passes was able to make progress.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:38 -07:00
Francisco Jerez
e9eb59ba68 i965/fs: Add FS_OPCODE_FB_WRITE_LOGICAL to has_side_effects().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:38 -07:00
Francisco Jerez
48d743c501 i965/fs: Allow constant propagation into logical send sources.
Logical sends are eventually lowered into a series of copies so they
can take almost anything as source.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:37 -07:00
Francisco Jerez
f1a607cf68 i965/fs: Let CSE handle logical sampler sends as expressions.
This will prevent some shader-db regressions when we start plumbing
logical sends through the optimizer.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:37 -07:00
Francisco Jerez
b0c8e5e0c8 i965/fs: Pass a BAD_FILE register to the logical FB write when oMask is unused.
This will let the optimizer know that the sample mask value is unused
so its definition can be DCE'ed.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 23:41:37 -07:00
Timothy Arceri
aac90ba292 glsl: fix xfb_offset unsized array validation
This partially fixes CTS test:
GL44-CTS.enhanced_layouts.xfb_get_program_resource_api

The test now fails at a tes evaluation shader with unsized output arrays.

The ARB_enhanced_layouts spec says:

   "It is a compile-time error to apply xfb_offset to the declaration of an
   unsized array."

So this seems like a bug in the CTS.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-30 15:11:47 +10:00
Timothy Arceri
87fb5aa3e7 glsl: dont crash when attempting to assign a value to a builtin define
For example GL_ARB_enhanced_layouts = 3;

Fixes:
GL44-CTS.enhanced_layouts.glsl_contant_immutablity

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-30 12:47:58 +10:00
Dave Airlie
d98d6e6269 egl/dri3: don't crash on no context.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94925

Pointed out by Karol Herbst on irc.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-30 11:30:04 +10:00
Dave Airlie
e2791b38b4 mesa/program_interface_query: fix transform feedback varyings.
The spec says gl_NextBuffer and gl_SkipComponents need to be
returned to userspace in the program interface queries.

We currently throw those away, this requires a complete piglit
run to make sure no drivers fallover due to the extra varyings.

This fixes:
GL45-CTS.program_interface_query.transform-feedback-built-in

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-30 11:26:50 +10:00
Dave Airlie
6effdce92e glsl/ast: subroutineTypes can't be returned from functions.
These types can't be returned.

This fixes:
GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types
for the return type case.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-30 11:25:30 +10:00
Timothy Arceri
db2a35193f glsl: use has_double() helper
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-05-30 11:01:40 +10:00
Timothy Arceri
8f4ac20b6f glsl: fix explicit uniform block alignment
This stops the offset being bumped again when and an explicit
alignment has already been applied.

Fixes alignment issues in:
GL44-CTS.enhanced_layouts.uniform_block_alignment

Note the test still fails due to unrelated issues with doubles.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-05-30 11:01:32 +10:00
Jordan Justen
7398a32c50 i965: Shrink stage_prog_data param array length
It appears we were over-allocating these arrays.

Previously we would use nir->num_uniforms directly for scalar
programs, and multiply it by 4 for vec4 programs.

Instead we should have been dividing by 4 in both cases to convert
from bytes to a gl_constant_value count. The size of gl_constant_value
is 4 bytes.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-29 09:59:55 -07:00
Ilia Mirkin
160063b110 nv50,nvc0: fix the max_vertices=0 case
This is apparently legal. Drop any emit/restarts, and pass a 1 to the
hardware.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-29 09:34:03 -04:00
Ilia Mirkin
f2e7268a55 st/mesa: fix setting of point_size_per_vertex in ES contexts
GL ES 2.0+ does not have a GL_PROGRAM_POINT_SIZE enable, unlike desktop
GL. So we have to go and check the last pre-rasterizer stage to see
whether it outputs a point size or not.

This fixes a number of dEQP tests that use a geometry or tessellation
shader to emit points primitives.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-05-29 09:34:03 -04:00
Marek Olšák
04a78068ff mesa: skip level checking for FramebufferTexture*D if texture is zero
From the OpenGL 4.5 core spec:
  "An INVALID_VALUE error is generated if texture is not zero and level is
  not a supported texture level for textarget, as described above."

Other FramebufferTexture functions already do the right thing.

This fixes the main menu in F1 2015.

Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-29 14:24:23 +02:00
Ilia Mirkin
60341ddd5c st/mesa: expose OES_shader_io_blocks when we have enough for ES 3.1
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-28 20:58:12 -04:00
Vinson Lee
884ac61722 swr: [rasterizer] Do not define _mm256_storeu2_m128i with icc.
Fix build error with icc.

  CXX      libswrAVX_la-swr_clear.lo
icpc: command line warning #10006: ignoring unknown option '-Wdelete-non-virtual-dtor'
In file included from ./rasterizer/jitter/jit_api.h(31),
                 from swr_context.h(30),
                 from swr_clear.cpp(24):
./rasterizer/common/os.h(135): error: expected an identifier
  void _mm256_storeu2_m128i(__m128i *hi, __m128i *lo, __m256i a)
       ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-05-28 14:26:54 -07:00
Thomas Hindoe Paaboel Andersen
df210ff24d i965: add missing return in if statement
Re-add the "return false" that was removed in 0c02d7002d

It seems that something went wrong when merging the patch. The patch
sent to the mailing list does not directly match what was committed.
https://lists.freedesktop.org/archives/mesa-dev/2016-May/118198.html

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-28 11:26:33 -07:00
Ilia Mirkin
c7731a0740 gk110/ir: fix unspilling of predicates from registers
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96258
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>
2016-05-28 13:14:19 -04:00
Samuel Pitoiset
697237b71e nvc0: remove outdated surfaces validation code for GK104
This code was used for validating surfaces with compute but now we use
pipe_image_view instead. Anyway, surfaces support should be
re-introduced properly once OpenCL happens.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-28 15:50:07 +02:00
Samuel Pitoiset
f07ade6881 nvc0: do not always invalidate 3D CBs when using compute
Constant buffers are aliased between 3D and CP on Fermi, but we should
only invalidate them when a compute shader actually uses CBs and not
all the time after a lauching grid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-28 15:50:03 +02:00
Francisco Jerez
357495b94d i965: Update compute workgroup size limit calculation for SIMD32.
This should have the side effect of enabling the ARB_compute_shader
extension on Gen8+ hardware and all Gen7 platforms that didn't
previously expose it (VLV and IVB GT1) due to the number of hardware
threads per subslice being insufficient in SIMD16 mode.

v2: Bump workgroup size limit for GLES too (Jordan).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-27 23:29:06 -07:00
Francisco Jerez
46ce93ed22 i965: Add do32 debug option.
The do32 INTEL_DEBUG option causes the back-end to try to generate a
SIMD32 program when compiling a compute shader regardless of the
specified compute shader workgroup size, which will be useful for
testing SIMD32 code generation in the most common case in which the
workgroup size doesn't exceed the SIMD16 limit so SIMD32 codegen
wouldn't be automatically enabled.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:06 -07:00
Francisco Jerez
864737ce6c i965/fs: Build 32-wide compute shader when needed.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:06 -07:00
Francisco Jerez
37fd13ee2d i965/fs: Extend back-end interface for limiting the shader dispatch width.
This replaces the current fs_visitor::no16() interface with
fs_visitor::limit_dispatch_width(), which takes an additional
parameter allowing the caller to specify the maximum dispatch width a
shader can be compiled with.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:06 -07:00
Francisco Jerez
2d288cb9ea i965/fs: Implement SIMD32 register allocation support.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:06 -07:00
Francisco Jerez
7f10d3983b i965/fs: Remove pre-Gen7 register allocation class micro-optimization.
This was trying to save some one-time init on pre-Gen7 hardware under
the assumption that one would only ever need 1, 2, 4 and 8-wide
registers on those platforms.  However nothing guarantees that those
will be the only VGRF sizes used after lowering and optimization.  In
some cases we may end up with a temporary of different size being
allocated (e.g. by SIMD lowering to zip or unzip a multi-component
register region of a logical send instruction), and there is no
guarantee that they will be optimized away before register allocation
(especially since the compute_to_mrf coalescing pass is
rather... lacking...).  Instead just allocate classes for all possible
VGRF sizes up to MAX_VGRF_SIZE to avoid a crash in pq_test() when we
encounter a variable of any other size.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:06 -07:00
Francisco Jerez
1d5bf46ad1 i965/fs: Don't mutate multi-component arguments in sampler payload set-up.
The Gen5+ sampler message payload construction code steps through the
coordinate and derivative components by induction like 'coordinate =
offset(coordinate, bld, 1)', the problem is that while doing that it
may step one past the end of the coordinate vector causing an
assertion failure in offset() if it happens to be a (single component)
immediate.  Right now coordinates and derivatives are typically passed
as actual registers but that will no longer be the case when we start
propagating constants into logical messages.

Instead express coordinate components in closed form like
'offset(coordinate, bld, i)' -- The end result seems slightly more
readable that way and it allows passing the coordinate and derivative
registers by const reference instead of by value, so it seems like a
clean-up in its own right.

v2: Fold a few post-increment operators into the last MOV
    statement. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:06 -07:00
Francisco Jerez
ad8f66ed33 i965/fs: Fix multiple ACP interference during copy propagation.
This is more fallout from cf375a3333.
It's possible for multiple ACP entries to interfere with a given VGRF
write, so we need to continue iterating even if an overlapping entry
has already been found.

Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:06 -07:00
Francisco Jerez
c88b52745c i965/fs: Fix cmod propagation not to propagate non-identity cmod into CMP(N).
The conditional mod of these instructions determines the semantics of
the comparison itself (rather than being evaluated based on the result
of the instruction as is usually the case for most other instructions
that allow conditional mods), so it's in general not legal to
propagate a conditional mod into a CMP instruction.  This prevents
cmod propagation from (mis)optimizing:

 cmp.z.f0 tmp, ...
 mov.z.f0 null, tmp

into:

 cmp.z.f0 tmp, ...

which gives the negation of the flag result of the original sequence.
I could reproduce this easily with SIMD32 but I don't see any reason
why the problem would be SIMD32-specific, it was most likely working
by luck.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:06 -07:00
Francisco Jerez
8476233ae2 i965/fs: Estimate number of registers written correctly in opt_register_renaming.
The current estimate is incorrect for non-32b types.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:05 -07:00
Francisco Jerez
437e65f9d9 i965/fs: Add (sub)reg_offset asserts to brw_reg_from_fs_reg.
These are completely ignored by the conversion to brw_reg, so they
better be zero.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:05 -07:00
Francisco Jerez
51dd6a60f5 i965/fs: Reset reg_offset of the original destination to zero in compute_to_mrf().
Prevents an assertion failure in the following commit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:05 -07:00
Francisco Jerez
b9eab911ba i965/fs: Skip remove_duplicate_mrf_writes() during SIMD32 runs.
The pass is disabled in SIMD16 dispatch mode for the same reason, it
cannot handle instructions that write multiple MRF registers at once.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:05 -07:00
Francisco Jerez
796238d9e6 i965/fs: Use SIMD8 SSBO GET_BUFFER_SIZE message regardless of the dispatch width.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:05 -07:00
Francisco Jerez
29e4717251 i965/fs: Don't emit duplicated SSBO GET_BUFFER_SIZE instruction unnecessarily.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:05 -07:00
Francisco Jerez
a55452530f i965/fs: Emit fixed width memory fence opcode regardless of the dispatch width.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:05 -07:00
Francisco Jerez
ae730049c6 i965/fs: Return 32 bit mask from fs_builder::sample_mask().
This doesn't actually handle the FS case, just add an assertion for
the moment so I don't forget to update it later on for SIMD32 fragment
shader dispatch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:05 -07:00
Francisco Jerez
8b6edee679 i965/fs: Emit fixed-width null register regardless of the dispatch width.
brw_null_vec() cannot handle widths over 16 but it doesn't really
matter what width we specify for null registers because destination
regions have no width field at the hardware level.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:05 -07:00
Francisco Jerez
298320280f i965/fs: Fix half() to handle more exotic register files.
horiz_offset() is able to deal with a superset of the register files
currently special-cased in half().  Just call horiz_offset() in all
cases.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:05 -07:00
Francisco Jerez
8c9601ef7b i965/fs: Fix horiz_offset() to handle ARF and HW GRF register files.
We'll hit these in some cases during SIMD lowering in 32-wide
programs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:05 -07:00
Francisco Jerez
7d430fc05e i965/fs: Clean up remaining uses of fs_inst::reads_flag and ::writes_flag.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:05 -07:00
Francisco Jerez
ecd7a7255a i965/fs: Keep track of flag dependencies with byte granularity during scheduling.
This prevents false dependencies from being created between
instructions that write disjoint 8-bit portions of the flag register
and OTOH should make sure that the scheduler considers dependencies
between instructions that write or read multiple flag subregisters
at once (e.g. 32-wide predication or conditional mods).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:04 -07:00
Francisco Jerez
0fec265373 i965/fs: Track flag register liveness with byte granularity.
This is required for correctness in presence of multiple 8-wide flag
writes (e.g. 8-wide instructions with a conditional mod set) which
update a different portion of the same 16-bit flag subregister.  Right
now we keep track of flag dataflow with 16-bit granularity and
consider flag writes to have killed any previous definition of the
same subregister even if the write was less than 16 channels wide,
which can cause live flag register updates to be dead code-eliminated
incorrectly.

Additionally this makes sure that we handle 32-wide flag writes and
reads which may span multiple flag subregisters so the current
approach of just setting/testing a single bit from the live set
wouldn't have worked.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:04 -07:00
Francisco Jerez
df1aec763e i965/fs: Define methods to calculate the flag subset read or written by an fs_inst.
v2: Codestyle fixes (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:04 -07:00
Francisco Jerez
ece41df247 i965/fs: Expose arbitrary channel execution groups to the IR.
This generalizes the current fs_inst::force_sechalf flag to allow
specifying channel enable groups other than 0 or 8.  At some point it
will likely make sense to fix the vec4 generator to support arbitrary
execution groups and then move the definition of fs_inst::group into
backend_instruction (e.g. so we can do FP64 in the VEC4 back-end).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:04 -07:00
Francisco Jerez
81bc6de8c0 i965/ir: Make BROADCAST emit an unmasked single-channel move.
Alternatively we could have extended the current semantics to 32-wide
mode by changing brw_broadcast() to emit multiple indexed MOV
instructions in the generator copying the selected value to all
destination registers, but it seemed rather silly to waste EU cycles
unnecessarily copying the exact same value 32 times in the GRF.

The vstride change in the Align16 path is required to avoid assertions
in validate_reg() since the change causes the execution size of the
MOV and SEL instructions to be equal to the source region width.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:04 -07:00
Francisco Jerez
41562eb8f3 i965/fs: Allow specifying arbitrary quarter control to FIND_LIVE_CHANNEL.
This makes FIND_LIVE_CHANNEL behave like a normal instruction for
non-zero quarter control.  On Gen8+ we just leave the quarter control
field of the emitted FBL instruction set to the default value so the
hardware applies the expected shift to the execution mask signals.  On
Gen7 we apply the offset manually by specifying a non-zero subregister
offset in the source region of the FBL instruction.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:04 -07:00
Francisco Jerez
a5a0810960 i965/fs: Allow specifying arbitrary execution sizes up to 32 to FIND_LIVE_CHANNEL.
Due to a Gen7-specific hardware bug native 32-wide instructions get
the lower 16 bits of the execution mask applied incorrectly to both
halves of the instruction, so the MOV trick we currently use wouldn't
work.  Instead emit multiple 16-wide MOV instructions in 32-wide mode
in order to cover the whole execution mask.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:04 -07:00
Francisco Jerez
1e3c58ffaf i965/fs: Lower 32-wide scratch writes in the generator.
The hardware has messages that can write 32 32bit components at once
but the channel enable mask gets messed up.  We need to split them
into several 16-wide scratch writes for the channel enables to be
applied correctly.  The SIMD lowering pass cannot be used for this
because scratch writes are emitted rather late during register
allocation long after SIMD lowering has been done.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:29:02 -07:00
Francisco Jerez
a7d319c00b i965/fs: Implement scratch reads and writes of 4 GRFs at a time.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:28:59 -07:00
Francisco Jerez
fe5cdde2f9 i965/eu: Fix Gen7+ DP scratch message size calculation on Gen7.
Gen7 hardware expects the block size field in the message descriptor
to be the number of registers minus one instead of the log2 of the
number of registers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:28:59 -07:00
Francisco Jerez
fc7107de1d i965/eu: Set execution size explicitly for memory fence send message.
We don't want to emit a 32-wide send message in 32-wide programs.  The
memory fence message should have the same effect regardless of the
execution size (as long as it's valid) so just set it to one.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:28:59 -07:00
Francisco Jerez
5c887326c5 i965/eu: Consider QtrCtrl 3Q-4Q in typed surface message descriptor setup.
In SIMD32 programs the compiler is responsible for providing the
appropriate half of the sample mask in the message header, so the
first and third quarters both map to the first slot group of the
provided 16-bit half, while the second and fourth quarters map to the
second slot group -- IOW they should be equivalent to 1Q and 2Q modulo
two.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:28:59 -07:00
Francisco Jerez
448340d31f i965/fs: Clean up remaining uses of dispatch_width in the generator.
Most of these are bugs because the intended execution size of an
instruction and the dispatch width of the shader aren't necessarily
the same (especially in SIMD32 programs).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:28:59 -07:00
Francisco Jerez
7f28ad8c4d i965/eu: Remove brw_codegen::compressed and ::compressed_stack.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:28:59 -07:00
Francisco Jerez
646213168e i965/eu: Use current exec size instead of p->compressed in surface message generation.
This was kind of an abuse of p->compressed, dataport send message
instructions are always uncompressed.  Use the current execution size
instead since p->compressed is on its way out.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:28:46 -07:00
Francisco Jerez
492286e90b i965/fs: No need to reset predicate control after emitting some instructions.
Trivial clean-up.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:10 -07:00
Francisco Jerez
8ef5637729 i965/fs: Pass current execution size to brw_IF() and brw_DO().
This gets IF and DO instructions working in SIMD32 programs.  brw_IF()
and brw_DO() should probably behave in the same way as other generator
functions that emit control flow instructions and just figure out the
right execution size by themselves from the current execution controls
specified through the brw_codegen argument.  Changing that will
require updating lots of Gen4-5 clipper code though, so for the moment
just pass the current value redundantly from the FS generator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:10 -07:00
Francisco Jerez
fdae8b9f91 i965/eu: Stop using p->compressed to specify the exec size of control flow instructions.
p->compressed won't work for SIMD32, we should just be using the
execution size value specified via p->current instead.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:10 -07:00
Francisco Jerez
0b4cd91071 i965/fs: Extend region width calculation to allow arbitrary execution sizes.
Instead of just halving the execution size when the instruction is
compressed hoping that it will give a legal source region width, we
can calculate the maximum legal width value in closed form from the
component size and stride.  This makes sure that brw_reg_from_fs_reg()
always returns a valid hardware region even for virtual 32-wide
instructions (e.g. send-like instructions) that would seem to exceed
the hardware region width limit after halving.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:10 -07:00
Kenneth Graunke
dabaf4fb96 i965/fs: Pass the compression mode to brw_reg_from_fs_reg().
Curro is planning to eliminate p->compressed, so let's avoid using it
here and just pass in the value directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
[ Francisco Jerez: Pass boolean flag instead of brw_compression enum. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:10 -07:00
Francisco Jerez
3340a66fce i965/fs: Simplify per-instruction compression control setup in generator.
By using the new compression/group control interface.  This will allow
easier extension to support arbitrary channel enable groups at the IR
level.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:10 -07:00
Francisco Jerez
c78edcea8b i965/fs: No need to set compression control at the top of generate_code().
The right value is dependent on the specific IR instruction being
generated so it has to be reset in every iteration of the loop anyway.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:10 -07:00
Francisco Jerez
c19c3d3a52 i965/eu: Fix a bunch of compression control bugs in the generator.
Most of these were resetting quarter control to zero incorrectly even
though everything they needed to do was disable instruction
compression -- The brw_SAMPLE() case was doing the right thing but it
can be simplified slightly by using the new compression control
interface.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:10 -07:00
Francisco Jerez
3dffd81583 i965/eu: Define alternative interface for setting compression and group controls.
This implements some simple helper functions that can be used to
specify the group of channel enable signals and compression enable
that apply to a brw_inst instruction.

It's intended to replace brw_set_default_compression_control
eventually because the current interface has a number of shortcomings
inherited from the Gen-4-5-centric representation of compression and
group controls as a single non-orthogonal enum: On the one hand it
doesn't work for specifying arbitrary group controls other than 1Q and
2Q, which are frequently useful in SIMD32 and FP64 programs.  On the
other hand the current interface forces you to update the compression
*and* group controls simultaneously, which has been the source of a
number of generator bugs (a bunch of them fixed in this series),
because in many cases we would end up resetting the group controls to
zero inadvertently even though everything we wanted to do was disable
instruction compression -- The latter seems especially unfortunate on
Gen6+ hardware which have no explicit compression control, so we would
end up bashing the quarter control field of the instruction for no
benefit.

Instead of a single function that updates both at the same time
introduce separate interfaces to update one or the other independently
preserving the current value of the other (which typically comes from
the back-end IR so it has to be respected).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:10 -07:00
Francisco Jerez
5db4d62395 i965/fs: Remove FS_OPCODE_PACK_STENCIL_REF virtual instruction.
It's just a byte MOV with strided source.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:10 -07:00
Francisco Jerez
29ce110be6 i965/fs: Remove extract virtual opcodes.
These can be easily represented in the IR as a MOV instruction with
strided source so they seem rather redundant.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:09 -07:00
Francisco Jerez
9dcb8ff6a1 i965: Define brw_int_type() helper.
Intended as a (partial) inverse of type_sz().  Will be useful in the
next commit and some other SIMD32 generator changes I have queued up.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:09 -07:00
Francisco Jerez
bb89beb26b i965/fs: Remove manual splitting of DDY ops in the generator.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:22:02 -07:00
Francisco Jerez
982c48dc34 i965/fs: Remove manual unrolling of BFI instructions from the generator.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:23 -07:00
Francisco Jerez
95272f5c7e i965/fs: Drop Gen7 CMP SIMD unrolling workaround from the generator.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:23 -07:00
Francisco Jerez
f14b9ea6e6 i965/fs: Drop lowering code for a few three-source instructions from the generator.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:23 -07:00
Francisco Jerez
117a9a0a64 i965/fs: Set default access mode to Align1 for all instructions in the generator.
Currently the generator code for most opcodes honours the default
access mode (which should typically be Align1 in the scalar back-end),
but generate_code() doesn't set it explicitly which means that the
access mode from a previous instruction could leak into the following
ones if you did something special and weren't careful enough to save
and restore the previous access mode.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:22 -07:00
Francisco Jerez
3a541d0c0b i965/fs: Remove handcrafted math SIMD lowering from the generator.
Most of this wouldn't have worked for SIMD32 and had various
dispatch_width and compression control bugs.  It's mostly dead now
with SIMD lowering of math instructions turned on in the compiler.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:22 -07:00
Francisco Jerez
cf5443f984 i965/fs: Limit SIMD width of various virtual opcodes to the maximum supported value.
Which is 16 or 8 in most cases.  This will make sure that 32-wide
virtual instructions get chopped up into chunks of their maximum
execution size.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:22 -07:00
Francisco Jerez
197833caa3 i965/fs: Lower LOAD_PAYLOAD instructions of unsupported width.
Only per-channel LOAD_PAYLOAD instructions can be lowered, which
should cover everything that comes in from the front-end.

LOAD_PAYLOAD instructions used to construct actual message payloads
cannot be easily lowered because they contain headers and vectors of
variable type that aren't necessarily channel-aligned -- We shouldn't
find any of them in the program at SIMD lowering time though because
they're introduced during logical send lowering.

An alternative that may be worth considering would be to re-run the
SIMD lowering pass after LOAD_PAYLOAD lowering instead of this patch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:22 -07:00
Francisco Jerez
9eea3df29f i965/fs: Lower DDY instructions to SIMD8 during SIMD lowering time
...on hardware lacking compressed Align16 support.  Will allow
simplifying the generator code and fixing it for SIMD32 codegen.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:22 -07:00
Francisco Jerez
12ae87abb1 i965/fs: Apply usual FPU-like execution size restrictions to MULH.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:22 -07:00
Francisco Jerez
dea9c1df89 i965/fs: Calculate maximum execution size of MOV_INDIRECT correctly.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:22 -07:00
Francisco Jerez
122e031548 i965/fs: Assert that IF instruction with embedded compare has legal exec_size.
We shouldn't encounter these right now but if we did it wouldn't be
possible for the SIMD lowering pass to split it into multiple
instructions because of its side effects on control flow, so just
assert in order to kill the program.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:22 -07:00
Francisco Jerez
98c8bef01c i965/fs: Implement HSW BFI exec size workarounds in the SIMD lowering pass.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:22 -07:00
Francisco Jerez
88d9cc1563 i965/fs: Implement workaround for IVB CMP dependency race in the SIMD lowering pass.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:22 -07:00
Francisco Jerez
a6bf5f88c7 i965/fs: Enforce common regioning restrictions by SIMD splitting.
This change addresses a number of hardware restrictions on the source
and destination regions and other execution controls of regular
FPU-like instructions that in some cases can be avoided by reducing
the execution size of the instruction.  Some of these restrictions
(e.g. the one about 3src instructions not supporting compression on
some hardware) are currently being worked around case by case in the
generator with ad-hoc splitting code that is buggy in several ways
(e.g. doesn't handle non-trivial execution controls which would break
SIMD32 code), but it seems cleaner to implement as many restrictions
as we can in a single lowering pass since that will allow us to
simplify some of the surrounding code considerably and also make sure
that we don't forget applying them in the future.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:21 -07:00
Francisco Jerez
2b5adb942b i965/fs: Enforce extended math exec size limits during SIMD lowering.
This teaches the SIMD lowering pass about the hardware limits on the
execution size of math instructions, which will allow simplifying the
generator code and at the same time get rid of a number of bugs in the
manual SIMD unrolling done currently that prevent SIMD32 codegen from
working.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:21 -07:00
Francisco Jerez
a8e7b4f1d9 i965/fs: Handle SAMPLEINFO consistently like other texturing instructions.
Seems like this texturing opcode was missing its logical counterpart
which would prevent it from taking advantage of the SIMD lowering
infrastructure, define it and plumb it through the back-end.  At some
point we'll likely want to emit a single SAMPLEINFO message shared
among all channels irrespective of this change, but for the moment
this should be enough to get the intrinsic working in SIMD32 mode.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:21 -07:00
Francisco Jerez
99b5476d33 i965/fs: Lower math into Gen4-5 send-like instructions in lower_logical_sends.
The benefit is we will be able to use the SIMD lowering pass to unroll
math instructions of unsupported width and then remove some cruft from
the generator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:21 -07:00
Francisco Jerez
e531b7907a i965/fs: Add missing get_latency_gen7() cases for the Gen7 pull constant opcodes.
This was causing the scheduler to be rather optimistic about the
latency of pull constant opcodes on Gen7+.  This might seem to
increase the cycle count estimate calculated by the scheduler itself
for some shaders, even though the actual cycle count should actually
be decreased.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:21 -07:00
Francisco Jerez
ed4d0e41ac i965/fs: Rename Gen4 physical varying pull constant load opcode.
For consistency with the Gen7 variant.  I'm not doing the same to the
uniform pull constant message at this point because the non-GEN7 one
is still overloaded to be either an expression-like logical
instruction or a Gen4-specific physical send message.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:21 -07:00
Francisco Jerez
64a6cb87f1 i965/fs: Implement promotion of varying pull loads on Gen4 during SIMD lowering.
Varying pull constant loads inherit the same limitation of pre-ILK
hardware that requires expanding SIMD8 texel fetch instructions to
SIMD16, we can deal with pull constant loads in the same way it's done
for texturing during SIMD lowering.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:21 -07:00
Francisco Jerez
d8a3294ac2 i965/fs: Hide varying pull constant load message setup behind logical opcode.
This will allow the SIMD lowering pass to split 32-wide varying pull
constant loads (not natively supported by the hardware) into 16-wide
instructions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:21 -07:00
Francisco Jerez
0bc5ad8d19 i965/fs: Avoid constant propagation when the type sizes don't match.
The case where the source type of the instruction is smaller than the
immediate type could be handled by calculating the portion of the
immediate read by the instruction (assuming that the source channels
are aligned with the destination channels of the copy) and then
representing the same value as an immediate of the source type
(assuming such an immediate type exists), but the code below doesn't
do that, so just bail for the moment.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:20 -07:00
Francisco Jerez
52cc80d859 i965/fs: Fix CSE temporary copy for some LOAD_PAYLOAD corner cases.
If the LOAD_PAYLOAD instruction only has header sources it's possible
for the number of registers written to be less than or equal to the
SIMD component size, in which case it would take the single-MOV path
at the bottom which would cause the channel enable masks to be applied
incorrectly to the header contents and/or cause it to write past the
end of the allocated temporary.  If the instruction is either
LOAD_PAYLOAD or doesn't write exactly one component the MOV path is
going to mess up the program so just don't use it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:20 -07:00
Francisco Jerez
c5f224145a i965/fs: Handle instruction predication in SIMD lowering pass.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:20 -07:00
Francisco Jerez
1760c24b4b i965/fs: No need to unzip SIMD-periodic sources during SIMD lowering.
If the source value is going to the same for all SIMD-lowered chunks
of the instruction there should be no need to unzip the value into
multiple temporary registers one for each lowered chunk.  As a side
effect this fixes SIMD lowering of instructions with a vector
immediate source.  In the long term it *might* still be worth fixing
offset() to handle vector immediates correctly though, this should be
good enough for the moment.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:20 -07:00
Francisco Jerez
168163f5f0 i965/fs: Generalize is_uniform() to is_periodic().
This will be useful in the SIMD lowering pass.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:20 -07:00
Francisco Jerez
b736e78ddb i965/fs: Fix byte_offset() for MRF/ARF/FIXED_GRF regs.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 23:19:20 -07:00
Francisco Jerez
2db9dd5aeb i965/fs: Fix off-by-one region overlap comparison in copy propagation.
This was introduced in cf375a3333 but
the blame is mine because the pseudocode I sent in my review comment
for the original patch suggesting to do things this way already had
the off-by-one error.  This may have caused copy propagation to be
unnecessarily strict while checking whether VGRF writes interfere with
any ACP entries and possibly miss valid optimization opportunities in
cases where multiple copy instructions write sequential locations of
the same VGRF.

Cc: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-05-27 23:19:20 -07:00
Ronie Salgado
8f538d9ae0 anv/cmd_buffer: Don't delete command buffers in ResetCommandPool()
v2 (Jason Ekstrand): Destroy command buffers in DestroyCommandPool().

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95034
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-27 18:56:33 -07:00
Brian Paul
747754f027 gallium/util: another s/unsigned/enum pipe_prim_type/ for clang
Trivial.
2016-05-27 18:42:21 -06:00
Jason Ekstrand
b93b5935a7 anv: Try the first 8 render nodes instead of just renderD128
This way, if you have other cards installed, the Vulkan driver will still
work.  No guarantees about WSI working correctly but offscreen should at
least work.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95537
2016-05-27 17:18:33 -07:00
Jason Ekstrand
e023c104f7 anv: strdup the device path into the physical device
This way we don't have to assume that the string coming in is a piece of
constant data that exists forever.
2016-05-27 17:18:33 -07:00
Jason Ekstrand
9048dee328 anv/formats: Exit early for unsupported formats 2016-05-27 17:17:09 -07:00
Jason Ekstrand
10bc9f7024 anv/formats: Map VK_FORMAT_UNDEFINED to ISL_FORMAT_UNSUPPORTED
At one point in time, we may have used the mapping to ISL_FORMAT_RAW for
certain buffer surfaces but that time has long since passed.  This fixes a
bug where doing format queries on VK_FORMAT_UNDEFINED would assert-fail.
2016-05-27 17:17:09 -07:00
Jason Ekstrand
b16326c740 anv/clear: Remove an unused variable 2016-05-27 17:17:09 -07:00
Brian Paul
8beb6f3c9c gallium/util: another unsigned -> enum pipe_prim_type change
gcc didn't warn about the unsigned / enum pipe_prim_type mismatch
between the .c and .h file.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-27 17:55:05 -06:00
Jordan Justen
47e2a57fe9 i965/compute: Fix uniform init issue when SIMD8 is skipped
In d8347f12ea, we added support for
skipping SIMD8 generation when the program local size is too large for
SIMD8 to be usable. This change was missed in that commit.

This bug would impact gen7 platforms when the compute shader local
size is greater than 512, and gen8 platforms when the local size is
greater than 448.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-27 16:44:00 -07:00
Bas Nieuwenhuizen
65d4ba6f20 docs: Mention GL4.3 and ES3.1 support for nvc0 and radeonsi
v2: also update the introductory text.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-28 01:04:03 +02:00
Jason Ekstrand
fb2a5ceb32 anv: Emit DRAWING_RECTANGLE once at driver initialization
Also, we don't actually need it for clipping because meta always colors
inside the lines and, for all other operations, the user is required to set
a scissor.  Since DRAWING_RECTANGLE stalls the GPU, we want to emit it as
little as possible.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-27 15:18:11 -07:00
Jason Ekstrand
3a83c176ea anv/cmd_buffer: Only emit PIPE_CONTROL on-demand
This is in contrast to emitting it directly in vkCmdPipelineBarrier.  This
has a couple of advantages.  First, it means that no matter how many
vkCmdPipelineBarrier calls the application strings together it gets one or
two PIPE_CONTROLs.  Second, it allow us to better track when we need to do
stalls because we can flag when a flush has happened and we need a stall.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-27 15:18:09 -07:00
Jason Ekstrand
7120c75ec3 genxml: Make PIPE_CONTROL::CommandStreamerStallEnable a boolean
This has been declared as a uint since SNB but it's only one bit.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-27 15:18:07 -07:00
Jason Ekstrand
b26bd6790d anv/clear: Only clear the render area when doing subpass clears
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-27 15:18:04 -07:00
Jason Ekstrand
5432487792 anv: Move push constant allocation to the command buffer
Instead of blasting it out as part of the pipeline, we put it in the
command buffer and only blast it out when it's really needed.  Since the
PUSH_CONSTANT_ALLOC commands aren't pipelined, they immediately cause a
stall which we would like to avoid.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-27 15:17:43 -07:00
Bas Nieuwenhuizen
2cee0d0f9c radeonsi: enable OpenGL 4.3
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-27 22:28:11 +02:00
Dave Airlie
0438bc76e2 nouveau: enable GL 4.3 on kepler/fermi
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-28 05:52:13 +10:00
Marek Olšák
43550f25ed radeonsi: always reserve output space for tess factors
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dave Airlie <airlied@redhat.com>
2016-05-27 21:40:43 +02:00
Dave Airlie
c44513a1f3 glsl/linker: call link_uniform blocks on linked shader.
The old code called this on the prelinked shader list,
but at this point we have the linked shader, so we should
call the interface on that alone.

This fixes a regression in:
dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.13
introduced in
5b2675093e
glsl: handle implicit sized arrays in ssbo

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96228
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reported-by: Mark James
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-28 05:35:53 +10:00
Dave Airlie
f0254fdd07 mesa/get: drop unused extension checks.
These all show up as unused warnings here, so drop them for now.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-28 05:29:23 +10:00
Bas Nieuwenhuizen
4717d5a2d3 gallium/ddebug: Add passthrough for query_memory_info.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-27 20:00:07 +02:00
Jason Ekstrand
0482efdc93 nir/inline: Also rewrite param derefs for texture instructions
Without this, samplers get left hanging as derefs to variables that don't
actually exist.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-05-27 10:28:27 -07:00
Jason Ekstrand
2522180845 nir/inline: Break the guts of rewrite_param-derefs into a helper
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-05-27 10:28:27 -07:00
Jason Ekstrand
d19c406395 nir/inline: Make the rewrite_param_derefs helper work on instructions
Now that we have the better nir_foreach_block macro, there's no reason to
use the archaic block version for everything.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-05-27 10:28:27 -07:00
Jason Ekstrand
2fcba404f8 nir/inline: Don't use foreach_instr_safe unless we need to
Suggested-by: Connor Abbott <cwabbott0@gmail.com>
2016-05-27 10:28:27 -07:00
Roland Scheidegger
9247570d42 gallivm: eliminate a unnecessary AND with unorm lerps
Instead of doing a add and then mask out the upper bits, we can
simply do a add with a half wide type (this, of course, assumes
the hw can actually do it...), so we'll get the required zero
in the upper bits automatically.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-27 19:11:28 +02:00
Roland Scheidegger
17d685c426 gallium/util: use enum pipe_prim_type instead of unsigned some more
There were complaints from a mingw build:
u_draw.h:134:14: error: invalid conversion from ‘uint {aka unsigned int}’
to ‘pipe_prim_type’ [-fpermissive]

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-27 19:11:28 +02:00
Brian Paul
2318d2015a svga: remove unneeded casts in get_query_result_vgpu9() calls
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-05-27 10:01:12 -06:00
Brian Paul
9be122e9b0 svga: use MAYBE_UNUSED to silence release-build warnings
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-05-27 10:00:56 -06:00
Ben Widawsky
8314dd7ff2 isl: Fix some tautological-compare warnings
Fixes:
isl.c:62:22: warning: self-comparison always evaluates to true [-Wtautological-compare]
    assert(ISL_DEV_GEN(dev) == dev->info->gen);
                      ^~
isl.c:63:33: warning: self-comparison always evaluates to true [-Wtautological-compare]
    assert(ISL_DEV_USE_SEPARATE_STENCIL(dev) == dev->use_separate_stencil);

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-26 21:59:17 -07:00
Ilia Mirkin
4ccf8c952a mesa: add support for GLSL ES 3.20 version string
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-26 21:25:53 -04:00
Ilia Mirkin
faae9ab2ee mapi: expose new functions in GL ES 3.2
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-26 21:25:53 -04:00
Ilia Mirkin
df2881381a nvc0/ir: handle a load's reg result not being used for locked variants
For a load locked, we might not use the first result but the second
result is the predicate result of the locking. In that case the load
splitting logic doesn't apply (which is designed for splitting 128-bit
loads). Instead we take the predicate and move it into the first
position (as having a dead result in first def's position upsets all
sorts of things including RA). Update the emitters to deal with this as
well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-26 21:23:49 -04:00
Ilia Mirkin
04ecad97ff nvc0/ir: avoid generating illegal instructions for compute constbuf loads
For user-supplied constbufs, fileIndex is 0. In that case, when we
subtract 1, we'll end up loading from constbuf offset -16. This is
illegal, and there are asserts to avoid it. Normally we'd just DCE it,
but no point in generating the instructions if they're not going to be
used.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-26 21:23:49 -04:00
Rob Clark
4f98c94be7 gallium/util: fix build break
Missing #include caused build breaks after 21a3fb9cd.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-26 20:59:08 -04:00
Jason Ekstrand
9f9f229359 nir/spirv: Allow pointless variable decorations on inputs
SPIR-V specifies that a bunch of stuff gets applied to types.  This means
taht a local variable could get, for instance, an array stride.  Just
because it's pointless doesn't mean you'll never see it.
2016-05-26 17:10:50 -07:00
Brian Paul
1ec45a1948 gallium/util: use enum pipe_prim_type in u_prim.h functions
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul
7a49b41436 util/indices: move duplicated assignments out of switch cases
Spotted by Roland.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul
46be65c681 gallium: change pipe_draw_info::mode to be pipe_prim_type
Makes debugging with gdb a little nicer.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul
a25ae485a6 util/indices,svga: s/unsigned/enum pipe_prim_type/
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul
21a3fb9cd8 util: s/unsigned/enum pipe_resource_usage/ for buffer usage variables
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul
45078e8890 svga: s/unsigned/enum pipe_resource_usage/ for buffer usage variables
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:18 -06:00
Brian Paul
d21a309c6c svga: s/unsigned/enum pipe_prim_type/ for primitive type variables
Proper enum types were only added recently.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul
90afd7b7ef svga: fix test for unfilled triangles fallback
VGPU10 actually supports line-mode triangles.  We failed to make use of
that before.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul
2c07c40d2f svga: clean up and improve comments in svga_draw_private.h
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul
0f983e1793 util/indices: implement unfilled (tri->line) conversion for adjacency prims
Tested with new piglit gl-3.2-adj-prims test.

v2: re-order trisadj and tristripadj code, per Roland.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul
d6c2c7d710 util/indices: implement provoking vertex conversion for adjacency primitives
Tested with new piglit gl-3.2-adj-prims test.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul
479d364c39 util/indices: assert that the incoming primitive is a triangle type
The unfilled index translator/generator functions should only be
called when the primitive mode is one of the triangle types.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul
26de558072 util/indices: formatting, whitespace fixes in u_unfilled_indices.c
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul
24eadb4810 util/indices: improve comments in u_indices.h
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-05-26 17:44:17 -06:00
Brian Paul
5393238765 svga: fix primitive mode (point/line/tri) test for unfilled primitives
The original mode test was valid before we had GS support.

Regression tested with full piglit run.  Though, I don't think we have
any piglit tests that exercise drawing unfilled adjacency primitives.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-05-26 17:44:17 -06:00
Ian Romanick
b7af108d3e i965: Enable GL_OES_shader_io_blocks
Only one dEQP io_blocks test fails.  This test fails for the same reason
as the match_different_member_struct_names test in a previous commit.

dEQP-GLES31.functional.separate_shader.validation.io_blocks.match_different_member_struct_names

v2: Add to release notes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-05-26 16:24:25 -07:00
Ian Romanick
660240da9e glsl: Allow shader interface blocks in GLSL ES
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-05-26 16:24:25 -07:00
Ian Romanick
7a3093efcc glsl: Add a has_shader_io_blocks helper
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-05-26 16:24:25 -07:00
Ian Romanick
f0902ee813 mesa: Add extension tracking for GL_OES_shader_io_blocks
v2: Also support GL_EXT_shader_io_blocks.  It's pretty much identical to
the OES extension.  Suggested by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-05-26 16:24:25 -07:00
Ian Romanick
326a269c77 mesa: Only validate SSO shader IO in OpenGL ES or debug context
v2: Move later in series to avoid issues with Gallium drivers and debug
contexts.  Suggested by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-05-26 16:23:53 -07:00
Ian Romanick
3722c76001 mesa: Remove old validate_io function
The new validate_io catches all of the cases (and many more) that the
old function caught.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-05-26 16:22:25 -07:00
Ian Romanick
bd3f15cffd mesa: Additional SSO validation using program_interface_query data
Fixes the following dEQP tests on SKL:

dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_smooth_fragment_flat
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_implicit_explicit_location_1
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_array_element_type
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_none
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_order
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_type
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_centroid_fragment_flat
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_array_length
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_type
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_precision
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_explicit_location_type
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_centroid
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_explicit_location
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_smooth
dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_name

It regresses one test:

dEQP-GLES31.functional.separate_shader.validation.varying.match_different_struct_names

Hoever, this test is based on language in the OpenGL ES 3.1 spec that I
believe is incorrect.  I have already submitted a spec bug:

https://www.khronos.org/bugzilla/show_bug.cgi?id=1500

v2: Move spec quote about built-in variables to the first place where
it's relevant.  Suggested by Alejandro.

v3: Move patch earlier in series, fix rebase issues.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v2]
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> [v2]
2016-05-26 16:21:01 -07:00
Ian Romanick
cfff746297 mesa: Track the additional data in gl_shader_variable
The interface type, interpolation mode, precision, the type of the
outermost structure, and whether or not the variable has an explicit
location will be used for SSO validation on OpenGL ES.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-05-26 16:19:16 -07:00
Jason Ekstrand
15e553daf0 nir: Make nir_const_value a union
There's no good reason for it to be a struct of an anonymous union.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96221
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-26 16:03:44 -07:00
Kenneth Graunke
e7776fa947 i965: Use the buffer object size for VERTEX_BUFFER_STATE's size field.
commit 7c8dfa78b9 (i965/draw: Use the real size for vertex
buffers) changed how we programmed the VERTEX_BUFFER_STATE size field.

Previously, we programmed it to the size of the actual underlying BO,
which is page-aligned, and potentially much larger than the GL buffer
object.  This violated the ARB_robust_buffer_access spec.

With that change, we started programming it based on the range of data
we expect the draw call to actually access - which is based on the
min_index and max_index information provided to glDrawRangeElements().

Unfortunately, applications often provide inaccurate range information
to glDrawRangeElements().  For example, all the Unreal demos appear to
draw using a range of [0, 3] when the index buffer's actual index range
is [0, 5].  Such results are undefined, and we are absolutely allowed
to restrict access to the range they specified.  However, the failure
mode is usually that nothing draws, or misrendering with wild geometry,
which is kind of bad for a common mistake.  And people tend to assume
the range information isn't that important when data is in VBOs.

There's no real advantage, either.  ARB_robust_buffer_access only
requires us to restrict access to the GL buffer object size, not
the range of data we think they should access.  Doing that allows
buggy applications to still function.  (Note that we still use this
information for busy-tracking, so if they try to overwrite the data
with glBufferSubData, they'll still hit a bug.)  This seems to be
safer.

We may want to provide the more strict range as a debug option,
or scan the VBO and warn against bogus glDrawRangeElements in
debug contexts.  That can be done as a later patch, though.

Makes Unreal demos draw again.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-26 15:56:41 -07:00
Samuel Pitoiset
e01a482182 nvc0: invalidate textures/samplers between 3D and CP on Fermi
Like constant buffers, samplers and textures are aliased on Fermi and
we need to invalidate the state when switching from 3D to CP and vice
versa.

This fixes rendering issues in the UE4 demos.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-26 23:51:22 +02:00
Jason Ekstrand
9f0bc0f2b3 anv: Stop linking against libmesa.la and libdri_test_stubs.la
This brings the final size of an optimized non-debug build of the Vulkan
driver down to 2.9 MB as opposed to 8.7 MB for the dri driver.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-26 14:13:38 -07:00
Jason Ekstrand
057259655e i965: Don't link libmesa or libdri_test_stubs into tests
Now that the compiler has been completely separated from libmesa, we no
longer need these.  We can make the tests much smaller by not linking them
in.  This also ensures that anyone who runs make check won't accidentally
put in any dependencies from the compiler to the rest of mesa core.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-26 14:13:38 -07:00
Jason Ekstrand
870ff6cd38 i965: Move compiler debug functions to intel_screen.c
They reference the compiler so they shouldn't go in libi965_compiler.la.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-26 14:13:38 -07:00
Jason Ekstrand
327161a48d i965/test: Remove the fragment/vertex_program field from test visitors
None of them are actually using it.  It's a relic of an older compiler
interface that required a gl_program.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-26 14:13:38 -07:00
Jason Ekstrand
e0ae10c49a i965: Move brw_new_shader to brw_link.cpp
That's where brw_link_shader lives and they seem to go together.  Also,
this gets it out of libi965_compiler.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-26 14:13:38 -07:00
Jason Ekstrand
5136b67915 i965: Move brw_nir_lower_uniforms.cpp to i965_FILES
This gets it out of i965_compiler.la

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-26 14:13:38 -07:00
Jason Ekstrand
5e43ba7e9e i965: Move brw_create_nir to brw_program.c
This way it's no longer part of libi965_compiler.la since it depends on
GLSL and ARB program stuff.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-26 14:13:38 -07:00
Jason Ekstrand
86a2447eec i965/nir: Move the type_size_*_bytes functions to brw_nir.h
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-26 14:13:38 -07:00
Jason Ekstrand
58d1e82d32 ptn: Include nir.h
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-26 14:13:38 -07:00
Jason Ekstrand
32210dea8e compiler: Move glsl_to_nir to libglsl.la
Right now libglsl.la depends on libnir.la so putting it in libnir.la
adds a dependency on libglsl.la that goes the wrong direction.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-26 14:13:38 -07:00
Ben Widawsky
ddcfc35f62 i965/sklgt4: Implement depth/timestamp write w/a
The stated bug describes a scenario in which a post sync write operation for
depth or timestamp can be ignored. There are two workarounds suggested, the
first and easier is to simply do a cs stall when we do these type of writes.
The second option is to do a PIPE_CONTROL flush after the post sync but before
the data is required.

Generally, I believe the data written out is consumed by the application on the
CPU side and so doing the easier of the two is ideal. Furthermore, these queries
aren't tremendously common in the perf sensitive apps I have looked at. However,
there could be cases where a shader stage might directly consume the data, and
as a result option 2 may be desirable.

This patch goes with the easier solution for now.

gen9lp bug_de_id=2137196

By itself, this does *not* fix any of the GT4 hangs we're currently
experiencing.

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-26 14:08:17 -07:00
Ben Widawsky
f1fa8b4a1c i965/bxt: Add 2x6 variant
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-26 14:06:43 -07:00
Bas Nieuwenhuizen
43d7305a40 radeonsi: Allow TES distribution between shader engines.
The R_028B50_VGT_TESS_DISTRIBUTION value is copied from
amdgpu-pro. Smaller values in the ACCUM fields seem to
decrease the performance advantage from this patch, higher
values don't seem to matter.

v2: Add distribution mode field enums.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
f91c85b29b radeonsi: Process multiple patches per threadgroup.
Using more than 1 wave per threadgroup does increase performance
generally.  Not using too many patches per threadgroup also
increases performance. Both catalyst and amdgpu-pro seem to
use 40 patches as their maximum, but I haven't really seen
any performance increase from limiting the number of patches
to 40 instead of 64.

Note that the trick where we overlap the input and output LDS
does not work anymore as the insertion of the tess factors
changes the patch stride.

v2: - Add comment about LDS assumptions.
    - Add constant for buffer size.
    - Fix code style.

v3: - Correct limits for not splitting patches between waves.
    - Set max num_patches to 40 as in the proprietary driver.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
fd0a7a382f radeonsi: Add barrier before writing the tess factors.
The factors may be stored to LDs by another invocation than
the invocation for vertex 0.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
fee3160af9 radeonsi: Enable dynamic HS.
This allows running the TES on different CU's than the
TCS which results in performance improvements.

v2: Only write the control word from one invocation.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
26f436132b radeonsi: Remove LDS layout user SGPR's from TES.
They are unused.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
a4e2146a9d radeonsi: Use buffer loads and stores for passing data from TCS to TES.
We always try to use 4-component loads, as LLVM does not combine loads
and they bypass the L1 cache.

We can't use a similar strategy for stores and this is especially
notable with the tess factors, as they are often set with separate
MOV's per component in the TGSI.

We keep storing to LDS and the LDS space, so we can load the outputs
later, either due to the shader, of for wrting the tess factors.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
6217716e8f radeonsi: Store inputs to memory when not using a TCS.
We need to copy the VS outputs to memory. I decided to do this
using a shader key, as the value depends on other shaders.

I also switch the fixed function TCS over to monolithic, as
otherwisze many of the user SGPR's need to be passed to the
epilog, which increases register pressure, or complexity to
avoid that. The main body of the fixed function TCS is not
that interesting to precompile anyway, since we do it on
demand and it is very small.

v2: Use u_bit_scan64.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
7846fa8768 radeonsi: Add offchip buffer address calculation.
Instead of creating a memory area per patch and per vertex, we put
the same attribute of every vertex & patch together. Most loads
and stores access the same attribute across all lanes, only for
different patches and vertices.

For the TCS this results in tightly packed data for 4-component
stores.

For the TES this is not the case as within a patch the loads
often also access the same vertex. However if there are < 4
vertices/patch, this still results in a reduction of the number
of cache lines. In the LDS situation we only do better than worst
case if the data per patch < 64 bytes, which due to the
tessellation factors is pretty much never.

We do not use hardware swizzling for this. It would slightly reduce
the number of executed VALU instructions, but I had issues with
increased wait times that I haven't been able to solve yet.
Furthermore, the tbuffer_store intrinsic does not support both
VGPR offset and an index, so we have a problem storing
indirectly indexed outputs. This can be solved by temporarily
storing arrays in LDS and then copying them, but I don't think
that is worth the effort. The difference in VALU cycles
hardware swizzling gives is about 0.2% of total busy cycles.
That is without handling the array case.

I chose for attributes instead of components as they are often
accessed together, and the software swizzling takes VALU cycles
for calculating offsets.

v2: - Rename functions to get_tcs_tes_buffer_address.
    - multiply by 16 as late as possible.
    - Use  tgsi_full_src_register_from_dst.
    - Remove some bad comments.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
c49e68dc4b radeonsi: Add user SGPR for the layout of the offchip buffer.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
d9a0c54f6f radeonsi: Use correct parameter index for LS_OUT_LAYOUT.
This happens to be in the right position, but that changes
when TCS/TES get new parameters.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
3e7a7a9a65 radeonsi: Add buffer load functions.
v2: - Use llvm.admgcn.buffer.load instrinsics for new LLVM.
    - Code style fixes.

v3: - Code style fix.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
9fdb778702 radeonsi: Define build_tbuffer_store_dwords earlier to support new users.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
5c34562d7c radeonsi: Add offchip tessellation parameters.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen
d27ff7d683 radeonsi: Add buffer for offchip storage between TCS and TES.
The buffer is quite large, but should only be allocated if the
application uses tessellation. Most non-games don't.

v2: - Use the correct register for SI.
    - Add define for block size.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-26 22:07:04 +02:00
Rob Clark
6e51fe75a4 tgsi: fix coverity out-of-bounds warning
CID 1271532 (#1 of 1): Out-of-bounds read (OVERRUN)34. overrun-local:
Overrunning array of 2 16-byte elements at element index 2 (byte offset
32) by dereferencing pointer &inst.Dst[i].

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-26 15:17:49 -04:00
Rob Clark
3d66ba971e tgsi: fix out of bounds access
Not sure why coverity calls this an out-of-bounds read vs out-of-bounds
write.

CID 1358920 (#1 of 1): Out-of-bounds read (OVERRUN)9. overrun-local:
Overrunning array r of 3 16-byte elements at element index 3 (byte
offset 48) using index chan (which evaluates to 3).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-26 15:17:49 -04:00
Anuj Phogat
0c02d7002d i965: Don't use fast copy blit in case of logical operations other than GL_COPY
XY_FAST_COPY_BLT command doesn't have a field for raster operation. So, fall
back to using XY_SRC_COPY_BLT to handle those cases.

Fixes piglit test gl-1.1-xor-copypixels when fast copy blit is enabled
for all tiling formats.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-26 10:57:09 -07:00
Anuj Phogat
97f0f91cc1 i965/gen9: Remove the halign/valign field setup code in fast copy blit
Experimentation with different values of src/dst horizontal/vertical
alignment showed that these fileds are not used on gen9 hardware.

A recent update in graphics specs has removed these fields from
XY_FAST_COPY_BLT command.

Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Chad Versace <chad.versace@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-26 10:57:09 -07:00
Samuel Pitoiset
c52e92ec3a nvc0: allow to monitor MP perf counters with compute shaders
To read out MP perf counters we use a compute shader and need to upload
input data like a 64-bits addr used to store the values and a sequence
ID for synchronization. Currently, this input data is uploaded as user
uniforms which means that it's sticked to c0[], but if a compute shader
from a real application is used, monitoring those performance counters
will just overwrite some data and miserably crash.

Instead, sticking the 64-bits addr and the sequence into the driver
constant buffer seems like much better and will allow to monitor
counters with GL 4.3 apps.

Tested on GF119 and GK110, but should not hurt anything on GK104.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-26 19:34:57 +02:00
Kristian Høgsberg Kristensen
329d115ac6 mesa: Move robustness code to main/robustness.c
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-26 09:37:17 -07:00
Kristian Høgsberg Kristensen
d7d729b965 docs: Mark GL_KHR_robustness done for GLES3.2 as well
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-26 09:36:36 -07:00
Plamena Manolova
a0674ce5c4 egl: Additional attribute validation for eglCreatePbufferSurface
eglCreatePbufferSurface should generate an EGL_BAD_MATCH error if:
1: The EGL_TEXTURE_FORMAT attribute is EGL_NO_TEXTURE and EGL_TEXTURE_TARGET
is something other than EGL_NO_TEXTURE
2: EGL_TEXTURE_FORMAT is something other than EGL_NO_TEXTURE and
EGL_TEXTURE_TARGET is EGL_NO_TEXTURE.

This fixes the dEQP-EGL.functional.negative_api.create_pbuffer_surface test.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-26 08:02:48 -07:00
Marek Olšák
8539c9bf31 gallium/radeon: add the kernel version into the renderer string
Example:
Gallium 0.4 on AMD TONGA (DRM 3.2.0 / 4.5.0, LLVM 3.9.0)

My kernel version is pretty long already (4.5.0-amd-01025-g32791c1)
and adding "kernel" into the string would make too it long for glxinfo
to display.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-05-26 16:53:46 +02:00
Marek Olšák
53f33619a4 winsys/amdgpu: add back multithreaded command submission
Ported from the initial amdgpu winsys from the private AMD branch.

The thread creates the buffer list, submits IBs, and cleans up
the submission context, which can also destroy buffers.

3-5% reduction in CPU overhead is expected for apps submitting a lot
of IBs per frame. This is most visible with DMA IBs.

v2: use a semaphore instead of a busy loop in amdgpu_ws_queue_cs
    add another amdgpu_cs_sync_flush call into amdgpu_bo_map

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-26 16:43:45 +02:00
Lars Hamre
c626a86586 gallium/tgsi: use _mesa_roundevenf in micro_rnd
Fixes the following piglit tests (for softpipe):

/spec/glsl-1.30/execution/built-in-functions/...
fs-roundeven-float
fs-roundeven-vec2
fs-roundeven-vec3
fs-roundeven-vec4
vs-roundeven-float
vs-roundeven-vec2
vs-roundeven-vec3
vs-roundeven-vec4

/spec/glsl-1.50/execution/built-in-functions/...
gs-roundeven-float
gs-roundeven-vec2
gs-roundeven-vec3
gs-roundeven-vec4

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-26 07:59:15 -06:00
Emil Velikov
d519f59a9f .mailmap: use Jakob Bornecrantz's personal email
The VMware one is bouncing.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-26 13:57:32 +01:00
Ilia Mirkin
f998e5dc6b nvc0: add note about where the viewport mask would go
Not piping this all the way through yet, but no better place to note
this down. This will can be used with NV_viewport_array2.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-26 08:46:29 -04:00
Ilia Mirkin
b634936d3b nvc0: enable 32 textures on kepler+
For fermi, this likely will require use of linked tsc mode. However on
bindless architectures, we can have as many as we want. As it stands,
the AUX_TEX_INFO has 32 teture handles reserved.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-26 08:46:13 -04:00
Alejandro Piñeiro
2ed9563e79 glsl: add unit tests data vertex/expected outcome for uninitialized warning
v2: fix 025 test. Add three more tests (Ian Romanick)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-26 09:19:36 +02:00
Alejandro Piñeiro
eee00274fa glsl: add warning-test
It executes compiler-glsl on all the available shaders, and it checks
that the outcome is the expected.

Bash code based on the already existing optimization-test

v2: rebasing: use --version option

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-26 09:19:17 +02:00
Alejandro Piñeiro
68c23d2d04 glsl: add just-log option for the standalone compiler.
Add an option in order to ask to just print the InfoLog, without any
header or separator. Useful if we want to use the standalone compiler
to track only the warning/error messages.

v2: all printfs goes on its own line (Ian Romanick)
v3: rebasing: move just_log to standalone.h/cpp

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-26 08:46:05 +02:00
Alejandro Piñeiro
66ff04322e glsl: do not raise uninitialized warning with out function parameters
It silence by default warnings with function parameters, as the
parameters need to be processed in order to have the actual and the
formal parameter, and the function signature. Then it raises the
warning if needed at verify_parameter_modes where other in/out/inout modes
checks are done.

v2: fix comment style, multi-line condition style, simplify check,
    remove extra blank (Ian Romanick)
v3: inout function parameters can raise the warning too (Ian
    Romanick)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-26 08:39:17 +02:00
Alejandro Piñeiro
b9f90ef652 glsl: add a empty set_is_lhs on ast_node
Just to allow to call set_is_lhs on any ast_node without a casting. Useful
when processing a ast_node list that we know it contain ast_expression.

v2: comment out new_value to avoid unused parameter warning (Ian Romanick)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-26 08:39:07 +02:00
Dave Airlie
5b2675093e glsl: handle implicit sized arrays in ssbo
The current code disallows unsized arrays except at the end of
an SSBO but it is a bit overzealous in doing so.

struct a {
	int b[];
	int f[4];
};

is valid as long as b is implicitly sized within the shader,
i.e. it is accessed only by integer indices.

I've submitted some piglit tests to test for this.

This also has no regressions on piglit on my Haswell.
This fixes:
GL45-CTS.shader_storage_buffer_object.basic-syntax
GL45-CTS.shader_storage_buffer_object.basic-syntaxSSO

This patch moves a chunk of the linker code down, so
that we don't link the uniform blocks until after we've
merged all the variables. The logic went something like:

Removing the checks for last ssbo member unsized from
the compiler and into the linker, meant doing the check
in the link_uniform_blocks code. However to do that the
array sizing had to happen first, so we knew that the
only unsized arrays were in the last block. But array
sizing required the variable to be merged, otherwise
you'd get two different array sizes in different
version of two variables, and one would get lost
when merged. So the solution was to move array sizing
up, after variable merging, but before uniform block
visiting.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-26 12:42:10 +10:00
Dave Airlie
4d70fd1bc7 glsl: fix error message on uniform block mismatch
This looks like a cut-paste from above.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-26 12:40:41 +10:00
Dave Airlie
c952c0e713 glsl/ast: assign explicit_xfb_buffer from correct place
This fixes:
GL44-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.data_pass_through

As the OUT_TC interface structures weren't matching because
one of them had explicit_xfb_buffer set when it shouldn't.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-26 12:17:03 +10:00
Bruce Cherniak
c8835a5924 swr: [rasterizer] Correctly select optimized primitive assembly.
Indexed primitives were always using cut-aware primitive assembly,
whether primitive_restart was enabled or not.  Correctly pass down
primitive_restart and select optimized PA when possible.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-05-25 18:47:16 -05:00
Kenneth Graunke
978ab88858 docs: Mention i965/gen8+ supports GL 4.2 in release notes. 2016-05-25 14:22:56 -07:00
Kenneth Graunke
72ba9c3160 docs: Update GL_OES_copy_image status. 2016-05-25 14:22:30 -07:00
Kenneth Graunke
0f0f357b77 i965: Enable OES_copy_image (and EXT) on Gen8+ and Baytrail.
For now, only enable it on platforms that actually support ETC2.

At this point, Broadwell is only failing 5 (out of 8358) dEQP tests:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.
   srgb8_alpha8_r11f_g11f_b10f.renderbuffer_to_texture3d
   srgb8_alpha8_rgb10_a2ui.renderbuffer_to_cubemap
   srgb8_alpha8_rgb10_a2ui.renderbuffer_to_renderbuffer
   srgb8_alpha8_rgb10_a2.renderbuffer_to_texture2d
   srgb8_alpha8_rgb9_e5.renderbuffer_to_texture3d

These fail with all methods (meta, blorp, blitter, memcpy).

All are blacklisted from the Android mustpass list, which makes me
wonder whether there's an issue with the tests.  The formats in
question work with other targets, and the targets in question work
with other formats...

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2016-05-25 14:17:29 -07:00
Kenneth Graunke
88a630121d i965: Implement a BLORP path for CopyImage and prefer it over Meta.
We're dropping Meta in favor of BLORP everywhere we can.

This also fixes bugs when copying cubemaps to 2D, which is currently
broken in the meta pass.  BLORP just works.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94198
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2016-05-25 14:17:29 -07:00
Kenneth Graunke
2822c8a078 i965: Make the CopyImage BLT path bail for stencil images.
The BLT can't handle S8 because it's W-tiled (at least without
additional funny business, and I'm not sure we care).  Disallow
it so it falls back to the CPU path, which works.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2016-05-25 14:17:29 -07:00
Kenneth Graunke
c51702bdc8 i965: Also copy stencil miptree data.
The Meta path handles this, but the CPU/BLT fallbacks did not.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2016-05-25 14:17:29 -07:00
Kenneth Graunke
45d6818021 i965: Make a helper function for CopyImage of a miptree.
Currently, it only contains the BLT/CPU fallbacks, so the name is a bit
too generic.  But eventually this will use BLORP as well, at which point
the name will make more sense.

The next patch will introduce a second call.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2016-05-25 14:17:29 -07:00
Kenneth Graunke
2dc98d9a15 i965: Combine src/dest tex vs. rb checks in intel_copy_image_sub_data.
This simplifies things a little - now we only have one (tex or rb?)
if-ladder for src, and a second for dst, rather than four.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2016-05-25 14:17:29 -07:00
Kenneth Graunke
1b39c5efca i965: Account for MinLayer in CopyImageSubData's blitter/CPU paths.
Fixes Piglit's arb_copy_image-texview test with the Meta path disabled
(so we hit the blitter/CPU fallback paths).

v2: Add MinLayer even for cube maps (suggested by Ilia).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2016-05-25 14:17:29 -07:00
Rob Clark
231dcb19f9 freedreno/ir3: cmdline compiler for glsl
Use glsl/libstandalone.la to add support for taking glsl src files (in
addition to .tgsi) as input.  Then glsl->nir and feed the result into
the ir3 backend as normal.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-25 16:31:15 -04:00
Rob Clark
0f982bb67d glsl: split out libstandalone
Split standalone glsl_compiler into a libstandalone.la and a thin
main.cpp.  This way drivers can re-use the glsl standalone frontend in
their own standalone compilers.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-25 16:31:15 -04:00
Rob Clark
ec434d940d android: drop build of standalone glsl_compiler
It's only a tool for debugging the glsl compiler, and should not be
installed.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Rob Herring <robh@kernel.org>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-25 16:31:15 -04:00
Matt Turner
61847d7708 i965: Mark fallthrough in switch statement.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-05-25 12:44:34 -07:00
Matt Turner
83c6749ddb i965: Assert that a depth_mt exists when using HiZ.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-05-25 12:44:34 -07:00
Matt Turner
4a5e92ac70 nir: Strengthen assertion that 'out' is nonnull.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-05-25 12:44:34 -07:00
Matt Turner
44809f2371 spirv: Mark default cases unreachable().
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-05-25 12:44:34 -07:00
Matt Turner
469a1c56a6 isl: Mark default cases unreachable.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-05-25 12:44:34 -07:00
Matt Turner
47dca31606 isl: Remove useless qualifier from return type.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-05-25 12:44:34 -07:00
Samuel Pitoiset
71c30bd87c nvc0: add descriptions for hardware perf counters/metrics
The GALLIUM_HUD does not yet expose a description for each events, but
this might be useful for developers who want to have a long description
of hw perf counters directly in the source code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-25 21:06:49 +02:00
Brian Paul
89e4de20fa mesa: 80-column wrapping for _context_lost_GetSynciv()
Reviewed-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2016-05-25 12:23:12 -06:00
Brian Paul
ae7c4a6f98 mesa: add GLAPIENTRY to new _context_lost_X functions
To fix MSVC build.  Any function which goes into the dispatch table
needs to have the GLAPIENTRY (__stdcall) tag.

Reviewed-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2016-05-25 12:23:12 -06:00
Giuseppe Bilotta
1b62b47f6f scons: support 2.5.0
The get_implicit_deps changed in SCons 2.5, expecting a callable rather
than a path as third argument. Detect the SCons versions and set the
argument appropriately to support both 2.5 and earlier versions.

This closes #95211.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95211
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-25 12:23:12 -06:00
Giuseppe Bilotta
8c00fe3970 scons: whitespace cleanup
This text transformation was done automatically via the following shell
command:

$ find -name SCons\* -exec sed -i s/\\s\\+$// '{}' \;

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-25 12:23:12 -06:00
Alejandro Piñeiro
8c29bba242 i965/fs: take into account doubles when emitting system values
Fixes the following cts test:
GL42-CTS.vertex_attrib_64bit.limits_test

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-25 20:14:22 +02:00
Kristian Høgsberg Kristensen
89bb4be91e i965: Fix shadowing of 'height' parameter
The nested declaration of 'height' shadows a parameter and uses
uninitialized memory. Fix by renaming to 'plane_height' which also makes
the code clearer.

This would typically break the bo size computation, but we don't use
that except when mmaping, and we don't mmap YUV buffers much.

Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Reported-by: Mathias Fröhlich <Mathias.Froehlich@gmx.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-25 09:42:55 -07:00
Kristian Høgsberg Kristensen
595224f714 mesa: Add .gitignore entries for make check binaries
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Acked-by: Matt Turner <mattst88@gmail.com>
2016-05-25 09:41:44 -07:00
Kristian Høgsberg Kristensen
85008db1d5 i965: Enable GL_KHR_robustness
GL_KHR_robustness adds the GL_CONTEXT_LOST error and five new entry
points that we already implement.  This patch adds a new dispatch table
that returns GL_CONTEXT_LOST from all entry points and implements the
GL_LOSE_CONTEXT_ON_RESET strategy by setting that table when we learn
that we've lost the context.

With the GL_CONTEXT_LOST reporting in place and dispatch for the new
entry points we can turn on GL_KHR_robustness.

Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-25 09:41:44 -07:00
Emil Velikov
f036eea2cf .mailmap: Use Chia-I Wu personal e-mail.
The LunarG one is bouncing.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-25 17:38:06 +01:00
Emil Velikov
4b79f82836 .mailmap: Use my (Emil Velikov) personal e-mail.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-25 17:35:48 +01:00
Ilia Mirkin
21c1754306 docs: add missing GL_OES/EXT_gpu_shader5 enablement note
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-25 09:50:22 -04:00
Ilia Mirkin
601a5195eb glsl: add GL_EXT_clip_cull_distance define, add helpers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
2016-05-25 09:50:07 -04:00
Brian Paul
9690ab0cdf tgsi: print TGSI_PROPERTY_NEXT_SHADER value as string, not an integer
Print "GEOM" instead of "2", for example.

v2: also update the text parsing code, per Ilia.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-25 07:21:23 -06:00
Brian Paul
2b773fcf00 tgsi: s/6/PIPE_SHADER_TYPES/ for tgsi_processor_type_names array size
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-25 07:21:23 -06:00
Jason Ekstrand
998829f404 nir/spirv: Handle location decorations on structure members 2016-05-24 21:12:56 -07:00
Jason Ekstrand
961369d597 nir/spirv: Add explicit handling for all decorations
From time to time we have had cases where glslang has added a decoration we
don't handle and it has caused problems.  This audit ensures that, for
every decoration, we either handle it or hit an unreachable() with an
accurate description of why we don't have to.
2016-05-24 21:12:56 -07:00
Jason Ekstrand
6f89e51c84 i965/draw: Use the correct buffer index for interleaved VBO sizes
The buffer_range_* arrays are indexed by buffer index not element index.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-24 20:50:35 -07:00
Jordan Justen
e58fabc93a i965/gen7: Fix gl_HelperInvocation
It appears that UV immediates aren't working on Ivy Bridge. In this
case, a signed version will work, and this fixes the piglit
tests/spec/glsl-4.50/execution/helper-invocation.shader_test test.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-24 15:44:06 -07:00
Emil Velikov
e384d75b12 mesa_glinterop: make GL interop version field bidirectional
This allows clear and easy communication between the two.

Caller: Requesting information (struct vN)
Callee: I know how to deal with older version (vN-1) only. Here is your
data and the version I support.
Caller: Older version ? Sure I'll cap all access to the fields provided
by the older version (vN-1)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:03:00 +01:00
Emil Velikov
0e983276b9 mesa_glinterop: drop mesa_glinterop_device_info::interop_version
One cannot use a single version to control both export_in and export_out
versions. Using this forces us to always extend/bump both structs at the
same time.

An alternative scheme is coming with next patch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:03:00 +01:00
Emil Velikov
f8a114aa5c st/dri: add note about GL interop version checks
... and make them more explicit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:03:00 +01:00
Emil Velikov
923bdbf48c mesa_glinterop: rename MESA_GLINTEROP_INVALID_{VALUE,VERSION}
Be more explicit what it actually does.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:03:00 +01:00
Emil Velikov
c196de23ae mesa_glinterop: s/struct_version/version/
OCD polish for consistency with other mesa interfaces.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:03:00 +01:00
Emil Velikov
cb0708c843 mesa_glinterop: fix GL interop *_VERSION comments
Using the macro to set the version is wrong and ill-advised. Please don't
do it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:03:00 +01:00
Emil Velikov
a3eb8702fb mesa_glinterop: remove inclusion of EGL header
Analogous to previous commit, but for EGL.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:03:00 +01:00
Emil Velikov
8472045b16 mesa_glinterop: remove inclusion of GLX header
Since we only need partial information about the GLX symbols we can
forward declare them and drop the include. Obviously each user of the
said API will needs more than what's provides, so they'll include the
GLX header.

If they don't, the compiler will give us a nice warning ;-)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:03:00 +01:00
Emil Velikov
b5f9820d90 mesa_glinterop: remove unneeded GLAPI/GLAPIENTRY/APIENTRYP symbols
These come from windows.h, gl.h, glcorearb.h and/or glext.h.

The interop interface is aimed at non-Windows platforms while the macros
are used/derived due to Windows specifics. Thus we can safely remove
them.

Strictly speaking there should be GLXAPIENTRY/EGLAPIENTRY and alike
macros, although a) there is no GLX ones and b) this brings us even
further from decoupling the file from the GLX/EGL header dependency.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:03:00 +01:00
Emil Velikov
bcf9e47653 mesa_glinterop: replace GL types with their native counterpart.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:02:56 +01:00
Emil Velikov
2e726144f9 mesa_glinterop: use generic variable types for the GL interop
Thus we can preserve the ABI, while avoiding the inclusion of some/all
of the following:

  EGL/egl.h
  GL/gl.h
  GL/glcorearb.h
  GLES/gl.h
  GLES2/gl2.h
  GLES3/gl3.h
  GLES3/gl31.h

This will allow us to build/use it alongside any combination of APIs.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:02:08 +01:00
Emil Velikov
cbf29d90ba mesa_glinterop: use consistent naming scheme for GL interop
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:02:08 +01:00
Emil Velikov
0d31bfd71a Revert "mesa: Build EGL without X11 headers after interop patchset"
This reverts commit 4e2c9a0435.

The solution was incomplete and fragile. An alternative one is coming
shortly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-24 23:02:05 +01:00
Ian Romanick
c8d9ed5ea1 docs: Note that GL_OES_geometry_shader and GL_OES_tessellation_shader are started
The GL_OES_geometry_shader work is on the oes_shader_io_blocks branch
of idr's fd.o repository.

The GL_OES_tessellation_shader work is on the tess-gles branch
of kwg's fd.o repository.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-24 12:45:46 -07:00
Emil Velikov
7e196cd170 c11/threads: resolve link issues with -O0
Add weak symbol notation for the pthread_mutexattr* symbols, thus making
the linker happy. When building with -O1 or greater the optimiser will
kick in and remove the said functions as they are dead/unreachable code.

Ideally we'll enable the optimisations locally, yet that does not seem
to work atm.

v2: Add the AX_GCC_FUNC_ATTRIBUTE([weak]) hunk in configure.

Cc: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2016-05-24 20:21:31 +01:00
Tim Rowley
0ceed1701d swr: [rasterizer] remove containers.hpp
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-24 13:29:37 -05:00
Tim Rowley
1e3e22efb5 swr: [rasterizer core] remove utility dead code
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-24 13:29:29 -05:00
Tim Rowley
dc34479b8c swr: [rasterizer core] buckets fixes
1. Don't clear bucket descriptions to fix issues with sim level
   buckets getting out of sync.

2. Close out threadviz file descriptors in ClearThreads().

3. Skip buckets for jitter based buckets when multithreaded. We need
   thread local storage through llvm jit functions to be fixed before
   we can enable this.

4. Fix buckets StopCapture to correctly detect capture complete.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-24 13:29:21 -05:00
Tim Rowley
3074a2b4fa swr: [rasterizer core] move centroid setup out of CalcCentroidBarycentrics
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-24 13:29:14 -05:00
Tim Rowley
9a2a4ecb39 swr: [rasterizer jitter] implement InstanceID/VertexID in fetch jit
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-24 13:28:47 -05:00
Ian Romanick
7fc4a82007 mesa: Silence unused parameter warnings
Neither shProg nor name was used.  Remove them both.

main/shader_query.cpp:779:53: warning: unused parameter ‘shProg’ [-Wunused-parameter]
 program_resource_location(struct gl_shader_program *shProg,
                                                     ^
main/shader_query.cpp:780:72: warning: unused parameter ‘name’ [-Wunused-parameter]
                           struct gl_program_resource *res, const char *name,
                                                                        ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-05-24 11:04:08 -07:00
Ian Romanick
78399cf170 glsl/linker: Silence unused parameter warning
The parameter is required for the interface.

glsl/link_uniforms.cpp:689:61: warning: unused parameter ‘record_type’ [-Wunused-parameter]
                             bool row_major, const glsl_type *record_type,
                                                             ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-05-24 11:04:05 -07:00
Kristian Høgsberg Kristensen
2bb935be2e dri: Add YVU formats
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen
1be1114e6b i965: Allow creating planar YUV __DRIimages
Lift the resctriction we had before and allow creation of images with
multiple planes. We still require all the planes to be within the same
bo.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen
654e950cba i965: Invoke lowering pass for YUV textures
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen
44997fc0c1 i965: Support textures with multiple planes
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen
3352f2d746 i965: Create multiple miptrees for planar YUV images
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen
6eede87631 i965: Refactor intel_set_texture_image_bo() to create_mt_for_dri_image()
This function now only creates the mt and we then call
intel_set_texture_image_mt() in intel_image_target_texture_2d() to set
it for the texture image.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen
8ceb7c7d9b i965: Use intel_set_texture_image_mt() in intelSetTexBuffer2()
Create the mt for the drawable bo directly and call our new
intel_miptree_create_for_bo() helper instead.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-05-24 10:14:56 -07:00
Kristian Høgsberg Kristensen
40e9be4a5c i965: Add new intel_set_texture_image_mt() helper
This factors out the work of setting up a miptree as the backing for a
texture image into a new helper.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-05-24 10:14:56 -07:00
Kristian Høgsberg Kristensen
a41b57679f nir: Add a lowering pass for YUV textures
This lowers sampling from YUV textures to 1) one or more texture
instructions to sample each plane and 2) color space conversion to RGB.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-24 10:14:56 -07:00
Kristian Høgsberg Kristensen
50c24c3ff3 nir: Handle NULL in nir_copy_deref()
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-24 10:14:56 -07:00
Kristian Høgsberg Kristensen
29921ee987 nir: Add new 'plane' texture source type
This will be used to select the plane to sample from for planar
textures.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-24 10:14:56 -07:00
Brian Paul
39b7b8b906 mesa: log buffer ID numbers in decimal, not hexadecimal
All the other error messages use decimal.  Let's be consistent.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-24 10:26:26 -06:00
Brian Paul
ce1cc70e27 mesa: use enum name in bind_buffer_object() error message
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-24 10:26:26 -06:00
Brian Paul
55c19527a6 mesa: raise error for glEnable(GL_VERTEX_ARRAY), etc. in core profile
Otherwise, if the call executes normally we'll hit an assertion later
in the VBO code when we draw something.  Note that these cases were
already handled correctly for the glIsEnabled() function (and the API
checks were copied from there).

Tested with new piglit gl-3.1-enable-vertex-array test.

v2: fix compat/es mix-up, per Ilia.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-24 10:26:26 -06:00
Nicolas Boichat
a9b2b5e241 docs/egl: Android platform can also be build using autotools
We added support for Android build using autotools (configure),
update the documentation to reflect that.

Signed-off-by: Nicolas Boichat <drinkcat@google.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-24 16:24:54 +01:00
Juan A. Suarez Romero
e79aa19d88 i965: fix double-precision vertex inputs measurement
For double-precision vertex inputs we need to measure them in dvec4
terms, and for single-precision vertex inputs we need to measure them in
vec4 terms.

For the later case, we use type_size_vec4() function. For the former
case, we had a wrong implementation based on type_size_vec4().

This commit introduces a proper type_size_dvec4() function, that we use
to measure vertex inputs.

Measuring double-precision vertex inputs as dvec4 is required because
ARB_vertex_attrib_64bit states that these uses the same number of
locations than the single-precision version. That is, two consecutives
dvec4 would be located in location "x" and location "x+1", not "x+2".

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-24 10:06:29 +02:00
Ilia Mirkin
ccd58015a2 docs: true up nvc0 status - images, etc
Images aren't supported on maxwell, but neither is tessellation. Don't
overly confuse matters by trying to expose those subtleties in the
GL3.txt file/relnotes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Dave Airlie <airlied@redhat.com>
2016-05-23 23:47:11 -04:00
Ilia Mirkin
856587909c st/mesa: enable ARB_ES3_1_compatibility when ES 3.1 would be exposed
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-23 23:47:11 -04:00
Ilia Mirkin
5878254545 mesa: remove separate enable for KHR_robust_buffer_access_behavior
This extension appears to be a strict subset of the ARB version. Also
remove it from GL3.txt since it doesn't seem relevant.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 23:47:11 -04:00
Timothy Arceri
72449c477e glsl: add support for explicit components to frag outputs
V2: fix error checking for arrays and components. V1 was
only taking into account all the array elements and all the
components of one of the varyings during the comparision
and treating the other as a single slot/component.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-24 12:46:48 +10:00
Ilia Mirkin
37266dfb7c mesa: add view classes for 3d astc formats
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-23 22:34:37 -04:00
Ilia Mirkin
979bcb9f42 glsl: add EXT_clip_cull_distance support based on ARB_cull_distance
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-23 22:22:06 -04:00
Ilia Mirkin
f236f1f506 nvc0: expose robust buffer access
We apparently pass all the relevant CTS tests. There are probably some
shortcomings, but they can be addressed down the line.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-23 22:22:05 -04:00
Jason Ekstrand
9f5ccaf4dc i965: Use ISL for surface format introspection
With this, we can delete the surface format table in brw_surface_formats.c
because all of the information we need is now in ISL.
2016-05-23 19:12:34 -07:00
Jason Ekstrand
d68acde1cb anv/formats: Use isl_format_supports* for format introspection 2016-05-23 19:12:34 -07:00
Jason Ekstrand
7374d006b6 isl: Add per-gen format introspection
This is just a copy-and-paste from brw_surface_formats.c.  For the
supports_vertex_fetch function, we do a bit more work so that it properly
handles Bay Trail.
2016-05-23 19:12:34 -07:00
Jason Ekstrand
03a82dc5d1 isl: Add the ISL_FORMAT_R32G32_FLOAT_LD format 2016-05-23 19:12:34 -07:00
Jason Ekstrand
35a514e6ff isl: Add support for quering the string name of a format 2016-05-23 19:12:34 -07:00
Jason Ekstrand
75d10dff0b i965: Enable ARB/KHR_robust_buffer_access_behavior on BYT and HSW+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
1a092fcf3b main: Add extension enable bits for KHR_robust_buffer_access_behavior
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
66e137ecf1 nir/lower_samplers: Protect against sampler index overflow
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
27b9481d03 glsl: Add an option to clamp block indices when lowering UBO/SSBOs
This prevents array overflow when the block is actually an array of UBOs or
SSBOs.  On some hardware such as i965, such overflows can cause GPU hangs.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
ac242aac3d glsl/linker: Add a helper variable for compiler options
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
aec10a1d5b i965/draw: Use the real size for index buffers
Previously, we were using the size of the whole BO which may be
substantially larger than the actual index buffer size.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
7c8dfa78b9 i965/draw: Use the real size for vertex buffers
Previously, we were using the size of the BO which may be substantially
larger than the actual vertex buffer size.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
a643bc6246 i965/draw: Use 3-channel formats for vertex fetch when possible.
For a long time, several of the 3-channel vertex formats didn't exist so we
faked them with 4-channel versions.  Starting with Sandy Bridge, we can use
R16G16B16_FLOAT and 8 and 16-bit integer formats become available on
Haswell and Bay Trail.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
ab3d8d5ea4 i965/surface_formats: Update the VB column for new formats added on BYT
Bay Trail and Haswell added a bunch of new vertex formats.  There was also
the addition of 64-bit passthrough formats for BDW+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
d5b4ab2c5f i965/draw: Properly handle rounding when dividing by InstanceDivisor
The old code always divided rounded down and then subtracted 1.  What we
wanted was to divide rounded up and then subtract 1 which is equivalent to
subtracting 1 and then dividing rounded down.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
ad42ab473c i965/draw: Account for BaseInstance in VBO bounds
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
ad3deec8ca i965/draw: Use worst-case VBO bounds if brw->num_instances == 0
Previously, we only handled the "I don't know what's going on" case for
things with InstanceDivisor == 0.  However, in the DrawIndirect case we can
get num_instances == 0 and we don't know what's going on with the instanced
ones either.  This commit makes the worst-case bound the default and then
conservatively tightens the bound.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
8892519751 i965/draw: Delay when we get the bo for vertex buffers
The previous code got the BO the first time we encountered it.  However,
this can potentially lead to problems if the BO is used for multiple arrays
with the same buffer object because the range we declare as busy may not be
quite right.  By delaying the call to intel_bufferobj_buffer, we can ensure
that we have the full range for the given buffer.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
a01a1eb9e4 i965/draw: Stop relying on min_index == -1 for invalid index bounds
The vbo layer passes an index_bounds_valid flag that we should be using
instead.  This also fixes a bug when min_index == -1 and basevertex != 0
where we were actually comparing min_index + basevertex == -1 which was
false and we were getting the wrong buffer-sizing path.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
a7011922f1 vbo: Declare the index range invalid for DrawTransformFeedback
Right now, we're setting the range to [0, 0] which is obviously bogus.
Instead, we should set it to be invalid like we do for DrawIndirect.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Jason Ekstrand
df6ec2aba5 vbo: Declare the index range invalid for DrawIndirect
Right now, we're just setting the range to [0, MAX_UINT32] which, while
correct isn't helpful.  With DrawIndirect, you can't really know what the
actual range is so we may as well flag it as being an invalid range.  This
is what we do for draws with index buffer which is similar (the indices
aren't statically known) if a bit simpler.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Ilia Mirkin
21f3df0820 mesa/teximage: fix GL_FLOAT in comment
Noticed by Brian. Trivial.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-23 21:44:41 -04:00
Timothy Arceri
2d9308012c glsl: fix explicit location validation for doubles
Previously we would fail to find a match for the second half of a
dvec4 as 'i' would get incremented to 1 before we added the var to
the array at component 0.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-24 11:30:51 +10:00
Dave Airlie
33397bf7fd docs: update ARB_cull_distance status.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-24 11:27:58 +10:00
Dave Airlie
5c10d47bae st/mesa: reenable culling
Now the lowering pass is fixed, reenable ARB_cull_distance.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-24 11:27:54 +10:00
Dave Airlie
a88c5d7e55 i965: reenable ARB_cull_distance.
Now the lowering pass is fixed we can reenable culling.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-24 11:27:29 +10:00
Dave Airlie
a08c4ebbe8 glsl: rewrite clip/cull distance lowering pass
The last version of this broke clipping, and I had to spend
sometime getting this working properly.

I had to introduce a third pass to count the clip/cull totals,
all due to one messy corner case. We have a piglit test
tes-input-gl_ClipDistance.shader_test
that doesn't actually output the clip distances, it just passes
them like a varying from TCS->TES, the older lowering pass worked
but to lower clip/cull we need to know the total number of clip+culls
used to defined the new variable correctly, and to offset culls
properly.

This adds an extra pass that works out the sizes for clip/cull,
then lowers gl_ClipDistance then gl_CullDistance into the new
gl_ClipDistanceMESA.

The pass checks using the fixed array sizes code if they array
has been referenced, or is actually never used, and ignores
it in the latter case.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-24 11:27:29 +10:00
Dave Airlie
8c628ab13e glsl: make max array trackers ints and use -1 as base. (v2)
This fixes a bug that breaks cull distances. The problem
is the max array accessors can't tell the difference between
an never accessed unsized array and an accessed at location 0
unsized array. This leads to converting an undeclared unused
gl_ClipDistance inside or outside gl_PerVertex to a size 1
array. However we need to the number of active clip distances
to work out the starting point for the cull distances, and
this offset by one when it's not being used isn't possible
to distinguish from the case were only the first element is
accessed. I tried to use ->used for this, but that doesn't
work when gl_ClipDistance is part of an interface block.

So this changes things so that max_array_access is an int
and initialised to -1. This also allows unsized arrays to
proceed further than that could before, but we really shouldn't
mind as they will get eliminated if nothing uses them later.

For initialised uniforms we no longer change their array
size at runtime, if these are unused they will get eliminated
eventually.

v2: use ralloc_array (Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-24 11:27:29 +10:00
Nanley Chery
2ae493d686 anv/formats: Make alpha blending a property of render targets
In agreement with the SNB PRM, alpha blending is a property that render
targets may or may not support.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 17:26:17 -07:00
Nanley Chery
9721be6681 i965: Unset alpha blend for R10G10B10_SNORM_A2_UNORM
This format does not support alpha blending, according to the SNB PRM.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 17:26:17 -07:00
Dave Airlie
8b89c92ef6 i965: deindent blorp code.
gcc6 warns about this.

Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-24 10:14:31 +10:00
Dave Airlie
e257284481 glsl: reindent line in ast_function.cpp
This fixes a warning with gcc -Wmisleading-indentation.

Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-24 10:14:31 +10:00
Ilia Mirkin
82d756f3af mesa: allow GL_FRAMEBUFFER_DEFAULT_LAYERS to be queried with ES geometry
When we have the geometry extensions, enable querying of the new param.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-23 20:03:40 -04:00
Ilia Mirkin
2dabd49704 mesa: allow xfb to be active in GLES when geometry shader is enabled.
OES_geometry_shader has wording to allow xfb when using Draw*Indirect
and DrawElements.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-23 20:03:20 -04:00
Ilia Mirkin
2e8e1e8909 main: check driver float texture support before upgrading to 16F/32F
When passing in GL_RGBA or other base formats, we will try to upgrade
the format to whatever the passed in type was. However not all drivers
(notably nv30) support 32F textures, and so this would lead to crashes
down the line. Only upgrade when the relevant extensions are available.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-23 20:00:39 -04:00
Ilia Mirkin
1e99a46b44 st/mesa: update inst->info along with inst->op
Otherwise we still have TGSI_OPCODE_CMP's info, which causes a number of
later logic to go wrong. This fixes

dEQP-GLES2.functional.shaders.functions.control_flow.return_in_if_vertex

on nv30.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-23 19:58:53 -04:00
Bas Nieuwenhuizen
533d1e9085 glsl: Use correct mode for split components.
The mode should stay the same as the original struct. In
particular, shared should not be changed to temporary.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-05-24 09:55:38 +10:00
Kenneth Graunke
1c1873b93b mesa: Implement glGet*(GL_PRIMITIVE_RESTART_FOR_PATCHES_SUPPORTED).
Technically, this was introduced with GL 4.4.  However, I believe it
was intended to be retroactive.  As far as I know, AMD has never
supported primitive restart with patches, while NVidia and Intel do.
This necessitated the need for a query which would allow applications
to figure out whether this was usable or not.

I decided to expose it everywhere ARB_tessellation_shader is exposed.
(It's also in both OES and EXT_tessellation_shader.)

Enable this for i965 and Gallium drivers which expose the capability.

v2: Fix a bug in the state_tracker code (caught by Ilia Mirkin).

Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=10364
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-23 16:44:22 -07:00
Kenneth Graunke
70048eb1e3 gallium: Add a pipe cap for whether primitive restart works for patches.
Some hardware supports primitive restart on patch primitives, and other
hardware does not.  Modern GL and ES include a query for this feature;
adding a capability bit will allow us to answer it.

As far as I know, AMD hardware does not support this feature, while
NVIDIA and Intel hardware does.  However, most Gallium drivers do not
appear to support tessellation shaders yet.  So, I've enabled it for
nvc0 and disabled it everywhere else.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-23 16:44:11 -07:00
Francisco Jerez
015035027b i965/fs: Mark UBO uniform pull constant loads as force_writemask_all.
This lets the rest of the backend know that the uniform pull constant
load opcodes don't respect channel enables -- Without this the
register allocator has no way to know that the return payload of a
pull constant load is not per-channel and spills of the destination
will be broken under non-uniform control flow.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 14:07:23 -07:00
Francisco Jerez
7eb4966887 i965/fs: Allow spilling of non-contiguous registers.
This should be working fine now.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94997
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 14:05:21 -07:00
Francisco Jerez
6fc5dd5b6a i965/fs: Calculate the (un)spill block size correctly.
Currently the spilling code attempts to guess the scratch message
block size from the dispatch width of the shader, which is plain wrong
for SIMD-lowered instructions (frequently but not exclusively
encountered in SIMD32 shaders) or for instructions with register
region data types of size other than 32 bit.

Instead try to use the SIMD component size of the instruction which in
some cases will allow the dataport to apply the correct channel mask
to the scratch data read or written.  In the spill case the block size
needs to be clamped to the number of MRF registers reserved for
spilling.  In the unspill case I didn't even bother because we
currently have no 100% accurate way to determine whether a source
region is per-channel or whether it contains things like headers that
don't respect channel boundaries -- That's fine, because the unspill
is marked force_writemask_all we can just use the largest allowable
scratch message size.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 14:05:21 -07:00
Francisco Jerez
11260cc54f i965/fs: Set exec_all on spills not matching the channel layout of the instruction.
This prevents the application of an incorrect channel mask by the
scratch write instruction for spilled variables that don't have an
exact one-to-one correspondence between channels of the variable and
32-bit components of the scratch write instruction.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 14:05:21 -07:00
Francisco Jerez
bb67c467a4 i965/fs: Set exec_all on unspills.
This makes sure that unspills restore the exact contents of the
variable in scratch space into the GRF without applying channel
masking, which is incorrect under control flow for things like message
headers or vectors of heterogeneous types that don't properly respect
channel boundaries.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 14:05:20 -07:00
Francisco Jerez
07e67cc266 i965/fs: Move scratch block size calculation into the caller of emit_(un)spill.
This makes emit_(un)spill even more stupid by removing the logic that
decides what execution size each scratch read or write send message
should have and instead relying on the caller to specify an
appropriate execution size via the builder argument.  This makes sense
because the caller will need to act differently based on the scratch
message width (e.g. emit an additional unspill before the instruction
if the execution width and channel layout of the spill doesn't match
the instruction's).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 14:05:20 -07:00
Francisco Jerez
284c8fbcef i965/fs: Make emit_spill/unspill static functions taking builder as argument.
This seems cleaner than exposing an implementation detail of
brw_fs_reg_allocate.cpp to the world, and will give the caller control
over the instruction execution flags (e.g. force_writemask_all) that
are applied to the scratch read and write instructions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 14:05:20 -07:00
Francisco Jerez
70023c40c6 i965/fs: Apply execution controls from the instruction to scratch messages.
Until now the execution controls (e.g. channel group,
force_writemask_all, exec_size) of the instruction had been completely
ignored by spilling, even though that can lead to a mismatch between
the channel mask applied to the contents of the (un)spilled memory and
the GRF source or destination of the instruction.  In some cases we'll
actually want the (un)spill messages to be marked force_writemask_all
regardless of whether the instruction has it set, but that will have
to be handled specially by the caller.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 14:05:20 -07:00
Francisco Jerez
e98cf03114 i965/fs: Fix signedness of local variables and arguments of emit_(un)spill.
To avoid some some spurious warnings about comparison signedness in
the following commits.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 14:05:20 -07:00
Francisco Jerez
f471d3eede i965/fs: Factor out calculation of the block of MRFs reserved for spilling.
And as we're at it fix the calculation to allocate a larger block of
registers for 32-wide dispatch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 14:05:20 -07:00
Plamena Manolova
21edd24c0d egl: Add OpenGL_ES to API string regardless of GLES version
According to the EGL specifications eglQueryString(EGL_CLIENT_APIS)
should return a string containing a combination of "OpenGL", "OpenGL_ES"
and "OpenVG", any other values would be considered invalid. Due to this
when the API string is constructed, the version of GLES should be
disregarded and "OpenGL_ES" should be attached once instead of
"OpenGL_ES2" and "OpenGL_ES3".

Fixes:
dEQP-EGL.functional.negative_api* and
dEQP-EGL.functional.query_context.simple.query_api

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-23 13:46:01 -07:00
Rob Clark
46ff17559b freedreno/ir3: disable cp for indirect src's
The variable-indexing tests always had a few random fails, which I
usually couldn't reproduce when running tests manually.  Somehow
recently this got a lot worse.  I ported a couple of the shaders to
GLES to see what blob does, and it also seems to be avoiding to cp
indirect srcs.  So I guess indirect w/ instructions other than cat1
(mov) are not totally reliable.  Let's just switch that off until
this is better understood.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-23 15:57:13 -04:00
Samuel Pitoiset
c3c4370299 nvc0: do not invalidate compute constbufs on Kepler
Constbufs are only aliased on Fermi and this will reduce the number of
flushes when we switch between 3d and compute.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-23 20:56:29 +02:00
Rob Clark
5245d845b6 nir/validate: fix null deref coverity warning
CID 1265536 (#1 of 2): Explicit null dereferenced (FORWARD_NULL)6.
var_deref_op: Dereferencing null pointer parent.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-23 10:14:50 -04:00
Nicolas Boichat
0cbc90c57c mesa: dri: Add shared glapi to LIBADD on Android
/system/vendor/lib/dri/*_dri.so actually depend on libglapi: without
this, loading the so file fails with:
cannot locate symbol "__emutls_v._glapi_tls_Context"

On non-Android (non-bionic) platform, EGL uses the following
workflow, which works fine:
  dlopen("libglapi.so", RTLD_LAZY | RTLD_GLOBAL);
  dlopen("dri/<driver>_dri.so", RTLD_NOW | RTLD_GLOBAL);

However, bionic does not respect the RTLD_GLOBAL flag, and the dri
library cannot find symbols in libglapi.so, so we need to link
to libglapi.so explicitly. Android.mk already does this.

Signed-off-by: Nicolas Boichat <drinkcat@google.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Emil Velikov: s/explicitely/explicitly/]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-23 13:25:51 +01:00
Nicolas Boichat
27d713a004 configure.ac: Add support for Android builds
Add support for EGL android platform.

Also, detect when --host finishes with -android. In that case, we
do not set _GNU_SOURCE, and define autoconf symbol HAVE_ANDROID, so
that Android-specific workarounds can be applied.

Signed-off-by: Nicolas Boichat <drinkcat@google.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Emil Velikov: Rebase on top of HAVE_EGL_PLATFORM_NULL removal]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-23 13:23:39 +01:00
Emil Velikov
960d854a98 anv: remove define _DEFAULT_SOURCE
The build systems already add this as applicable. There's no need to
have this in the source file.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-23 12:09:11 +01:00
Emil Velikov
1b64d1247d gbm: remove define _DEFAULT_SOURCE
The build systems already add this as applicable. There's no need to
have this in the source file.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-23 12:09:11 +01:00
Emil Velikov
efe4beb717 gbm: remove define _BSD_SOURCE
The build systems already add this as applicable. There's no need to
have this in the source file.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-23 12:09:11 +01:00
Jiri Slaby
a6ce91fe52 glxcmds: glXGetFBConfigs, fix screen bounds
Bounds of screen are 0 (inclusive) and ScreenCount(dpy) (exclusive).
The upper bound was too ScreenCount(dpy) (inclusive).

This causes a crash invoked by java3d which passes down an invalid
screen:
6  0x00007f0e5198ba70 in <signal handler called> () at /lib64/libc.so.6
7  0x00007f0e14531e14 in glXGetFBConfigs (dpy=<optimized out>, screen=1, nelements=nelements@entry=0x7f0dab3c522c) at glxcmds.c:1660
8  0x00007f0e14532f7f in glXChooseFBConfig (dpy=<optimized out>, screen=<optimized out>, attribList=0x7f0dab3c54e0, nitems=0x7f0dab3c535c) at glxcmds.c:1611
9  0x00007f0e1478d29b in find_S_FBConfigs () at /usr/lib64/libj3dcore-ogl.so
10 0x00007f0e1478d3dc in find_S_S_FBConfigs () at /usr/lib64/libj3dcore-ogl.so
11 0x00007f0e1478d567 in find_AA_S_S_FBConfigs () at /usr/lib64/libj3dcore-ogl.so
12 0x00007f0e1478d728 in find_DB_AA_S_S_FBConfigs () at /usr/lib64/libj3dcore-ogl.so
13 0x00007f0e1478d97c in Java_javax_media_j3d_X11NativeConfigTemplate3D_chooseOglVisual () at /usr/lib64/libj3dcore-ogl.so

While ScreenCount(dpy) is actually 1:
(gdb) p dpy->nscreens
$2 = 1
screen=1 is passed to glXGetFBConfigs.

Fix this typo in glXGetFBConfigs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95456
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-23 12:07:47 +01:00
Elie TOURNIER
0f738fa23e doxygen: Add missing modules to Windows runner
Acked-by: Rhys Kidd <rhyskidd@gmail.com>
2016-05-23 12:07:47 +01:00
Emil Velikov
793574afad egl: add missing link against $(CLOCK_LIB)
Some platforms require separate library in order to resolve the
clock_gettime() symbol. Add the link or the build will fail.

Fixes: 70299474f5 ("egl: add EGL_KHR_reusable_sync to egl_dri")

Cc: Dongwon Kim <dongwon.kim@intel.com>
Reported-by: Pali Rohár <pali.rohar@gmail.com>
Tested-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-23 12:07:47 +01:00
Emil Velikov
d67e757d11 egl: android: remove explicit glFlush call
The DRI flush extension should already do the same thing.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Rob Herring <robh@kernel.org>
2016-05-23 12:07:47 +01:00
Emil Velikov
9b3c7481c6 egl: android: drop dri2_create_image_android_native_buffer argument
The drv is no longer used/needed as of last commit.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
2016-05-23 12:07:47 +01:00
Emil Velikov
38ef6f5f60 egl: android: directly use dri2_create_image_dma_buf()
Make the function non static so that we can use it directly from the
android platform code.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
2016-05-23 12:07:47 +01:00
Emil Velikov
2cd687ce97 configure.ac: error out when building from git without python3
Bail early, as opposed to later on during the build.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-23 12:07:47 +01:00
Emil Velikov
a155cdaace vl/drm: don't call close(-1) in vl_drm_screen_create error path
Analogous to previous commits.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-05-23 12:07:47 +01:00
Emil Velikov
ed3f6ccce0 st/xa: don't call close(-1) in xa_tracker_create error path
Analogous to previous commit.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-05-23 12:07:46 +01:00
Emil Velikov
6e00a1e6cb st/dri: don't call close(-1) in dri{2, kms_}_init_screen error path
Add separate labels and jump to the correct one as needed.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-05-23 12:07:46 +01:00
Eric Engestrom
7362bb3e21 vk/intel: use negative VK_NO_PROTOTYPES scheme
3d0fac7aca changed all
VK_PROTOTYPES to VK_NO_PROTOTYPES
This brings the Intel header in line with the rest of the Vulkan code.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-05-23 12:07:46 +01:00
Rob Herring
8aeb6d768b gbm: Add map/unmap functions
This adds map and unmap functions to GBM utilizing the DRIimage extension
mapImage/unmapImage functions or existing internal mapping for dumb
buffers. Unlike prior attempts, this version provides a region to map and
usage flags for the mapping. The operation follows the same semantics as
the gallium transfer_map() function.

This was tested with GBM based gralloc on Android.

Signed-off-by: Rob Herring <robh@kernel.org>
[Emil Velikov: drop no longer relevant hunk from commit message.]
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-23 12:07:46 +01:00
Rob Herring
1f4869a208 configure.ac: add pthreadstubs support
Add pthreadstubs to avoid pulling in full pthreads library. GBM will be the
first user.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-23 12:07:46 +01:00
Rob Herring
0a4275b534 gbm: rename gbm_dri_bo_{map,unmap} to gbm_dri_bo_{map,unmap}_dumb
In preparation to add public map/unmap functions, rename the existing
gbm_dri_bo_{map,unmap} functions to indicate that they are only for dumb
buffers.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-23 12:07:46 +01:00
Rob Herring
e8431a630d st/dri: Add support for DRIimage extension mapImage/unmapImage
Implement support for mapImage/unmapImage functions in version 12 of the
DRIimage extension.

Signed-off-by: Rob Herring <robh@kernel.org>
[Emil Velikov: align/indent the map/unmap vfuncs]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-23 12:07:46 +01:00
Rob Herring
a0f06f168f DRI: Add DRIimage map and unmap functions
Add mapImage and unmapImage functions to DRIimage extension for mapping
and unmapping DRIimages for CPU access. The caller provides the region of
the image to map and is returned a pointer to the beginning of the region
and the stride (which could be different from the original).

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-23 12:07:46 +01:00
Rob Herring
bdfa635f72 gbm: Add Android build support
In order to use libgbm for gralloc, add it to the Android build.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-23 12:07:46 +01:00
Rob Herring
64a005e3ee gbm: add Android gallium_dri.so library loading support
GBM needs the same special gallium_dri.so loading as EGL for Android, so
copy over the same hunk from the EGL code.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-23 12:07:46 +01:00
Rob Herring
7d79eec456 gbm: split out source file to Makefile.sources
In preparation to add Android build support, split out the source file
lists to Makefile.sources

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>

[Emil Velikov: Whitespace cleanup.]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-23 12:07:46 +01:00
Rob Herring
fc1806e041 Android: Move setting DEFAULT_DRIVER_DIR to shared location
Move the defining of DEFAULT_DRIVER_DIR path to a common location so both
EGL and GBM can use it.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-23 12:07:45 +01:00
Emil Velikov
6ce11e7e2c c11/threads: create mutexattrs only when needed
If the mutexattrs are the default one can just pass NULL to
pthread_mutex_init. As the compiler does not know this detail it
unnecessarily creates/destroys the attrs.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-23 12:07:45 +01:00
Andres Gomez
4424bf5da4 configure: added xcb to dri3 modules to pkg-conf
This fixes a recent linking error in libvulkan_common

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-05-23 11:21:34 +02:00
Juan A. Suarez Romero
3c9096eea4 glsl/linker: dvec3/dvec4 consume twice input vertex attributes
From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes):

"A program with more than the value of MAX_VERTEX_ATTRIBS
active attribute variables may fail to link, unless
device-dependent optimizations are able to make the program
fit within available hardware resources. For the purposes
of this test, attribute variables of the type dvec3, dvec4,
dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may
count as consuming twice as many attributes as equivalent
single-precision types. While these types use the same number
of generic attributes as their single-precision equivalents,
implementations are permitted to consume two single-precision
vectors of internal storage for each three- or four-component
double-precision vector."

This commits makes dvec3, dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4,
dmat4x3 and dmat4 consume twice as many attributes as equivalent
single-precision types.

v3: count doubles as consuming two attributes (Dave Airlie)
v4: make reference to spec (Michael Schellenberger Costa)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>

Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2016-05-23 10:48:07 +02:00
Francisco Jerez
b46867cd37 i965/fs: do not depend on std140 alignment rules for UBO loads
The previous implementation relied on the std140 alignment rules to
avoid handling misalignment in the case where we are loading more than
2 double components from a vector, which requires to emit a second load
message.

This alternative implementation deals with misalignment and is more
flexible going forward.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-05-23 08:56:57 +02:00
Iago Toral Quiroga
38b719d624 nir: handle double-precision in fsign, fsat, fnot and frcp
I think these are not strictly necessary since the floats in them
should be automatically promoted to doubles when operated with
double sources, but it makes things more explicit at least.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-23 08:54:37 +02:00
Iago Toral Quiroga
3f73039ade nir: handle double-precision in fabs, frsq and fsqrt
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-23 08:54:28 +02:00
Dave Airlie
3466db3969 glsl/parser: handle multiple layout sections with AST nodes.
For geometry/compute inputs and tess control outputs, we create
an AST node to keep track of some things. However if we have
multiple layout sections, we don't ever link the node into the AST.

This is because we create the node on the rightmost layout declaration
and don't pass it back in so it gets linked at the end of the parsing
of the rightmost.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:20:01 +10:00
Dave Airlie
aaa69c79cd glsl: allow layout qualifier overrides with ARB_shading_language_420pack
GLSL 4.20 allows overriding the layout qualifiers.

This helps fix:
GL45-CTS.shading_language_420pack.qualifier_override_layout

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:19:57 +10:00
Dave Airlie
6f2dc0d044 subroutines: handle explicit indexes properly
The code didn't deal with explicit function indexes properly.
It also handed out the indexes at link time, when we really
need them in the lowering pass to create the correct if ladder.

So this patch moves assigning the non-explicit indexes earlier,
fixes the lowering pass and the lookups to get the correct values.

This fixes a few of:
GL45-CTS.explicit_uniform_location.subroutine-index-*

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:19:57 +10:00
Dave Airlie
5fe912831c mesa/subroutines: fix reset on bindpipeline
Fixes:
GL45-CTS.shader_subroutine.subroutine_uniform_reset

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:19:57 +10:00
Dave Airlie
7fa0250f94 mesa/subroutines: count number subroutines properly.
The code was implementing the ACTIVE_SUBROUTINE_UNIFORMS
incorrectly, using the number of types not the number of
uniforms. This is different than the locations as the
locations may be sparsly allocated.

This fixes:
GL43-CTS.shader_subroutine.four_subroutines_with_two_uniforms

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:19:57 +10:00
Dave Airlie
22db9b10eb mesa/subroutines: don't generate error in GetSubroutineIndex.
GLSL spec says this doesn't generate an error.

Fixes:
GL45-CTS.explicit_uniform_location.subroutine-loc

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:19:57 +10:00
Dave Airlie
3b8b6be7bb glsl/ast: for geom shaders allow stream flags in input flags.
This fixes:
GL45-CTS.shader_subroutine.subroutines_with_separate_shader_objects

Since we set the stream flags earlier on all geom shaders, we
shouldn't fall over later if we find one.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:19:57 +10:00
Dave Airlie
93b3b6af3c glsl/linker: skip inactive explicit locations.
This fixes a crash in:
GL45-CTS.explicit_uniform_location.subroutine-loc-negative-link-max-num-of-locations

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:19:57 +10:00
Dave Airlie
c714731653 glsl: fix subroutine uniform .length().
This fixes .length() on subroutine uniform arrays, if
we don't find the identifier normally, we look up the corresponding
subroutine identifier instead.

Fixes:
GL45-CTS.shader_subroutine.arrays_of_arrays_of_uniforms
GL45-CTS.shader_subroutine.arrayed_subroutine_uniforms

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:19:57 +10:00
Dave Airlie
432ac19c1a glsl/linker: link error on too many subroutine functions.
This fixes:
GL45-CTS.explicit_uniform_location.subroutine-index-negative-link-max-num-of-indices

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:19:56 +10:00
Dave Airlie
18b0a13e80 glsl: produce a linker error for a subroutine uniform with no functions.
If a subroutine uniform is declared with no functions backing it,
that isn't legal, so we should fail to link.

Fixes:
GL43-CTS.shader_subroutine.subroutine_uniform_wo_matching_subroutines

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:19:56 +10:00
Dave Airlie
b572b599ef glsl: validate subroutine types match function signature.
This fixes:
GL43-CTS.shader_subroutine.subroutines_incompatible_with_subroutine_type

It just makes sure the signatures match as well as the return
types.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:19:56 +10:00
Dave Airlie
ba3414d832 arb_shader_subroutine: check active subroutine limit
_mesa_GetActiveSubroutineUniformiv needs to check
against the number of types here.

Noticed while playing with ogl conform.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 16:18:25 +10:00
Ilia Mirkin
74e71cbfcb nv30: don't assert when running out of registers
This happens with dEQP tests. The code doesn't at all protect against
this condition, so while unhandled, this is an expected situation.

Also avoid using more than the first 16 registers for nv3x vertex
programs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-22 22:57:18 -04:00
Ilia Mirkin
36ff09cdfe nouveau: allow allocating non-object-backed buffers
On nv30, for example, there is no hardware index buffer support. So all
of those will be created entirely in user memory.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-22 22:57:18 -04:00
Tobias Klausmann
96f390ff35 llvm/softpipe: Enable cull_distance as draw supports it.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 11:04:37 +10:00
Dave Airlie
e6d9389366 tgsi: remove culldist semantic.
This isn't used anymore in the tree, culldist's
are part of the clipdist semantic, we could in theory
rename it, but I'm not sure there is much point, and
I'd have to be careful with virgl.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 11:03:44 +10:00
Dave Airlie
d17062a40e draw: stop using CULLDIST semantic.
The way the HW works doesn't really fit with having
two semantics for this.

The GLSL compiler emits 2 vec4s and two properties,
this makes draw use those instead of CULLDIST semantics.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 11:03:40 +10:00
Emil Velikov
bddb3b5375 virgl: remove unused state_tracker/graw.h include
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 11:02:17 +10:00
Dave Airlie
62c728f7d8 mesa/queryobject: return INVALID_VALUE if offset < 0 (v2)
This fixes:
GL45-CTS.direct_state_access.queries_errors

The ARB_direct_state_access spec agrees.

v2: move check down further (Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-23 07:33:03 +10:00
Samuel Pitoiset
a7fad12931 nvc0/ir: fix indirect access for images
When the array doesn't start at 0 we need to account for su->tex.r.
While we are at it, make sure to avoid out of bounds access by masking
the index.

This fixes GL45-CTS.shading_language_420pack.binding_image_array.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reported-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-22 23:06:16 +02:00
Ilia Mirkin
cb9a51d1f6 nv30: reset the stencil mask when fast-clearing
Apparently the stencil mask applies to clears on nv30/nv40. Reset it to
0xff before doing a stencil clear. This fixes gl-1.0-readpixsanity and
a number of other piglit tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-22 14:48:56 -04:00
Ilia Mirkin
f57a8440d5 nv30,nv50: add PIPE_SHADER_CAP_PREFERRED_IR support
The mesa state tracker has recently started to query this.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-22 14:05:36 -04:00
Ilia Mirkin
9f19ccff9c nvc0: fix setting of tess_mode in various situations
This fixes a lot of INVALID_VALUE errors reported by the card when
running dEQP tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-05-22 11:58:22 -04:00
Ilia Mirkin
d6edae7090 nv50/ir: fix prog info init
Left over from the pre-mainline tess support. Adapt to use the new
defines.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-05-22 11:58:22 -04:00
Ilia Mirkin
035b1097db nvc0/ir: return 0 for gl_TessCoord.z for non-triangles modes
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-05-22 11:58:22 -04:00
Matt Turner
bdc9c20df0 mesa: Unlock mutex on error path.
Caught by Coverity (CID 1362021). Caused by commit 015f2207c.
2016-05-22 07:01:35 -07:00
Timothy Arceri
a83e9afbe4 i965: remove redundant NULL check
We would have segfaulted in the above code if prog could be NULL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-22 23:08:08 +10:00
Eduardo Lima Mitev
7dce4793b7 anv/nir_apply_pipeline_layout: Pass the nir_src from the nir_tex_src
nir_instr_rewrite_src() expects a nir_src and it is currently being fed a
nir_tex_src. This will crash something.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-21 19:57:31 +02:00
Samuel Pitoiset
30b93141aa nvc0: expose GLSL version 420 on GF100
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-21 18:33:06 +02:00
Samuel Pitoiset
d04050071d nvc0: enable ARB_shader_image_load_store on GF100
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-21 18:33:03 +02:00
Samuel Pitoiset
362e17a712 nvc0/ir: add a lowering pass for surfaces on Fermi
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-21 18:32:58 +02:00
Samuel Pitoiset
b663db44ba nvc0/ir: add emission for SULDB and SUSTx
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-21 18:32:56 +02:00
Samuel Pitoiset
cd88d1a171 nvc0/ir: add emission for OP_SULEA
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-21 18:32:54 +02:00
Samuel Pitoiset
8aa1fd321d nv50/ir: fix tex constraints for surface coords on Fermi
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-21 18:32:49 +02:00
Ilia Mirkin
be4caaf247 nv50/ir: use moveSources to condense sources
This makes sure that rIndirectSrc and other things stay updated.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-21 18:32:46 +02:00
Samuel Pitoiset
879bd2ea0c nvc0: bind images on fragment and compute shaders for Fermi
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-21 18:32:41 +02:00
Samuel Pitoiset
e7d2ef42a5 nvc0/ir: don't check the format for surface stores on Kepler
Initially to make sure the format doesn't mismatch and won't produce
out-of-bounds access, we checked that both formats have exactly the same
number of bytes, but this should not be checked for type stores.

This fixes serious rendering issues in the UE4 demos (tested with
realistic and reflections).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-21 16:50:28 +02:00
Samuel Pitoiset
5e32cc9192 nv50/ir: fix a comment in canDualIssue()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-21 16:50:25 +02:00
Samuel Pitoiset
70834d05cd nv50/ir: fix SUSTx constraints on Kepler
To prevent out-of-bounds access and format mismatch we add a predicate
on sustp, but we have to account for it when the sources are condensed
because a predicate is a source. Using the range 3:6 will only condense
the input data and it's always the case. This also fixes constraints
when an indirect access is used.

This ensures that sources are correctly aligned.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-21 16:06:14 +02:00
Kenneth Graunke
9c0d16adc1 i965: Just read the existing tally on EndTransformFeedback if paused.
If the transform feedback object is paused when ending, then there are
no new snapshots to add to the tally.  In fact, we haven't written a
starting snapshot, so we'd best not try and compute (end - start).

Just load the existing tally so we can convert it to the number of
vertices written and store it to the final result location.

This is the Haswell+ equivalent of the previous commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-20 19:55:42 -07:00
Kenneth Graunke
915f7c25fa i965: Don't write a counter snapshot on EndTransformFeedback if paused.
If the transform feedback object is paused, then we've already written
an ending counter snapshot.  We don't want to write another one.

This fixes assertions in GL33-CTS.transform_feedback.api_errors_test,
which calls EndTransformfeedback after PauseTransformFeedback.  On the
next BeginTransformFeedback, we tried to tally up the results, and saw
an odd number of snapshots (due to the double-end), and tripped an
assertion.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-20 19:55:40 -07:00
Kenneth Graunke
47fbe178fa mesa: Call TransformFeedback driver hooks before setting flags.
This way, the driver's EndTransformFeedback() hook can tell whether the
transform feedback operation was paused.  It's also convenient to have
Paused remain false until the driver's PauseTransformFeedback hook
finishes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-20 19:55:26 -07:00
Kenneth Graunke
f7eb95a526 nir: Fix crash in nir_lower_wpos_center().
Otherwise we rewrote the fadd to use itself, causing crashes in
validation.  Instead, start after the last use like we should.

A brown paper bag fix.  Fixes crashes in several Vulkan tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-20 16:33:24 -07:00
Dave Airlie
0970c563d6 nir: remove dead glsl variables before lowering io.
For cull distance GLSL will let unsized unused arrays get
into the backend, we should nuke those straight away, to
save caring about them later.

This fixes:
arb_separate_shader_objects/linker/large-number-of-unused-varyings
as a side effect (even without culling changes).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-21 08:56:45 +10:00
Kenneth Graunke
de45da6a8c spirv: Handle the PixelCenterInteger execution mode.
This isn't allowed by Vulkan, but might be useful someday for
SPIR-V in OpenGL (if that ever becomes a thing).  It's easy enough
to hook up, and as precedent, we already do so for OriginLowerLeft.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 14:44:22 -07:00
Kenneth Graunke
9b8b3f7501 i965: Delete dead dFdy flipping code.
Rob's nir_lower_wpos_ytransform() pass flips dFdy in the opposite case
of what I expected, so we always take the negate_value case.  It doesn't
really matter.

v2: Write src0 before src1 in ADD instructions (requested by Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 14:30:09 -07:00
Kenneth Graunke
08bc74e694 i965: Delete brw_wm_prog_key::render_to_fbo and drawable_height.
Now that we handle flipping and other gl_FragCoord transformations
via a uniform, these key fields have no users.

This patch actually eliminates the associated recompiles.  The Tomb
Raider benchmark's minimum FPS increases from ~1 FPS to a reasonable
number.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 14:30:09 -07:00
Kenneth Graunke
dac10e8a13 i965, anv: Use NIR FragCoord re-center and y-transform passes.
This handles gl_FragCoord transformations and other window system vs.
user FBO coordinate system flipping by multiplying/adding uniform
values, rather than recompiles.

This is much better because we have no decent way to guess whether
the application is going to use a shader with the window system FBO
or a user FBO, much less the drawable height.  This led to a lot of
recompiles in many applications.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 14:30:08 -07:00
Kenneth Graunke
6e5d86c07a nir: Add a simple nir_lower_wpos_center() pass for Vulkan drivers.
nir_lower_wpos_ytransform() is great for OpenGL, which allows
applications to choose whether their coordinate system's origin is
upper left/lower left, and whether the pixel center should be on
integer/half-integer boundaries.

Vulkan, however, has much simpler requirements: the pixel center
is always half-integer, and the origin is always upper left.  No
coordinate transform is needed - we just need to add <0.5, 0.5>.
This means that we can avoid using (and setting up) a uniform.

I thought about adding more options to nir_lower_wpos_ytransform(),
but making a new pass that never even touched uniforms seemed simpler.

v2: Use normal iterator rather than _safe variant (noticed by Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:30:00 -07:00
Kenneth Graunke
12ab7fc6ac nir: Don't use ffma in nir_lower_wpos_ytransform().
ffma is an explicitly fused multiply add with higher precision.
The optimizer will take care of promoting mul/add to fma when
it's beneficial to do so.

This fixes failures on Gen4-5 when using this pass, as those platforms
don't actually implement fma().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-20 14:29:04 -07:00
Kenneth Graunke
b8b1b1c34c nir: Handle fddy_fine and fddy_coarse in nir_lower_wpos_ytransform.
These also need flipping!

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:29:04 -07:00
Kenneth Graunke
4b7577fad8 nir: Make lower_wpos_ytransform_block a void function.
The return value was used for the old nir_foreach_block callback system,
but at this point it no longer means anything.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:29:04 -07:00
Kenneth Graunke
88ea960aa7 nir: Make nir_lower_wpos_ytransform() match FragCoord by location.
gl_FragCoord is a shader input with location == VARYING_SLOT_POS.
ARB_fragment_programs have an equivalent input at VARYING_SLOT_POS,
but it isn't called gl_FragCoord.  We do want to transform it.

Matching by location guarantees we catch both.

Fixes several fp tests on a branch which uses this pass on i965.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:29:04 -07:00
Kenneth Graunke
c9192fcbd2 nir: Add interp_var_at_offset flipping.
The Y-offset needs flipping as well, similar to ddy.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:29:04 -07:00
Kenneth Graunke
287f099db1 nir: Fix fddy swizzles in nir_lower_wpos_ytransform().
The original value might have been swizzled.  That's taken care of in
the fmul source - we don't want to reswizzle it again.

Fixes validation failures in glsl-derivs-varyings on a branch of mine
which uses this pass in i965.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:29:04 -07:00
Kenneth Graunke
7fe9a19302 nir: Fix wpos_ytransform lowering state_slot swizzle.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-20 14:28:30 -07:00
Kenneth Graunke
1539009bf0 i965: Fix brw_regs_equal() for NaN and positive/negative zero.
We'd like the comparisons to mean "the exact same bits".  Comparing
doubles won't do that for NaN values or positive vs. negative zero.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 14:28:06 -07:00
Dave Airlie
b19a0d506d virgl: handle cull distance cap.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-21 06:19:54 +10:00
Rob Herring
2235b80f2a virgl: Add missing texture transfer_inline_write
transfer_inline_write cannot be NULL and the virgl renderer doesn't support
inline writes for textures, so add the default version.

This fixes a crash in st_TexSubImage since commit fb9fe352ea ("st/mesa:
use transfer_inline_write for memcpy TexSubImage path").

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-21 06:07:18 +10:00
Kristian Høgsberg Kristensen
12dc89d844 anv: Merge in my TODO list items
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2016-05-20 10:35:57 -07:00
Matt Turner
015f2207cf mesa: Replace uses of Shared->Mutex with hash-table mutexes
We were locking the Shared->Mutex and then using calling functions like
_mesa_HashInsert that do additional per-hash-table locking internally.

Instead just lock each hash-table's mutex and use functions like
_mesa_HashInsertLocked and the new _mesa_HashRemoveLocked.

In order to do this, we need to remove the locking from
_mesa_HashFindFreeKeyBlock since it will always be called with the
per-hash-table lock taken.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-20 10:05:09 -07:00
Matt Turner
aded1160e5 hash: Add _mesa_HashRemoveLocked() function.
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-20 10:05:09 -07:00
Matt Turner
fb5dcb81cc i965: Pass nir_src/nir_dest by reference.
Cuts 6K of .text.

   text    data     bss     dec     hex filename
5772372  264648   29320 6066340  5c90a4 lib/i965_dri.so before
5766074  264648   29320 6060042  5c780a lib/i965_dri.so after

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 10:04:06 -07:00
Mark Janes
9ca5ec2a31 glsl: Guard against NULL dereference
This trivially corrects mesa 3ca1c221, which introduced a check that
crashes when a match is not found.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005
Fixes: piglit.spec.glsl-1_50.compiler.interface-blocks-name-reused-globally-4.vert
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-05-20 09:52:49 -07:00
Nanley Chery
9b8c4000d0 anv: Enable textureCompressionASTC_LDR on Gen9+
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 09:27:11 -07:00
Nanley Chery
0d2847e177 anv/format: Reorder ASTC mappings to match ISL enum ordering
Keep the lists consistent for ease of use.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 09:27:11 -07:00
Nanley Chery
f3ed3a0a15 genxml: Expand SKL's SurfaceFormat field width for ASTC
In the expanded field, only ASTC format enums have the MSB set to 1.
Expanding the field width makes the process of handling these formats
identical to the way other formats are handled.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 09:27:11 -07:00
Nanley Chery
a141576887 isl: Handle npot ASTC block dimensions on Gen9+
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 09:27:11 -07:00
Nanley Chery
de86fb875d isl: Add 2D ASTC format layouts and enums
Also, make changes needed for successful compilation and registration
as a texture compression mode.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-20 09:27:11 -07:00
Youry Metlitsky
4e2c9a0435 mesa: Build EGL without X11 headers after interop patchset
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-20 08:44:18 -07:00
Rob Clark
df361fc58c nir/validate: assume() that hashtable entry exists
At this point, it would require a logic error in nir_validate to not
have already populated this hashtable entry, but coverity doesn't
realize that:

CID 1265547 (#1 of 1): Dereference null return value (NULL_RETURNS)3.
dereference: Dereferencing a null pointer entry.

CID 1271039 (#1 of 1): Dereference null return value (NULL_RETURNS)3.
dereference: Dereferencing a null pointer entry.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 11:13:50 -04:00
Rob Clark
fcd6b3f42b nir: coverity unitialized pointer read
Not sure how coverity arrives at the conclusion that we can read comp[j]
unitialized (around line 204), other than not being aware that ncomp is
greater than 1 so it won't underflow in the 'if (tex->is_array)' case.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 11:13:50 -04:00
Rob Clark
53c48feae0 nir: coverity sign-extension fix
Not 100% sure, but I think being an unsigned literal will help:

CID 1358505 (#1 of 1): Unintended sign extension
(SIGN_EXTENSION)sign_extension: Suspicious implicit sign extension:
load1->def.num_components with type unsigned char (8 bits, unsigned) is
promoted in load1->def.num_components * (load1->def.bit_size / 8) to
type int (32 bits, signed), then sign-extended to type unsigned long (64
bits, unsigned). If load1->def.num_components * (load1->def.bit_size /
8) is greater than 0x7FFFFFFF, the upper bits of the result will all be
1.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 11:13:50 -04:00
Rob Clark
bb993da795 nir/glsl_to_nir: quell some uninit_member coverity errors
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Matt Turner <mattst88@gmail.com>
2016-05-20 11:13:50 -04:00
Rob Clark
3a1bbd6a0a freedreno/ir3: need to lower fmod too
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-20 11:13:50 -04:00
Mark Janes
a2d28ddc01 i965: Fix strerror error code sign
This trivial fix to error-handling corrects the sign of drm error
codes before passing them to strerror.

Identified by Coverity: CID1358581
2016-05-20 05:58:18 -07:00
Jason Ekstrand
eb384daae8 nir/spirv: Handle the NonReadable decoration on struct members 2016-05-19 21:18:59 -07:00
Jason Ekstrand
ea8c11fdc2 anv/pipeline: Bounds-check resource indices when robuts_buffer_access is enabled 2016-05-19 21:18:59 -07:00
Jason Ekstrand
902628bce6 anv/pipeline: Only do buffer bounds checks if robustBufferAccess is enabled 2016-05-19 21:18:59 -07:00
Jason Ekstrand
23090b51e0 anv/apply_dynamic_offsets: Use rewrite_src instead of a regular assignment
Originally we removed the instruction, changed the source, and then
re-inserted it.  This works, but nir_instr_rewrite_src is a bit more
obviously correct.
2016-05-19 21:18:59 -07:00
Jason Ekstrand
c29ffea6d1 anv/device: Add a boolean for robust buffer access 2016-05-19 21:18:59 -07:00
Jason Ekstrand
d5b4638d6a anv: Add a TODO file 2016-05-19 20:09:31 -07:00
Dave Airlie
3ca1c2216d glsl: handle same struct redeclaration (v2)
This works around a bug in older version of UE4, where a shader
defines the same structure twice. Although we aren't sure this is correct
GLSL (it most likely isn't) there are enough UE4 based things out there
we should deal with this.

This drops the error to a warning if the struct names and contents match.

v1.1: do better C++ on record_compare declaration (Rob)
v2: restrict this to desktop GL only (Ian)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-20 11:22:52 +10:00
Matt Turner
8a65b5135a i965/fs: Recognize and emit ld_lz, sample_lz, sample_c_lz.
Ken suggested instead of a big and complicated optimization pass, to
just recognize the operations here. It's certainly less code and a lot
prettier, but it seems to actually perform worse for currently unknown
reasons.

total instructions in shared programs: 8923452 -> 8904108 (-0.22%)
instructions in affected programs: 814563 -> 795219 (-2.37%)
helped: 3336
HURT: 10

total cycles in shared programs: 66970734 -> 66651476 (-0.48%)
cycles in affected programs: 10582686 -> 10263428 (-3.02%)
helped: 2438
HURT: 691

total spills in shared programs: 1811 -> 1789 (-1.21%)
spills in affected programs: 85 -> 63 (-25.88%)
helped: 4

total fills in shared programs: 3143 -> 3109 (-1.08%)
fills in affected programs: 167 -> 133 (-20.36%)
helped: 4

LOST:   2
GAINED: 36

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-19 17:27:49 -07:00
Matt Turner
75dccf5ac2 i965: Add infrastucture for sample lod-zero operations.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-19 17:27:49 -07:00
Matt Turner
07353599e0 i965/fs: Add and use get_nir_src_imm().
The next patch wants to inspect the LOD argument and do something
different if it's 0.0f. But at that point we've emitted a MOV for it and
we just have a register to look at.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-19 17:27:49 -07:00
Ilia Mirkin
8bf5493899 nvc0: account for shader-allocated local memory needs
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-19 20:20:23 -04:00
Ilia Mirkin
5c6b8cc7d0 nv50/ir: treat addresses as local
Address registers are always loaded right before use. Don't treat them
as "global", which will cause them to be put into the function's
linkage, and will make the register allocator hold onto that
register until the end of the function.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-19 20:20:23 -04:00
Tim Rowley
65c2abf6fd swr: [rasterizer] utility functions for shared libs
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:27:18 -05:00
Tim Rowley
6deb9f7f2c swr: [rasterizer jitter] fix assert in AVX implementation of MASKLOADD
llvm changed the mask type to vector of ints with 3.8.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:27:12 -05:00
Tim Rowley
600528168b swr: [rasterizer core] apply KNOB_TOSS_DRAW to more functions
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:27:06 -05:00
Tim Rowley
6d212cccf0 swr: [rasterizer jitter] add instancing to non-gather fetch path
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:27:01 -05:00
Tim Rowley
63d7ed835a swr: [rasterizer core] move MultisampleTrait static from header to cpp
Move a MultisampleTrait static from header to cpp as clang seemed to get
confused with some specializations in the header vs some in cpp.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:54 -05:00
Tim Rowley
c969ef2d42 swr: [rasterizer core] clang override for _mm_undefined*
Not supported in older xcode versions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:49 -05:00
Tim Rowley
da75160039 swr: [rasterizer common] add OSX to unix portability sections
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:44 -05:00
Tim Rowley
4997169779 swr: [rasterizer] rename _aligned_malloc to AlignedMalloc
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:38 -05:00
Tim Rowley
2e4ef23523 swr: [rasterizer jitter] rename MEMCPY function to MEMCOPY
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:30 -05:00
Tim Rowley
aebbd2f7dd swr: [rasterizer common] guard definition of __cdecl/__stdcall
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:24 -05:00
Tim Rowley
82e335ce67 swr: [rasterizer common] include cstddef for offsetof
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:19 -05:00
Tim Rowley
759d8cf3a3 swr: [rasterizer core] removed tabs that snuck in
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:14 -05:00
Tim Rowley
8e39d410f1 swr: [rasterizer core] code style cleanup
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:08 -05:00
Tim Rowley
b914217c25 swr: [rasterizer core] add dummy code for cygwin build
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:26:02 -05:00
Tim Rowley
a0747c4ce3 swr: [rasterizer core] move variable query outside loop
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:25:54 -05:00
Tim Rowley
f2a1f894ba swr: [rasterizer core] utility function for getenv
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:25:48 -05:00
Tim Rowley
4a58b21ef7 swr: [rasterizer common] portable threadviz buckets
Output with slashes instead of backslashes for unix/linux.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:25:30 -05:00
Tim Rowley
2031baffb5 swr: [rasterizer common] foreground win32 assert dialog
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:25:24 -05:00
Tim Rowley
33d4c2c798 swr: [rasterizer core] use parens to disambiguate operator precedence
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 16:25:06 -05:00
Tim Rowley
9475251145 swr: standardize linkage and check for unresolved symbols
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-19 13:27:33 -05:00
Tim Rowley
6423004d85 swr: fix swr linkage so that static llvm works
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-19 13:27:33 -05:00
Tim Rowley
8987460b9e swr: PIPE_CAP_CULL_DISTANCE cap request response
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 13:27:33 -05:00
Tim Rowley
78572c9b0b docs: add swr to GL3.txt
v2: not on gl3.3 list until gl3.2 is complete

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-19 13:27:17 -05:00
Leo Liu
2f90d11d86 st/va: use drm render node for wayland display type
With xwayland, vainfo use VA_DISPLAY_WAYLAND as default and it fails
and fails when specify display with  `vainfo --display wayland`.
In fact wayland support for libva uses drm path to connect device,
and should use drm pipe loader to create screen.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-19 09:40:33 -04:00
Marek Olšák
f6742859b7 gallium/radeon: small cleanups in r600_texture_transfer_map
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-19 12:35:50 +02:00
Marek Olšák
54737aabb9 gallium/radeon: don't set PB_USAGE in winsyses
There is no point.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-19 12:35:50 +02:00
Marek Olšák
f330b7a14f gallium/radeon: handle VRAM_GTT placements as having slow CPU reads
not sure if we should include GTT WC too

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-19 12:35:50 +02:00
Marek Olšák
5e14d0ac2c gallium/radeon: ignore PIPE_TRANSFER_MAP_DIRECTLY
Only st/xa is using this, which is irrelevant to us.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-19 12:35:50 +02:00
Marek Olšák
51cf04cf0e radeonsi: add a workaround for a bug in LLVM <= 3.8
This is not directly applicable to stable and needs to be backported.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-19 12:35:50 +02:00
Eduardo Lima Mitev
7671687713 i965/fs: Silence warnings related to use of uninitialized values
brw_fs.cpp: In function ‘const unsigned int* brw_compile_fs(const [...]
brw_fs.cpp:6093:64: warning: ‘simd16_grf_start’ may be used uninitialized [...]
       prog_data->base.dispatch_grf_start_reg = simd16_grf_start;

brw_fs.cpp:5996:29: note: ‘simd16_grf_start’ was declared here
    uint8_t simd8_grf_start, simd16_grf_start;

brw_fs.cpp:6094:52: warning: ‘simd16_grf_used’ may be used uninitialized [...]
       prog_data->reg_blocks_0 = brw_register_blocks(simd16_grf_used);

brw_fs.cpp:5997:29: note: ‘simd16_grf_used’ was declared here
    unsigned simd8_grf_used, simd16_grf_used;

(and more)

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-19 09:05:18 +02:00
Eric Anholt
a507dcc160 vc4: Size transfer temporary mappings appropriately for full maps of 3D.
We don't really support reading/writing of 3D textures since the hardware
doesn't do 3D, but we do need to make sure that a pipe_transfer for them
has enough space to store the image.  This was previously not a problem
because the state tracker only mapped a slice at a time until
fb9fe352ea.  Fixes glean glsl1 tests, which
all have setup of a 3D texture at the start.
2016-05-18 17:30:07 -07:00
Nanley Chery
7ac08adfb4 anv/device: Fix viewportBoundsRange
Align with the spec requirement that the range must be at least
[−2 × maxViewportDimensions, 2 × maxViewportDimensions − 1]. Our
hardware supports this.

Fixes dEQP-VK.api.info.device.properties

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-18 16:01:50 -07:00
Dave Airlie
61b6789252 glsl/linker: attempt to match anonymous structures at link
This is my attempt at fixing at least one of the UE4 bugs with GL4.3.

If we are doing intrastage matching and hit anonymous structs, then
we should do a record comparison instead of using the names.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-19 08:16:50 +10:00
Mark Janes
4dfa89e33c anv/batch_chain: free pointers for error cases
Trivial fix to improperly handled cleanup during
VK_ERROR_OUT_OF_HOST_MEMORY.

Identified by Coverity: CID 1358908 and 1358909
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-18 15:14:22 -07:00
Wang He
f21b7d1e5c st/nine: Minor change to support musl libc
A few changes to support musl libc as well.

In particular fpu_control.h is glibc specific.
fenv.h doesn't enable to do exactly what we want either,
so instead use assembly directly.

Signed-off-by: Wang He <xw897002528@gmail.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Patrick Rudolph
de39231134 st/nine: Enable D3DPMISCCAPS_PERSTAGECONSTANT
Nine already supports the feature.
There are no failing WINE tests for per stage constants.
Enabling D3DPMISCCAPS_PERSTAGECONSTANT as it fixes
https://github.com/iXit/Mesa-3D/issues/205

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
839f417634 st/nine: Turn on thread_submit by default when on different device
The last remaining issues with thread_submit have been resolved,
thus turn it when on a different device (the case where is is
beneficial).

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
9cae3cdc89 st/nine: Fix usage of rasterizer multisample bit.
pipe_rasterizer multisample bit should be enabled only when really
wanting to do multisampling, thus we should disable when not having
msaa render target.
This fixes some depth calculation precision issues on radeon.
Also disable it when depth and stencil tests are disabled, since in that
case multisampling is same as not multisampled.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
f297e7de0f st/nine: ATOC has effect only with ALPHATESTENABLE
ATOC extension does something only when alpha test is enabled.
Use a second bit to encode the difference with ATIATOC.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
edc5cdced5 st/nine: Add debug string for ATOC
We were missing a debug string for this format.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
4e89dcf0c4 st/nine: Add asserts for output/input packing
Nine doesn't support vs output/ps input packing.
We haven't found any application requiring that,
and implementing it properly is complex.

Add asserts for now.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
aeddda0c3a st/nine: Use correct PIPE_HANDLE_USAGE flag for frontbuffer copy
When taking screenshots we do a copy from the frontbuffer
to an allocated buffer (which we then copy to a ram buffer).

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
ca7c78a88e st/nine: Fix output shift calculation
We were getting it wrong for negative values.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
b8d95d4087 st/nine: Fix CheckDeviceFormat advertising for surfaces
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
6ef231c80f st/nine: Improve buffer placement
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
7639033973 st/nine: Fix buffer bind flags
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
0f6e31823d st/nine: Fix buffer locking flags handling
Our behaviour was not entirely similar to what
the docs and our tests describe.

Drop d3dlock_buffer_to_pipe_transfer_usage.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Patrick Rudolph
f45b9894e5 st/nine: Improve logging
Add missing DBG calls in dtors.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Patrick Rudolph
f3fa7e3068 st/nine: Use WINE thread for threadpool
Use present interface 1.2 function ID3DPresent_CreateThread
to create the thread for threadpool.
Creating the thread with WINE prevents some rarely occuring crashes.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Patrick Rudolph
72be473ad1 st/nine: Don't present if window is occluded
The problem is that if one d3d present call fails,
because of our occlusion check in present method,
the next presentation call will send the same pixmap to the Xserver again,
without waiting it is released, which is wrong.

Move the present call after occlusion check to return and prevent
Xpixmaps errors.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Patrick Rudolph
c673c46ccf st/nine: Use new function to query for resolution mismatch
Any third party app might change the current screen resolution.
Poll for resolution mismatch to force a device reset.
Required for non ex devices only.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Patrick Rudolph
dae9a91727 st/nine: Implement IPresent version 1.2
Implement presentation interface version 1.2:
* ID3DPresent_ResolutionMismatch
    Poll for resolution mismatch.
    A third party app might have changed resolution,
    which requires a device reset.
* ID3DPresent_CreateThread
    Create a thread in WINE to allow nine to use Windows API
    functions. Required for multi-threaded presentation.
    In single-threaded presentation mode the calling thread is
    already known to WINE.
* ID3DPresent_WaitForThread
    Wait for a wine thread to terminate.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
2e149a2bf0 st/nine: Implement BumpEnvMap for ff
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
c4e85202cb st/nine: Format conversion for volumes in UpdateTexture
We were doing the conversion for surfaces, but not yet
volumes. Now that volumes can do conversion, use it.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
23e2a235dc st/nine: Remove one useless function output
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
10e548c0c9 st/nine: Add support for X8L8V8U8
X8L8V8U8 support should be common. Some more recent cards
do support this format, but not L6V5U5.

Add fallback for this format to have it alwaus supported.

L6V5U5 conversion rule apparently differs a bit from the normal
spec, and thus the gallium equivalent format leads to slightly
wrong colors. Since some recent cards do not support it, do not
support it either.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
258ca1823c st/nine: Add format fallback with conversion to volumes
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
755fbcdf24 st/nine: Add format fallback with conversion to surfaces
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
52cb8e33c3 gallium/util: Implement util_format_translate_3d
This is the equivalent of util_format_translate, but for volumes.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
89344a80fc st/nine: Fix Pointsize in programmable shader
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
ae0fdd8a40 st/nine: Fix ff pointscale computation
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
c4af309973 st/nine: Fix header of GetIndices
There is a mistake in the online documentation,
the function only has 2 arguments.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
3e9d01ff39 st/nine: Increase minor d3dadapter9drm ABI
Version 0.1 allows to assume that the second
element of the IDirect3D* structures will
be a pointer to the internal nine vtable.

This is useful if the gallium nine user wants
to wrap some interfaces.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
2d51c817cd st/nine: Fix leak after ctor failures
Previously ctor failures would not unreference
the device.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
7fc8391d23 st/nine: Add ColorFill test for compressed textures
ColorFill should contain alignment checks
for compressed textures.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
d11d913987 st/nine: PositionT and Tessfactor are forbidden as PS input
According to wine tests, they are forbidden as PS input,
which makes sense.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
44068af92e st/nine: Fix some shader failures not triggering error
Some failures during shader translation would not
raise errors before this patch.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
a77d8cd710 st/nine: Forbid POSITION0 for PS3.0
POSITION0 input is forbidden for PS3.0 apparently.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
217d969746 st/nine: Rework UpdateTexture Checks
Our code did match the user documentation of the function
quite well (except for format check).

However the DDI documentation and wine tests show that
documentation was not correct. Thus adapt our code to
fit the best possible to the -real- spec.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
4c77673de7 st/nine: Use bufs instead of Flags for Clear
bufs doesn't contain depthstencil if
there is z buffer mismatch. This is the behaviour
we want.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
f7c3d27d18 d3dadapter9: Add ddebug, rbug and trace support
Add support for ddebug, rbug and trace

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Axel Davy
0ae3c8ece7 radeon: Change AA sample locations for EG+
This sets the AA location to the d3d11
spec.
EG/NI 8X MSAA is left as is. Not sure
why it was set different to Cayman, so
lets it as is.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-18 23:37:14 +02:00
Axel Davy
11e4987135 radeonsi: Mixed colorbuffer formats are unsupported
Besides depth/stencil, the hardware doesn't support
mixed formats.

The GL state tracker doesn't make use of them.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-18 23:37:14 +02:00
Axel Davy
fc3533c088 radeonsi: Change default behaviour for undefined COLOR0
d3d 9 needs COLOR0 to be 1.0 on all channels when
undefined. 0.0 for the others is fine.
GL behaviour is undefined.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-18 23:37:14 +02:00
Axel Davy
a221f40dbb r600g: Change default behaviour for undefined COLOR0
d3d 9 needs COLOR0 to be 1.0 on all channels when
undefined. 0.0 for the others is fine.
GL behaviour is undefined.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-18 23:37:14 +02:00
Axel Davy
7e05e4c388 r600: Change default behaviour for undefined COLOR0
d3d 9 needs COLOR0 to be 1.0 on all channels when
undefined. 0.0 for the others is fine.
GL behaviour is undefined.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-18 23:37:14 +02:00
Christian Schmidbauer
f5d6ed5702 st/nine: Clean up WINAPI definition
As Emil pointed out, only gcc, clang and MSVC compatibility is required.
Hence the check for GNUC can be skipped, as __i386__ and __x86_64__ are
only defined for gcc/clang, not for MSVC.

Remove the #undef which has been there for historic reasons, when wine
dlls for nine have been built inside mesa. Instead use #ifndef in order
to avoid redefining WINAPI from MSVC's headers.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Axel Davy <axel.davy@ens.fr>
2016-05-18 23:37:14 +02:00
Brian Paul
243fd02858 svga: add another debug_printf() in svga_screen_create()
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-05-18 14:58:35 -06:00
Brian Paul
96909ef128 spirv: add switch case for nir_texop_txf_ms_mcs in vtn_handle_texture()
Mark it as unreachable.  Silences a compiler warning:

spirv/spirv_to_nir.c:1397:4: warning: enumeration value
'nir_texop_txf_ms_mcs' not handled in switch [-Wswitch]
    switch (instr->op) {
    ^

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-05-18 14:57:45 -06:00
Matt Turner
9c290b1e54 Revert "i965/urb: fixes division by zero"
This reverts commit 2a8aa1e3de.
2016-05-18 12:48:50 -07:00
Ardinartsev Nikita
2a8aa1e3de i965/urb: fixes division by zero
Fixes regression introduced by af5ca43f26

Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419
2016-05-18 11:09:37 -07:00
Matt Turner
caab3cd536 mesa: fclose() filename on error.
Pretty useless, as it's in debugging code. Found by Coverity (CID
1257016).
2016-05-18 11:09:37 -07:00
Matt Turner
cbb0e3a7e8 i965/fs: Assert that nir_op_extract_*'s src1 is a constant. 2016-05-18 11:09:37 -07:00
Matt Turner
6a4ff51f7a glsl: Check that layout is non-null before dereferencing.
layout should only be null for structs, but it's checked everywhere else
and confuses Coverity (CID 1358495).

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-05-18 11:09:37 -07:00
Matt Turner
53f64a8404 egl/dri2: Don't check return result of mtx_unlock().
Coverity (CID 1358496) warns that the cleanup code doesn't unlock the
mutex (which is arguably kind of stupid, since the only case that can
happen is when mtx_unlock() failed!). But, mtx_unlock() isn't going to
fail -- the mutex was locked by this thread just a few lines above it.
2016-05-18 11:09:37 -07:00
Matt Turner
b1e6d069da spirv: Properly size the src[] array.
Operations like nir_op_bitfield_insert have four arguments, and Coverity
isn't privy to the fact that 4-argument operations aren't possible here,
so it thinks this can lead to memory corruption. Just increase the size
of the array to quell any fears.
2016-05-18 11:09:37 -07:00
Matt Turner
0a548eb56f isl: Mark default cases in switch unreachable.
To silence -Wmaybe-uninitialized warnings.
2016-05-18 11:09:37 -07:00
Ian Romanick
7619aed41d glsl/linker: Ensure the first stage of an SSO pipeline has input locs assigned
Previously an SSO pipeline containing only a tessellation control shader
and a tessellation evaluation shader would not get locations assigned
for the TCS inputs.  This would lead to assertion failures in some
piglit tests, such as arb_program_interface_query-resource-query.

That piglit test still fails on some tessellation related subtests.
Specifically, these subtests fail:

'GL_PROGRAM_INPUT(tcs) active resources' expected 2 but got 3
'GL_PROGRAM_INPUT(tcs) max length name' expected 12 but got 16
'GL_PROGRAM_INPUT(tcs,tes) active resources' expected 2 but got 3
'GL_PROGRAM_INPUT(tcs,tes) max length name' expected 12 but got 16
'GL_PROGRAM_OUTPUT(tcs) active resources' expected 15 but got 3
'GL_PROGRAM_OUTPUT(tcs) max length name' expected 23 but got 12

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2016-05-18 10:53:50 -07:00
Ian Romanick
79bbff9def glsl/linker: Don't include interface name for built-in blocks
Commit 11096ec introduced a regression in some piglit tests (e.g.,
arb_program_interface_query-resource-query).  I did not notice this
regression because other (unrelated) problems caused failed assertions
in those same tests on my system... so they crashed before getting to
the new failure.

v2: Use is_gl_identifier.  Suggested by Tim.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2016-05-18 10:53:34 -07:00
Ian Romanick
2ef4b5bc93 glsl: Assert that inputs have a location assigned
This catches a problem previously undetected until deep in the backend.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-05-18 10:53:34 -07:00
Ian Romanick
cf9220b11f glsl/linker: Fix trivial typos in comments
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-05-18 10:53:34 -07:00
Ian Romanick
d2579728c9 glsl/linker: Fix some formatting to match current coding conventions
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-05-18 10:53:34 -07:00
Ian Romanick
02e4753777 glsl/linker: Silence unused parameter warning
The use of the parameter was removed in d6b92028.

glsl/link_varyings.cpp:1390:39: warning: unused parameter ‘separate_shader’ [-Wunused-parameter]
                                   bool separate_shader)
                                       ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-05-18 10:53:34 -07:00
Ian Romanick
75c9aa6670 glsl/linker: Silence unused parameter warning
The parameter appears to have been unused since the function was added
in commit 12ba6cfb.  Remove it.

glsl/linker.cpp:2886:60: warning: unused parameter ‘prog’ [-Wunused-parameter]
 match_explicit_outputs_to_inputs(struct gl_shader_program *prog,
                                                            ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-05-18 10:53:34 -07:00
Ian Romanick
f687b8e178 i965: Silence unused parameter warnings
The only place that actually used the type parameter was the GS visitor,
and it was always passed glsl_type::int.  Just remove the parameter.

brw_vec4_vs_visitor.cpp:38:61: warning: unused parameter ‘type’ [-Wunused-parameter]
                                            const glsl_type *type)
                                                             ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-05-18 10:53:34 -07:00
Daniel Scharrer
1d628ea09d mesa: Don't advertise GLES 3.1 without compute support
The MaxComputeWorkGroupInvocations constant is used in
compute_version_es2() instead of extensions->ARB_compute_shader
as ES has lower requirements than desktop GL.

Both i965 and gallium set this constant before enabling compute support.

Signed-off-by: Daniel Scharrer <daniel@constexpr.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-05-18 18:21:21 +02:00
Rob Clark
5827a1dc4b mesa/st: don't leak name
Pointed out by coverity.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-18 09:20:22 -04:00
Brian Paul
877a8026c7 svga: null out all sampler views if start=num=0
Because the CSO module handles sampler views for fragment shaders
differently than vertex/geom shaders, VS/GS shader sampler views
aren't explicitly unbound like for FS sampler vers.  This code
checks for the case of start=num=0 and nulls out the sampler views.
Fixes a assert regression in piglit's arb_texture_multisample-
sample-position test.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-05-17 19:20:36 -06:00
Brian Paul
fe430b0310 st/mesa: remove unused st_context::default_texture
The code which used this was removed quite a while ago.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-05-17 19:20:36 -06:00
Brian Paul
5888c47cc9 cso: remove / add some comments
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-05-17 19:20:36 -06:00
Eric Anholt
18260d0582 vc4: Add support for vertex color clamping in the rasterizer.
This gets us precompile of vertex shaders at the state tracker level as
well.
2016-05-17 18:09:58 -07:00
Eric Anholt
474e2bbcc1 vc4: Move tgsi_to_nir to precompile time.
Now we have an immutable nir shader in our shader's CSO that we can clone
and lower/optimize.
2016-05-17 18:07:39 -07:00
Eric Anholt
734fe41092 vc4: Mark the driver as supporting fragment color clamping in rast.
We always clamp fragment colors, since they're always 8-bit unorm, so
there's no need to have us compile separate shaders based on
GL_ARB_color_buffer_float.  This gives us precompilation of fragment
programs to the vc4_shader_state_create() level.
2016-05-17 18:07:39 -07:00
Eric Anholt
8835eb689b vc4: Enable sharing shaders across contexts.
This allows the same pipe_shader_state to be referenced from multiple
contexts.  Since our pipe_shader_state is treated as immutable (other than
the variant number) within the driver, this is no problem.
2016-05-17 18:07:39 -07:00
Eric Anholt
62087cb9b8 vc4: Switch to using nir_load_front_face.
This will be generated by glsl_to_nir, and it turns out that this is a
more code-efficient path than the floating point math, anyway.

No change on shader-db, but drops an instruction in piglit's
glsl-fs-frontfacing.
2016-05-17 18:07:39 -07:00
Eric Anholt
0700e4c0c7 vc4: Drop the dead export_linkage array.
This came from deriving from freedreno.
2016-05-17 18:07:39 -07:00
Eric Anholt
24e7e3d3fc vc4: Fix a -Wformat-security warning.
This is apparently enabled as an error in Android builds, and the compiler
can't tell that the return value is safe.
2016-05-17 18:07:39 -07:00
Alex Deucher
86f51d7958 radeonsi: add new polaris11 pci ids
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-17 17:49:50 -04:00
Alex Deucher
768320b497 radeonsi: add new polaris10 pci ids
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-17 17:49:50 -04:00
Kenneth Graunke
dc657a8201 i965: Make brw_reg_from_fs_reg() halve exec_size when compressed.
In a5d7e144ea, Connor generalized the
exec_size halving code to handle more cases.  As part of this, he made
it not halve anything if the region accessed falls completely in a
single register.

Unfortunately, it started producing some invalid regions:

-add(16)  g6<1>F  g10<8,8,1>UW    -g1<0,1,0>F    { align1 compr };
-add(16)  g8<1>F  g12<8,8,1>UW    -g1.1<0,1,0>F  { align1 compr };
+add(16)  g6<1>F  g10<16,16,1>UW  -g1<0,1,0>F    { align1 compr };
+add(16)  g8<1>F  g12<16,16,1>UW  -g1.1<0,1,0>F  { align1 compr };

Here, the UW source region completely fits within a register.  However,
we have to use instruction compression because the destination region
spans two registers.  <16,16,1> is invalid because it's compressed.

To handle this, skip the "everything fits in one register" case and
fall through to the exec_size halving case when compressed.

Fixes hundreds of Piglit regressions on GM965.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95370
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-17 14:40:37 -07:00
Kenneth Graunke
062ad81669 i965: Move compression decisions before brw_reg_from_fs_reg().
brw_reg_from_fs_reg() needs to know whether the instruction will be
compressed or not.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95370
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-17 14:40:31 -07:00
Kenneth Graunke
9a1936d965 i965: Enable ES 3.2 sample shading extensions.
This enables:
- GL_OES_sample_shading
- GL_OES_sample_variables
- GL_OES_shader_multisample_interpolation

On Gen8, we pass all the CTS tests, and all but 4 of the dEQP-GLES31
tests (dealing with 1x/2x MSAA at half rate sampling).  We believe
those 4 dEQP-GLES31 tests are incorrect.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-17 14:27:29 -07:00
Jordan Justen
1ff212bfd3 anv: Fix warning: unused variable ‘cs_prog_data’
This was introduced in 8a80af2820.

Reported-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-17 14:09:56 -07:00
Mauro Rossi
0e81336550 android: fix building error in libmesa_st_mesa
Fixes the following building error due to libmesa_nir dependency:

In file included from external/mesa/src/mesa/state_tracker/st_glsl_to_nir.cpp:44:0:
external/mesa/src/compiler/nir/nir.h:42:25: fatal error: nir_opcodes.h: No such file or directory
 #include "nir_opcodes.h"
                         ^
compilation terminated.
build/core/binary.mk:706: recipe for target 'out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.o' failed
make: *** [out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.o] Error 1
make: *** Waiting for unfinished jobs....

Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-17 17:07:28 -04:00
Nicolai Hähnle
941756f092 radeonsi: force level zero on image instructions in non-fragment shaders (v2)
Section 8.9 (Texture Functions) of the OpenGL Shading Language 4.5
specification:

   However, automatic level of detail is computed only for fragment shaders.
   Other shaders operate as though the base level of detail were computed as
   zero.

and Section 8.9.3 (Texture Gather Functions):

   When performing a texture gather operation, the minification and
   magnification filters are ignored, and the rules for LINEAR filtering in
   the OpenGL Specification are applied to the base level of the texture
   image to identify the four texels i_0 j_1, i_1 j_1, i_1 j_0, and i_0 j_0.

Of course, explicit LOD or derivative variants work in all shader types.

This fixes several GL4x-CTS.texture_gather.* tests.

v2: TG4 is always level zero (thanks, Ilia)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-17 15:28:40 -05:00
Nicolai Hähnle
988fd6c922 radeonsi: emit TXQ in separate functions
TXQ is sufficiently different that having in it in the same code path as
texture sampling/fetching opcodes doesn't make much sense.

v2: guard against NULL pointer dereferences

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2016-05-17 15:28:40 -05:00
Nicolai Hähnle
d464bfd12a winsys/amdgpu: cleanup error handling in amdgpu_ctx_create
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-17 15:28:40 -05:00
Nicolai Hähnle
fef08af99c winsys/amdgpu: avoid ioctl call when fence_wait is called without timeout
When user fences are used, we don't need the kernel for polling.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-17 15:28:39 -05:00
Nicolai Hähnle
0558564200 gallium/radeon: add radeon_emitted to check for non-trivial IBs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-17 15:28:39 -05:00
Nicolai Hähnle
5e89b027b9 gallium/radeon: use radeon_emit_array
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-17 15:28:39 -05:00
Nicolai Hähnle
c23273532e gallium/radeon: use radeon_emit
Mostly generated using a sed-script, with manual fix-up for multi-line
statements.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-17 15:28:38 -05:00
Nicolai Hähnle
4ac555e9e5 st/mesa: fix reversed copyimage canonical format
The format_desc swizzle describes where in the array each color channel
comes from - but the existing code was written as if each entry in the
swizzle described the meaning of an array element.

Fixes piglit's arb_copy_image-format-swizzle.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-17 15:28:38 -05:00
Jordan Justen
6c9f35bb73 Revert "HACK: Don't re-configure L3$ in render stages pre-BDW"
This reverts commit 41af9b2e51.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94468
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-17 13:04:03 -07:00
Jordan Justen
8a80af2820 anv: Port L3 cache programming from i965
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-17 13:04:03 -07:00
Jordan Justen
aa41de080d anv/gen7: Add memory barrier to vkCmdWaitEvents call
We also have this barrier call for gen8 vkCmdWaitEvents.

We don't implement waiting on events for gen7 yet, but this barrier at
least helps to not regress CTS cases when data caching is enabled.
Without this, the tests would intermittently report a failure when the
data cache was enabled.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-17 13:04:03 -07:00
Jordan Justen
8ee31828c6 anv: Keep track of whether the data cache should be enabled in L3
If images or shader buffers are used, we will enable the data cache in
the the L3 config.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-17 13:04:03 -07:00
Jordan Justen
ff41738871 genxml/hsw: Add L3 cache control registers
These were added to the i965 driver in
5912da45a6.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-17 13:04:03 -07:00
Jan Vesely
47b390fe45 Treewide: Remove Elements() macro
Signed-off-by: Jan Vesely <jano.vesely@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-17 15:28:04 -04:00
Jan Vesely
322cd2457c r600g,sb: Don't use standard macro name
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-05-17 15:28:03 -04:00
Jason Ekstrand
b6c4d46a58 anv/formats: Add support for VK_FORMAT_B4G4R4A4_UNORM pre-gen8 2016-05-17 12:17:22 -07:00
Jason Ekstrand
45c93384e5 anv: Add a devinfo argument to the get_format functions 2016-05-17 12:17:22 -07:00
Jason Ekstrand
100db3d31c anv/formats: Set the swizzle to RGB1 when using an RGBA format to fake RGB
This way we get correct sampling from RGB formats that are faked as RGBA.
This should also cause it to disable rendering and blending on those
formats.  We should be able to render to them and, on Broadwell and above,
we can blend on them with work-arounds.  However, we'll add support for
that more properly later when it's deemed useful.  For now, disabling
rendering and blending should be safe.
2016-05-17 12:17:22 -07:00
Jason Ekstrand
ce375fba41 anv/formats: Refactor anv_get_format
The new code removes the switch statement and instead handles depth/stencil
as up-front special cases.  This allows for potentially more complicated
color format handling in the future.
2016-05-17 12:17:22 -07:00
Jason Ekstrand
34198d798c anv: Use 16 bits for the isl_format in anv_format
This way the entire anv_format structure fits in 32 bits
2016-05-17 12:17:22 -07:00
Jason Ekstrand
7cae59012d anv/formats: Use the isl_channel_select enum for the swizzle 2016-05-17 12:17:22 -07:00
Jason Ekstrand
8ed429a4f0 anv/formats: Add an anv_get_format helper
This commit removes anv_format_for_vk_format and adds an anv_get_format
helper.  The anv_get_format helper returns the anv_format by-value.  Unlike
anv_format_for_vk_format the format returned by anv_get_format is 100%
accurate and includes any tweaks needed for tiled vs. linear.
anv_get_isl_format is now just a wrapper around anv_get_format that picks
off just the isl_format.
2016-05-17 12:17:22 -07:00
Jason Ekstrand
13f5cee663 anv/format: Simplify anv_format
Now that we have VkFormat introspection and we've removed everything that
tried to use anv_format for introspection, we no longer need most of what
was in anv_format.
2016-05-17 12:17:22 -07:00
Jason Ekstrand
c1c004e5b2 anv/formats: Delete validate_GetPhysicalDeviceFormatProperties
All it ever did was some extra logging that was useful when initially
bringing up Dota2.  We don't need it anymore.
2016-05-17 12:17:22 -07:00
Jason Ekstrand
aad56f3ee7 anv/image: Use aspects for computing full usage 2016-05-17 12:17:22 -07:00
Jason Ekstrand
fbc23d93e0 anv: Remove the anv_format member from anv_image 2016-05-17 12:17:22 -07:00
Jason Ekstrand
be94a23b44 anv/wsi: Use vk_format_info for asserts rather than anv_format 2016-05-17 12:17:22 -07:00
Jason Ekstrand
63dbb2c60a anv/copy: Use the linear format from the image for the buffer block size
Because the buffer is exposed to the user, the block size is defined to
always exactly be the size of the actual vulkan format.  This is the same
size (it had better be) as the linaer image format.
2016-05-17 12:17:22 -07:00
Jason Ekstrand
c87429c5f1 anv/image: Stop using anv_format for image create validation 2016-05-17 12:17:22 -07:00
Jason Ekstrand
990a7420b6 anv/image: Make heavier use of aspects 2016-05-17 12:17:22 -07:00
Jason Ekstrand
369b8bf402 anv/copy: Use the color_surf from the image to get the block size 2016-05-17 12:17:22 -07:00
Jason Ekstrand
9102e88364 anv: Change render_pass_attachment.format to a VkFormat 2016-05-17 12:17:22 -07:00
Jason Ekstrand
ffc502ce0c anv: Add helpers to provide simple VkFormat introspection
As much as I hate adding yet more format introspection, there are times
when the VkFormat is sufficient and we don't want to round-trip through
isl_format.  For these times, the new vk_format_info.c/h files provide some
simple driver-agnostic VkFormat introspection.  This intended to be
specific to Vulkan but not to any driver whatsoever.
2016-05-17 12:17:22 -07:00
Jason Ekstrand
97ba402cc3 anv/image: Use get_isl_format when creating buffer views 2016-05-17 12:17:22 -07:00
Jason Ekstrand
234ecf26c6 anv/image: Add an aspects field
This makes several checks easier and allows us to avoid calling
anv_format_for_vk_format in a number of cases.
2016-05-17 12:17:22 -07:00
Jason Ekstrand
1bda8d06e5 anv: Make format_for_descriptor return an isl_format 2016-05-17 12:17:22 -07:00
Jason Ekstrand
263a8cb52d anv/wayland: Don't allow non-renderable formats 2016-05-17 12:17:22 -07:00
Jason Ekstrand
eb6baa3174 anv/wsi: Make WSI per-physical-device rather than per-instance
This better maps to the Vulkan object model and also allows WSI to at least
know the hardware generation which is useful for format checks.
2016-05-17 12:17:22 -07:00
Adam Jackson
2ad9d6237a glapi/gen: Copy some GL 1.0 enum details into ARB_viewport_array
Otherwise the instances in the extension XML override the core
definitions, and we stop knowing their sizes in indirect_size_get.c

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-05-17 15:04:56 -04:00
Adam Jackson
f4983b194d glapi: Define PURE for Sun Studio as well
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-05-17 15:04:56 -04:00
Adam Jackson
f1dd8dd6b6 glapi/glx: Mark byteswap functions as _X_UNUSED (v2)
Squashes the one remaining warning in the xserver build.

v2: Also clean up some non-standard whitespace (Ian Romanick)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-05-17 15:04:56 -04:00
Adam Jackson
ea08a5bcf6 glapi: Harden GLX request size processing (v2)
v2: Use == not is for equality testing (Dylan Baker)

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-05-17 15:04:56 -04:00
Adam Jackson
88cfc9ddaa glapi: Add the safe_{add,mul,pad} functions from xserver
We're about to update the generator scripts to use these, easier not to
vary between client and server.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-05-17 15:04:56 -04:00
Adam Jackson
7bc5c7f586 glapi: Fix whitespace droppings when printing the license header
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-05-17 15:04:56 -04:00
Rob Clark
1e93b0caa1 mesa/st: add support for NIR as possible driver IR
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
2016-05-17 14:22:46 -04:00
Rob Clark
2bbb140be3 mesa/st: move things around a bit in st_create_fp_variant()
Prep work for next patch.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-17 14:22:46 -04:00
Rob Clark
8f9a46dccb mesa/st: add nir pass for lowering builtin uniforms
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-17 14:22:46 -04:00
Emil Velikov
52addd90d1 scons: gallium: link against nir as needed
... otherwise we'll produce uncomplete binaries with introduction of NIR
as alternative IR with next commits.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-17 14:22:46 -04:00
Jason Ekstrand
265487aedf i965/fs: Add an allow_spilling flag to brw_compile_fs
This allows us to disable spilling for blorp shaders since blorp state
setup doesn't handle spilling.  Without this, blorp fails hard if you run
with INTEL_DEBUG=spill.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Francisco Jerez <currojerez@riseup.net>
2016-05-17 10:20:11 -07:00
Ilia Mirkin
dd4b44efc0 nvc0/ir: fix shared atomic lowering to preserve shared memory location
We were always doing atomics on shared memory location 0 instead of the
originally supplied location. Make sure to pass through the original
symbol and any indirection.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org # note: expect minor conflict
2016-05-17 11:22:01 -04:00
Rob Clark
b65bd3dee5 freedreno/ir3: fix compiler warning
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-17 10:05:20 -04:00
Rob Clark
e8beffb1b3 nir/validate: dump annotated shader with error msgs
Log all the errors, and at the end dump the shader w/ error annotations
to make it easier to see where the problems are.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-05-17 10:05:20 -04:00
Rob Clark
54ecfcc162 nir/validate: assert() -> validate_assert()
Prep work for next patch.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-05-17 10:05:20 -04:00
Rob Clark
a0ef26c1c2 nir/print: add support for print annotations
Caller can pass a hashtable mapping NIR object (currently instr or var,
but I guess others could be added as needed) to annotation msg to print
inline with the shader dump.  As the annotation msg is printed, it is
removed from the hashtable to give the caller a way to know about any
unassociated msgs.

This is used in the next patch, for nir_validate to try to associate
error msgs to nir_print dump.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-05-17 10:05:20 -04:00
Alejandro Piñeiro
e5e412cd27 i965: Expose OpenGL 4.2 for gen8+
ARB_vertex_attrib_64bit was the only feature missing.

v2: we can expose 4.2 instead of 4.1 (Ian Romanick)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 09:05:55 +02:00
Alejandro Piñeiro
f051eae25a docs: Mark ARB_vertex_attrib_64bit as done for i965/gen8+
v2: label as done for i965/gen8+ instead of i965 (Kenneth Graunke)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 09:05:55 +02:00
Alejandro Piñeiro
59b5441fd9 i965: Enable ARB_vertex_attrib_64bit for gen8+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 09:05:55 +02:00
Juan A. Suarez Romero
d6281a9d95 i965: take care of doubles when lowering VS inputs
Input attributes can require 2 vec4 or 1 vec4 depending on whether they
are double-precision or not.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 09:05:55 +02:00
Juan A. Suarez Romero
7ea09511ca i965/fs: calculate first non-payload GRF using attrib slots
When computing where the first non-payload GRF starts, we can't rely on
the number of attributes, as each attribute can be using 1 or 2 slots
depending on whether they are a dvec3/4 or other.

Instead, we need to use the number of slots used by the attributes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 09:05:55 +02:00
Juan A. Suarez Romero
b7423b485e i965/vec4: use attribute slots to calculate URB read length
Do not use total attributes because a dvec3/dvec4 attribute requires two
slots. So rather use total attribute slots.

v2: do not use loop to calculate required attribute slots (Kenneth
Graunke)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 09:05:55 +02:00
Juan A. Suarez Romero
b0fb08e179 i965: take care of doubles when remapping VS attributes
Double-precision types require 1 slot in VUE for double and dvec2, and 2 slots for
anything else.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 09:05:54 +02:00
Juan A. Suarez Romero
80535873bb nir: add double input bitmap
This bitmap tracks which input attributes are double-precision.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 09:05:54 +02:00
Juan A. Suarez Romero
ccfe25f758 i965/fs: shuffle 32bits into 64bits for doubles
VS Thread Payload handles attributes in URB as vec4, no matter if they
are actually single or double precision.

So with double-precision types, value ends up in the registers split in
32bits chunks, in different positions.

We need to shuffle the chunks to get the doubles correctly.

v2:
 * Extra blank line. Add { } on if body (Ian Romanick)
 * Use dest directly (Kenneth Graunke)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 09:05:47 +02:00
Alejandro Piñeiro
96c276dda9 i965/fs: half exec_size when dealing with 64 bits attributes
The HW has a restriction that only vertical stride may cross register
boundaries. Until now this was only handled on VGRFs at
rw_reg_from_fs_reg, but it is also needed for attributes.

v2:
 * Remove reference to commit id on commit message (Juan Suarez)
 * Simplify code that compute final exec_size (Ian Romanick)
 * Use REG_SIZE on that same code (Kenneth Graunke)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 07:34:40 +02:00
Alejandro Piñeiro
1ff32ae8b2 i965: passthru formats cannot be used width edge flag enabled
Add an assertion to detect this case.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 07:34:40 +02:00
Antia Puentes
8b0a334b5e i965: Configure how to store *64*PASSTHRU vertex components
From the Broadwell specification, structure VERTEX_ELEMENT_STATE
description:

   "When SourceElementFormat is set to one of the *64*_PASSTHRU
    formats,  64-bit components are stored in the URB without any
    conversion. In this case, vertex elements must be written as 128
    or 256 bits, with VFCOMP_STORE_0 being used to pad the output
    as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red component into
    the URB, Component 1 must be specified as VFCOMP_STORE_0 (with
    Components 2,3 set to VFCOMP_NOSTORE) in order to output a 128-bit
    vertex element, or Components 1-3 must be specified as VFCOMP_STORE_0
    in order to output a 256-bit vertex element. Likewise, use of
    R64G64B64_PASSTHRU requires Component 3 to be specified as VFCOMP_STORE_0
    in order to output a 256-bit vertex element."

Uses 128-bits to write double and dvec2 vertex elements, and 256-bits for
dvec3 and dvec4 vertex elements.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Antia Puentes <apuentes@igalia.com>

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 07:34:40 +02:00
Alejandro Piñeiro
71150b73c8 i965: get the proper vertex surface type for doubles on gen8+
This commit adds support for PASSTHRU format when pushing
double-precision attributes.

Check glarray->Doubles in order to know if we should choose a format
that does a conversion to float, or just passthru the 64-bit double.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-17 07:34:40 +02:00
Ilia Mirkin
b1d74e9486 nvc0/ir: make sure out-of-bounds buffer loads/atomics get a 0 result
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-17 01:27:29 -04:00
Timothy Arceri
4fb4fd0b6b glsl: make reserved_varying_slot() static
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-17 15:06:39 +10:00
Timothy Arceri
1d752823af glsl: include per-patch varyings when generating reserved slot bitfield
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-17 15:06:27 +10:00
Timothy Arceri
00441829e7 glsl: don't incorrectly eliminate patches with explicit locations
These varying have a separate location domain from per-vertex varyings
and need to be handled separately.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-17 15:06:21 +10:00
Timothy Arceri
3f477f0ea5 glsl: remove remainings tabs in link_varyings.cpp
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-17 15:06:16 +10:00
Timothy Arceri
6d5f7557fb glsl: fix location and component packing validation on patches
These varyings have a separate location domain from per-vertex varyings
and need to be handled separately.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-17 15:06:12 +10:00
Kenneth Graunke
aae0865dc0 i965: Enable ARB_shader_precision on Gen8+.
I recently fixed a bug in the Piglit tests:
https://lists.freedesktop.org/archives/piglit/2016-May/019802.html

With that patch in place, we pass all the tests.  So, turn it on.

We could probably expose this earlier than Gen8, but the extension
says that OpenGL 4.0 is required, and all of our tests are written
against GLSL 4.00 (which is only supported on Gen8+).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 17:52:45 -07:00
Jose Fonseca
cf010de6ee vl/dri: Move the DRI3 check out of sources include into C.
Fixes SCons build.

Trivial.  Built locally with SCons and autotools.
2016-05-16 21:50:43 +01:00
Leo Liu
5e2072c711 st/vdpau: add dri3 support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu
c122c74dca vl/dri3: implement functions for get and set timestamp
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu
9f50a79b8f vl/dri3: handle PresentCompleteNotify event
and get timestamp calculated based on the event's reply

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu
e8282178ab st/va: add dri3 support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu
8d7ac0a4e4 vl/dri3: implement DRI3 BufferFromPixmap
We also need render to the front buffer of temporary X pixmap,
this is the case of when we using opengl as video out for vaapi.
the basic implementation is to pass pixmap ID to X server, and
then X will return dma-buf fd, we will get the buffer object
through this dma-buf fd.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu
858b329c2c vl/dri3: add support for resizing
When drawable size changed, PresentConfigureNotify event will be
emitted, by handling the event to re-allocate resized buffer.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu
96580ad593 vl/dri3: implement funciton for get dirty area
This will clear presentation area not covered by video content

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu
b0bd908284 vl/dri3: implement function for flush frontbuffer
Request drawable content in pixmap by calling DRI3 PresentPixmap,
and handle PresentIdleNotify event.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu
e1223282db vl/dri3: add back buffers support
This implements DRI3 PixmapFromBuffer. Create buffer objects, and
associate it to a dma-buf fd, and then pass this fd with a pixmap
ID to X server for creating pixmap object; also add a function
for wait events.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu
69ba9be4d2 vl/dri3: implement flushing for queued events
also place holder for present events handling

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu
758b1bbaa7 vl/dri3: register present events
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu
672e8d5e7e vl/dri3: set drawable geometry
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Leo Liu
12e5220e34 vl/dri3: add DRI3 support and implement create and destroy
Required functions into place for implementation, create screen
with device fd returned from X server, also bail out to DRI2
with certain conditions.

v2: -organize the error out path (Axel)
    -squash previous patch 1 and 2 into one (Emil)

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-05-16 16:28:51 -04:00
Dave Airlie
30e437bd76 mesa/version.c: enable cull distance in version check.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-17 06:08:31 +10:00
Ian Romanick
11096ecc39 glsl/linker: Include the interface name for input and output blocks
On my oes_shader_io_blocks branch, this fixes 71
dEQP-GLES31.functional.program_interface_query.* tests.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2016-05-16 11:18:03 -07:00
Ian Romanick
7c11589eb4 glsl/linker: Use canonical format for ARB_program_interface_query spec quotes
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 11:18:03 -07:00
Mark Janes
fd854c1add i965: check tcs for NULL dereference
Coverity issue 1361544 found an instance where the tcs variable is
checked for NULL, but unconditionally dereferenced later in the same
function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 11:11:11 -07:00
Matt Turner
bf91034d44 i965: Mark is_lossless_compressed_aux UNUSED to silence warning.
Used only in assert().
2016-05-16 11:08:55 -07:00
Matt Turner
1385018a72 genxml: Use llroundf() and store to appropriate type.
Both functions return uint64_t, so I expect the masking/shifting should
be done on 64-bit types.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-16 11:06:15 -07:00
Matt Turner
4191551262 nir: Mark nir_start_block()/nir_impl_last_block() with returns_nonnull.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 11:06:15 -07:00
Matt Turner
377ab2f2d7 util: Add ATTRIBUTE_RETURNS_NONNULL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 11:06:15 -07:00
Jan Vesely
40c6d54e76 clover: grid_offset should be padded with 0 not 1
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 13:58:14 -04:00
Iago Toral Quiroga
71465179fc i965: Expose OpenGL 4.0 for gen8+
ARB_gpu_shader_fp64 was the only feature missing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:34 +02:00
Iago Toral Quiroga
b1d21e1159 docs: Mark ARB_gpu_shader_fp64 as done for i965/gen8+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
309d285c6b i965: Enable ARB_gpu_shader_fp64 for gen8+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
58f304defe i965/tes/scalar: Fix load input for doubles
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
61197b8d5d i965/tcs/scalar: fix store output for doubles
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
cda3435ea8 i965/tcs/scalar: fix load input for doubles
v2: do not write to the original indirect_offset since that is
    an expression that could be used somewhere else (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
66192b3c16 i965/fs: fix nir_intrinsic_store_output for doubles
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
3cce67aff0 i965/fs: fix number of output components for doubles
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
0297f1021a i965/vec4: handle doubles in type_size_vec4()
The scalar backend uses this to check URB input sizes.

v2: Removed redundant break after return (Curro)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
8c6d147373 i965/fs: support doubles with shared variable stores
This is pretty much the same we do with SSBOs.

v2: do not shuffle in-place, it is not safe since the original 64-bit data
    could be used after the write, instead use a temporary like we do
    for SSBO stores (Iago)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
943f9442bf i965/fs: support doubles with ssbo stores
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
b9aa66aa51 i965/fs: add shuffle_64bit_data_for_32bit_write helper
This does the inverse operation of shuffle_32bit_load_result_to_64bit_data
and we will use it when we need to write 64-bit data in the layout expected
by untyped write messages.

v2 (curro):
- Use subscript() instead of stride()
- Assert on the input types rather than silently retyping.
- Use offset() instead of horiz_offset(), drop the multiplier definition.
- Drop the temporary vgrf and force_writemask_all.
- Make component_i const.
- Move to brw_fs_nir.cpp

v3 (curro):
- Pass dst and src by reference.
- Simplify allocation of tmp register.
- Move to brw_fs_nir.cpp.
- Get rid of the temporary.

v3 (Iago):
- Check that the src and dst regions do not overlap, since that would
  typically be a bug in the caller.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
33f7ec18ac i965/fs: support doubles with SSBO loads
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
8aa01ac596 i965/fs: support doubles with shared variable loads
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
6eab06b866 i965/fs: Add do_untyped_vector_read helper
We are going to need the same logic for anything that reads
doubles via untyped messages (CS shared variables and SSBOs). Add a
helper function with that logic so that we can reuse it.

v2:
- Make this a static function instead of a method of fs_visitor (Iago)
- We only support types with a size of 4 or 8 (Curro)
- Avoid retypes by using a separate vgrf for the packed result (Curro)
- Put dst parameter before source parameters (Curro)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
b86d4780ed i965/fs: support doubles with UBO loads
UBO loads with constant offset use the UNIFORM_PULL_CONSTANT_LOAD
instruction, which reads 16 bytes (a vec4) of data from memory. For dvec
types this only provides components x and y. Thus, if we are reading
more than 2 components we need to issue a second load at offset+16 to
read the next 16-byte chunk with components w and z.

UBO loads with non-constant offset emit a load for each component
in the vector (and rely in CSE to fix redundant loads), so we only
need to consider the size of the data type when computing the offset
of each element in a vector.

v2 (Sam):
- Adapt the code to use component() (Curro).

v3 (Sam):
- Use type_sz(dest.type) in VARYING_PULL_CONSTANT_LOAD() call (Curro).
- Add asserts to ensure std140 vector alignment rules are followed
  (Curro).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
58f1804c4f i965/fs: fix pull constant load component selection for doubles
UNIFORM_PULL_CONSTANT_LOAD is used to load a contiguous vec4 starting at a
constant offset that is 16-byte aligned. If we need to access an unaligned
offset we emit a load with an aligned offset and use the remaining constant
offset to select the component into the vec4 result that we are interested
in. This component must be computed in units of the type size, since that
is what fs_reg::set_smear expects.

This patch does this change in the two places where we use this message:
In demote_pull_constants when we lower uniform access with constant offset
into the pull constant buffer and in UBO loads with constant offset.

v2 (Sam):
- Fix set_smear() in fs_visitor::lower_constant_loads(), take into account
source type instead and remove MAX2 (Curro).
- Improve changes to nir_intrinsic_load_ubo case in nir_emit_intrinsic()
(Curro).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:33 +02:00
Francisco Jerez
71fd4942d1 i965/fs: Fix and document component().
This fixes a number of bugs of component() by reimplementing it in
terms of horiz_offset(): Handling of base registers starting at a
non-zero subreg_offset, handling of strided registers and overflow of
subreg_offset into reg_offset.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
e209134f71 i965/fs: Fix fs_visitor::VARYING_PULL_CONSTANT_LOAD for doubles
v2 (Curro):
   - Assert on scale == 1 when shuffling 64-bit data.
   - Remove type_slots, use type_sz(vec4_result.type) instead.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:33 +02:00
Iago Toral Quiroga
50b7676dc4 i965/fs: add shuffle_32bit_load_result_to_64bit_data helper
There will be a few places where we need to shuffle the result of a 32-bit
load into valid 64-bit data, so extract this logic into a separate helper
that we can reuse.

v2 (Curro):
- Use subscript() instead of stride()
- Assert on the input types rather than retyping.
- Use offset() instead of horiz_offset(), drop the multiplier definition.
- Don't use  force_writemask_all.
- Mark component_i as const.
- Make the function name lower case.

v3 (Curro):
- Pass src and dst by reference.
- Move to brw_fs_nir.cpp

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:33 +02:00
Francisco Jerez
4d9c461e53 i965/fs: Stop using the LOAD_PAYLOAD instruction in lower_simd_width.
Instead of using the LOAD_PAYLOAD instruction (emitted through the
emit_transpose() helper that is no longer useful and this commit
removes) which had to be marked force_writemask_all in some cases,
emit a series of moves to apply proper channel enable signals to the
destination.  Until now lower_simd_width() had mainly been used to
lower things that invariably had a basic block-local temporary as
destination so it didn't seem like a big deal, but I found it to be
the reason for several Piglit regressions in my SIMD32 branch and
Igalia discovered the same issue independently while working on FP64
support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:32 +02:00
Iago Toral Quiroga
9149fd6817 i965/fs: fix copy/constant propagation regioning checks
We were not accounting for subreg_offset in the check for the start
of the region.

Also, fs_reg::regs_read() already takes the stride into account, so we
should not multiply its result by the stride again. This was making
copy-propagation fail to copy-propagate cases that would otherwise be
safe to copy-propagate. Again, this was observed in fp64 code, since
there we use stride > 1 often.

v2 (Sam):
- Rename function and add comment (Jason, Curro).
- Assert that register files and number are the same (Jason).
- Fix code to take into account the assumption that src.subreg_offset
is strictly less than the reg_offset unit (Curro).
- Don't pass the registers by value to the function, use
'const fs_reg &' instead (Curro).
- Remove obsolete comment in the commit log (Curro).

v3 (Sam):
- Remove the assert and put the condition in the return (Curro).
- Fix function name (Curro).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:32 +02:00
Iago Toral Quiroga
789eecdb79 i965/fs: fix copy propagation from load payload
We were not considering the case where the load payload is writing to
a destination with a reg_offset > 0.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:32 +02:00
Iago Toral Quiroga
cf375a3333 i965/fs: fix copy propagation of partially invalidated entries
We were not invalidating entries with a src that reads more than one register
when we find writes that overwrite any register read by entry->src after
the first. This leads to incorrect copy propagation because we re-use
entries from the ACP that have been partially invalidated. Same thing for
entries with a dst that writes to more than one register.

v2 (Sam):
- Improve code by defining regions_overlap() and using it instead of a
loop (Curro).

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:32 +02:00
Francisco Jerez
ea1ef49a16 i965/fs: Reindent register offset calculation of try_copy_propagate().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:32 +02:00
Francisco Jerez
0fb19806c0 i965/fs: Simplify and fix register offset calculation of try_copy_propagate().
try_copy_propagate() was special-casing UNIFORM registers (the
BAD_FILE, ARF and FIXED_GRF cases are dead, see the assertion at the
top of the function) and then failing to take into account the
possibility of the instruction reading from a non-zero offset of the
destination of the copy.  The VGRF/ATTR handling takes it into account
correctly, and there is no reason we couldn't use the exact same logic
for the UNIFORM file aside from the fact that uniforms represent
reg_offset in different units.  We can work around that easily by
defining an additional constant with the right unit reg_offset is
expressed in.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:32 +02:00
Iago Toral Quiroga
7aa53cd725 i965/fs: disallow type change in copy-propagation if types have different sizes
Because the semantics of source modifiers are type-dependent, the type of the
original source of the copy must be kept unmodified while propagating it into
some instruction, which implies that we need to have the guarantee that the
meaning of the instruction is going to remain the same after we have changed
the types. Whenthe size of the new type is different from the size of the old
type the new and old instructions cannot possibly be equivalent because the new
instruction will be reading more data than the old one was.

Prevents that we turn this:

load_payload(8) vgrf17:DF, |vgrf4+0.0|:DF 1sthalf
mov(8) vgrf18:DF, vgrf17:DF 1sthalf
load_payload(8) vgrf5:DF, vgrf18:DF, vgrf20:DF NoMask 1sthalf WE_all
load_payload(8) vgrf21:UD, vgrf5+0.4<2>:UD 1sthalf
mov(8) vgrf22:UD, vgrf21:UD 1sthalf

into:

load_payload(8) vgrf17:DF, |vgrf4+0.0|:DF 1sthalf
mov(8) vgrf18:DF, |vgrf4+0.0|:DF 1sthalf
load_payload(8) vgrf5:DF, |vgrf4+0.0|:DF, |vgrf4+2.0|:DF NoMask 1sthalf WE_all
load_payload(8) vgrf21:UD, vgrf5+0.4<2>:UD 1sthalf
mov(8) vgrf22:DF, |vgrf4+0.4|<2>:DF 1sthalf

where the semantics of the last instruccion have changed.

v2 (Curro):
  - Update commit log and add comment to explain the problem better.
  - Simplify the condition.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:32 +02:00
Iago Toral Quiroga
ac9b966aac i965/fs: Fix copy propagation of load payload for double operands
Specifically, consider the size of the data type of the operand to compute
the number of registers written.

v2 (Sam):
- Fix line width (Jordan).
- Add an assert (Jordan).
- Use REG_SIZE in the calculation of regs_written (Curro)

v3 (Sam):
- Fix assert and calculation of regs_written (Curro).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 09:55:32 +02:00
Francisco Jerez
70dc19f9d6 i965/fs: Fix propagation of copies with strided source.
This has likely been broken since we started propagating copies not
matching the offset of the instruction exactly
(1728e74957).  The copy source stride
needs to be taken into account to find out the offset at the origin
that corresponds to the offset at the destination of the copy which is
being read by the instruction.  This has led to program miscompilation
on both my SIMD32 branch and Igalia's FP64 branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:32 +02:00
Iago Toral Quiroga
17decd940c i965/fs: fix subreg_offset overflow in byte_offset()
This can happen if the register already has a non-zero subreg_offset
when byte_offset() is called.

v2 (Sam):
- Refactor byte_offset() (Jordan).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-16 09:55:32 +02:00
Kenneth Graunke
2fd79ebe8f i965: Fix JIP to skip over sibling do...while loops.
We've apparently always been botching JIP for sequences such as:

do
    cmp.f0.0 ...
    (+f0.0) break
    ...
    do
        ...
    while
    ...
while

Because the "do" instruction doesn't actually exist, the inner "while"
is at the same depth as the "break".  brw_find_next_block_end() thus
mistook the inner "while" as the end of the loop containing the "break",
and set the "break" to point to the wrong place.

Only "while" instructions that jump before our instruction are relevant.
We need to ignore the rest, as they're sibling control flow nodes (or
children, but this was already handled by the depth == 0 check).

See also commit 1ac1581f38.

This prevents channel masks from being screwed up, and fixes GPU
hangs(*) in dEQP-GLES31.functional.shaders.multisample_interpolation.
interpolate_at_sample.centroid_qualified.multisample_texture_16.

The test ended up executing code with no channels enabled, and that
code contained FIND_LIVE_CHANNEL, which returned 8 (out of range for
a SIMD8 program), which then was used in indirect GRF addressing,
which randomly got a boolean value (0xFFFFFFFF), interpreted it as
a sample ID, OR'd it into an indirect send message descriptor,
which corrupted the message length, sending a pixel interpolator
message with mlen 15, which is illegal.  Whew :)

(*) Technically, the test doesn't GPU hang currently, but only
    because another bug prevents it from issuing pixel interpolator
    messages entirely...with that fixed, it hangs.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 00:20:07 -07:00
Kenneth Graunke
2f02fad6b3 i965: Make a "does this while jump before our instruction?" helper.
I need to use this in an additional place.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-16 00:19:53 -07:00
Kenneth Graunke
b6f250d7f2 i965: Send the minimal number of STATE_BASE_ADDRESS packets.
STATE_BASE_ADDRESS stalls the whole pipeline, and the documentation
cautions us to emit it as little as possible for better performance.

We recently put some hacks in BLORP to try and avoid emitting it
if it was already set correctly.  However, this wasn't quite minimal:
if BLORP is the first operation (i.e. glClear()), then it would emit
it, and subsequent draw calls would emit it again.

This caused a small drop in performance in GPUTest Triangle when
switching from Meta to BLORP.

Unlike most packets, STATE_BASE_ADDRESS isn't influenced by GL state:
it needs to be emitted once per batch, before most other commands, or
whenever we change the program cache BO.  It's also valid in both the
3D and compute pipelines, which makes it even more unique.

This patch removes it from the atom mechanism and instead directly
calls it as part of every draw, compute dispatch, or BLORP operation.
We introduce a new flag indicating that STATE_BASE_ADDRESS has already
been emitted this batch, and if so, skip doing it again.  When we make
a new program cache BO, we simply reset the flag, so the next operation
will emit it again.  When we flush/reset the batch, we reset the flag.

This guarantees that we'll emit STATE_BASE_ADDRESS only when we have to.
It's also less code than the old atom mechanism.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-16 00:11:51 -07:00
Kenneth Graunke
97179c606c i965: Combine Gen4-7 and Gen8+ state base address emitters.
We're about to start calling it directly, and this means the callers
won't have to think about generations.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-16 00:11:50 -07:00
Kenneth Graunke
7b70a12e1c i965: Move Gen4-5 programs to brw_upload_programs() too.
This way all the programs are in one place again, and it also should
make some future STATE_BASE_ADDRESS related changes possible.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-16 00:11:49 -07:00
Kenneth Graunke
b23b099a0b i965: Mark brw const in brw_state_dirty and callers.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-16 00:11:43 -07:00
Kenneth Graunke
8e71ac731b glsl: Don't do constant propagation in opt_constant_folding.
opt_constant_folding is supposed to fold trees of constants into a
single constant.  Surprisingly, it was also propagating constant values
from variables into expression trees - even when the result couldn't be
folded together.  This is opt_constant_propagation's job.

The ir_dereference_variable::constant_expression_value() method returns
a clone of var->constant_value.  So we would replace the dereference
with a constant, propagating it into the tree.

Skip over ir_dereference_variable to avoid this surprising behavior.
However, add code to explicitly continue doing it in the constant
propagation pass, as it's useful to do so.

shader-db statistics on Broadwell:

total instructions in shared programs: 8905349 -> 8905126 (-0.00%)
instructions in affected programs: 30100 -> 29877 (-0.74%)
helped: 93
HURT: 20

total cycles in shared programs: 71017030 -> 71015944 (-0.00%)
cycles in affected programs: 132456 -> 131370 (-0.82%)
helped: 54
HURT: 45

The only hurt programs are by a single instruction, while the helped
ones are helped by 1-4 instructions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-15 23:59:39 -07:00
Kenneth Graunke
db8fcbbaf9 glsl: Avoid excess tree walking when folding ir_dereference_arrays.
If an ir_dereference_array has non-constant components, there's no
point in trying to evaluate its value (which involves walking down
the tree and possibly allocating memory for portions of the subtree
which are constant).

This also removes convoluted tree walking in opt_constant_folding(),
which tries to fold constants while walking up the tree.  No need to
walk down, then up, then down again.

We did this for swizzles and expressions already, but I was lazy
back in the day and didn't do this for ir_dereference_array.

No change in shader-db.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-15 23:59:33 -07:00
Kenneth Graunke
329fe93210 glsl: Consolidate duplicate copies of constant folding.
We could probably clean this up more (maybe make it a method), but at
least there's only one copy of this code now, and that's a start.

No change in shader-db.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-15 23:59:20 -07:00
Kenneth Graunke
3bf27a9a00 glsl: Remove bonus tree walking in opt_constant_folding().
It looks like this was missed when converting opt_constant_folding()
from a hierarchical visitor to an rvalue visitor in 6606fde3.

ir_rvalue_visitor already processes values on the way back up the tree,
so we will have already visited every child node.  There's no point in
doing it again.

No change in shader-db.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-15 23:59:10 -07:00
Kenneth Graunke
8e59670bcf glsl: Make opt_constant_variable() bail in useless cases.
The pass ultimately skips over any entries with assignment_count != 1,
so there's no need to do further work once we've determined that there
are multiple assignments.

The constant value could be a large array (i.e. uvec4[327]), at which
point skipping the constant_expression_value() call (and the clone()
call within) can save us piles of memory.

No change in shader-db.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-15 23:59:05 -07:00
Kenneth Graunke
c907ca6c8d i965: Flip interpolateAtOffset's y offset when necessary.
Fixes 4 dEQP-GLES31.functional.shaders.multisample_interpolation tests:
- interpolate_at_offset.no_qualifiers.default_framebuffer
- interpolate_at_offset.centroid_qualifier.default_framebuffer
- interpolate_at_offset.sample_qualifier.default_framebuffer
- interpolate_at_offset.array_element.default_framebuffer

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-15 23:50:52 -07:00
Kenneth Graunke
6d65b0c6dc nir: Add a nir->info.uses_interp_var_at_offset flag.
I've added this to nir_gather_info(), but also to glsl_to_nir() as a
temporary measure, since the i965 GL driver today doesn't use
nir_gather_info() yet.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-15 23:50:28 -07:00
Kenneth Graunke
d4d7e1516b glsl: Drop bad ASSERT_TRUE in gl_CullDistance link_varyings test.
I don't know what the intention was here, but this function returns
void.  We can't assert anything about its return value.

Fixes "make check" failures.

v2: Also fix prototype for the function (caught by Jordan).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-15 23:49:19 -07:00
Jan Vesely
9525f33164 clover: Handle PIPE_SHADER_IR_NIR in switch
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-15 20:05:10 -04:00
Rob Clark
277818ecfb freedreno/ir3: small standalone compiler cleanup
Don't hard-code the gpu-id anymore.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-15 17:25:48 -04:00
Rob Clark
f06343d6ea nir: forward-declare 'struct gl_shader_program'
Drop extra #include which is otherwise unneeded (and makes this header
difficult to include from outside of src/mesa).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-15 17:25:48 -04:00
Rob Clark
79d6409a14 nir: return progress from lower_idiv
With algebraic-opt support for lowering div to shift, the driver would
like to be able to run this pass *after* the main opt-loop, and then
conditionally re-run the opt-loop if this pass actually lowered some-
thing.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-15 17:25:48 -04:00
Rob Clark
f8840f471d freedreno/ir3: lower fdiv
Not sure how we didn't hit this already, but since we want fdiv
converted into mul + rcp, we should set this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-15 17:25:48 -04:00
Rob Clark
53cde5e295 freedreno/ir3: handle VARYING_SLOT_PNTC
In the glsl->tgsi path, this already gets translated to VAR8, which
matches up with rasterizer->sprite_coord_enable.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-15 17:25:48 -04:00
Rob Clark
2f1581059b freedreno/ir3: disable TGSI specific hacks in nir case
When we got NIR directly from state tracker (vs using tgsi_to_nir) we
need to realize this and skip some TGSI specific hacks.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-15 17:25:48 -04:00
Rob Clark
784086f3c1 freedreno/ir3: add support for NIR as preferred IR
For now under debug flag, since only suitable for debugging/testing.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-15 17:25:47 -04:00
Rob Clark
8b24f7b440 nir: fix comment typo about f2d/d2f
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-15 17:25:47 -04:00
Ilia Mirkin
be2b13e3bf nv50/ir: avoid asserts when the state tracker feeds us bogus inputs
INTERP is defined (by me) to have to have a INPUT source. However the
state tracker does not always obey this. This happens due to varying
packing logic introducing additional mov's which can't always be undone.
Instead of just giving up, we instead try harder to find the original
input. This won't always be possible, for example with indirect
accesses. There's not much we can (easily) do about that though.

This fixes the remaining interpolateAt* failures in dEQP:

dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at*

some of which were asserting due to INTERP_* being passed a non-input.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-15 14:12:56 -04:00
Ilia Mirkin
9323d084ac nvc0: don't try to go through the push path for indirect draws
This fixes

dEQP-GLES31.functional.draw_indirect.draw_elements_indirect.*.default_attribute

These tests were causing a const vbo to be set up, and were small enough
draws that the logic was trying to go via the push path (which emits
data directly into the cmd stream rather than uploading a user vbo).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-15 10:48:39 -04:00
Ilia Mirkin
2ef3cdb07e nvc0/ir: make sure to align the second arg of TXD to 4, as we do for TEX
This was handled in handleTEX(), however the way the logic works, those
extra arguments aren't added on by then, so it did nothing. Instead we
must duplicate that bit here. GK110 appears to complain about
MISALIGNED_GPR, however it's reasonable to believe that GK104 has the
same requirements.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95403
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-15 10:48:39 -04:00
Tobias Klausmann
8c02939794 nv50,nvc0: add support for cull distances
Cull distances are just a special case of clip distances as far as the
hardware is concerned. Make sure that the relevant "planes" are enabled,
and flip the clip mode to cull for those.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
[imirkin: add enables on nvc0, add nv50 support]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
2016-05-15 10:48:39 -04:00
Ilia Mirkin
2ad970ecf4 st/mesa: disable cull distance for now
The pass that st/mesa relies on to combine clip and cull distances has
been reverted, so we can't expose ARB_cull_distance until that is
resolved.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-15 10:48:38 -04:00
Jason Ekstrand
09e041d61d i965: Use blorp for all clears
We used to use a meta path on gen8 but we haven't since c7cf17ae75.  We
might as well delete the meta path since blorp works on all gens.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 14:18:21 -07:00
Jason Ekstrand
1cfb4bc890 i965: Use blorp for all stencil blits
We used to use a meta path because blorp didn't support 16x MSAA.  Now it
does, so we don't need the meta paths anymore.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 14:18:21 -07:00
Jason Ekstrand
64f2907030 i965: Use blorp for all updownsample blits
We used to use a meta path because blorp didn't support 16x MSAA.  Now it
does, so we don't need the meta paths anymore.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 14:18:21 -07:00
Jason Ekstrand
f5febc83a7 i965/blorp: Add support for 16x MSAA
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 14:18:21 -07:00
Jason Ekstrand
a32315bd19 i965: move brw_meta_set_fast_clear_color to brw_meta_util.c
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 14:18:21 -07:00
Jason Ekstrand
36529f670f i965; Move brw_meta_get_*_rect to brw_meta_util.c
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 14:18:21 -07:00
Jason Ekstrand
21034f1b08 i965: Move brw_is_color_fast_clear_compatible to brw_meta_util
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 14:18:21 -07:00
Jason Ekstrand
b05c68fc8a i965: Move brw_get_rb_for_slice to brw_meta_util
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 14:18:21 -07:00
Jason Ekstrand
672cffee0f i965/blorp: Get rid of the blorp_prog_data_int() helper
The helper was initially created to allow us to set reasonable defaults as
we mutated the brw_blorp_prog_data structure in preparation for NIR.  Now
that everything is going through brw_blorp_compile_nir_shader() which fully
fills out the brw_blorp_prog_data structure, we don't need the helper.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:54 -07:00
Jason Ekstrand
c228ea8345 i965/blorp: Delete the old blorp shader emit code
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:54 -07:00
Jason Ekstrand
c18da26abf i965/blorp: Stop doing f2i(i2f(sample_id))
NIR gets kind of awkward when you have a 3-component vector with two floats
and one int.  This led to us accidentally going through float for the
sample index.  It doesn't hurt anything but it also isn't needed.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:53 -07:00
Jason Ekstrand
e503da61c6 i965/blorp: Refactor coordinate munging
The original code-flow tried to map original blorp.  This puts things more
where they belong and simplifies some of the logic.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:53 -07:00
Jason Ekstrand
8636937dd6 i965/blorp: Add bilinear blending support to the NIR path
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:53 -07:00
Jason Ekstrand
6bd7bd6633 i965/blorp: Add support for averaging resolves to the NIR path
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:53 -07:00
Jason Ekstrand
c7269c1551 i965/blorp: Add MSAA encode/decode support to the NIR path
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:53 -07:00
Jason Ekstrand
df8c2936cd i965/blorp: Add support for W-[de]tiling to the NIR path
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:53 -07:00
Jason Ekstrand
6adb8d6d3a i965/blorp: Add support for discard-based bounds checks to the NIR path
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:53 -07:00
Jason Ekstrand
4bdace0791 i965/blorp: Add initial support for NIR-based blit shaders
Many of the more complex cases still fall back to the old shader builder.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:53 -07:00
Jason Ekstrand
b0275ad0c9 i965/blorp: Refactor getting the blit kernel into a helper
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:53 -07:00
Jason Ekstrand
6df3d75206 i965/blorp: Use NIR for clear shaders
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95373
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:53 -07:00
Jason Ekstrand
bb45f42f55 i965/blorp: Create the program key in get_clear_kernel
There's no reason to be passing a whole struct around just for a single
boolean.  We can create it later when we actually need to use it as a key.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:53 -07:00
Jason Ekstrand
c1fe8859d3 i965/blorp: Add a helper for compiling NIR shaders
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:53 -07:00
Jason Ekstrand
353eadb170 blorp: Add initial state setup support for SIMD8 dispatch
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:52 -07:00
Jason Ekstrand
cd5a2905cf i965/blorp: Add a param array to prog_data
This array allows the push constants to be re-arranged on upload.  The
actual arrangement will, eventually, come from the back-end compiler.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:52 -07:00
Jason Ekstrand
c46cbe19f4 i965/blorp: Add a prog_data_init helper
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-14 13:34:52 -07:00
Jason Ekstrand
50e5e1f747 i965/fs: Implement the new NIR MCS texturing
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:49 -07:00
Jason Ekstrand
f47faa4316 nir: Add texture opcodes and source types for multisample compression
Intel hardware does a form of multisample compression that involves an
auxilary surface called the MCS.  When an MCS is in use, you have to first
sample from the MCS with a special opcode and then pass the result of that
operation into the next sample instrucion.  Normally, we just do this
ourselves in the back-end, but we want to expose that functionality to NIR
so that we can use MCS values directly in NIR-based blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:44 -07:00
Jason Ekstrand
87a41e862b nir/builder: Add a helper for grabbing multiple channels from an ssa def
This is similar to nir_channel except that it lets you grab more than one
channel by providing a mask.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:40 -07:00
Jason Ekstrand
fc58cb543f nir/builder: Generate the alu helpers directly in python
There's no reason for having a macro *and* a python generator.  We can
easily just do the whole thing in python.  This has the advantage that we
are no longer definining ALU# macros which conflict with the ones in
brw_fs_builder.h.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:38 -07:00
Jason Ekstrand
a0e6e5f21f i965/fs: Use MRF0 for the repclear message
This is what BLORP does.  Making them match cuts down on the noise when
looking at AUB diffs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:33 -07:00
Jason Ekstrand
5a68df87da i965/blorp: Simplify the sample layout calculation
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:30 -07:00
Jason Ekstrand
bee160b31b i965/fs: Organize prog_data by ksp number rather than SIMD width
The hardware packets organize kernel pointers and GRF start by slots that
don't map directly to dispatch width.  This means that all of the state
setup code has to re-arrange the data from prog_data into these slots.
This logic has been duplicated 4 times in the GL driver and one more time
in the Vulkan driver.  Let's just put it all in brw_fs.cpp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:25 -07:00
Jason Ekstrand
7be100ac9a i965/gen7_wm: Move where we set the fast clear op
This better matches gen8 state setup

Acked-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:21 -07:00
Jason Ekstrand
1ec466d0ff i965/fs: Stop setting dispatch_grf_start_reg from the visitor
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:18 -07:00
Jason Ekstrand
082768af30 i965/fs: Clean up the logic in compile_fs a bit
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:13 -07:00
Jason Ekstrand
b0f8768905 i965/state: Clean up WM/PS state to pull more things out of prog_data
Now that we have a persample_shading bit in prog_data we can reduce the
amount the state setup code needs to be looking at the GL state.  In
particular, it no longer pulls anything directly out of the
gl_fragment_program and no longer depends on NEW_FRAGMENT_PROGRAM.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:10 -07:00
Jason Ekstrand
712a980add i965/fs: Rework the persample shading key/prog_data bits
This commit reworks and simplifies the way we handle persample shading in
the shader key and prog_data.  The previous approach had three different
key bits that had slightly different and hard-to-decern meanings while the
new bits are far more clear.  This commit changes it to two easily
understood bits that communicate everything we need:

 1) key->persample_interp: means that the user has requested persample
    interpolation through the API.  This is equivalent to having
    SAMPLE_SHADING enabled and having MIN_SAMPLE_SHADING_VALUE set high
    enough that you actually get multiple per-sample invocations.

 2) key->multisample_fbo: means that the shader will be running on an
    actual multi-sampled framebuffer.

This commit also adds a new "persample_dispatch" bit to prog_data which
indicates that the shader should be run in persample mode.  This way the
state setup code doesn't have to look at the fragment program or GL state
and can just pull that data out of the prog_data.

In theory, this shuffle could mean more recompiles.  However, in practice,
we were shoving enough state into the key before that we were probably
hitting a recompile on every per-sample shader anyway.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:05 -07:00
Jason Ekstrand
a2f50d87b6 nir: Add an info bit for uses_sample_qualifier
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:33:52 -07:00
Kenneth Graunke
59156b2e96 i965: Fix undefined df bits in brw_reg comparisons.
Commit 5310bca024 added a new "double df"
field to the brw_reg struct, adding an extra 4 bytes of data that isn't
usually initialized (or may contain irrelevant garbage if the struct is
mutated).  This means that it's no longer safe to memcmp().

Instead, add a brw_regs_equal() function which ignores the extra df bits
unless they matter.  To keep the implementation cheap, we wrap the first
set of fields in a union/struct so that we can use a single DWord
comparison.

v2: Drop unnecessary casts (caught by Francisco Jerez).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-05-14 00:18:37 -07:00
Dave Airlie
9f8867d877 i965: disable cull distance temporarily.
I'll fix this up on Monday, so leave the docs changes in place.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-14 11:39:34 +10:00
Dave Airlie
7a6d55826e Revert "glsl: Extend lowering pass for gl_ClipDistance to support other arrays (v4)"
This reverts commit ad355652c2.

This broke a bunch of clip tests.
2016-05-14 11:39:34 +10:00
Ian Romanick
a608e946b5 docs: Mark GL_OES_shader_io_blocks as started
Watch the oes_shader_io_blocks of my fd.o Mesa GIT repo for progress.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-13 17:48:46 -07:00
Kristian Høgsberg Kristensen
4e959cf9f9 docs: update ARB_cull_distance status.
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2016-05-13 16:32:14 -07:00
Kristian Høgsberg Kristensen
c564348a2e i965: Add support for GL_ARB_cull_distance
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-13 16:28:25 -07:00
Ilia Mirkin
a1c2444792 st/mesa: flip y coordinate of interpolateAtOffset for winsys
This fixes a few dEQP tests like

dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.no_qualifiers.default_framebuffer

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-13 19:17:41 -04:00
Ilia Mirkin
0d8e850195 glsl: make sure that textureProj(bias) variants are only exposed in fs
Many were already marked as fs_only, but not all. This fixes the
remaining ir_txb entries.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-13 19:17:26 -04:00
Ilia Mirkin
37c8f4c609 glsl: be more strict when validating shader inputs
interpolateAt* can only take input variables or an element of an input
variable array. No structs.

Further, GLSL 4.40 relaxes the requirement to allow swizzles, so enable
that as well.

This fixes the following dEQP tests:

dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.negative.interpolate_struct_member
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_centroid.negative.interpolate_struct_member
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.negative.interpolate_struct_member

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-13 19:17:26 -04:00
Ilia Mirkin
5239f1e0c9 glsl: make sure that interpolateAt arguments are variables
In the case of a constant, it might have been propagated through and
variable_referenced() returns NULL. Error out in that case.

Fixes 3 dEQP tests:

dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.negative.interpolate_constant
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_centroid.negative.interpolate_constant
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.negative.interpolate_constant

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2016-05-13 19:17:26 -04:00
Tobias Klausmann
8f45f4f3ca mesa/st: Add support for GL_ARB_cull_distance (v2)
v2: don't bother with cull dist varyings except to assert.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-14 08:28:23 +10:00
Tobias Klausmann
2be258ea18 gallium: Add a pipe cap for arb_cull_distance
This lets us safely enable or disable the extension as needed

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-14 08:28:17 +10:00
Tobias Klausmann
d656736bbf glsl: Add arb_cull_distance support (v3)
v2: make too large array a compile error
v3: squash mesa/prog patch to avoid static compiler errors in bisect

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-14 08:28:08 +10:00
Tobias Klausmann
ad355652c2 glsl: Extend lowering pass for gl_ClipDistance to support other arrays (v4)
This will come in handy when we want to lower gl_CullDistance into
gl_CullDistanceMESA.

[airlied: drop separate APIs for clip/cull - just use single API
to call both passes.]

v3: reexamine my sanity, this was pretty broken, the new code
creates one copy of gl_ClipDistanceMESA, as the clip distance
varying and lowers everything into that in two passes, one for clips
one for culls.
v4: rework using the passes in clip/cull sizes, instead of the
array sizes.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-14 08:28:07 +10:00
Dave Airlie
dd3390e12f glsl: rename lower_clip_distance to lower_distance.
This just renames the file in anticipation of adding cull lowering,
and renames the internals.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-14 08:27:40 +10:00
Tobias Klausmann
eb18fea707 mesa/main: Add support for GL_ARB_cull_distance (v2)
airlied:
v2: rename LowerClipDistance to LowerCombinedClipCullDistnace.
I don't think we want any other behaviour with any current hw.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-14 08:27:29 +10:00
Tobias Klausmann
f2a2e08e01 glapi: Add GL_ARB_cull_distance
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-14 08:27:10 +10:00
Nanley Chery
6674d018f7 anv/copy: Fix copying Images from Buffers with larger dimensions
This function previously assumed that the Buffer and Image had matching
dimensions. However, it is possible to copy from a Buffer with larger
dimensions than the Image. Modify the copy function to enable this.

v2: Use ternary instead of MAX for setting bufferExtent (Jason Ekstrand)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95292
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Tested-by: Matthew Waters <matthew@centricular.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-13 14:08:57 -07:00
Maarten Lankhorst
ff5c312623 .mailmap: Fix my email addresses.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@ubuntu.com>
2016-05-13 12:28:05 +02:00
Nicolai Hähnle
a694c20ecf radeonsi/sid_tables: rename reg_table to sid_reg_table
This is purely cosmetic, making it easier to assign blame for space used
in the binary in case somebody else makes a similar cleanup effort in the
future.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-05-13 01:03:39 -05:00
Nicolai Hähnle
c7f73a70f0 radeonsi/sid_tables: store offset into global fields table instead of pointer
This avoids relocations in the final binary.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-05-13 01:03:39 -05:00
Nicolai Hähnle
54ab39caaf radeonsi/sid_tables: store strings by offset instead of by pointer
This saves some space and avoids the need for relocations.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-05-13 01:03:39 -05:00
Nicolai Hähnle
ca8f71f4cb r600: remove TABLE_SIZE macro
Use ARRAY_SIZE instead.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-05-13 01:03:38 -05:00
Nicolai Hähnle
43ac091e4c r600: move alu_op_table to .c file
So that it gets compiled and emitted only once, saving space is the final
binary.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-05-13 01:03:38 -05:00
Nicolai Hähnle
390c740b99 r600: move cf_op_table to .c file
So that it gets compiled and emitted only once, saving space is the final
binary.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-05-13 01:03:38 -05:00
Nicolai Hähnle
a180e1d22d r600: move fetch_op_table to .c file
So that it gets compiled and emitted only once, saving space is the final
binary.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-05-13 01:03:37 -05:00
Nicolai Hähnle
6d350fb13f r600: protect r600_isa.h with extern "C"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-05-13 01:03:37 -05:00
Bas Nieuwenhuizen
ac77fb74a0 gallium/ddebug: Implement launch_grid.
Does not implement dumping info.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-13 07:43:46 +02:00
Bas Nieuwenhuizen
22b35122fa gallium/ddebug: Support compute states.
v2: Reuse the macro for bind & delete.

Note that may not be able to share the delete long-term as
pipe_compute_state contains members not in pipe_shader_state,
and we need to distinguish the pointer location if we add that
struct to the union.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-13 07:43:37 +02:00
Bas Nieuwenhuizen
5efe477b13 gallium/ddebug: Add passthrough for get_compute_param.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-13 07:39:12 +02:00
Ian Romanick
8f05a0a4c0 nir: Remove empty visit_call_src and visit_load_const_src functions
The guts were removed in dfb3abba.  It has been almost exactly a year,
so I dont think we're going to "decide we want [predication] back."

Silences several "unused parameter" warnings:

nir/nir.c: In function ‘visit_call_src’:
nir/nir.c:1052:32: warning: unused parameter ‘instr’ [-Wunused-parameter]
 visit_call_src(nir_call_instr *instr, nir_foreach_src_cb cb, void *state)
                                ^
nir/nir.c:1052:58: warning: unused parameter ‘cb’ [-Wunused-parameter]
 visit_call_src(nir_call_instr *instr, nir_foreach_src_cb cb, void *state)
                                                          ^
nir/nir.c:1052:68: warning: unused parameter ‘state’ [-Wunused-parameter]
 visit_call_src(nir_call_instr *instr, nir_foreach_src_cb cb, void *state)
                                                                    ^
nir/nir.c: In function ‘visit_load_const_src’:
nir/nir.c:1058:44: warning: unused parameter ‘instr’ [-Wunused-parameter]
 visit_load_const_src(nir_load_const_instr *instr, nir_foreach_src_cb cb,
                                            ^
nir/nir.c:1058:70: warning: unused parameter ‘cb’ [-Wunused-parameter]
 visit_load_const_src(nir_load_const_instr *instr, nir_foreach_src_cb cb,
                                                                      ^
nir/nir.c:1059:28: warning: unused parameter ‘state’ [-Wunused-parameter]
                      void *state)
                            ^

v2: Add some comments in nir_foreach_src suggested by Jason.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Connor Abbott <cwabbott0@gmail.com>
2016-05-12 16:47:14 -07:00
Ian Romanick
098166e1bc nir: Silence unused parameter warnings
These cases had the parameter removed:

nir/nir_lower_vec_to_movs.c: In function ‘try_coalesce’:
nir/nir_lower_vec_to_movs.c:124:66: warning: unused parameter ‘shader’ [-Wunused-parameter]
 try_coalesce(nir_alu_instr *vec, unsigned start_idx, nir_shader *shader)
                                                                  ^
nir/nir_lower_io.c: In function ‘load_op’:
nir/nir_lower_io.c:147:32: warning: unused parameter ‘state’ [-Wunused-parameter]
 load_op(struct lower_io_state *state,
                                ^

These cases had the parameter (void) silenced because the parameter was
necessary for an interface:

nir/glsl_to_nir.cpp:1900:32: warning: unused parameter 'ir' [-Wunused-parameter]
 nir_visitor::visit(ir_barrier *ir)
                                ^
nir/nir.c: In function ‘remove_use_cb’:
nir/nir.c:802:35: warning: unused parameter ‘state’ [-Wunused-parameter]
 remove_use_cb(nir_src *src, void *state)
                                   ^
nir/nir.c: In function ‘remove_def_cb’:
nir/nir.c:811:37: warning: unused parameter ‘state’ [-Wunused-parameter]
 remove_def_cb(nir_dest *dest, void *state)
                                     ^

Number of total warnings in my build reduced from 2543 to 2538
(reduction of 5).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-12 16:46:41 -07:00
Leo Liu
bd9ae72459 vl/dri: fix close fd error out
fd should be set to -1 only if it got closed by pipe_loader_release.

Signed-off-by: Leo Liu <leo.liu@amd.com>
2016-05-12 18:26:48 -04:00
Samuel Pitoiset
988b09f9ac nvc0: fix indentation in nvc0_invalidate_resource_storage()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-12 21:37:08 +02:00
Samuel Pitoiset
abb3401095 nvc0: save some CPU cycles in nvc0_context_unreference_resources()
This reduces the number of loop iterations for invalidating buffers
and images.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-12 21:37:08 +02:00
Samuel Pitoiset
b8f0b00a9a nvc0: invalidate texture buffers for compute
This is a pretty rare situation but this can happen though.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-12 21:37:08 +02:00
Tim Rowley
2785f2f2d7 swr: properly expose compressed format support
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-12 14:12:18 -05:00
Jason Ekstrand
5186545d66 anv: Don't advertise shaderImageGatherExtended
We don't actually support all of the extended gather functionality so we
shouldn't be advertising it.
2016-05-12 10:57:00 -07:00
Rob Clark
9d3cc80b75 nir: glsl_get_bit_size() should take glsl_type
It's what all the call-sites once, so gets rid of a bunch of inlined
glsl_get_base_type() at the call-sites.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-12 13:39:40 -04:00
Topi Pohjolainen
b19cff1639 i965/gen9: Enable lossless compression
I tried first creating the auxiliary buffer the same time with the
color buffer. That, however, led me into a situation where we would
later create the rest of the mip-levels and the compression would
need to be disabled (it is only supported for single level buffers).

Here we try to create it on demand just before the hardware starts
to render. This is similar what we do with fast clear buffers,
their creation is deferred until the first clear.
This setup also gives the opportunity to detect if the miptree
represents the temporaty texture used internally in the mesa core.
This texture is mostly written by cpu and therefore enabling
compression for it doesn't make much sense.

Note that a heuristic is included. Floating point formats are not
enabled yet as they are only seen to hurt performance.

Some highlights with window system driver kept fixed to default
and only the application driver changing:

Manhattan: 8.32152% +/- 0.355881%
Offscreen: 9.09713% +/- 0.340763%

Glb trex: 8.46231% +/- 0.460624%
Offscreen: 9.31872% +/- 0.463743%

v2 (Ben): Re-use msaa layout type for single sampled case.
v3: Moved the deferred allocation of mcs to brw_try_draw_prims() and
    brw_blorp_blit_miptrees() instead.
v4: (Ken): Drop MIPTREE_LAYOUT_ACCELERATED_UPLOAD when allocating mcs.
    Do not enable for scanout buffers

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:37 +03:00
Topi Pohjolainen
cd9e97a020 i965: Set render state for lossless compressed
v2: Add support for blorp and removed the support for meta
v3 (Ben): Add assertion on compressed non-fast clear - must
          be partial clear.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:37 +03:00
Topi Pohjolainen
cda8c2a911 i965/wm: Don't sample lossless compressed as multisampled
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:37 +03:00
Topi Pohjolainen
683dda0083 i965/gen9: Setup MCS for compressed texture surfaces
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:37 +03:00
Topi Pohjolainen
1a05aeeb1c i965/blorp: Do not resolve lossless compressed blit sources
Blorp blits use sampling engine which is capable of resolving
on the fly. Buffers are still resolved for blitter engine. Current
understanding is that blitter doesn't understand lossless compression.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:37 +03:00
Topi Pohjolainen
01ba26d0b0 i965/blorp: Prepare blits for lossless compression
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:37 +03:00
Topi Pohjolainen
84066ebd63 i965: Deferred allocation of mcs for lossless compressed
Until now mcs was associated to single sampled buffers only for
fast clear purposes and it was therefore the responsibility of the
clear logic to allocate the aux buffer when needed. Now that normal
3D render or blorp blit may render with mcs enabled also, they need
to prepare the mcs just as well.

v2: Do not enable for scanout buffers
v3 (Ben):
   - Fix typo in commit message.
   - Check for gen < 9 and return early in brw_predraw_set_aux_buffers()
   - Check for gen < 9 and return early in intel_miptree_prepare_mcs()
v4: Check for msaa_layput and number of samples to determine if
    lossless compression is to used. Otherwise one cannot distuingish
    between fast clear with and without compression.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:26 +03:00
Topi Pohjolainen
1ca02b6ebb i965: Add flag telling if miptree is for client consumption
Consider later on adding specific disable flags such as

MIPTREE_LAYOUT_DISABLE_AUX_MCS   = 1 << 3, /* CCS_D */
MIPTREE_LAYOUT_DISABLE_AUX_CCS_E = 1 << 4,
MIPTREE_LAYOUT_DISABLE_AUX       = MIPTREE_LAYOUT_DISABLE_AUX_MCS |
                                   MIPTREE_LAYOUT_DISABLE_AUX_CCS_E,

and equivalent boolean/enums into miptree.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:22 +03:00
Topi Pohjolainen
a6e0f1cc7f i965: Add helper for lossless compression support
v2: Check explicitly against base type of GL_FLOAT instead of
    using _mesa_is_format_integer_color(). Otherwise we miss
    GL_UNSIGNED_NORMALIZED.
v3 (Ben): Also call intel_miptree_supports_non_msrt_fast_clear()
          in order to really check everything.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:22 +03:00
Topi Pohjolainen
874c5f05db i965/gen9: Prepare surface state setup for lossless compression
v2 (Ben): Use combination of msaa_layout and number of samples
          instead of introducing explicit type for lossless
          compression (intel_miptree_is_lossless_compressed()).
v3 (Ben): Do not set fast claer state in surface state setup.
          Moved into brw_postdraw_set_buffers_need_resolve()
          using a separate patch.
v4: Support for blorp
v5 (Ben): Re-use gen8_get_aux_mode()

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:22 +03:00
Topi Pohjolainen
a8544267fd i965/gen8: Expose auxiliary mode resolver
Also use the opportunity to drop the unused surface type argument.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:22 +03:00
Topi Pohjolainen
94926492d8 i965: Relax assertion of halign == 16 for lossless compressed aux
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:22 +03:00
Topi Pohjolainen
ba9f954e60 i965/blorp: Set full resolve for lossless compressed
v2 (Ben): Introduce union for fast clear and resolve ops

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:49:22 +03:00
Topi Pohjolainen
58e7392e12 i965/blorp: Do not skip fast color clear with new color
This hasn't been visible before. It showed up with lossless
compression with:

dEQP-GLES3.functional.fbo.color.repeated_clear.sample.tex2d.rgb8

Current fast clear logic kicks color resolves even for gpu sampling.
In the test case this results into trashing of the fast color clear
state between two subsequent clears, and therefore each clear is
performed correctly.
With lossless compression the resolves are unnecessary and therefore
the clear state indicates that the buffer is already cleared. Without
considering if the previous color value was the same as the new,
clears that need to be performed are skipped and the buffer ends up
holding old pixel values.

v2 (Ken): Fix the comparison for gen < 9

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-05-12 19:48:47 +03:00
Kenneth Graunke
12dcad1b42 i965: Enable scalar GS by default.
I'd originally left this off because Orbital Explorer was hanging the
GPU, but it seems to be working these days.  There have been a bunch
of changes since then, so we probably fixed something.

On my Broadwell laptop, both Synmark/GSCloth and Orbital Explorer seem
to run at approximately the same framerate in either mode.  This is
despite large reductions in instruction count for Synmark, and large
increases for Orbital Explorer.  It apparently just doesn't matter.

Switching to scalar mode will gain us fp64 support in the next release,
as vec4-mode support isn't yet ready.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-12 01:01:42 -07:00
Kenneth Graunke
607fb0f13d i965: Reduce the SIMD8 GS push constant threshold from 32 to 24.
Three Shadow of Mordor geometry shaders increase by a single
instruction, but the number of spills/fills in Orbital Explorer
is reduced from 194:1279 -> 82:454.  No other programs are affected.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-12 01:01:42 -07:00
Kenneth Graunke
3aa542c657 i965: Delete bogus assertion in emit_gs_input_load().
This looks like leftover cruft from an earlier attempt at writing
point size hacks.  Each vertex has its own copy of gl_PointSize,
so accessing any vertex other than 0 would cause this to fail.

The tests seem to work fine without it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-12 01:01:42 -07:00
Kenneth Graunke
1c41cb58de i965: Support instanced GS inputs in the scalar backend.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-12 01:01:36 -07:00
Kenneth Graunke
5fc3772650 i965: Use an early return for the push case in emit_gs_input_load().
Just trying to keep things from getting too ugly in the next commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-12 00:59:08 -07:00
Kenneth Graunke
e9ca952581 i965: Drop BRW_NEW_BLORP from stipple and line parameter packets.
BLORP never touches these, and they're all non-pipelined.  Some
are fairly large packets as well.

I haven't tried to benchmark this; the effect is likely to be small.
However, we may as well stop the pointless papercuts; maybe they'll
add up someday.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-12 00:54:37 -07:00
Jakob Sinclair
18f7c88dd6 glsl: fixed uninitialized pointer
Class "ir_constant" had a bunch of constructors where the pointer member
"array_elements" had not been initialized. This could have lead to unsafe
code if something had tried to write anything to it. This patch fixes
this issue by initializing the pointer to NULL in all the constructors.
This issue was discovered by Coverity.

CID: 401603, 401604, 401605, 401610

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-05-12 09:46:36 +02:00
Ilia Mirkin
ba3f0b6d59 nvc0: fix gl_SampleMaskIn computation
The SAMPLEMASK semantic should only return the bits set covered by the
current invocation. However we were always retrieving the covmask, which
returns the covered samples of the whole pixel.

When not doing per-sample invocation, this is precisely what we want.
However when doing per-sample invocation, we have to select the
sampleid'th bit and only return that. Furthermore, this means that we
have to have a 1:1 correlation for invocations and samples.

This fixes most

dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.*

tests. A few failures remain due to disagreements about nr_samples==1
logic as well as what happens with MSAA x2 RTs when the shading fraction
is 0.5.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-11 20:39:27 -04:00
Ilia Mirkin
f5fe903002 nv50/ir: generalize interp fixups to be able to fixup anything
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-11 20:39:26 -04:00
Jason Ekstrand
66a442687f .mailmap: Update the e-mail addresses for Kristian Høgsberg
This changes it to use his personal e-mail and adds his @intel.com address

Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-11 12:32:14 -07:00
Jason Ekstrand
7e759fbd60 .mailmap: Use Connor Abbott's personal e-mail 2016-05-11 12:27:15 -07:00
Giuseppe Bilotta
9c3392cb3a Add .mailmap
This adds a first tentative .mailmap file, to canonicize contributor
name/emails in shortlogs and other statistical endeavours.

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-11 12:21:46 -07:00
Jason Ekstrand
f1dcc7976a i965: Stop splitting fma() prior to optimization
According to the GLSL spec, if the user uses the fma() intrinsic to
generate a precise-consumed value, and you have it in your hardware, you
shouldn't split it.  For a while now, we've been splitting all ffma's
up-front and then planned to fuse them later which isn't valid.  Correctly
handling the GLSL behaviour fixes rendering corruptions in Tomb Raider.
The only reason why doing this possibly helped before was for ARB programs
which is handled by the previous commit.

Shader-db results on Haswell:

   total instructions in shared programs: 7560300 -> 7561510 (0.02%)
   instructions in affected programs: 56265 -> 57475 (2.15%)
   helped: 86
   HURT: 291

The only shaders in the database that are affected are from "Shadow of
Mordor" which is the first app in our database to use fma().  We could, at
some point in the future, split inexact ffma opcodes which would fix the
shader-db regressions since Shadow of Mordor doesn't ues precise.  However,
this fixes a bug now and and the shader-db impact is fairly small.

Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-11 11:44:35 -07:00
Jason Ekstrand
47f01e538a ptn: Emit mul+add for MAD
Unlike fma() in GLSL, MAD in ARB programs is 100% splittable.  Just emit
the split version and let the optimizer fuse them later.

Shader-db results on Haswell:

   total instructions in shared programs: 7560379 -> 7560300 (-0.00%)
   instructions in affected programs: 143928 -> 143849 (-0.05%)
   helped: 443
   HURT: 250

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-11 11:44:35 -07:00
Jason Ekstrand
1b72c31e1f nir/algebraic: Separate ffma lowering from fusing
The i965 driver has its own pass for fusing mul+add combinations that's
much smarter than what nir_opt_algebraic can do so we don't want to get the
nir_opt_algebraic one just because we didn't set lower_ffma.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-11 11:44:35 -07:00
Rob Clark
5886d1bad1 anv: fix build break
Previous rename of lower-output-to-temps pass predated merging of anv,
and apparently vulkan wasn't enabled in my local builds so overlooked
this when rebasing.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-11 14:03:24 -04:00
Rob Clark
697382eb61 mesa/st: split the type_size calculation into it's own file
We'll want to re-use this for NIR.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-11 12:20:12 -04:00
Rob Clark
0e5a369879 glsl: export accessor for builtin-uniform descriptors
We'll need this for a nir pass to lower builtin-uniform access.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-11 12:20:12 -04:00
Rob Clark
dfbabc6bad nir/lower-io: add support for lowering inputs
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-11 12:20:11 -04:00
Rob Clark
595f9d5476 nir/lower-io: split out some helper fxns
Prep work to reduce the noise in the next patch.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-11 12:20:11 -04:00
Rob Clark
b085016f94 nir: rename lower_outputs_to_temporaries -> lower_io_to_temporaries
Since it will gain support to lower inputs, give it a more generic name.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-11 12:20:11 -04:00
Rob Clark
47fcef9a20 nir: move callsite of lower_outputs_to_temporaries
Going to convert this pass to parameterized lower_io_to_temporaries, and
we want the user to be able to specify whether to lower outputs or
inputs or both.  The restriction of running this pass before validate
to avoid output reads no longer applies.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-11 12:20:11 -04:00
Rob Clark
5261947260 nir: lower-io-types pass
A pass to lower complex (struct/array/mat) inputs/outputs to primitive
types.  This allows, for example, linking that removes unused components
of a larger type which is not indirectly accessed.

In the near term, it is needed for gallium (mesa/st) support for NIR,
since only used components of a type are assigned VBO slots, and we
otherwise have no way to represent that to the driver backend.  But it
should be useful for doing shader linking in NIR.

v2: use glsl_count_attribute_slots() rather than passing a type_size
    fxn pointer

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-11 12:20:11 -04:00
Rob Clark
b10cc24519 nir: passthrough-edgeflags support
Handled by tgsi_emulate for glsl->tgsi case.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-11 12:20:11 -04:00
Rob Clark
3a939d034e nir: add lowering pass for glBitmap
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-11 12:20:11 -04:00
Rob Clark
12c18ce476 nir: add lowering pass for glDrawPixels
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-11 12:20:11 -04:00
Rob Clark
b26645a00f nir: add lowering pass for y-transform
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-11 12:20:11 -04:00
Rob Clark
e1d80f8603 gallium: add NIR as a possible IR
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-11 12:20:11 -04:00
Rob Clark
425dc4c4b3 gallium: refactor pipe_shader_state to support multiple IR's
The goal is to allow the pipe driver to request something other than
TGSI, but detect whether what is getting is TGSI vs what it requested.
The pipe drivers will always have to support TGSI (and convert that into
whatever it is that they prefer), but in some cases we should be able to
skip the TGSI intermediate step (such as glsl->nir vs glsl->tgsi->nir).

I think pipe_compute_state should get similar treatment.  Currently,
afaict, it has one user and one consumer, which has allowed it to be
sloppy wrt. supporting alternative IR's.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-11 12:20:11 -04:00
Rob Clark
4500d17245 freedreno: fix multi-layer transfer_map's
The use of transfer_inline_write() in TexSubImage path (see fb9fe352ea)
exposed a bug for "layer_first" resources (ie. a4xx) not setting correct
layer_stride.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-11 12:03:21 -04:00
Juan A. Suarez Romero
9bea018994 glsl: use var with initializer on global var validation
Currently, when cross validating global variables, all global variables
seen in the shaders that are part of a program are saved in a table.

When checking a variable this already exist in the table, we check both
are initialized to the same value. If the already saved variable does
not have an initializer, we copy it from the new variable.

Unfortunately this is wrong, as we are modifying something it is
constant. Also, if this modified variable is used in
another program, it will keep the initializer, when it should have none.

Instead of copying the initializer, this commit replaces the old
variable with the new one. So if we see again the same variable with an
initializer, we can compare if both are the same or not.

v2: convert tabs in whitespaces (Kenenth Graunke)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-11 13:50:04 +02:00
Jordan Justen
2c1c060b03 util/ralloc: Remove double zero'ing of rzalloc buffers
Juha-Pekka found this back in May 2015:
<1430915727-28677-1-git-send-email-juhapekka.heikkila@gmail.com>

From the discussion, obviously it would be preferable to make
ralloc_size no longer return zeroed memory, but Juha-Pekka found that
it would break Mesa.

In <56AF1C57.2030904@gmail.com>, Juha-Pekka mentioned that patches
exist to fix i965 when ralloc_size is fixed to not zero memory, but
the patches have not made their way to mesa-dev yet.

For now, let's stop doing the double zeroing of rzalloc buffers.

v2:
 * Move ralloc_size code to rzalloc_size, and add a comment as
   suggested by Ken.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 22:54:46 -07:00
Jonathan Gray
e3d43dc5ea genxml: avoid using a GNU make pattern rule
% pattern rules are a GNU extension.  Convert the use of one to a
inference rule to allow this to build on OpenBSD.

v2: inference rules can't have additional prerequisites so add a target
rule to still depend on gen_pack_header.py

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-10 20:54:33 -07:00
Roland Scheidegger
430797843a gallivm: improve dumping of bitcode
Use GALLIVM_DEBUG=dumpbc for dumping of modules as bitcode.
Instead of a fixed llvmpipe.bc name, use ir_<modulename>.bc so multiple
modules can be dumped (albeit it might still overwrite previous modules,
particularly the modules from draw tend to always have the same name).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-11 04:43:35 +02:00
Vinson Lee
8d639138c7 swr: [rasterizer] Include cmath for std::isnan and std::isinf.
This patch fixes this build error.

  CXX      rasterizer/memory/libswrAVX_la-ClearTile.lo
In file included from rasterizer/memory/ClearTile.cpp:34:0:
./rasterizer/memory/Convert.h: In function ‘uint16_t Convert32To16Float(float)’:
./rasterizer/memory/Convert.h:170:9: error: ‘__builtin_isnan’ is not a member of ‘std’
     if (std::isnan(val))
         ^
./rasterizer/memory/Convert.h:170:9: note: suggested alternative:
<built-in>: note:   ‘__builtin_isnan’
./rasterizer/memory/Convert.h:176:14: error: ‘__builtin_isinf_sign’ is not a member of ‘std’
     else if (std::isinf(val))
              ^
./rasterizer/memory/Convert.h:176:14: note: suggested alternative:
<built-in>: note:   ‘__builtin_isinf_sign’

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95180
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-05-10 17:11:05 -07:00
Jason Ekstrand
a5660bf1f8 i965/blorp: Don't blend integer values during MSAA resolves
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-10 15:32:00 -07:00
Jason Ekstrand
4f4f393bf3 meta/blit: Don't blend integer values during MSAA resolves
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-10 15:31:50 -07:00
Jason Ekstrand
203c786a73 i965/fs: Default all constants to a location of -1
Otherwise constants which aren't live get an undefined constant location.
When we go to set up param and pull_param we end up assigning all unused
uniforms to slot 0.  This cases the Vulkan driver to segfault because it
doesn't have pull_param.

This fixes bugs in the Vulkan driver introduced in c3fab3d000.

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2016-05-10 15:25:30 -07:00
Dave Airlie
d36d11ad90 st/glsl_to_tgsi: attach image to correct instruction for samples
This fixes a crash (but not the test):
GL45-CTS.shader_texture_image_samples_tests.functional_test

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-11 06:55:09 +10:00
Dave Airlie
07df3b81ff mesa: move MESA_MAP_NOWAIT_BIT up away from GL_MAP_PERSISTENT_BIT
This was colliding badly and making
GL45-CTS.buffer_storage.map_persistent_texture
fail on radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-11 06:54:56 +10:00
Dave Airlie
b230d51a18 mesa/meta: check for signed/unsigned int conversion for pbo getteximage
When doing GetTexSubImage using a PBO, we should check if it involves
a signed/unsigned conversion and bail if it does, just like in the
other cases.

This fixes:
GL33-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo
on Haswell at least.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95324
Reviewed-by: Matt Turer <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-11 06:52:20 +10:00
Matt Turner
8bb156a261 i965: Handle BRW_OPCODE_DO on Gen6+ in brw_instruction_name().
This became a problem after the recent disassembler changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 12:12:46 -07:00
Bas Nieuwenhuizen
3d21720d31 radeonsi: Set declared tessellation LDS size to hardware size.
The calculated limit gave problems on SI as it was > 32 KiB
and the hardware LDS size on SI is only 32 KiB. It isn't
correct anyway when processing multiple patches in a threadgroup.

As we potentially have any number of patches such that the
used LDS is at most the hardware LDS size, and exact size
per patch is not known at compile time, this seems like
the only valid bound.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-10 20:14:55 +02:00
Rob Clark
8623e599fc freedreno/ir3: size input/output arrays properly
We index into these based on var->data.driver_location, which might have
gaps (ie. two inputs, one w/ drvloc 0 and other 2).  This shows up in
(for example) 'bin/copyteximage 1D', but was only noticed recently due
to additional asserts.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-10 13:17:27 -04:00
Ian Romanick
2483a9a08c ir_to_mesa: Emit smarter ir_binop_logic_or for vertex programs
Continue using ADD in the other case because a fragment shader backend
could fuse the ADD with a MUL to generate a MAD for ((x && y) || z).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-10 09:22:18 -07:00
Ian Romanick
f7328f9afd prog: Delete all remains of OPCODE_SNE, OPCODE_SEQ, OPCODE_SGT, and OPCODE_SLE
There is nothing left that can generate them.  These used to be
generated by ir_to_mesa or by the assembler for various NV extensions
that have been removed.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-10 09:22:18 -07:00
Ian Romanick
fd63e77998 ir_to_mesa: Do not emit OPCODE_SEQ or OPCODE_SNE
Nothing that consumes the output of this backend consumes them
navtively.  This is *not* the way i915 has implemented these
instructions, but, as far as I am able to tell, this is the way both the
Cg compiler and the HLSL compiler implement these operations.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-10 09:22:18 -07:00
Ian Romanick
15e6a1a3be ir_to_mesa: Do not emit OPCODE_SLE or OPCODE_SGT
Nothing that consumes the output of this backend consumes them
navtively.  This is the way i915 has implemented these instructions
since it began consuming GLSL.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-10 09:22:18 -07:00
Samuel Pitoiset
e46ac18ebe nvc0: enable compute support by default on GK110+
Compute support seems to be pretty stable now, and according to piglit
it doesn't seem to break 3D state.

As a side effect, this will expose ARB_compute_shader on GK110/GK208.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-10 17:47:01 +02:00
Marek Olšák
2b58bc4461 gallium/radeon: don't flush the GFX IB if DMA doesn't depend on it
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
fb89f06698 radeonsi: consolidate radeon_add_to_buffer_list calls for DMA
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
60946c0d60 gallium/radeon: add a heuristic for better (S)DMA performance
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
bb74152597 gallium/radeon: flush if DMA IB memory usage is too high
This prevents IB rejections due to insane memory usage from
many concecutive texture uploads.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
70934de00e radeonsi: add new SDMA texture copy code
This implements:
- Linear-to-linear partial copies. (unaligned)
- Tiled-to-linear and linear-to-tiled partial copies.
  (unaligned except 1-2 Bpp)
- Tiled-to-tiled partial copies aligned to 8x8.

v2: Extend the SDMA L2T VM fault workaround to T2L.
    - Same algorithm, just applied to T2L.
      (and using a 0-based address and surface.bo_size instead of buf->size)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
a512da36ae gallium/radeon: fix (S)DMA read-after-write hazards
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
f837c37f02 radeonsi: raise the max size for SDMA buffer copies
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
faa4f0191d radeonsi: remove SDMA texture copy code
Most of this has never worked according to the new test.

The new code will be radically different.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
498a40cae8 radeonsi: only expose *_init_*dma_functions from (S)DMA files
just normalizing the interfaces

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
3af28e558f gallium/radeon: implement randomized SDMA texture copy testing (v2)
v2: - adjustments for exercising all important SDMA code paths
    - decrease the probability of getting huge sizes (faster testing)
    - increase the probability of getting power-of-two dimensions
    - change the memory cap to 128MB (faster testing)
    - better detect which engine has been used

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
f475c9fb07 gallium/radeon: discard CMASK or DCC if overwriting a whole texture by DMA
v2: simplify the conditionals

Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
2f173b8e13 gallium/radeon: use a common function for DMA blit preparation
this is more robust and probably fixes some bugs already

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
2af4b637d8 gallium/radeon: split out code for discarding DCC
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
c85d0c17d9 gallium/radeon: rename r600_texture_disable_cmask -> discard_cmask
because it doesn't decompress

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
fb9fe352ea st/mesa: use transfer_inline_write for memcpy TexSubImage path
This allows drivers to use their own fast path for texture uploads.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
871d2aff24 gallium/radeon: fix partial layered transfers of cube (array) textures
a staging cube texture with array_size % 6 != 0 doesn't work very well

just use 2D_ARRAY or 2D for all staging textures

Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
c2377b394b gallium/radeon: align alignments for better buffer reuse
It's for the buffer cache.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
544967faf5 gallium/radeon: use gart_page_size instead of hardcoded 4096
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
bfa8a00920 winsys/radeon: use gart_page_size instead of private size_align
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Marek Olšák
9d8c283f28 winsys/amdgpu: move gart_page_size to struct radeon_winsys
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-10 17:20:09 +02:00
Roland Scheidegger
e4cf8717de gallivm: print declarations of intrinsics with GALLIVM_DEBUG=ir
Those aren't really interesting, however outputting them is helpful when
trying to feed the IR to llvm llc (or opt) for debugging.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-10 17:08:16 +02:00
Roland Scheidegger
5c200894c8 gallivm: use InternalLinkage instead of PrivateLinkage for texture functions
At least with MCJIT the disassembler will crash otherwise when trying to
disassemble such functions.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-10 17:08:16 +02:00
Roland Scheidegger
8b66e2647d gallivm: disable avx512 features
We don't target this yet, and some llvm versions incorrectly enable it based
on cpu string, causing crashes.
(Albeit this is a losing battle, it is pretty much guaranteed when the next
new feature comes along llvm will mistakenly enable it on some future cpu,
thus we would have to proactively disable all new features as llvm adds them.)

This should fix https://bugs.freedesktop.org/show_bug.cgi?id=94291 (untested)

Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com

CC: <mesa-stable@lists.freedesktop.org>
2016-05-10 17:08:16 +02:00
Jose Fonseca
94e8653a3b Revert "nir: Try to warn when C99 extensions are used in nir headers."
This reverts commit 99474dc29b.

-Wpedantic is too verbose, even when applied to just a few includes.

We'll just have to deal with the issues as they come.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-10 03:29:24 -07:00
Samuel Iglesias Gonsálvez
4c9006f957 i965/fs: fix MOV_INDIRECT exec_size for doubles
In that case, the writes need two times the size of a 32-bit value.
We need to adjust the exec_size, so it is not breaking any hardware
rule.

v2:
  - Add an assert to verify type size is not less than 4 bytes (Jordan).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:09 +02:00
Samuel Iglesias Gonsálvez
75ada43a3a i965/fs: take into account doubles when calculating read_size for MOV_INDIRECT
v2:
- Fix assert's line width (Topi).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-10 11:25:09 +02:00
Samuel Iglesias Gonsálvez
03687ab77f i965/fs: demote_pull_constants() did not take into account double types
The constants could be double, and it was allocating size for float types
for the destination register of varying pull constant loads.

Then the fs_visitor::validate() will complain.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-10 11:25:09 +02:00
Samuel Iglesias Gonsálvez
c3fab3d000 i965/fs: push first double-based uniforms in push constant buffer
When there is a mix of definitions of uniforms with 32-bit or 64-bit
data type sizes, the driver ends up doing misaligned access to double
based variables in the push constant buffer.

To fix this, this patch pushes first all the 64-bit variables and
then the rest. Then, all the variables would be aligned to
its data type size.

v2:
- Fix typo and improve comment (Jordan).
- Use ralloc(NULL,...) instead of rzalloc(mem_ctx,...) (Jordan).
- Fix typo (Topi).
- Use pointers instead of references in set_push_pull_constant_loc() (Topi).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-05-10 11:25:09 +02:00
Iago Toral Quiroga
193cb67a84 i965/fs: recognize writes with a subreg_offset > 0 as partial
Usually, writes to a subreg_offset > 0 would also have a stride > 1
and we would recognize them as partial, however, there is one case
where this does not happen, that is when we generate code for 64-bit
imemdiates in gen7, where we produce something like this:

mov(8) vgrf10:UD, <low 32-bit>
mov(8) vgrf10+0.4:UD, <high 32-bit>

and then we use the result with a stride of 0, as in:

mov(8) vgrf13:DF, vgrf10<0>:DF

Although we could try to avoid this issue by producing different code
for this by using writes with a stride of 2, that runs into other
problems affecting gen7 and the fact is that any instruction that
writes to a subreg_offset > 0 is a partial write so we should really
recognize them as such.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:09 +02:00
Iago Toral Quiroga
34ed61b334 i965/fs/lower_simd_width: Fix registers written for split instructions
When the original instruction had a stride > 1, the combined registers
written by the split instructions won't amount to the same register space
written by the original instruction because the split instructions will
use a stride of 1. The current code assumed otherwise and computed the
number of registers written by split instructions as an equal share based
on the relation between the lowered width and the original execution size
of the instruction.

It is only after the split, when we interleave the components of the result
from the lowered instructions back into the original dst register, that the
original stride takes effect and we write all the registers specified by
the original instruction.

Just make the number of register written the same as the vgrf space we
allocate for the dst of the split instruction.

Fixes crashes in fp64 tests produced as a result of assigning incorrectly the
number of registers written by split instructions, which led to incorrect
validation of the size of the writes against the allocated vgrf space.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:09 +02:00
Iago Toral Quiroga
9741cff1ec i965/fs: rename our lower_d2f pass to lower_d2x
Since it no longer handles conversions from double to float but from
double to various other 32-bit types.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:09 +02:00
Iago Toral Quiroga
efaf62a40a i965/fs: implement i2d and u2d
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:08 +02:00
Iago Toral Quiroga
c63a6f2149 i965/fs: implement d2i and d2u
These need the same treatment as d2f, so generalize our d2f lowering to cover
these too.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:08 +02:00
Iago Toral Quiroga
e0c45182e3 i965/fs: implement d2b
v2: Use subscript() instead of stride() (Curro)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:08 +02:00
Iago Toral Quiroga
80f60a4302 i965/fs: implement fsign() for doubles
v2 (Sam):
  - Fix indentation (Kenneth)
  - Simplify code (Kenneth)

v3: Use subscript() instead of stride() (Curro)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:08 +02:00
Iago Toral Quiroga
c9ecd651e6 i965/fs: add null_reg_df
Probably not needed since we fix the dst type of comparisons
automatically, but for consistency with the rest of null_reg_*
functions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:08 +02:00
Iago Toral Quiroga
e8a8fc9563 i965/fs: We only support 32-bit integer ALU operations for now
Add asserts so we remember to address this when we enable 64-bit
integer support, as suggested by Connor and Jason.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:08 +02:00
Iago Toral Quiroga
9e5ce151a4 i965/fs: handle fp64 opcodes in brw_do_channel_expressions
In the case of the pack opcode we are already doing the
lowering in NIR, so no need to do it here. The unpack opcode
operates on scalars, so it should not be lowered.

In the case of frexp_sig and frexp_exp, they are lowered in
lower_instructions, so we don't have to care about them.

All the remaining opcodes involve conversions from and to doubles
and are business as usual.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:08 +02:00
Connor Abbott
a644b0939d i965/fs: add support for f2d and d2f
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:08 +02:00
Connor Abbott
9e1b3ea199 i965/fs: add a pass for legalizing d2f
We need to do this late, in order to avoid partial writes during the
optimization loop.

v2: Use subscript() instead of stride().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:08 +02:00
Connor Abbott
2286a74e3b i965/fs: fix dst width calculation in CSE
v2 (Sam):
- Fix line width (Topi).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-10 11:25:08 +02:00
Connor Abbott
fccd15524f i965/fs: fix regs_written in LOAD_PAYLOAD for doubles
v2: Account for the stride of the dst (Iago)

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-10 11:25:07 +02:00
Connor Abbott
6b6d68ae07 i965/fs: fix is_copy_payload() for doubles
v2 (Sam):
- LOAD_PAYLOAD treats each header source as a 32B block
  regardless of the datatype. Drop the change (Curro)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-10 11:25:07 +02:00
Connor Abbott
e83f51d54e i965/fs: fix compares for doubles
The destination has to have the same source as the type, or else the
simulator will complain. As a result, we need to emit a CMP that
outputs a 64-bit wide result and then do a strided MOV to pick out the
low 32 bits of each channel.

v2: Use subscript() instead of stride() (Curro)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:07 +02:00
Connor Abbott
a5d7e144ea i965/fs: extend exec_size halving in the generator
The HW has a restriction that only vertical stride may cross register
boundaries. Previously, this only mattered for SIMD16 instructions where
we needed to use the same regioning parameters as the equivalent SIMD8
instruction but double the exec size. But we need to do the same
splitting for 64-bit instructions as well as instructions with a stride
of 2 (which effectively consume 64 bits per element). Fix up the code to
do the right thing instead of special-casing SIMD16.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:07 +02:00
Connor Abbott
4f3888c1ca i965/fs: fix assign_constant_locations() for doubles
Uniform doubles will read two registers, in which case we need to mark
both as being live.

v2 (Sam):
  - Use a formula to get the number of registers read with proper
    units (Curro).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:07 +02:00
Connor Abbott
cc64c9e441 i965/fs: use byte_offset() in offset() for uniforms
This makes things more consistent, and also fixes the offset calculation
for double uniforms.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:07 +02:00
Connor Abbott
fe949949a9 i965/fs: handle uniforms in byte_offset()
v2: Do it only for uniforms (Iago)

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:06 +02:00
Connor Abbott
1f51aada3f i965/fs: fix type_size() for doubles
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:06 +02:00
Iago Toral Quiroga
935e0e305d i965/fs: optimize unpack double
When we are actually unpacking from a double that we have previously
packed from its 32-bit components we can bypass the pack operation
and source from its arguments directly.

v2 (Sam):
- Fix line overflow (Topi)
- Bail if the parent instruction's source is not SSA (Connor)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:06 +02:00
Iago Toral Quiroga
ba1907f040 i965/fs: optimize pack double
When we are actually creating a double using values obtained from a
previous unpack operation we can bypass the unpack and source from
the original double value directly.

v2:
- Style changes (Topi)
- Bail is parent instruction's src is not SSA (Connor)

v3: Use subscript() instead of stride() (Curro)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:06 +02:00
Connor Abbott
7782f39e75 i965/fs/nir: translate double pack/unpack
v2 (Sam):
- Fix line overflow (Topi).

v3: Use subscript() instead of stride() (Curro)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:06 +02:00
Connor Abbott
fd763177c1 i965/fs: add a pass for lowering PACK opcodes
v2: Use subscript() instead of stride() (Curro)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:06 +02:00
Connor Abbott
ba582e58cd i965/fs: add PACK opcode
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:05 +02:00
Francisco Jerez
cc3bae5cd7 i965/fs: Introduce helper to extract a field from each channel of a register.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-05-10 11:25:05 +02:00
Connor Abbott
d17cdacba3 i965/fs: always pass the bitsize to brw_type_for_nir_type()
v2 (Sam):
- Add bitsize to brw_type_for_nir_type() in optimize_extract_to_float()

v3 (Sam):
- Fix line width (Topi).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:05 +02:00
Connor Abbott
a308bae58f i965/fs: add support for printing double immediates
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:05 +02:00
Connor Abbott
0f2e227d5c i965/fs: don't propagate 64-bit immediates
They can only be used with 1-src instructions, which practically (since
we should've constant-propagated away all 1-src instructions with 64-bit
immediates in NIR) means that they must be kept in separate MOV's and
can't be propagated.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:05 +02:00
Connor Abbott
0f1690fd95 i965/fs: use the NIR bit size when creating registers
v2 (Iago):
  - Squashed bits from 'support double precission constant operands for
    the implementation of 64-bit emit_load_const'.
  - Do not use BRW_REGISTER_TYPE_D for all 32-bit registers since that breaks
    asserts and functionality for some piglit tests. Just keep 32-bit types
    untouched and add 64-bit support.
  - Use DF instead of Q for 64-bit registers. Otherwise the code we generate
    will use Q sometimes and DF others and we hit unwanted DF/Q conversions,
    so always use DF.

v3 (Sam):
  - Mark 'reg_type' occurrences as const (Topi).

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Signed-off-by: Tapani Palli <tapani.palli@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:04 +02:00
Connor Abbott
76de7af8e2 i965: fixup uniform setup for doubles
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:04 +02:00
Iago Toral Quiroga
3210870b34 i965: two-argument instructions can only use 32-bit immediates
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-10 11:25:04 +02:00
Iago Toral Quiroga
3d10adf603 i965: fix brw_abs_immediate() for doubles
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-10 11:25:04 +02:00
Iago Toral Quiroga
830d87840c i965: fix brw_saturate_immediate() for doubles
v2 (Sam):
  - Mark 'size' as const (Topi).
  - Add comment to explain that we do copies 64-bits regardless of the
    type (Topi)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-10 11:25:03 +02:00
Connor Abbott
7bcc4cccad i965: fix is_zero(), is_one() and is_negative_one() for doubles
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:03 +02:00
Connor Abbott
2ae409286c i965: fix brw_negate_immediate() for doubles
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:03 +02:00
Connor Abbott
cbf7c7f099 i965/eu: add support for DF immediates
v2 (Sam):
  - Remove 'however' from the comment (Topi)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:03 +02:00
Connor Abbott
c0a1cd24a8 i965: add support for disassembling DF immediates
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:03 +02:00
Connor Abbott
bb175db16b i965: add support for getting/setting DF immediates
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:03 +02:00
Connor Abbott
5310bca024 i965: add brw_imm_df
v2 (Iago)
  - Fixup accessibility in backend_reg

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:02 +02:00
Topi Pohjolainen
9add73f641 i965/eu: Allow 3-src float ops with doubles
v2:
  - set 3src_src_type for BRW_REGISTER_TYPE_DF (Connor)

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:02 +02:00
Connor Abbott
367e762a71 i965/disasm: fix disasm of 3-src doubles
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:02 +02:00
Topi Pohjolainen
45066a6a59 i965: Tell backend register about double precision type
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Signed-off-by: Tapani P\344lli <tapani.palli@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:02 +02:00
Topi Pohjolainen
520b3b2fd1 i965: Determine size of double precision float register
This is used to determine how many registers an instruction reads and
writes as well as for offseting register region into a desired component.

v2 (Connor): rebase on master

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Signed-off-by: Tapani P\344lli <tapani.palli@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:02 +02:00
Topi Pohjolainen
e88cf0f2d2 i965: Lower DFRACEXP/DLDEXP
v2 (Connor): rebase on master which moved this to brw_link.cpp
v3 (Sam):
- Only enable DFREXP_DLDEXP_TO_ARITH in process_glsl_ir(). This is
used for doubles. Single floating point op is lowered by NIR.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:02 +02:00
Connor Abbott
30424fd25a i965: use pack/unpackDouble lowering
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:01 +02:00
Connor Abbott
bea2f8beb5 i965: use double lowering pass
v2: also lower trunc, ceil, floor, fract and roundEven (Iago)
v3: also lower mod for doubles (Sam)

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:01 +02:00
Samuel Iglesias Gonsálvez
d00a239b28 freedreno/ir3: lower lrp when operating with double operands
Lower lrp when operating with double operands because float version of
lrp is also lowered.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:01 +02:00
Samuel Iglesias Gonsálvez
93e690830a i965: enable lrp lowering for doubles
Broadwell and previous generations does not support lrp instruction
operating with doubles.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-10 11:25:01 +02:00
Dave Airlie
008feb3687 st/glsl_to_tgsi: brown paper bag for the input offsets fix.
Oops, thanks compiler.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-10 14:41:21 +10:00
Dave Airlie
4d8a71f7f1 glsl: check geometry output vertices limits.
This fixes:
GL45-CTS.geometry_shader.limits.max_output_vertices

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-10 14:26:03 +10:00
Dave Airlie
13c68e1447 mesa/vbo: fix check for zero aliases with 2/10/10/10
This fixes:
GL33-CTS.gtf33.GL3Tests.vertex_type_2_10_10_10_rev.vertex_type_2_10_10_10_rev_attrib

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-10 14:24:49 +10:00
Eduardo Lima Mitev
60a5d02416 nir/print: Print memory qualifiers in a variable declaration
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-10 06:22:05 +02:00
Eduardo Lima Mitev
7f7f58f17f glsl: Apply memory qualifiers to vars inside named block interfaces
This is missing and memory qualifiers are currently being ignored for SSBOs.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-10 06:21:55 +02:00
Dave Airlie
f75a26d1ba st/glsl_to_tgsi: handle offsets from inputs
This fixes:
GL45-CTS.gpu_shader5.texture_gather_offset_color_repeat

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-10 13:14:29 +10:00
Rob Clark
aa730aca20 scripts: bump git_reviewer.pl --git-min-percent default
Bump up default percentage of commits required to be auto-picked for CC.
Seems from a bit of trial-and-error to come up with a more reasonable
list of CC's this way.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-09 19:30:28 -04:00
Kenneth Graunke
e034d80fe1 Revert "Revert "i965: Switch to scalar TCS by default.""
This reverts commit bd326c229c.

Now that we've fixed the GPU hangs, let's turn it back on.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-09 16:20:27 -07:00
Kenneth Graunke
5ce405ba0f i965: Actually assign binding table offsets for the TCS.
As far as I can tell, this was just entirely missing...honestly, I'm
not sure how anything worked at all.

Caught by noticing GPU hangs in image load store tests with scalar TCS,
but probably has broader implications.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-09 16:20:18 -07:00
Kenneth Graunke
e0e7280db0 i965: Clamp "Maximum VP Index" to 1 when gl_ViewportIndex isn't written.
fs_visitor::emit_urb_writes skips writing the VUE header for shaders
that don't write gl_PointSize, gl_Layer, or gl_ViewportIndex.  This
leaves their values uninitialized.  Kristian's nearby comment says:

"But often none of the special varyings that live there are written
 and in that case we can skip writing to the vue header, provided the
 corresponding state properly clamps the values further down the
 pipeline."

However, we were clamping gl_ViewportIndex to [0, 15], so we would end
up using a random viewport.  To fix this, detect when the shader doesn't
write gl_ViewportIndex, and clamp it to [0, 0].

The vec4 backend always writes zeros to the VUE header, so it doesn't
suffer from this problem.  With vec4-style HWord writes, we can write
the header and position together in a single message.  In the FS world,
we would need 4 extra MOVs of 0 and a longer message, or a separate
OWord write.  It's likely cheaper to just clamp the value.

Fixes DiRT Showdown and Bioshock Infinite, which only rendered half of
the screen - the lower left of two triangles.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93054
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-09 15:31:27 -07:00
Jordan Justen
e74812dbfe i965/hsw: Fix brw_store_data_imm*
For Gen6 through Haswell dword 1 is MBZ. In gen 8 it becomes part of
the 64-bit address.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-09 15:05:08 -07:00
Kenneth Graunke
96d43f2d08 i965: Reimplement ARB_transform_feedback2 on Haswell and later.
My old implementation accumulated <start, end> pairs in a buffer,
and eventually processed that data on the CPU.  This meant flushing
the batchbuffer and waiting for it to completely execute before we
could map it, resulting in really long stalls.  We could also run out
of space in the buffer, and have to do this early.

Instead, we can use Haswell's MI_MATH command to do the (end - start)
subtraction, as well as the multiplication by 2 or 3 to convert from
the number of primitives written to the number of vertices written.
We still need to CS stall to read the counters, but otherwise everything
is completely pipelined - there's no CPU<->GPU synchronization required.
It also uses only 80 bytes in the buffer, no matter what.

Improves performance in Manhattan on Skylake GT3e at 800x600 by
6.1086% +/- 0.954166% (n=9).  At 1920x1080, improves performance
by 2.82103% +/- 0.148596% (n=84).

v2: Fix number of primitives -> number of vertices calculation for
    GL_TRIANGLES (I was multiplying by 4 instead of 3.)  Caught by
    Jordan Justen.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-09 15:00:01 -07:00
Kenneth Graunke
fdb6c1887f i965: Add a brw_load_register_reg64 helper.
It appears that we can't do this in a single command (like we do for
MI_LOAD_REGISTER_IMM) - the Skylake simulator gets rather grumpy about
the command length if I try to combine them.  No matter.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-09 15:00:01 -07:00
Kenneth Graunke
4c71c8a74a i965: Only enable ARB_query_buffer_object for newer kernels on Haswell.
On Haswell, we need version 6 of the kernel command parser in order to
write the math registers.  Our implementation of ARB_query_buffer_object
heavily relies on MI_MATH, so we should only advertise it when MI_MATH
is available.  We also need MI_LOAD_REGISTER_REG, which requires version
7 of the command parser.

To make these checks easier, introduce a screen->has_mi_math_and_lrr
flag that will be set when both commands are supported.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-09 14:59:58 -07:00
Dave Airlie
2d41eb313f mesa/objectlabel: don't return info on genned but never bound textures.
This fixes some cases in the CTS KHR debug tests where it uses
glIsTexture to find an invalid ID and then call GetObjectLabel.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-10 06:06:09 +10:00
Dave Airlie
bbc6a27590 mesa: don't use genned but unnamed xfb objects.
If we try to draw or query an XFB object that hasn't been bound,
we shouldn't return any information.

This fixes a couple if cases in:
GL33-CTS.transform_feedback.api_errors_test

The ObjectLabel test is inspired by another test.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-10 06:06:09 +10:00
Samuel Pitoiset
eafe3905d9 nv50/ir: silence unsupported TGSI_PROPERTY_CS_FIXED_BLOCK_*
We don't need them for compute shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-09 21:58:56 +02:00
Jordan Justen
2e2aa992ff mesa/compute: Fix indirect dispatch buffer size check on 32-bit systems
2655265fcb, but for compute.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-09 11:16:39 -07:00
Rob Clark
57763ee735 freedreno/ir3: fix fallout from new block iterators
Since this is potentially modifying the block structure of the shader,
it needs the _safe() version of the iterator.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-09 13:52:29 -04:00
Nicolai Hähnle
fe102f7677 radeonsi: workaround for tesselation on SI
We request more than 32KB of LDS here, which SI doesn't have. Since LLVM
recently started checking the size of declared LDS allocations, all shaders
involved in tesselation fail to compile on SI.

Note that the entire calculation here seems wrong, given how we calculate
indices for generic attributes, so the number ends up wrong on CI+ as well.
A proper solution is clearly needed, but this patch should serve as a band-aid
for SI in the meantime.

Also note that the real size of the LDS allocation in hardware is independent
from what we tell LLVM, so this is really more of a "cosmetic" change.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95198
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-09 11:52:46 -05:00
Nicolai Hähnle
d8f3e8e626 radeonsi: always allocate export memory for pixel shaders
Experiments with framebuffer-no-attachments type draw calls have shown that
NULL exports stall terribly unless we ensure that export memory is allocated
by the SPI.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-09 11:52:46 -05:00
Nicolai Hähnle
ad1782cfb5 radeonsi: expose performance counters as 64 bit
This is useful for shader-related counters, since they tend to quickly
exceed 32 bits.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-09 11:52:46 -05:00
Rob Clark
f096096b77 nir/search: fix typo
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-09 12:46:24 -04:00
Tim Rowley
b65f7ec450 gallium: enable intel jitevents profiling
LLVM when configured with "intel jitevents" enabled can inform
VTune about dynamic code, so individual shaders are attributed
profiling data and the resulting assembly can be examined.

Acked-by: Roland Scheidegger <sroland@vmware.com>
2016-05-09 11:25:02 -05:00
Bruce Cherniak
0062c5f09b swr: Add missing break in query switch statement.
Missed a switch break in query stat collection when refactoring queries.

Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2016-05-09 11:21:47 -05:00
Rob Clark
f33083a216 freedreno/ir3: allow for additional VS sysval inputs
There are a total of four possible currently, rather than 2.  So we need
to be prepared for the input array to grow by 16 components.  We could
get away with less if we could pack sysval inputs..  and the way this is
handled currently isn't really the nicest thing.  But it's a tactical
fix for an issue hit in:

GL31-CTS.gtf30.GL3Tests.transform_feedback.transform_feedback_vertex_id

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-09 11:51:59 -04:00
Emil Velikov
a0d9279e3b docs: add news item and link release notes for 11.1.4/11.2.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-09 14:28:20 +01:00
Emil Velikov
0c5752b672 docs: add sha256 checksums for 11.2.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-09 14:25:08 +01:00
Emil Velikov
f746aa348e docs: add release notes for 11.2.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-09 14:25:07 +01:00
Emil Velikov
596c881162 docs: add sha256 checksums for 11.1.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-09 14:25:04 +01:00
Emil Velikov
f93d8a885c docs: add release notes for 11.1.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-09 14:25:02 +01:00
Jose Fonseca
c521f2d737 scons: Improve Python module dependency discovery.
Several NIR scripts were using `from ... import ...` syntax, which wasn't
supported.

Using Python standard libary's modulefinder solves the problem with less
effort and hacks.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-09 14:19:24 +01:00
Marek Olšák
172bfdaa9e r300g: add support for PIPE_FORMAT_x8R8G8B8_*
And set endian swap for packed formats the way it should be done
in theory.

This allows big endian to work again, but it can still be buggy.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789

Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-05-09 13:11:40 +02:00
Daniel Stone
e54b2e902a Revert "i965: Always use Y-tiled buffers on SKL+"
This commit broke Weston, Mutter, and xf86-video-modesetting, on KMS.

In order to use Y-tiled buffers, the kernel requires the tiling mode to
be explicitly named through the I915_FORMAT_MOD_Y_TILED AddFB2 modifier;
it disallows any attempt to infer the buffer's tiling mode.

As the GBM API does not have a way to extract modifiers for a buffer,
this commit broke all users of GBM on SKL+. Revert it for now, until we
get a way to extract modifier information from GBM, and also let GBM
users inform the implementation that it intends to use the modifiers.

This reverts commit 6a0d036483.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Hans de Goede <hdegoede@redhat.com>
2016-05-09 10:35:55 +01:00
Dave Airlie
920d78a32c mesa/shader_query: add missing subroutines cases
ARRAY_SIZE and LOCATION should accept the SUBROUTINE_UNIFORM types.

Fixes:
GL43-CTS.program_interface_query.subroutines-vertex
GL43-CTS.program_interface_query.subroutines-tess-control
GL43-CTS.program_interface_query.subroutines-tess-eval
GL43-CTS.program_interface_query.subroutines-geometry
GL43-CTS.program_interface_query.subroutines-fragment
GL43-CTS.program_interface_query.subroutines-compute

Reviewed-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-09 06:30:52 +10:00
Kenneth Graunke
742bc53d04 spirv: Fix structure splitting with per-vertex interface arrays.
We want to use interface_type, not vtn_var->type.  They're normally
equivalent, but for geometry/tessellation per-vertex interface arrays,
we need to unwrap a level.

Otherwise, we tried to iterate a structure members but instead used
an array length.  If the array length was longer than the number of
fields in the structure, we'd crash.

Fixes the CreatePipelineGeometryInputBlockPositive layer validation
test.

v2: Just use glsl_without_array() on the vtn_var type
    (requested by Jason Ekstrand).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2016-05-07 15:44:41 -07:00
Kenneth Graunke
1896682d27 compiler: Add a C wrapper for glsl_type::without_array().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2016-05-07 15:44:41 -07:00
Nicolai Hähnle
b9e6e8e7d4 radeonsi: fix undefined behavior (memcpy arguments must be non-NULL)
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-07 16:46:59 -05:00
Nicolai Hähnle
146927ce7b radeonsi: fix some reported undefined left-shifts
One of these is an unsigned bitfield, which I suspect is a false positive, but
gcc 5.3.1 complains about it with -fsanitize=undefined.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-07 16:46:59 -05:00
Nicolai Hähnle
60d2fc233b gallium/radeon: clean left-shift undefined behavior
Shifting into the sign bit of a signed int is undefined behavior.
Unfortunately, there are potentially many places where this happens using
the register macros.

This commit is the result of running

sed -ie "s/(((\(\w\+\)) & 0x\(\w\+\)) << \(\w\+\))/(((unsigned)(\1) \& 0x\2) << \3)/g"

on all header files in gallium/{r600,radeon,radeonsi}.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-07 16:46:59 -05:00
Nicolai Hähnle
62b7958cd0 gallium: fix various undefined left shifts into sign bit
Funnily enough, some of these were turned into a compile-time error by gcc
with -fsanitize=undefined ("initializer is not a constant").

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-07 16:46:59 -05:00
Nicolai Hähnle
945c6887ab compiler/glsl: do not downcast list sentinel
This crashes gcc's undefined behaviour sanitizer.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-07 16:46:58 -05:00
Nicolai Hähnle
bdad1393a0 mesa/main: fix another undefined left shift
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-07 16:45:04 -05:00
Nicolai Hähnle
3e1cf8bf3f mesa/main: define _NEW_xxx flags as unsigned shifts
Since 1 << 31 complains about undefined behaviour; the others are changed
only for consistency.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-07 16:44:33 -05:00
Bas Nieuwenhuizen
6291f19f71 radeonsi: Compute correct LDS size for fragment shaders.
No sure where the 36 came from, but we clearly need at least
48 bytes per attribute per primitive.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-06 21:40:17 +02:00
Eric Anholt
a1f698881e vc4: Add support for loading immediate values in QIR.
This will be used for resetting the uniform stream in the presence of
branching, but may also be useful as an optimization to reduce how many
uniforms we have to copy out per draw call (in exchange for increasing
icache pressure).
2016-05-06 10:25:55 -07:00
Eric Anholt
890dc19eeb vc4: Make vc4_qpu_validate() produce more verbose failures.
Seeing the expansion of a QPU_GET_FIELD in an assert isn't very
informative, and it's hard find what's going wrong without getting a dump
of the instruction that failed.
2016-05-06 10:25:55 -07:00
Eric Anholt
8e2d0843c0 vc4: Add a small QIR validate pass.
This has caught a couple of bugs during loop development so far, and I
should probably have written it long ago.
2016-05-06 10:25:55 -07:00
Eric Anholt
daaa9d579d vc4: Fix the src count on exp2/log2.
Found by the upcoming QIR validate pass.
2016-05-06 10:25:55 -07:00
Eric Anholt
d36b28402f vc4: Reuse QPU disasm's cond flags in QIR.
In the process, this made me flatten out the "%s%s%s%s" fprintf arguments.
2016-05-06 10:25:55 -07:00
Eric Anholt
419fee92ee vc4: When emitting an instruction to an existing temp, mark it non-SSA.
Prevents a bug in the later control-flow support series.
2016-05-06 10:25:55 -07:00
Eric Anholt
1387e722cd vc4: Make sure that we don't overwrite the signal for PROG_END.
We should have already emitted a NOP due to the last instruction being a
TLB or VPM write.  However, if you disable dead code elimination then you
might get dead code at the end, and that dead code might have the signal
bits set to something non-default, at which point you die in assertion
failure.
2016-05-06 10:25:55 -07:00
Samuel Pitoiset
44de03b0f8 nvc0: unreference images when the context is destroyed
Like other resources, we need to unreference all images.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-06 15:15:32 +02:00
Jose Fonseca
8ae78f7d28 nir: Remove spurious return from void function.
Left over from 450c061362.

Trivial.  Built locally with clang and gcc.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95296
2016-05-06 12:03:34 +01:00
Marek Olšák
901f57dff5 radeonsi: set DECOMPRESS_Z_ON_FLUSH if nr_samples >= 4
Vulkan always sets this. It only affects in-place Z decompression.
This is recommended for performance, but what app uses MSAA depth
texturing?

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-06 12:56:47 +02:00
Marek Olšák
4489d75a58 r600g: use the hw MSAA resolving if formats are compatible
This allows resolving RGBA into RGBX.
This should improve HL2 Lost Coast performance.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-06 12:56:47 +02:00
Kenneth Graunke
bd326c229c Revert "i965: Switch to scalar TCS by default."
This reverts commit b593737ed8.

Apparently it causes GPU hangs on some image load store tests.
Let's turn it back off until we figure out why.
2016-05-05 18:03:23 -07:00
Leo Liu
fef0e993a1 st/omx/enc: fix incorrect reference picture order for B frames
Stacking frames is for driver that's capable to do dual instances
encoding. Such feature is not enabled for B frames currently.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-05-05 19:26:43 -04:00
Jason Ekstrand
7bc987abe0 i965/fs: Move handling of samples_identical into the switch statement
This is where we handle texop_texture_samples so it makes things more
consistent.
2016-05-05 16:25:21 -07:00
Jason Ekstrand
3ba228f997 i965/fs: Simplify texture destination fixups
There are a few different fixups that we have to do for texture
destinations that re-arrange channels, fix hardware vs. API mismatches, or
just shrink the result to fit in the NIR destination.  These were all being
done in a somewhat haphazard manner.  This commit replaces all of the
shuffling with a single LOAD_PAYLOAD operation at the end and makes it much
easier to insert fixups between the texture instruction itself and the
LOAD_PAYLOAD.

Shader-db results on Haswell:

   total instructions in shared programs: 6227035 -> 6226669 (-0.01%)
   instructions in affected programs: 19119 -> 18753 (-1.91%)
   helped: 85
   HURT: 0

   total cycles in shared programs: 56491626 -> 56476126 (-0.03%)
   cycles in affected programs: 672420 -> 656920 (-2.31%)
   helped: 92
   HURT: 42
2016-05-05 16:25:21 -07:00
Jason Ekstrand
7de0ae634e i965/fs: stop inclinding glsl/ir.h in brw_fs.h
We are no longer using anything from GLSL IR in the FS backend.
2016-05-05 16:25:21 -07:00
Jason Ekstrand
a815499294 i965/fs: Merge nir_emit_texture and emit_texture
The fs_visitor::emit_texture helper originated when we still had both NIR
and IR visitors for the FS backend.  Since the old visitor was removed,
emit_texture serves no real purpose beyond arbitrarily splitting
heavily-linked code across two functions.
2016-05-05 16:25:21 -07:00
Connor Abbott
4fab8dd5ea nir: remove now-unused nir_foreach_block*_call()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-05 16:19:42 -07:00
Connor Abbott
7c36f9eb52 vc4: fixup for new nir_foreach_block()
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-05-05 16:19:41 -07:00
Connor Abbott
582815d9ea ir3: fixup for new nir_foreach_block() 2016-05-05 16:19:41 -07:00
Jason Ekstrand
31fc4a2528 nir/lower_double_ops: fixup for new nir_foreach_block()
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-05 16:19:41 -07:00
Jason Ekstrand
450c061362 nir/lower_double_pack: fixup for new nir_foreach_block()
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-05 16:19:41 -07:00
Jason Ekstrand
8c807cc2a6 nir/gather_info: fixup for new foreach_block()
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-05 16:19:41 -07:00
Connor Abbott
331b9f73a2 nir/lower_two_sided_color: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-05 16:19:41 -07:00
Connor Abbott
d40fbbc27e nir/lower_tex: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-05 16:19:41 -07:00
Connor Abbott
8a7fe634d2 nir/lower_outputs_to_temporaries: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-05 16:19:41 -07:00
Kenneth Graunke
b593737ed8 i965: Switch to scalar TCS by default.
Normally, we expect SIMD8 shaders to be more instructions than SIMD4x2
shaders, as it takes four instructions to operate on a vec4, rather than
a single instruction.  However, the benefit is that it can process 8
objects per shader thread instead of 2.

Surprisingly, the shader-db statistics show an improvement in both
instruction and cycle counts:

Synmark: -31.25% instructions, -29.27% cycles, 0 hurt.
Tessmark: -36.92% instructions, -37.81% cycles, 0 hurt.
Unigine Heaven: -3.42% instructions, -17.95% cycles, 0 hurt.
Shadow of Mordor:
   +13.24% instructions (26 with fewer instructions, 45 with more),
   -5.23% cycles (44 with fewer cycles, 27 with more cycles).

Presumably, this is because the SIMD8 URB messages are a much more
natural fit than the SIMD4x2 URB messages - there's a ton less header
setup.

I benchmarked Shadow of Mordor and Unigine Heaven on my Skylake GT3e,
and the performance seems to be the same or increase ever so slightly
(< 1 FPS difference).  So I believe it's strictly superior.

There's also a lot more optimization potential we can do in scalar mode.

This will also help us finish fp64 support, as scalar support is going
to land much sooner than vec4-mode support.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-05 14:24:00 -07:00
Kenneth Graunke
bc0062c54a nir: Optimize out stores of undefs.
There are a couple of cycle count changes in shader-db, but it's
basically a wash.

However, with the Broadwell scalar TCS backend enabled, many
Shadow of Mordor shaders benefit from this patch.  Because we don't
batch up output writes for TCS, vec4 outputs might not have all
components defined.  Many output writes have a value of undef,
which is useless.

With scalar TCS, stats for tessellation shaders on Broadwell:

total instructions in shared programs: 1283000 -> 1280444 (-0.20%)
instructions in affected programs: 34302 -> 31746 (-7.45%)
helped: 71
HURT: 0

total cycles in shared programs: 10798768 -> 10780682 (-0.17%)
cycles in affected programs: 158004 -> 139918 (-11.45%)
helped: 71
HURT: 0

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-05 14:24:00 -07:00
Kenneth Graunke
c7a8b32700 nir: Replace vecN(undef, undef, ...) with a single undef.
shader-db statistics on Broadwell:

total instructions in shared programs: 8963409 -> 8962455 (-0.01%)
instructions in affected programs: 60858 -> 59904 (-1.57%)
helped: 318
HURT: 0

total cycles in shared programs: 71408022 -> 71406276 (-0.00%)
cycles in affected programs: 398416 -> 396670 (-0.44%)
helped: 199
HURT: 51

GAINED: 1

The only shaders affected were in Dota 2 Reborn.

It also sets up for the next optimization.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-05 14:24:00 -07:00
Kenneth Graunke
49ea7454a1 nir: Rename opt_undef_alu to opt_undef_csel; update comments.
This better reflects what it does.  I plan to add other ALU
optimizations as well, so the old name would be confusing.

In preparation for that, also move the file comments about csels
above the opt_undef_csel function, and delete the ones about there
not being other optimizations.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-05 14:24:00 -07:00
Kenneth Graunke
a808ba5965 i965: Rework passthrough TCS checks.
According to Timothy, using program_string_id == 0 to identify the
passthrough TCS is going to be problematic for his shader cache work.

So, change it to strcmp() the name at visitor creation time.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-05 14:24:00 -07:00
Tim Rowley
ff8c0c9a35 swr: [rasterizer core] Faster modulo operator in ProcessVerts
Avoid % operator, since we know that curVertex is always incrementing.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-05 14:50:11 -05:00
Tim Rowley
2be7c3e780 swr: [rasterizer] Small warning cleanup
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-05 14:50:03 -05:00
Tim Rowley
b39c530f88 swr: [rasterizer] Add SWR_ASSUME / SWR_ASSUME_ASSERT macros
Fix static code analysis errors found by coverity on Linux

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-05 14:49:56 -05:00
Tim Rowley
db084f48eb swr: [rasterizer] Miscellaneous backend changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-05 14:49:48 -05:00
Tim Rowley
3951a2109e swr: [rasterizer] Add support for X24_TYPELESS_G8_UINT format
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-05 14:49:42 -05:00
Tim Rowley
909aee07f8 swr: [rasterizer jitter] Fix printing bugs for tracing.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-05 14:49:29 -05:00
Tim Rowley
bc084e6b3d swr: [rasterizer memory] Add missing store tiles function
Storing color hot tile to 8bit w-major stencil format.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-05 14:49:23 -05:00
Tim Rowley
5332c9d931 swr: [rasterizer jitter] Add asserts for supported formats in fetch shader
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-05 14:49:18 -05:00
Tim Rowley
6e89227054 swr: [rasterizer core] Fix thread allocation
Fix windows in 32-bit mode when hyperthreading is disabled on Xeons.

Some support for asymmetric processor topologies.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-05 14:49:11 -05:00
Tim Rowley
c2f5d2daa8 swr: [rasterizer core] Fix threadviz support in buckets
Need to do lazy eval of the threadviz knob since order of globals
is undefined.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-05 14:49:04 -05:00
Tim Rowley
1eb211c4a4 swr: [rasterizer] Whitespace cleanup and misc changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-05-05 14:48:55 -05:00
Nicolai Hähnle
d97e333ea4 radeonsi: mark descriptor loads as using dynamically uniform indices
This tells LLVM to always use SMEM loads for descriptors. It fixes a
regression in piglit's arb_shader_storage_buffer_object/execution/indirect.shader_test
that was caused by LLVM r268259 (but the proper fix is really here in Mesa).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-05 12:21:40 -05:00
Matt Turner
f01d92f473 i965/fs: Don't follow pow with an instruction with two dest regs.
Beginning with commit 7b208a73, Unigine Valley began hanging the GPU on
Gen >= 8 platforms.

Evidently that commit allowed the scheduler to make different choices
that somehow finally ran afoul of a hardware bug in which POW and FDIV
instructions may not be followed by an instruction with two destination
registers (including compressed instructions). I presume the conditions
are more complex than that, but the internal hardware bug report (BDWGFX
bug_de 1696294) does not contain much more information.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94924
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> [v1]
Tested-by:  Mark Janes <mark.a.janes@intel.com> [v1]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-05-05 10:18:28 -07:00
Bruce Cherniak
9d86a5eea7 swr: Remove stall waiting for core query counters.
When gathering query results, swr_gather_stats was
unnecessarily stalling the entire pipeline.  Results are now
collected asynchronously, with a fence marking completion.

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2016-05-05 10:50:09 -05:00
Dave Airlie
76a36ac3ea mesa/ubo: add missing compute cases for ubo/atomic buffers
This fixes: GL43-CTS.compute_shader.resource-ubo

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-05 20:29:02 +10:00
Dave Airlie
2dd3fc3cac mesa/compute: drop pointless casts.
We already are a GLintptr, casting won't help.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-05 20:28:41 +10:00
Thomas Hindoe Paaboel Andersen
76a423efe0 mesa: remove null check before free
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-05-05 09:50:38 +02:00
Thomas Hindoe Paaboel Andersen
3a6763f0a0 freedreno: remove null check before free
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-05-05 09:34:01 +02:00
Thomas Hindoe Paaboel Andersen
8698194313 nir: fix assert for wildcard pairs
The assert was null checking dest_arr_parent twice. The intention
seems to be to check both dest_ and src_.

Added in d3636da9

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-05-05 09:33:02 +02:00
Brian Paul
be5010c4b8 glapi: fix parameter type for GetSamplerParameterIuivEXT() in es_EXT.xml
The function returns GLuint, not GLfloat values.

v2: also fix the OES function

Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-04 14:49:39 -06:00
Brian Paul
54d203a319 mesa: include texture format in glGenerateMipmap error message
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-05-04 14:49:39 -06:00
Brian Paul
a62f031bc3 main: uses casts to silence some _mesa_debug() format warnings
Silences warnings with 32-bit Linux gcc builds and MinGW which doesn't
recognize the ‘t’ conversion character.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2016-05-04 14:49:39 -06:00
Jordan Justen
51300a0387 docs: Mark GL_ARB_query_buffer_object as done for i965/hsw+
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-04 11:23:17 -07:00
Jordan Justen
f00c399bae i965: Implement ARB_query_buffer_object for HSW+
v2:
 * Declare loop index variable at loop site (idr)
 * Make arrays of MI_MATH instructions 'static const' (idr)
 * Remove commented debug code (idr)
 * Updated comment in set_query_availability (Ken)
 * Replace switch with if/else in hsw_result_to_gpr0 (Ken)
 * Only divide GL_FRAGMENT_SHADER_INVOCATIONS_ARB by 4 on
   hsw and gen8 (Ken)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-04 11:23:17 -07:00
Jordan Justen
357ff91359 i965/gen6+: Add load register immediate helper functions
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-04 11:23:17 -07:00
Jordan Justen
959e1e9e66 i965/hsw+: Add support for copying a register
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-04 11:23:17 -07:00
Jordan Justen
aad14a22cb i965/gen6+: Add support for storing immediate data into a buffer
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-04 11:23:17 -07:00
Jordan Justen
ac0bbf9ef3 i965: Add MI_MATH reg defs for HSW+
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-04 11:23:17 -07:00
Jordan Justen
9f581f8f24 i965: Add brw_store_register_mem32
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-04 11:23:17 -07:00
Jordan Justen
c54e5c2fb2 i965: Use offset instead of index in brw_store_register_mem64
This matches the byte based offset of brw_load_register_mem*.

The function is also moved into intel_batchbuffer.c like
brw_load_register_mem*.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-04 11:23:10 -07:00
Jan Vesely
77959ce07b r600,compute: create vtx buffer for text + rodata
Reserve buffer id 2

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2016-05-04 13:09:18 -04:00
Rob Clark
2e117a7649 freedreno: allow ctx->draw_vbo to fail
Pretty much only happens if shader variant compile fails.  But in this
case, if we haven't emitted cmdstream, we don't want to set needs_flush.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-04 11:25:55 -04:00
Rob Clark
291ac872a4 freedreno: move shader-stage dirty bits to global dirty flag
This was always a bit overly complicated, and had some issues (like
ctx->prog.dirty not getting reset at the end of the batch).  It also
required some special hacks to avoid resetting dirty state on binning
pass.  So just move it all into ctx->dirty (leaving some free bits
for future shader stages), and make FD_DIRTY_PROG just be the union
of all FD_SHADER_DIRTY_*.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-04 11:25:55 -04:00
Rob Clark
a48cccacf3 freedreno/a4xx: fix bogus offset for f32x24s8 stencil restore
fixes: $piglit/bin/fbo-clear-formats GL_ARB_depth_buffer_float

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-04 11:25:55 -04:00
Rob Clark
e7c64041e9 freedreno: add some debug_asserts() to catch insane offsets
Ofc won't catch *all* faults, but at least helpful for catching offsets
which are completely bogus.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-04 11:25:55 -04:00
Rob Clark
1f2bc64f31 freedreno/a4xx: deal with VS which do not write position
Fixes $piglit/bin/glsl-1.40-tf-no-position

a3xx may need similar?

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-04 11:25:55 -04:00
Rob Clark
a6ad30202c freedreno/ir3: remove a couple redundant is_flow()s
Now that the opc's encode the instruction category (making them unique)
we no longer need to check the category in addition to the opc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-04 11:25:55 -04:00
Rob Clark
f0a1f3de27 freedreno/ir3: cp small negative integers too
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-04 11:25:55 -04:00
Rob Clark
1f04d4bf59 freedreno/ir3: fix # of registers
The instruction encoding allows for more registers, but at least on
a3xx/a4xx they don't actually exist.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-04 11:25:55 -04:00
Rob Clark
173871dfb9 freedreno/ir3: lower immeds to const
Helps reduce register pressure and instruction counts for immediates
that would otherwise require a mov into gpr.

total instructions in shared programs:          4455332 -> 4369297 (-1.93%)
total dwords in shared programs:                8807872 -> 8614432 (-2.20%)
total full registers used in shared programs:   263062 -> 250846 (-4.64%)
total half registers used in shader programs:   9845 -> 9845 (0.00%)
total const registers used in shared programs:  1029735 -> 1466993 (42.46%)

                 half       full      const      instr     dwords
    helped           0       10415           0       17861        5912
      hurt           0        1157       21458         947          33

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-04 11:25:55 -04:00
Rob Clark
b15c7fc268 freedreno/ir3: add ir3_cp_ctx
Needed in next commit.. just split out to reduce noise.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-04 11:25:55 -04:00
Rob Clark
b9985e5bde add REVIEWERS and get_reviewer.pl script
Copied from linux kernel (where it is called MAINTAINERS and
get_maintainer.pl), with minimal changes to script (to recognize
mesa src tree rather than linux kernel src tree, and to avoid
accidentaly CC'ing Linus Torvalds on mesa patches), and slimmed
down MAINTAINER file syntax to recognize that we don't really have
subsystem "maintainers" in the same sense as the linux kernel (ie. no
different mailing lists and git trees per subsystem).

The main point is to automate slapping on the correct CC's for patches
via git's --cc-cmd feature, more than anything else.

I didn't attempt to fully populate the REVIEWERS file, by a long shot.
This is an opt-in system and anyone else can add their own entries.

To utilize:

  git send-email --cc-cmd ./scripts/get_reviewer.pl ...

or to configure it to be the default:

  git config sendemail.cccmd ./scripts/get_reviewer.pl

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-04 11:25:46 -04:00
Ilia Mirkin
38fcf7cbad nouveau/video: properly detect the decoder class for availability checks
The kernel is now more strict with the class ids it exposes, so we need
to check the G98 and MCP89 classes as well as the GT215 class. This
effectively caused us to decide there were no decoding capabilities on
newer kernel for VP3 chips.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95251
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
2016-05-04 10:45:07 -04:00
Kenneth Graunke
0332963d19 i965: Delete stale perf_debug().
MOCS for 3DSTATE_SO_BUFFER has existed for ages.
2016-05-04 02:29:03 -07:00
Kenneth Graunke
3a886721ed i965: Silence unused variable warning
I added this when deleting some unnecessary code in a rebase.
2016-05-04 00:46:31 -07:00
Juan A. Suarez Romero
97989059b9 mesa/main: handle double uniform matrices properly
When computing the offset in the uniform storage table, take into account
the size multiplier so double precision matrices are handled correctly.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-04 08:08:12 +02:00
Samuel Iglesias Gonsálvez
2ab2d2e588 nir: Separate 32 and 64-bit fmod lowering
Split 32-bit and 64-bit fmod lowering as the drivers might need to
lower them separately inside NIR depending on the HW support.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-04 08:07:49 +02:00
Samuel Iglesias Gonsálvez
b902377a56 nir/lower_double_ops: lower mod()
There are rounding errors with the division in i965 that affect
the mod(x,y) result when x = N * y. Instead of returning '0' it
was returning 'y'.

This lowering pass fixes those cases.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-04 08:07:49 +02:00
Matt Turner
9f81434c5f i965: Define GEN_GE/GEN_LE macros in terms of GEN_LT.
GEN_LT has a straightforward implementation on which we can build the
GEN_GE and GEN_LE macros.

Suggested-by:  Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-03 22:34:01 -07:00
Matt Turner
affaae197f i965: Add disassembler support for remaining opcodes.
For opcodes that changed meaning on different generations, we store a
pointer to a secondary table and the table's size in a tagged union in
place of the mnemonic and number of sources.

Acked-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-03 22:34:00 -07:00
Matt Turner
b89b0a03f2 i965: Make opcode_descs and gen_from_devinfo() static.
The previous commit replaced direct uses of opcode_descs with calls to
the wrapper function, which should be the only method of accessing
opcode_descs's data. As a result gen_from_devinfo() can also be made
static.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-03 22:34:00 -07:00
Matt Turner
0ff4912cf4 i965: Actually check whether the opcode is supported.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-03 22:33:59 -07:00
Matt Turner
667408b889 i965: Merge inst_info and opcode_desc tables.
I merged opcode_desc into inst_info (instead of the other way around)
because inst_info was sorted by opcode number.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-03 22:33:42 -07:00
Matt Turner
d01596613b i965: Move inst_info from brw_eu_validate.c to brw_eu.c.
Drop the uses of 'enum gen' to a plain int, so that we don't have to
expose the bitfield definitions and GEN_GE/GEN_LE macros to other users
of brw_eu.h. As a result, s/.gen/.gens/ to avoid confusion with
devinfo->gen.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-03 22:33:42 -07:00
Francisco Jerez
1530e27534 i965/disasm: Wrap opcode_desc look-up in a function.
The function takes a device info struct as argument in addition to the
opcode number in order to disambiguate between multiple opcode_desc
entries for different instructions with the same opcode number.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1]

[v2] mattst88: Put brw_opcode_desc() in brw_eu.c instead of moving it
               there in a later patch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v2]
[v3] mattst88: Return NULL if opcode >= ARRAY_SIZE(opcode_descs)
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-03 22:32:40 -07:00
Francisco Jerez
1cc7573162 i965: Pass devinfo pointer to is_3src() helpers.
This is not strictly required for the following changes because none
of the three-source opcodes we support at the moment in the compiler
back-end has been removed or redefined, but that's likely to change in
the future.  In any case having hardware instructions specified as a
pair of hardware device and opcode number explicitly in all cases will
simplify the opcode look-up interface introduced in a subsequent
commit, since the opcode number alone is in general ambiguous.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-03 18:06:21 -07:00
Francisco Jerez
c55dc77ab1 i965: Pass devinfo pointer to brw_instruction_name().
A future series will implement support for an instruction that happens
to have the same opcode number as another instruction we support
already on a disjoint set of hardware generations.  In order to
disambiguate which instruction it is brw_instruction_name() will need
some way to find out which device we are generating code for.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-03 18:06:21 -07:00
Kenneth Graunke
7d9143ad88 i965: Write a scalar TCS backend that runs in SINGLE_PATCH mode.
Unlike most shader stages, the Hull Shader hardware makes us explicitly
tell it how many threads to dispatch and manually configure the channel
mask.  One perk of this is that we have a lot of flexibility - we can
run it in either SIMD4x2 or SIMD8 mode.

Treating it as SIMD8 means that shaders with 8 or fewer output vertices
(which is overwhemingly the common case) can be handled by a single
thread.  This has several intriguing properties:

- Accessing input arrays with gl_InvocationID as the index is a simple
  SIMD8 URB read with g1 as the header.  No indirect addressing required.
- Barriers are no-ops.
- We could potentially do output shadowing to combine writes, as the
  concurrency concerns are gone.  (We don't do this yet, though.)

v2: Drop first_non_payload_grf change, as it was always adding 0
    (caught by Jordan Justen).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-03 16:28:00 -07:00
Kenneth Graunke
75881bed9e i965: Rework the TCS passthrough shader to use NIR.
I'm about to implement a scalar TCS backend, and I'd rather not
duplicate all of this code there.

One change is that we now write the tessellation levels from all
TCS threads, rather than just the first.  This is pretty harmless,
and was easier.  The IF/ENDIF needed for that are gone; otherwise
the generated code is basically identical.

I chose to emit load/store intrinsics directly because it was easier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-03 16:27:52 -07:00
Brian Paul
ef5a31fc06 gallium/util: change assertion to conditional in util_bitmask_destroy()
If we fail to create a context in the VMware driver we call this function
unconditionally to free a bunch of bit vectors.  Instead of asserting on
a null pointer, just no-op.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-03 15:40:49 -06:00
Brian Paul
68116dcd5a cso: null-out previously bound sampler states
If, for example, we previously had 2 sampler states bound and now we
are binding one, we'd leave the second sampler state unchanged.
This change nulls-out the second sampler state in this situation.
We're already doing the same thing for sampler views.

This silences an occasional warning issued by the VMware driver when
the number of sampler views and sampler states disagreed.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-05-03 15:40:49 -06:00
Brian Paul
05abaa65c7 svga: try to flag surfaces for sampling, in addition to rendering
This silences some warnings when we try to sample from surfaces that were
created for drawing, such as when blitting from one of the framebuffer
surfaces.  We were already doing the opposite situation (adding a bind
flag for rendering to surfaces declared as texture sources).

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-05-03 15:40:48 -06:00
Brian Paul
abc6432d54 svga: fix copying non-zero layers of 1D array textures
Like cube maps, we need to convert the z information to a layer index.
Also rename the *_face vars to *_face_layer to make things a little more
understandable.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-05-03 15:40:48 -06:00
Brian Paul
b94f73c150 svga: clean up svga_pipe_blit.c
Remove dead code.  Fix formatting.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-05-03 15:40:48 -06:00
Brian Paul
8842be1132 rbug: s/Elements/ARRAY_SIZE/
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-05-03 15:40:48 -06:00
Brian Paul
7f641916bf freedreno: s/Elements/ARRAY_SIZE/
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-05-03 15:40:48 -06:00
Brian Paul
b91975714d trace: s/Elements/ARRAY_SIZE/
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-05-03 15:40:48 -06:00
Brian Paul
e193c5dd59 ilo: s/Elements/ARRAY_SIZE/
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-05-03 15:40:48 -06:00
Brian Paul
951bf8b4a6 i915g: s/Elements/ARRAY_SIZE/
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-05-03 15:40:48 -06:00
Samuel Pitoiset
5658ddc7fe nvc0: compute a percentage for metric-achieved_occupancy
metric-issue_slot_utilization and metric-branch_efficiency are already
computed as percentages.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-03 23:18:50 +02:00
Samuel Pitoiset
10ec27760a nvc0: display some performance metrics with a percentage
This makes more sense for them.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-03 23:18:50 +02:00
Samuel Pitoiset
64937615a0 nvc0: store the driver query type for performance metrics
This will allow to use percentages for some metrics because the Gallium
HUD doesn't allow to display floating point numbers and 0 is printed
instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-03 23:18:50 +02:00
Samuel Pitoiset
a9bc3211f5 nvc0: fix exposing of metric-issue_slots for SM21/SM30
This is most likely a copy-paste error when I reworked this area few
weeks ago. For SM20, metric-issue_slots is equal to inst_issued because
there is only one pipeline, so the metric is not exposed there.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reported-by: Karol Herbst <nouveau@karolherbst.de>
2016-05-03 23:18:50 +02:00
Mark Janes
0af8a7d50c mesa/objectlabel: handle NULL src string
This prevents a crash when a NULL src is passed with a non-NULL length.

fixes: dEQP-GLES31.functional.debug.object_labels.query_length_only
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95252

Signed-off-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-05-03 14:07:31 -07:00
Dave Airlie
265fe9dce8 glsl: subroutine types cannot be used in constructors.
This fixes two of the cases in
GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-04 06:44:45 +10:00
Dave Airlie
3110a0aa23 glsl: resource is a reserved keyword in GLSL 4.20 as well
resource just appears in GLSL 4.20 without any fanfare.

Fixes GL43-CTX.CommonBugs.CommonBug_ReservedNames

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-04 06:44:45 +10:00
Jan Vesely
ebbe31d57c gallium,utils: Fix trivial sign compare warnings
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-03 12:00:09 -04:00
Knut Andre Tidemann
c68a9cdaac anv: fix hang during generation of dev_icd.json.
Fixes: b370ec7c76 ("anv: tweak the %.json rule")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-03 11:42:47 +01:00
Anuj Phogat
883f3662db swrast: Add texfetch_funcs entries for astc 3d formats
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
63432eb370 mesa: Enable translation between astc 3d gl formats and mesa formats
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
54cac7ad96 mesa: Handle astc 3d formats in _mesa_get_compressed_formats()
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
dcfea1d7eb mesa: Handle astc 3d formats in _mesa_base_tex_format()
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
cf85ef1618 mesa: Account for astc 3d formats in _mesa_is_astc_format()
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
38cd8145a8 mesa: Add a helper function is_astc_3d_format()
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
72dfe0242d mesa: Add the missing defines for GL_OES_texture_compression_astc
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
57451e0fc1 mesa: Align the values of #define's in glheader.h
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
0306110fa9 mesa: Add OES_texture_compression_astc to extension table and gl_extensions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
059f36c671 mesa: Add entries for astc 3d formats initializing struct gl_format_info
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
705216dbed mesa: Add mesa formats for astc 3d formats
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
24bb6ee8b6 glapi: Update dispatch XML files for OES_texture_compression_astc.xml
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
63a7a9d115 mesa: Account for block depth in _mesa_format_image_size()
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
87bf66daa9 mesa: Handle 3d block sizes in _mesa_compute_compressed_pixelstore
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
84a44844f2 mesa: Handle 3d block sizes in teximage error checks
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
ec60b3da69 mesa: Handle 3d block sizes in getteximage error checks
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:18 -07:00
Anuj Phogat
5713461ae7 mesa: Add an assert for BlockDepth in _mesa_get_format_block_size()
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:17 -07:00
Anuj Phogat
9163c37349 mesa: Add a helper function to query 3D block sizes
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:17 -07:00
Anuj Phogat
6abb1b4984 mesa: Add block depth field in struct gl_format_info
This will be later required for 3D ASTC formats.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-05-03 03:43:17 -07:00
Dave Airlie
c4a0cd4662 mesa/copyimage: make sure number of samples match.
This fixes
GL43-CTS.copy_image.samples_missmatch
which otherwise asserts in the radeonsi driver.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-03 20:13:29 +10:00
Dave Airlie
5989a2937f mesa/objectlabel: don't do memcpy if bufSize is 0 (v2)
This prevents GL43-CTS.khr_debug.labels_non_debug from
memcpying all over the stack and crashing.

v2: actually fix the test.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-03 20:12:59 +10:00
Dave Airlie
30823f997b mesa/textureview: move error checks up higher
GL43-CTS.texture_view.errors checks for GL_INVALID_VALUE
here but we catch these problems in the dimensionsOK check
and return the wrong error value.

This fixes:
GL43-CTS.texture_view.errors.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-03 20:12:52 +10:00
Marek Olšák
5541e11b9a gallium/radeon: remove stencil_tile_split from metadata
this is a leftover from the days when depth-stencil buffers were
allocated by the DDX

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
20a77397fa gallium/radeon: remove tile_mode_array_valid flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
c8aac4fc0d winsys/amdgpu: pass PIPE_CONFIG to addrlib on texture import
This hasn't been needed, but I think we should set it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
dc970c4f4e winsys/amdgpu: read NUM_BANKS from buffer metadata
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
02f90cef7d radeonsi: remove unused tile mode getters
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
b9e3e87069 radeonsi: just read tile mode arrays in SDMA setup
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
0c2cba1ec6 radeonsi: just read tile mode arrays in SI DMA setup
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
c3ca54aee9 radeonsi: just read tile mode arrays in DB setup
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
ef45825708 gallium/radeon: add radeon_surf::macro_tile_index
for indexing cik_macrotile_mode_array

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
ed4fd542de winsys/radeon: drop support for kernels lacking tile mode array queries
This will allow us to simplify a lot of code around tiling.

Kernel 3.10 is required for SI support.
Kernel 3.13 is required for CIK support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
3d956b4bc0 st/mesa: fix blit-based GetTexImage for non-finalized textures
This fixes getteximage-depth piglit failures on radeonsi.

Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
77af6bcc26 winsys/radeon: count buffer size only once
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
3e3c43418e winsys/amdgpu: count buffer size only once
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
f98ba4123c winsys/amdgpu: loosen up requirements for how much memory IBs can use
ported from winsys/radeon.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
9ec00c23c2 radeonsi: when parsing dmesg, skip empty lines
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-02 22:49:25 +02:00
Marek Olšák
9983efca76 radeonsi: use the hw MSAA resolving if formats are compatible
This allows resolving RGBA into RGBX.
This should improve HL2 Lost Coast performance.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-02 22:49:25 +02:00
Samuel Pitoiset
819836d240 nv50,nvc0: re-bind old compute state after reading MP perf counters
This might be useful to avoid breaking the current compute state when
monitoring MP perf counters because we use a compute kernel to read out
those counters. This has been initially suggested by Ilia Mirkin.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-05-02 22:30:48 +02:00
Rob Clark
dcf8c4425a nir: make lower_clamp_color pass work after lower i/o
Kinda important to work with tgsi_to_nir, which generates nir which
already has i/o lowered.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-02 14:25:38 -04:00
Eric Anholt
226bd92945 vc4: Use NIR lowering for sRGB decode.
This should get us the same decode code generated, but with a lot less
custom code in the driver.
2016-05-02 11:06:29 -07:00
Eric Anholt
4b326341f3 vc4: Just use NIR lowering for texture projection.
This means doing Newton-Raphson on the RCP, but it's probably actually a
good thing to be accurate on.
2016-05-02 11:06:29 -07:00
Eric Anholt
2f98bc100d vc4: Scalarize phi nodes as well.
This makes fewer programs with loops assertion fail, replacing them with
the rendering failure warning.
2016-05-02 11:06:29 -07:00
Eric Anholt
4a2ad8500d vc4: Add whitespace after each program stage dump.
In particular it's been hard to find the point where we switch from
dumping pre-optimization QIR and post-optimization QIR.
2016-05-02 11:06:29 -07:00
Eric Anholt
84322b2f31 vc4: Remove the CSE pass.
It's not doing anything according to shader-db now that we're using NIR.
It would have had to be reworked significantly anyway, to handle control
flow.
2016-05-02 11:06:29 -07:00
Eric Anholt
b145b731ab vc4: Emit only one FRAG_Z or FRAG_W QIR opcode.
We were generating piles of FRAG_W for interpolation, only to CSE them
away immediately.  Since this is the only thing that CSE is doing for us
any more, just avoid making the CSE work necessary.
2016-05-02 11:06:29 -07:00
Eric Anholt
e138716d8d vc4: Use the NIR cubemap normalization instead of our own.
This is one of two uses of the current QIR CSE pass according to
shader-db.  The NIR pass means that we'll end up doing Newton-Raphson on
our RCP, which we weren't doing before, but that's probably actually a
good thing.
2016-05-02 11:06:29 -07:00
Eric Anholt
3bee7581e6 vc4: Drop the support for DCE of texture instructions.
Now that we're using NIR for our optimization, there's no need for this
tricky code.
2016-05-02 11:06:29 -07:00
Nicolai Hähnle
155ce49603 radeonsi: fix PIPE_FORMAT_R11G11B10_FLOAT handling
That format has first_non_void < 0. This fixes a regression in piglit
arb_shader_image_load_store-semantics that was introduced by commit 76b8c5cc60,
while hopefully still shutting Coverity up (and failing in a more obvious way
if a similar error should re-appear).

Reviewed-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-02 11:38:23 -05:00
Nicolai Hähnle
169ace5636 radeonsi: correct NULL-pointer check in si_upload_const_buffer
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-05-02 11:37:55 -05:00
Dave Airlie
cf6dadb00b softpipe: bump 3D texture limit to 2048
The GL4.1 spec bumps this to 2048, so we should do so.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-02 07:29:02 +10:00
Dave Airlie
277170eeea softpipe: allow r32 xchg on shader images.
This is part of OES_shader_image_atomic.txt.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-02 07:28:58 +10:00
Ilia Mirkin
3950aa47df softpipe: avoid leaking local_mem on machines alloc failure
Spotted by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
2016-05-01 11:19:08 -04:00
Ilia Mirkin
ad545d179b vbo: avoid leaking prim on vbo bind failure
Spotted by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
2016-05-01 11:19:08 -04:00
Edward O'Callaghan
23cf24e227 mapi/glapi: Fix dup word typo in glapi_getproc.c
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-01 16:07:29 +02:00
Emil Velikov
44f921091a isl: automake: don't explicitly EXTRA_DIST the tests folder
The file(s) within are already picked thanks to the build rule of the
respective test. No need to have the folder in EXTRA_DIST.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 14:17:30 +01:00
Timothy Arceri
f982e2434b mesa: add LOCATION_COMPONENT support to GetProgramResourceiv
From Section 7.3.1.1 (Naming Active Resources) of the OpenGL 4.5 spec:

   "For the property LOCATION_COMPONENT, a single integer indicating the first
   component of the location assigned to an active input or output variable is
   written to params. For input and output variables with a component specified
   by a layout qualifier, the specified component is written. For all other
   input and output variables, the value zero is written."

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-01 23:13:36 +10:00
Timothy Arceri
b1c872a81e glsl: add component to has_layout() helper
I don't think this will do much as it's a compiler error
to use component without location which is already in the
table but its good to be consistent.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-05-01 23:13:28 +10:00
Timothy Arceri
589053dac7 glsl: validate linking of intrastage component qualifiers
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-05-01 23:13:22 +10:00
Timothy Arceri
0317dfcd9b glsl: update explicit location matching to support component qualifier
This is needed so we don't optimise away the varying when more than
one shares the same location.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-01 23:13:15 +10:00
Timothy Arceri
0d88b15f07 glsl: cross validate varyings with a component qualifier
This change checks for component overlap, including handling overlap of
locations and components by doubles. Previously there was no validation
for assigning explicit locations to a location used by the second half
of a double.

V3: simplify handling of doubles and fix double component aliasing
detection

V2: fix component matching for matricies

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-01 23:13:10 +10:00
Timothy Arceri
94438578d2 glsl: validate and store component layout qualifier in GLSL IR
We make use of the existing IR field location_frac used for tracking
component locations.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-01 23:13:05 +10:00
Timothy Arceri
2d9936a686 glsl: allow component qualifier on varying inputs
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-05-01 23:13:00 +10:00
Timothy Arceri
daa8df590b glsl: parse component layout qualifier
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-01 23:12:52 +10:00
WuZhen
ea4c1afd05 android: enable dlopen() on all architectures
Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-01 12:31:29 +01:00
Jose Fonseca
5649d6ab06 winsys/sw/xlib: use correct free function for xlib_dt->data
Analogous to previous commit.

Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-01 12:31:29 +01:00
WuZhen
4f21f3f2e8 winsys/sw/dri: use correct free function for dri_sw_dt->data
align_malloc() is used to allocate dri_sw_dt->data, thus we should not
be using FREE() but align_free().

Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
[Emil Velikov: tweak commit summary/shortlog]
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-01 12:31:29 +01:00
WuZhen
798f7a8596 tgsi: initialize stack allocated struct
Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-01 12:31:29 +01:00
Emil Velikov
fb653641ea egl: android: do not feed invalid fourcc/pitch into the dri module
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-01 12:31:29 +01:00
Rob Herring
34ddef39ce egl: android: add dma-buf fd support
Add support for creating images from Android native buffers with dma-buf
fd. As dma-buf support also requires DRI image loader extension, add
that as well.

This is based on several originally patches written by Varad Gautam.
I've collapsed them into logical changes and done a bit of reformatting.
Using dma-bufs vs. GEM handles is now a runtime decision similar to the
wayland EGL instead of being compile time selection. The dma-buf support
is also re-written to use common dri2_create_image_dma_buf function in
egl_dri2.c.

Cc: Varad Gautam <varadgautam@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-01 12:31:29 +01:00
Rob Herring
81a6fff4c5 egl: android: factor out back buffer handling code
In preparation to use the same code for dma-bufs, factor out the code to a
separate function.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-01 12:31:28 +01:00
Rob Herring
dfaccf25f5 egl: android: factor out format conversion code to a function
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-01 12:31:28 +01:00
Rob Herring
d45884ef05 egl: android: disable __DRI_DRI2_LOADER support on render nodes
Use of __DRI_DRI2_LOADER extension is only supported for card nodes. In
order to support dmabufs, Android will be moving to using render nodes and
we need to disable the DRI2 loader extension.

This is based on the Wayland EGL code.

Cc: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-01 12:31:28 +01:00
Rob Herring
dbbf7a8e61 Android: fix build ordering of subdirectories
Different versions of make behave differently in whether $(wildcard) sorts
the results or not. The Android build now explicitly sorts
all-named-subdir-makefiles which breaks the build because src/gallium
must be included after src/mesa/drivers/dri.

The Android build system doesn't support doing "include $(call
all-named-subdir-makefiles,...)" twice, so rework things by generating
the included makefile list and including them in 2 steps.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-05-01 12:31:28 +01:00
Jamey Sharp
595d56cc86 glShaderSource must not change compile status.
OpenGL 4.5 Core Profile section 7.1, in the documentation for
CompileShader, says: "Changing the source code of a shader object with
ShaderSource does not change its compile status or the compiled shader
code."

According to Karol Herbst, the game "Divinity: Original Sin - Enhanced
Edition" depends on this odd quirk of the spec.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93551
Signed-off-by: Jamey Sharp <jamey@minilop.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-05-01 18:46:24 +10:00
Emil Velikov
9fa2e57a73 gallium/radeon: nuke the final pre LLVM 3.6 codepath
Missed with commit 100796c15c "gallium/radeon: drop support for LLVM
3.5"

v2: s/LLVN/LLVM/ in shortlog (Nicolai)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-05-01 08:57:32 +01:00
Emil Velikov
7336df06ed anv: include the files in the tarball
Namely the python script, the ICD header and private headers. We could
get the system version of the ICD ones, although there is no .pc file to
easily locate and/or manage them.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:46 +01:00
Emil Velikov
9e09507516 i965: don't forget to ship brw_nir_trig_workarounds.py
Otherwise we won't be able to regenerate the source file(s).

Signed-off-by:    Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:46 +01:00
Emil Velikov
1f04caa09c isl: include all the files in the tarball
Add the missing header(s), generation scripts, README ...

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:34 +01:00
Emil Velikov
cee69ccb92 spirv: automake: add missing headers to the tarball.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:06 +01:00
Emil Velikov
dc38e6b169 automake: wire up the intel vulkan driver to make distcheck
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:06 +01:00
Emil Velikov
dfbf1289a4 anv: update .gitignore
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:05 +01:00
Emil Velikov
fcdcb829d8 anv: automake: remove no longer needed include
Thanks to last commit we can nuke it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:05 +01:00
Emil Velikov
3285461ceb anv: automake: tweak anv_entrypoint.[ch] rule
Rather than using cat + cpp feed the file(s) directly into the latter.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:05 +01:00
Emil Velikov
bc7802098e anv: tweak libvulkan_intel.so link libraries
i.e do not use -lfoo directly.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:05 +01:00
Emil Velikov
9f235adf99 anv: cosmetic makefile changes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:05 +01:00
Emil Velikov
446234033d anv: place the builddir includes before the srcdir ones
Otherwise we risk picking the possibly outdated file in the source dir
over the fresh one in the builddir.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:05 +01:00
Emil Velikov
6cb814727d automake: tweak SUBDIR reorder and comment it
It should ease people with all the interaction and platforms and how
they interact (at least from a build POV) with each other.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:05 +01:00
Emil Velikov
4fcf0ba113 configure.ac: remove unused HAVE_EGL_PLATFORM_NULL conditional
Afaict the last user was based on st/egl.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:05 +01:00
Emil Velikov
9f3588eb37 automake: drop "EGL_" from HAVE_EGL_PLATFORM_WAYLAND
Analogous to previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:05 +01:00
Emil Velikov
5459db91e3 automake: drop "EGL_" from HAVE_EGL_PLATFORM_X11
The variable covers more than just EGL, let's try to untangle the
confusion it brings.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:05 +01:00
Emil Velikov
a56009d089 anv: get rid of VULKAN_ENTRYPOINT_CPPFLAGS variable
Add the missing include to AM_CPPFLAGS and use it throughout the
makefile.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:05 +01:00
Emil Velikov
6dc169e18f anv: factor out the X11/XCB build
Similar to earlier commit - move all the common bits into a single
place, thus improving readability and allowing us to see what's missing.

Also don't forget to add the missing bits. This commit should allows us
to build wayland only vulkan ;-)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:04 +01:00
Emil Velikov
cbc4837b83 anv: kill of custom define HAVE_WAYLAND_PLATFORM
Vulkan API already has equivalent, so simplify things as just use it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:04 +01:00
Emil Velikov
9bc99f5668 anv: refactor wayland build handling
Rather than having things split out in multiple places, consolidate it
and add all the missing bits. Also ensure that we use the already built
static library libwayland-drm.la.

v2 Add missing '\' in the CFLAGS.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2016-05-01 08:38:04 +01:00
Emil Velikov
3a2d09dd65 automake: include vulkan subdir after wayland-drm
We'll reuse the existing wayland-drm static library with next commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:04 +01:00
Emil Velikov
fe918556a2 anv: use a common variable to manage the library dependencies
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:04 +01:00
Emil Velikov
82d0b59f02 anv: use the GENERATED_FILES variable
... rather than having duplicates files through the sources lists.

Splitting things as is, has the side effect of making things clearer and
easing a potential android build. The latter of which automatically adds
BUILT_SOURCES to the binary.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:04 +01:00
Emil Velikov
3ee7d8b0eb anv: fold the tests' makefile
Recent commit removed the winsys defines from anv_private.h thus
breaking the tests. To fix that and avoid it in the future, merge the
tests makefile in the libvulkan one.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:04 +01:00
Emil Velikov
f3cb0dcae1 anv: build the core vulkan only once
Introduce a static library libvulkan_common.la that is used by
libvukan_intel.la and libvulkan_test.la.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:04 +01:00
Emil Velikov
21800d77ff anv: kill off custom CFLAGS
AM_CFLAGS already does all that we need.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:04 +01:00
Emil Velikov
623cb3a598 anv: add missing link against the math library
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:04 +01:00
Emil Velikov
e98cf60446 anv: split sources lists to Makefile.sources
Will allow others to reuse the lists (scons/android anyone ?) and makes
the file a lot shorter and easier to read.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:04 +01:00
Emil Velikov
0d3e7b17c9 anv: remove custom rule to install the intel_icd.json
Autoconf already does the exact same thing as the manually written rule.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94969
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:04 +01:00
Emil Velikov
30e6f68b3b anv: tweak the LDFLAGS
Copy/paste from the rest of mesa, but namely.
 - The module should be shared only.
 - We don't need the explicit ".so", as the vulkan loader will retrieve
the full filename from the json
 - No unresolved symbols in the final binary
 - Use the linker garbage collector to slim down the final binary.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:03 +01:00
Emil Velikov
b370ec7c76 anv: tweak the %.json rule
It's used only by dev_icd.json so just call it that way. While we're
here, manually expand $< (as it might cause issue on some systems)
and drop the unneeded install_libdir substitution.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:03 +01:00
Emil Velikov
abd360ab75 anv: add a comment about dev_icd.json
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:03 +01:00
Emil Velikov
44978a91ff genxml: ship all the files needed in the tarball
v2: The xml files are not called "gen*_pack.xml" (Jason)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:38:03 +01:00
Emil Velikov
3f23a0f8c1 anv: remove description about GENX_FUNC macro
The macro has been gone since commit 1f1cf6fcb0 "anv: Get rid of
GENX_FUNC"

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-01 08:37:25 +01:00
Emil Velikov
0700cdd5aa gallium/target-helpers: remove inline_wrapper_sw_helper.h
Unused as of commit dddedbec0e "{st,targets}/nine: use static/dynamic
pipe-loader"

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:37:25 +01:00
Mark Kettenis
b8e59292e6 egl/x11: resolve "initialization from incompatible pointer type" warning
With earlier commit we've moved a few functions and changing the
argument type from _EGLDisplay * to struct dri2_egl_display *.

The latter is effectively a wrapper around the former, thus
functionality was preserved, although GCC rightfully warned us about the
misuse.

Add a simple wrapper that casts and propagates the correct type.

Fixes: 9bbf3737f9 ("egl/x11: authenticate before doing chipset id
ioctls")
Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>
Reported-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:37:25 +01:00
Chuck Atkins
a92910ae37 glx: Refactor the configure options for glx implementation choice (v3)
Instead of cascading support for various different implementations of
GLX, all three options are now specified through the --enable-glx
option:

  --enable-glx=dri          : Enable the DRI-based GLX
  --enable-glx=xlib         : Enable the classic Xlib-based GLX
  --enable-glx=gallium-xlib : Enable the gallium Xlib-based GLX
  --enable-glx[=yes]        : Defaults to dri if DRI is enabled, else
                              gallium-xlib if gallium is enabled, else
                              xlib

This removes the --enable-xlib-glx option and fixes a bug in which both
the classic xlib-glx and gallium xlib-glx implementations were getting
built causing different versioned and conflicting libGL libraries to be
installed.

v2: Changes from various review feedback from Emil:
  a) Fixed typos
  b) Corrected help docs for new option
  c) Added appropriate a-b and r-b tags in commit msg
  d) Fixed various GLX related dependency checks.
v3: Rebased to current master and added changelog in commit msg

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94086

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:37:25 +01:00
Thomas Hindoe Paaboel Andersen
cbcd7b60f5 nir/lower_double_ops: fix indentation
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-30 12:16:32 -07:00
Thomas Hindoe Paaboel Andersen
21424e019d nir/opt_dead_cf: fix indentation
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-30 12:16:29 -07:00
Thomas Hindoe Paaboel Andersen
6935726197 nir/opt_dead_cf: correction of side effect check
Parenthesis are needed here as ! takes precedence over the &. The
check had the opposite effect than intended.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-30 12:16:22 -07:00
Rob Clark
663c0e5155 freedreno/ir3: use pipe_debug_callback for shader-db traces
For multi-threaded shader-db support.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-30 14:56:20 -04:00
Rob Clark
2578e3edcb freedreno/a4xx: add debug callback to emit
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-30 14:56:19 -04:00
Rob Clark
51f20dd279 freedreno/a3xx: add debug callback to emit
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-30 14:56:19 -04:00
Rob Clark
41d288c306 freedreno: wire up core pipe_debug_callback
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-30 14:56:19 -04:00
Rob Clark
e04db879f8 freedreno/ir3: handle color clamp variant ourselves
Now that there is a pass to do this in NIR, lets just use that and
manage the variants ourself, rather than letting state-tracker do it.
This way, mesa/st will precompile shaders without requiring
ST_DEBUG=precompile (which requires a debug build).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-30 14:56:19 -04:00
Rob Clark
64abf6d404 nir: clamp-color-output support
Handled by tgsi_emulate for glsl->tgsi case.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-04-30 14:56:19 -04:00
Rob Clark
482cdc4c92 freedreno: fix indentation
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-30 14:56:19 -04:00
Marek Olšák
53435514c1 radeonsi: fix synchronization of shader images
This fixes the winsys->cs_is_buffer_referenced query, which is used for
synchronization before buffers are mapped.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-30 19:36:16 +02:00
Samuel Pitoiset
8f2238ccba st/glsl_to_tgsi: fix potential crash when allocating temporaries
When index - t->temps_size is greater than 4096, allocating space for
temporaries on demand will miserably crash. This can happen when a game
uses a lot of temporaries like the recent released Tomb raider.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-30 17:41:32 +02:00
Kenneth Graunke
750c38fad1 glsl: Lower vector_extracts to swizzles after lower_vector_derefs.
lower_vector_derefs can produce new vector_extract operations.
Neither i965 nor st_glsl_to_tgsi can handle them, so we'd best
convert them to swizzles.

Together with the previous patch, this fixes assertion failures in
GLideN64, as well as a new Piglit test which reproduces the issue:
spec/glsl-1.10/compiler/vector-dereference-in-dereference.frag

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95164
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-29 16:03:36 -07:00
Kenneth Graunke
1cd600dbb9 glsl: Convert lower_vec_index_to_swizzle to a rvalue visitor.
The old visitor missed some cases.  For example, it wouldn't handle
an ir_dereference_array with a vector_extract as the index.

Rather than trying to add the missing cases, just rewrite it as an
ir_rvalue_visitor.  This makes it easy to replace any expression,
and is much less code.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95164
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-29 16:03:29 -07:00
Thomas Faller
d53cf1ea4c mesa: simplify _mesa_Lightfv
Signed-off-by: Thomas Faller <tfaller1@gmx.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-29 11:08:01 -06:00
Nicolai Hähnle
aa6f88f891 gallium/radeon: fix crash in r600_set_streamout_targets
Protect against dereferencing a gap in the targets array. This was triggered
by a test in the Khronos CTS.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-29 11:55:06 -05:00
Nicolai Hähnle
98c348d26b st/glsl_to_tgsi: reduce stack explosion in recursive expression visitor
In optimized builds, visit(ir_expression *) experiences inlining with gcc that
leads the function to have a roughly 32KB stack frame. This is a problem given
that the function is called recursively. In non-optimized builds, the stack
frame is much smaller, hence one gets crashes that happen only in optimized
builds.

Arguably there is a compiler bug or at least severe misfeature here. In any
case, the easy thing to do for now seems to be moving the bulk of the
non-recursive code into a separate function. This is sufficient to convince my
version of gcc not to blow up the stack frame of the recursive part. Just to be
sure, add the gcc-specific noinline attribute to prevent this bug from
reoccuring if inliner heuristics change.

v2: put ATTRIBUTE_NOINLINE into macros.h

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95133
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95026
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92850
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-29 11:52:59 -05:00
Nicolai Hähnle
59af21c3e9 tgsi/text: fix parsing of memory instructions
Properly handle Target and Format parameters when present.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:56 -05:00
Nicolai Hähnle
4055babc75 tgsi/text: add str_match_name_from_array
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:53 -05:00
Nicolai Hähnle
a56edbdd8f tgsi/text: add str_match_format helper function
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:51 -05:00
Nicolai Hähnle
acb65a23a3 tgsi/build: pass Memory.Texture and .Format through tgsi_build_full_instruction
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:32 -05:00
Nicolai Hähnle
318d305f6d tgsi/dump: signal nospace when the last print exceeded the size
Previously, there was a bug where nospace wasn't signalled if it just so
happened that the very last print exceeded the available space.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:28 -05:00
Nicolai Hähnle
e08eaa5b72 tgsi/dump: shared dump_ctx initialization
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-29 11:39:21 -05:00
Emil Velikov
4b1ea6910e st/omx: don't return early in vid_enc_EncodeFrame()
Earlier commit plugged a memory leak, although it missed a pair of
brackets. Thus we unconditionally returned even in the case of no error.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95203
Fixes: b87856d25d ("st/omx: Fix resource leak on OMX_ErrorNone")
Tested-by: Andy Furniss <adf.lists@gmail.com>
Acked-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
---
What an embarassing bug - missing brackets. Andy can you confirm that it
resolves the issue ?
2016-04-29 15:36:18 +01:00
Andres Gomez
c750029b37 glsl: Checks for interpolation into its own function.
This generalizes the validation also to be done for variables inside
interface blocks, which, for some cases, was missing.

For a discussion about the additional validation cases included see
https://lists.freedesktop.org/archives/mesa-dev/2016-March/109117.html
and Khronos bug #15671.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-04-29 08:03:00 +02:00
Jason Ekstrand
6d4a426745 nir/algebraic: Support lowering for both 64 and 32-bit ldexp
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-28 21:36:52 -07:00
Jason Ekstrand
f0af5b87ec nir/opcodes: Make ldexp take an explicitly 32-bit int
There is no sense in having the double version of ldexp take a 64-bit
integer.  Instead, let's just take a 32-bit int all the time.  This also
matches what GLSL does where both variants of ldexp take a regular integer
for the exponent argument.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-28 21:36:52 -07:00
Jason Ekstrand
bee40dd730 nir/opcodes: Simplify the expressions for [un]pack_double
The new expressions are more explicit in terms of where the bits go so it's
a little easier to tell what's going on.  This is the way GLSL specifies
things so it's a bit easier to verify too.  It also has the benifit that
the new expressions easily vectorize so we can constant-fold vector forms
of the _split versions correctly.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-28 21:36:52 -07:00
Kenneth Graunke
2655265fcb mesa: Fix indirect draw buffer size check on 32-bit systems.
Fixes dEQP-GLES31.functional subtests:
draw_indirect.negative.command_offset_not_in_buffer_signed32_wrap
draw_indirect.negative.command_offset_not_in_buffer_unsigned32_wrap

These tests use really large values that overflow GLsizeiptr, at
which point the buffer size isn't less than "end".

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95138
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2016-04-28 16:31:45 -07:00
Jason Ekstrand
70f89dd75e nir: Switch the arguments to nir_foreach_def
This matches the "foreach x in container" pattern found in many other
programming languages.  Generated by the following regular expression:

s/nir_foreach_def(\([^,]*\),\s*\([^,]*\))/nir_foreach_def(\2, \1)/

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-28 15:54:48 -07:00
Jason Ekstrand
5015260a05 nir: Switch the arguments to nir_foreach_use and friends
This matches the "foreach x in container" pattern found in many other
programming languages.  Generated by the following regular expression:

s/nir_foreach_use(\([^,]*\),\s*\([^,]*\))/nir_foreach_use(\2, \1)/

and similar expressions for nir_foreach_use_safe, etc.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-28 15:54:48 -07:00
Jason Ekstrand
9464d8c498 nir: Switch the arguments to nir_foreach_function
This matches the "foreach x in container" pattern found in many other
programming languages.  Generated by the following regular expression:

s/nir_foreach_function(\([^,]*\),\s*\([^,]*\))/nir_foreach_function(\2, \1)/

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-28 15:54:48 -07:00
Jason Ekstrand
e63766fb4b nir: Switch the arguments to nir_foreach_parallel_copy_entry
This matches the "foreach x in container" pattern found in many other
programming languages.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-28 15:54:48 -07:00
Jason Ekstrand
8564916d01 nir: Switch the arguments to nir_foreach_phi_src
This matches the "foreach x in container" pattern found in many other
programming languages.  Generated by the following regular expression:

s/nir_foreach_phi_src(\([^,]*\),\s*\([^,]*\))/nir_foreach_phi_src(\2, \1)/

and a similar expression for nir_foreach_phi_src_safe.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-04-28 15:54:48 -07:00
Jason Ekstrand
707e72f13b nir: Switch the arguments to nir_foreach_instr
This matches the "foreach x in container" pattern found in many other
programming languages.  Generated by the following regular expression:

s/nir_foreach_instr(\([^,]*\),\s*\([^,]*\))/nir_foreach_instr(\2, \1)/

and similar expressions for nir_foreach_instr_safe etc.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-28 15:54:48 -07:00
Jason Ekstrand
261d62de33 anv/lower_push_constants: fixup for nir_foreach_block()
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Jason Ekstrand
bb65764a4a anv/apply_pipeline_layout: fixup for nir_foreach_block()
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Jason Ekstrand
621cbc0c14 anv/apply_dynamic_offsets: fixup for nir_foreach_block()
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
7efff10585 i965/nir: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
3a8688fb41 nir/algebraic: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
1f8c100614 nir/validate: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
a471c161b1 nir/nir_worklist: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
db35177772 nir/remove_dead_variables: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
b3aaae398e nir/split_var_copies: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
9d41a1ffeb nir/repair_ssa: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
480a182ccd nir/opt_peephole_select: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
e5f37701ab nir/phi_builder: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
1ba40d834b nir/opt_cp: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
8dd7d78925 nir/opt_remove_phis: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
1a8c17a59e nir/opt_undef: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
52affdd2e6 nir/opt_dead_cf: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
ddc6639f85 nir/opt_dce: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
3afb3be674 nir/opt_gcm: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
eecf96f530 nir/opt_constant_folding: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
26b4c9ee15 nir/lower_samplers: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
f4ebff89e4 nir/normalize_cubemap_coords: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
492b3554a7 nir/lower_var_copies: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
c1b37c08bf nir/move_vec_src_uses_to_dest: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
ceed12557d nir/lower_vars_to_ssa: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
1557344c81 nir/lower_vec_to_movs: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
b1eada04b2 nir/lower_idiv: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
2febb88e6d nir/lower_to_source_mods: fixup for new foreeach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
c81ca60b41 nir/lower_io: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
7e909972e3 nir/lower_system_values: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
76c74de456 nir/lower_phis_to_scalar: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
b89f0bb58c nir/lower_indirect_derefs: fixup for new foreach_block()
v2 (Jason Ekstrand): Use nir_foreach_block_safe

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
e3c5bda16a nir/nir_lower_global_vars: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
480d78f55b nir/lower_atomics: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
06cf73a7ba nir/lower_load_const: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
15264133d7 nir/lower_locals_to_regs: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
1c6307aab4 nir/lower_gs_intrinsics: fixup for new foreach_block()
v2 (Jason Ekstrand): Use nir_foreach_block_safe

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
3bf3100794 nir/nir: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
686f247b21 nir/lower_clip: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
e36fbcfc3f nir/lower_alu_to_scalar: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
4179a56f42 nir/liveness: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
34af78edb3 nir/inline_functions: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
b23e59e172 nir/from_ssa: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
d6a6c729ca nir/dominance: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Samuel Pitoiset
9f92a8f00a nvc0: stick compute kernel arguments into uniform_bo
Having one buffer object for input kernel arguments coming from clover
and an other one for OpenGL user uniforms is unnecessary. Using the
uniform_bo object for both GL/CL uniforms avoids to declare a new BO.

This only affects compute programs but it should not hurt anything
because the states are dirtied and data will get reuploaded.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-29 00:44:08 +02:00
Tim Rowley
124a5d4ca0 swr: remove duplicated constant update code
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-28 16:16:46 -05:00
Marek Olšák
1a8c2ccb24 gallium/radeon: add the size only once in r600_context_add_resource_size
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-04-28 21:06:31 +02:00
Bas Nieuwenhuizen
8e43bc0eb6 winsys/radeon: enlarge buffer_indices_hashlist
Enlarge the buffer hashlist to prevent large numbers of misses
due to adding more buffers than can be cached in the hashlist.

Ported from winsys/amdgpu: 6373845d98

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-04-28 21:06:31 +02:00
Marek Olšák
92f6af2c4a gallium/radeon: drop support for LINEAR_GENERAL layout
Unused. All texture imports use LINEAR_ALIGNED regardless of what
the DDX does.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-04-28 20:16:56 +02:00
Marek Olšák
f564b61d33 radeonsi: rework clear_buffer flags
Changes:
- don't flush DB for fast color clears
- don't flush any caches for initial clears
- remove the flag from si_copy_buffer, always assume shader coherency

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-28 20:16:56 +02:00
Jason Ekstrand
d273ce5259 anv/dynamic_offsets: Fix the order of arguments to nir_build_imm 2016-04-28 11:05:56 -07:00
Jason Ekstrand
6028a67641 anv: Fix a build error caused by recent fp64 NIR changes 2016-04-28 10:13:42 -07:00
Jose Fonseca
99474dc29b nir: Try to warn when C99 extensions are used in nir headers.
Ideally we'd have nir.h being included with -Wpedantic too, but it fails
with:

src/compiler/nir/nir.h:754:20: warning: ISO C++ forbids zero-size array ‘src’ [-Wpedantic]
    nir_alu_src src[];
                    ^
In file included from src/compiler/nir/glsl_to_nir.cpp:42:0:
src/compiler/nir/nir.h:919:16: warning: ISO C++ forbids zero-size array ‘src’ [-Wpedantic]
    nir_src src[];

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-04-28 16:48:13 +01:00
Jose Fonseca
e7438009af nir: Remove spurious ; after nir_builder functions.
Makes -pedantic happy.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-04-28 16:48:12 +01:00
Jose Fonseca
caa5937ebb nir: Remove spurious ; after namespace.
Makes -pedantic happy.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-04-28 16:48:12 +01:00
Jose Fonseca
f7854d8227 nir: Avoid C99 field initializers.
As they are not standard C++ and are not supported by MSVC C++ compiler.

Just have nir_imm_double match nir_imm_float above.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2016-04-28 16:48:12 +01:00
Brian Paul
a609da60c0 gallium/util: s/Elements/ARRAY_SIZE/
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-28 09:04:24 -06:00
Brian Paul
f365488eaa mesa: improve comment on _mesa_check_disallowed_mapping(), return bool
The old comment was a bit terse.  Also, change the function return
type to bool.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-28 09:04:17 -06:00
Marek Olšák
7e7710a068 radeonsi: remove needless cache flushes at the end of CP DMA operations
not needed AFAIK

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-28 12:46:47 +02:00
Marek Olšák
7d49b459b6 radeonsi: remove flushes at the beginning and end of IBs done by the kernel
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-28 12:46:47 +02:00
Samuel Iglesias Gonsálvez
db07b46f2c nir: Add lrp lowering for doubles in opt_algebraic
Some hardware (i965 on Broadwell generation, for example) does not support
natively the execution of lrp instruction with double arguments.

Add 'lower_flrp64' flag to lower this instruction in that case.

v2:
   - Rename lower_flrp_double to lower_flrp64 (Jason)
   - Fix typo (Jason)
   - Adapt the code to define bit_size information in the opcodes.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 12:01:40 +02:00
Samuel Iglesias Gonsálvez
443600d51e nir: rename lower_flrp to lower_flrp32
A later patch will add lower_flrp64 option to NIR.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 12:01:40 +02:00
Iago Toral Quiroga
072613b3f3 nir/lower_double_ops: lower round_even()
At least i965 hardware does not have native support for round_even() on doubles.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-28 12:01:40 +02:00
Iago Toral Quiroga
bf91df7f7f nir/lower_double_ops: lower fract()
At least i965 hardware does not have native support for fract() on doubles.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 12:01:40 +02:00
Iago Toral Quiroga
126a1ac03f nir/lower_double_ops: lower ceil()
At least i965 hardware does not have native support for ceil on doubles.

v2 (Sam):
   - Improve the lowering pass to remove one bcsel (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 12:01:36 +02:00
Iago Toral Quiroga
29541ec531 nir/lower_double_ops: lower floor()
At least i965 hardware does not have native support for floor on doubles.

v2 (Sam):
  - Improve the lowering pass to remove one bcsel (Jason)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 11:58:35 +02:00
Iago Toral Quiroga
5fab3d178b nir/lower_double_ops: lower trunc()
At least i965 hardware does not have native support for truncating doubles.

v2:
  - Simplified the implementation significantly.
  - Fixed the else branch, that was not doing what we wanted.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 11:58:13 +02:00
Connor Abbott
2ea3649c63 nir: add a pass to lower some double operations
v2: Move to compiler/nir (Iago)
v3: Use nir_imm_int() to load the constants (Sam)
v4 (Sam):
  - Undo line-wrap (Jason).
  - Fix comment (Jason).
  - Improve generated code for get_signed_inf() function (Connor).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 11:58:13 +02:00
Connor Abbott
2cf3b28884 nir/builder: add nir_imm_double()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 11:58:13 +02:00
Samuel Iglesias Gonsálvez
3a150683ce nir/builder: Add bit_size info to nir_build_imm()
v2:
- Group num_components and bit_size together (Jason)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 11:58:13 +02:00
Jakob Sinclair
76b8c5cc60 radeonsi: check if value is negative
Fixes a Coverity defect by adding checks to see if a value is negative
before using it to index an array. By checking the value first it makes
the code a bit safer but overall should not have a big impact.

CID: 1355598

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-04-28 11:33:38 +02:00
Michel Dänzer
860210ccfc clover: Fix build against clang SVN >= r267772
(Re-pushing previous fix for clang SVN r265359, which was reverted in
the meantime)

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2016-04-28 12:57:03 +09:00
Lars Hamre
32cb7d61a9 glsl: fix lowering outputs for early/nested returns
Return statements in conditional blocks were not having their
output varyings lowered correctly.

This patch fixes the following piglit tests:
/spec/glsl-1.10/execution/vs-float-main-return
/spec/glsl-1.10/execution/vs-vec2-main-return
/spec/glsl-1.10/execution/vs-vec3-main-return

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-28 11:01:51 +10:00
Connor Abbott
122d27e998 nir: rewrite nir_foreach_block and friends
Previously, these were functions which took a callback. This meant that
the per-block code had to be in a separate function, and all the data
that you wanted to pass in had to be a single void *. They walked the
control flow tree recursively, doing a depth-first search, and called
the callback in a preorder, matching the order of the original source
code. But since each node in the control flow tree has a pointer to its
parent, we can implement a "get-next" and "get-previous" method that
does the same thing that the recursive function did with no state at
all. This lets us rewrite nir_foreach_block() as a simple for loop,
which lets us greatly simplify its users in some cases. This does
require us to rewrite every user, although the transformation from the
old nir_foreach_block() to the new nir_foreach_block() is mostly
trivial.

One subtlety, though, is that the new nir_foreach_block() won't handle
the case where the current block is deleted, which the old one could.
There's a new nir_foreach_block_safe() which implements the standard
trick for solving this. Most users don't modify control flow, though, so
they won't need it. Right now, only opt_select_peephole needs it.

The old functions are reimplemented in terms of the new macros, although
they'll go away after everything is converted.

v2: keep an implementation of the old functions around
v3 (Jason Ekstrand): A small cosmetic change and a bugfix in the loop
   handling of nir_cf_node_cf_tree_last().
v4 (Jason Ekstrand): Use the _safe macro in foreach_block_reverse_call

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-27 15:05:40 -07:00
Connor Abbott
958300137f nir/opt_cp: use nir_block_get_following_if()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-27 15:05:34 -07:00
Jordan Justen
aaaa22c775 vbo: Return INVALID_OPERATION during draw with a mapped buffer
Fixes the OpenGLES 3.1 CTS:
 * ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos

Because this is triggering the error message after the normal API
validation phase, we don't have the API function name available, and
therefore we generate an error message without the draw call name:

Mesa: User error: GL_INVALID_OPERATION in draw call (vertex buffers are mapped)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95142
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 14:30:06 -07:00
Nanley Chery
28d0bc72fb anv/formats: Return proper error code for unsupported formats
Fixes some failures in dEQP-VK.api.info.image_format_properties.* and
enables the test group to execute without assert failing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-27 11:28:30 -07:00
Nanley Chery
5f7e8eac42 anv/device: Set the compressed texture feature flags correctly
Sampling from an ETC2 texture is supported on Bay Trail and
from Gen8 onwards. While ASTC_LDR is supported on Gen9, the
logic to handle such formats has not yet been implemented in
the driver.

Fixes dEQP-VK.api.info.format_properties.compressed_formats.

v2: Enable ETC2 for Bay Trail (Kenneth Graunke)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-27 11:28:30 -07:00
Jason Ekstrand
e0806930ad nir/algebraic: Add a bit-size validator
This commit adds a validator that ensures that all expressions passed
through nir_algebraic are 100% non-ambiguous as far as bit-sizes are
concerned.  This way it's a compile-time error rather than a hard-to-trace
C exception some time later.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand
8a3e344180 nir/opt_algebraic: Fix some expressions with ambiguous bit sizes
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand
7e0ee3a38b nir/search: Respect the bit_size parameter on nir_search_value
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand
fcc1c8a437 nir/algebraic: Add a mechanism for specifying the bit size of a value
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand
cafb885e45 nir/algebraic: Use "uint" instead of "unsigned" for uint types
This is consistent with the rename done for the rest of NIR.  Currently,
"bool" is the only type specifier used in nir_opt_algebraic.py so this is
really a no-op.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand
736ee0bef7 nir/algebraic: Do better error reporting of bad expressions
Previously, if an exception was encountered anywhere, nir_algebraic would
just die in a fire with no indication whatsoever as to where the actual bug
is.  This commit makes it print out the particular search-and-replace
expression that is causing problems along with the exception.  Also, it
will now report all of the errors it finds and then exit at the end like a
standard C compiler would do.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Alejandro Piñeiro
b1dcedf393 isl: move -lm at the end of tests_ldadd
The test was failing to build with "undefined reference to `roundf'" errors,
so Make check on mesa was failing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-27 20:14:56 +02:00
Topi Pohjolainen
aef6a6c382 i965/blorp/gen8: Fix blitting of interleaved msaa surfaces
Fixes ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8_multisample.

Current logic divides given layer of one by number of samples (four)
trashing the layer to zero. Layer adjustment is only to be used with
non-interleaved msaa surfaces where samples for particular layer are
in multiple slices.

I copy-pasted a bit of documentation from
brw_blorp.c::brw_blorp_compute_tile_offsets().

Also took the opportunity to fix the comment regarding sampling
as 2D, cube textures are the only exception.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-04-27 19:57:40 +03:00
Brian Paul
1d242b6882 llvmpipe: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul
23c55e5c23 tgsi: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul
419e386571 os: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul
d902504a67 hud: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul
e522a76226 gallivm: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul
489df4a71a draw: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul
f93802c465 softpipe: s/Elements/ARRAY_SIZE/
Try to standardize on the later, which is defined in the common util/
directory.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Nicolai Hähnle
562c4a17b7 winsys/radeon: remove use_reusable_pool parameter from buffer_create
All callers set this parameter to true.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:41 -05:00
Nicolai Hähnle
13acf2b243 gallium/radeon: remove use_reusable_pool parameter from r600_init_resource
All callers set it to true.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:41 -05:00
Nicolai Hähnle
c868974396 radeon/video: always use the reusable buffer pool
A semantic error was introduced in a past refactoring that caused the bind
parameter to be passed into the use_reusable_pool parameter of buffer_create.
Since this clearly makes no sense, and there is no clear reason why the
cache _shouldn't_ be used, just use the cache always.

Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:41 -05:00
Nicolai Hähnle
8c43c06e04 radeonsi: work around an MSAA fast stencil clear problem
A piglit test (arb_texture_multisample-stencil-clear) has been sent.
This problem was discovered analyzing

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93767
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle
7a215a3e27 radeonsi: expclear must be disabled on first Z/S clear
The documentation and the HW team say so.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle
01a3bb5d8b radeonsi: move blend choice out of loop in si_blit_decompress_color
It does not depend on the level or layer.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle
450ff0f0d5 radeonsi: use level mask for early out in si_blit_decompress_color
Mostly for consistency with the other decompress functions, but note that
in the non-DCC decompress case, the function can now early-out in slightly
more (albeit probably rare) cases.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle
0ff05b55c6 radeonsi: si_blit_decompress_depth is only used for staging
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle
0b70fc2db4 radeonsi: only decompress the required ZS planes from si_blit
This happens to "fix" a rendering bug in KotOR2, because it avoids a still
not quite understood bug with MSAA fast stencil clear decompress. For the
stencil clear bug, I have sent a piglit test (arb_texture_multisample-stencil-clear).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93767
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle
def53a0b3d radeonsi: decompress Z & S planes in one pass
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle
dc6fc2f390 radeonsi: early out of si_blit_decompress_depth_in_place based on dirty mask
Avoid dirtying the db_render_state atom when possible.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle
d14d6c3f58 radeonsi: use MIN2 instead of expanded ?: operator
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle
159f182a57 radeonsi: fix brace style
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle
91fb4bb2e9 gallium/util: add u_bit_consecutive for generating a consecutive range of bits
There are some undefined behavior subtleties, so having a function to match
the u_bit_scan_consecutive_range makes sense.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Tim Rowley
504df3a1d7 swr: s/Elements/ARRAY_SIZE/
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 11:07:34 -05:00
Nicolai Hähnle
836cab51c8 radeonsi: emit s_waitcnt for shader memory barriers and volatile
Turns out that this is needed after all to satisfy some strengthened
coherency tests. Depends on support in LLVM, added in r267729.

v2: updated to reflect changes to the LLVM intrinsic

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2016-04-27 10:54:05 -05:00
Tim Rowley
e7201bd31b swr: [rasterizer] warning cleanup
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:54 -05:00
Tim Rowley
24f23817d2 swr: [rasterizer core] implement legacy depth bias enable
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:45 -05:00
Tim Rowley
fa36f8ec9c swr: [rasterizer jitter] support for dumping x86 asm
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:32 -05:00
Tim Rowley
a646ffdacf swr: [rasterizer core] more backend refactoring
BackendPixelRate should be easier to read/maintain now hopefully.

Small perf bump by moving some of the pfn's to inline functions
without template params.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:21 -05:00
Tim Rowley
8e815ff72c swr: [rasterizer jitter] add mSimdInt1Ty
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:12 -05:00
Tim Rowley
4e1e0b3a32 swr: [rasterizer core] backend refactor
Lump all template args into a bundle of traits, and add some
functionality to the MSAA traits.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:40:44 -05:00
Brian Paul
43f46caf76 svga: use the SVGA3D_DEVCAP_MAX_FRAGMENT_SHADER_INSTRUCTIONS query
Instead of a hard-coded 512.  The query typically returns 65536 now.
Fall back to 512 if the query fails as we do for vertex shaders (which
should never happen).

Note that we don't actually enforce this limit in our shaders but it gets
reported via the glGetProgramivARB(GL_MAX_PROGRAM_INSTRUCTIONS_ARB) query.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-04-27 08:43:33 -06:00
Hans de Goede
b5e7907f30 nouveau: codegen: LOAD: Take src swizzle into account
The llvm TGSI backend uses pointers in registers and does things
like:

LOAD TEMP[0].y, MEMORY[0], TEMP[0]

Expecting the data at address TEMP[0].x to get loaded to
TEMP[0].y. But this will cause the data at TEMP[0].x + 4 to be
loaded instead.

This commit adds support for a swizzle suffix for the 1st source
operand, which allows using:

LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0]

And actually getting the desired behavior

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-27 16:11:48 +02:00
Hans de Goede
90f45357ab nouveau: codegen: LOAD: Do not call fetchSrc(1) if the address is immediate
"off" later gets set to NULL when the address is immediate, so move the
fetchSrc(1) call to the non-immediate branch of the if-else. This brings
handleLOAD's offset handling inline with how it is done in handleSTORE.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-27 16:11:48 +02:00
Hans de Goede
1958397a58 nouveau: codegen: LOAD: Always use component 0 when getting the address
LOAD loads upto 4 components from the specified resource starting at
the passed in x value of the 2nd source operand, the y, z and w
components of the address should not be used.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-27 16:11:48 +02:00
Stefan Dirsch
7d25ed7036 dri3: Check for dummyContext to see if the glx_context is valid
According to the comments in src/glx/glxcurrent.c __glXGetCurrentContext()
always returns a valid pointer. If no context is made current, it will
contain dummyContext. Thus a test for NULL will always fail.

https://lists.freedesktop.org/archives/mesa-dev/2016-April/113962.html

Signed-off-by: Stefan Dirsch <sndirsch@suse.de>
Reviewed-by: Egbert Eich <eich@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-27 13:03:34 +01:00
Egbert Eich
4d9b518ad2 dri2: Check for dummyContext to see if the glx_context is valid
According to the comments in src/glx/glxcurrent.c __glXGetCurrentContext()
always returns a valid pointer. If no context is made current, it will
contain dummyContext. Thus a test for NULL will always fail.

https://bugzilla.opensuse.org/show_bug.cgi?id=962609

Tested-by: Olaf Hering <ohering@suse.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-27 13:03:11 +01:00
Timothy Arceri
6d1a59d15b glsl: move uniform block validation to link_uniform_blocks.cpp
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-04-27 16:17:47 +10:00
Kenneth Graunke
73ada723f0 docs: Mention that {ARB,OES}_texture_stencil8 is supported on i965/gen8+
Thanks to Thomas Helland for reminding me to do this.
2016-04-26 21:32:35 -07:00
Kenneth Graunke
fd9a7d8f30 i965: Enable ARB_texture_stencil8 and OES_texture_stencil8 on Gen8+.
Stencil texturing is required by ES 3.1.  Apparently we never actually
turned it on.  Do that now.  Also turn on the desktop extension.

Fixes nine dEQP-GLES31.functional tests:

stencil_texturing.format.stencil_index8_2d
texture.border_clamp.formats.stencil_index8.nearest_size_pot
texture.border_clamp.formats.stencil_index8.nearest_size_npot
texture.border_clamp.formats.stencil_index8.gather_size_pot
texture.border_clamp.formats.stencil_index8.gather_size_npot
texture.border_clamp.unused_channels.stencil_index8
state_query.internal_format.renderbuffer.stencil_index8_samples
state_query.internal_format.texture_2d_multisample.stencil_index8_samples
state_query.internal_format.texture_2d_multisample_array.stencil_index8_samples

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2016-04-26 21:32:35 -07:00
Kenneth Graunke
12c43a355c mesa: Try to fix CopyTex[Sub]Image of stencil textures.
ES prohibits this, but GL appears to allow it.  We at least need this
much, or else we'll crash as there's no source to read from.

This fixed crashes in the ES tests before I realized I needed to
prohibit stencil instead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2016-04-26 21:32:35 -07:00
Kenneth Graunke
027c6c1222 mesa: Disallow CopyTexSubImage on stencil formats in ES.
Fixes
- ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8
- ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8_multisample

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2016-04-26 21:32:35 -07:00
Kenneth Graunke
1e44599a43 i965: Fix MapTextureImage for multi-slice/level stencil buffers.
We called intel_miptree_get_image_offset() to get the image offsets
for the current level/slice, but then proceeded to ignore the results
and clobber level/slice 0 every time.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94713
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2016-04-26 21:32:35 -07:00
Kenneth Graunke
361a24e140 i965: Move TCS output indirect_offset.file check out a level.
I want to add another condition.  Moving the indirect_offset.file
check out a level should make this a little easier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-04-26 19:59:56 -07:00
Kenneth Graunke
13195f7ef8 i965/fs: Reduce the response length of sampler messages on Skylake.
Often, we don't need a full 4 channels worth of data from the sampler.
For example, depth comparisons and red textures only return one value.
To handle this, the sampler message header contains a mask which can
be used to disable channels, and reduce the message length (in SIMD16
mode on all hardware, and SIMD8 mode on Broadwell and later).

We've never used it before, since it required setting up a message
header.  This meant trading a smaller response length for a larger
message length and additional MOVs to set it up.

However, Skylake introduces a terrific new feature: for headerless
messages, you can simply reduce the response length, and it makes
the implicit header contain an appropriate mask.  So to read only
RG, you would simply set the message length to 2 or 4 (SIMD8/16).

This means we can finally take advantage of this at no cost.

total instructions in shared programs: 9091831 -> 9073067 (-0.21%)
instructions in affected programs: 191370 -> 172606 (-9.81%)
helped: 2609
HURT: 0

total cycles in shared programs: 70868114 -> 68454752 (-3.41%)
cycles in affected programs: 35841154 -> 33427792 (-6.73%)
helped: 16357
HURT: 8188

total spills in shared programs: 3492 -> 1707 (-51.12%)
spills in affected programs: 2749 -> 964 (-64.93%)
helped: 74
HURT: 0

total fills in shared programs: 4266 -> 2647 (-37.95%)
fills in affected programs: 3029 -> 1410 (-53.45%)
helped: 74
HURT: 0

LOST:   1
GAINED: 143

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-26 19:55:04 -07:00
Jason Ekstrand
d800b7daa5 nir: Add a helper for figuring out what channels of an SSA def are read
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-04-26 19:55:04 -07:00
Jason Ekstrand
acc2f1fe36 i965/fs: Use inst->regs_written for rlen for texture instructions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-04-26 19:55:04 -07:00
Jason Ekstrand
c7a09c0571 i965/fs: Properly report regs_written from SAMPLEINFO
The previous behavior would only allocate one register and then write
four thus potentially stomping three innocent bystanders.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-04-26 19:55:04 -07:00
Jason Ekstrand
30b37e4e9b i965/blorp: Set regs_written on texturing instructions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-04-26 19:55:04 -07:00
Kenneth Graunke
0bd956b34b i965: Don't force a header for texture offsets of 0.
Calling textureOffset() with an offset of <0, 0, 0> is equivalent to
calliing texture().  We don't actually need to set up an offset,
which causes a message header to be created.

A fairly common pattern is to sample at a point with a bunch of
offsets, and average them.  It's natural to write all the lookups
as textureOffset, but use <0, 0> for the center sample.

shader-db results on Skylake:

total instructions in shared programs: 9092095 -> 9092087 (-0.00%)
instructions in affected programs: 2826 -> 2818 (-0.28%)
helped: 12
HURT: 2

total cycles in shared programs: 70870166 -> 70870144 (-0.00%)
cycles in affected programs: 15924 -> 15902 (-0.14%)
helped: 2
HURT: 0

This also helps prevent code quality regressions in a future patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by Jason Ekstrand <jason@jlekstrand.net>
2016-04-26 19:55:04 -07:00
Patrick Rudolph
fb5d38e219 r600g: fix and optimize tgsi_cmp when using ABS and NEG modifier
Some apps set NEG and ABS on the source param to test for zero.
Use ALU_OP3_CNDE insted of ALU_OP3_CNDGE and unset both modifiers.

It also removes the need for a MOV instruction, as ABS isn't
supported on op3.

Tested on AMD CAYMAN and AMD RV770.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 12:48:50 +10:00
Dave Airlie
7aa3a93656 docs: update softpipe for ARB_compute_shader
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:01:12 +10:00
Dave Airlie
e749c30ceb softpipe: add support for compute shaders. (v2)
This enables ARB_compute_shader on softpipe. I've only
tested this with piglit so far, and I hopefully plan
on integrating it with my vulkan work. I'll get to
testing it with deqp more later.

The basic premise is to create up to 1024 restartable
TGSI machines, and execute workgroups of those machines.

v1.1: free machines.
v2: deqp fixes - add samplers support, finish
atomic operations, fix load/store writemasks.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:01:03 +10:00
Dave Airlie
f78bcb7638 tgsi/exec: initialise SysSemanticToIndex array to -1
We want to use the SysSemanticToIndex to tell if we've seen
the semantics at all.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:00:46 +10:00
Dave Airlie
fbea4e177f tgsi/exec: implement restartable machine.
This lets us restart the machine at a PC value, and exits
the machine when we hit a barrier.

Compute shaders will then execute all the threads up to the
barrier, then restart the machines after the barrier once
all are done.

v2: comment the code a bit, change return types.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:00:44 +10:00
Dave Airlie
8ffa3c58d4 tgsi/exec: make inputs/outputs optional for compute shaders.
compute shaders don't need input/outputs so don't bother
allocating memory for these.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:00:41 +10:00
Dave Airlie
16a9dc1e49 tgsi/exec: implement load/store/atomic on MEMORY.
This implements basic load/store/atomic ops on MEMORY types
for compute shaders.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 09:00:35 +10:00
Dave Airlie
354c5f2d0f tgsi/exec: split out setting up masks to separate function
This is just a cleanup that will make later changes easier
to make.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 08:56:22 +10:00
Dave Airlie
6cf36a7231 tgsi: accept a starting PC value for exec machine.
This will be used later to restart barriered execution
threads in compute, for now we just want to change the API.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 08:56:17 +10:00
Dave Airlie
912ed84f83 tgsi: move to using vector for system values.
For compute support some of the system values are .xyz types,
so move to using a vector instead of a single channel.

[airlied: squash swizzle fix from compute series].

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 08:26:53 +10:00
Dave Airlie
9013d9267c tgsi/exec: fix system value handling.
a) SysSemanticToIndex needs to be indexed with the semantic name
not the decl->Declaration.Semantic.

b) doing this in run is too late, as the mappings are all setup
prior to run in the execs.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-27 08:25:38 +10:00
Jason Ekstrand
4040fff81d i965/blorp: Convert state setup to C
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Jason Ekstrand
71775afe6e i965/blorp: Make state setup C-safe
Previously they (very rarely) used C++isms that prevented them from being
compiled as C.  As of this commit, they can be compiled as either C or C++.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Jason Ekstrand
bed74299c2 i965/blorp: Convert brw_blorp.cpp to a C file
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Jason Ekstrand
0551f3dfa4 i965/blorp: Make all of brw_blorp.h accessible to C
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Jason Ekstrand
b3f08b5424 i965/blorp: Turn brw_blorp_params into a C-style struct
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Jason Ekstrand
33fa12c50f i965/blorp: Turn coord_transform into a C-style struct
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Jason Ekstrand
b6dd8e42f0 i965/blorp: Turn blorp_surface_info into a C-style struct
This commit is mostly mechanical except that it changes where we set the
swizzle.  Previously, the blorp_surface_info constructor defaulted the
swizzle to SWIZZLE_XYZW.  Now, we memset to zero and fill out the swizzle
when we setup the rest of the struct.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Jason Ekstrand
a543f741bf i965/blorp: Roll mip_info into surface_info
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Jason Ekstrand
3839936497 i965/blorp: Get rid of the blorp_blit_params class
It was really just a wrapper around the function that constructed it.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Jason Ekstrand
8096ed7e27 i965/blorp: Remove the hiz params class
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Jason Ekstrand
e35d9407dc i965/blorp: Remove the clear params classes
They didn't really add anything other than a key and extra layers of
function calls.  This commit just inlines the extra functions and gets rid
of the extra classes.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Jason Ekstrand
659400cba3 i965/blorp: Remove the arguments to brw_blorp_params()
No one was using anything other than the defaults.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Jason Ekstrand
2dda4ff014 i965/blorp: Refactor to get rid of the get_wm_prog virtual function
Instead of having a virtual member function for getting the WM/PS kernel,
we simply add fields for prog_data and the kernel to brw_blorp_parms and
always make sure those get set as part of the different constructors.

v2: Use use prog_data != NULL to check for a valid program instead of a
    magic kernel offset value

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-26 14:55:22 -07:00
Tim Rowley
18d1658633 swr: autogenerate swr_context_llvm.h
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-26 16:45:26 -05:00
Laurent Carlier
12cf08fcc3 anv: honor DESTDIR when installing icd file
https://bugs.freedesktop.org/show_bug.cgi?id=94969

Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-26 14:57:54 -07:00
Juha-Pekka Heikkila
ec5f7fc7bd i965/meta: initialize values to avoid random behaviour on error path
if brw_meta_stencil_blit() errored at wrong place 'target' would
be uninitialized and cause random behaviour on leaving the funtion.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-26 14:54:29 -07:00
Juha-Pekka Heikkila
51632d6f27 meta: Avoid random memory access on error
Initialize drawFb to NULL in _mesa_meta_CopyImageSubData_uncompressed()
if getting readFb fails uninitialized drawFb will cause randomness
on cleanup.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-26 14:54:02 -07:00
Grazvydas Ignotas
cea3a7e615 mesa: add tags file to gitignore
For ctags users like me.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-26 14:49:27 -07:00
Jakob Sinclair
dda50af9c4 mesa: Remove every double semi-colon
Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-26 14:36:29 -07:00
Jakob Sinclair
e5d027ec7d glx: Remove every double semi-colon
Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-26 14:36:29 -07:00
Jakob Sinclair
ea327dc451 gallium: Remove every double semi-colon
Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-26 14:36:29 -07:00
Jakob Sinclair
de743a07ac egl: Remove every double semi-colon
Removes all accidental semi-colons in egl.

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-26 14:36:29 -07:00
Jakob Sinclair
e129e6eb89 gallium/r600: removing double semi-colons
Trivial change. Removing unnecessary semi-colons from the code.
I don't have push access so someone reviewing this can push it.

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-26 14:36:29 -07:00
Jakob Sinclair
12da8bb5f4 mesa/main: removing double semi-colons
Trivial change. Removing unnecessary semi-colons from the code.
I don't have push access so someone reviewing this can push it.

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-26 14:36:29 -07:00
Jakob Sinclair
09e4ac00ac glsl: removing double semi-colons
Trivial change. Removing unnecessary semi-colons from the code.
I don't have push access so someone reviewing this can push it.

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-26 14:36:29 -07:00
Jose Fonseca
52c7443932 glx: Don't enclose includes inside extern "C" { }.
Ran `make check` inside src/glx to verify everything compiles and links
correctly.

https://bugs.freedesktop.org/show_bug.cgi?id=95158

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-26 21:28:34 +01:00
Marek Olšák
80e5fb60b4 radeonsi: add RW_BUFFERS only once in si_ce_needed_cs_space
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-26 21:37:07 +02:00
Marek Olšák
2b4b5ebfcf egl: fix make check broken by interop support 2016-04-26 21:37:07 +02:00
Samuel Pitoiset
e64ee4cf60 docs: mark ARB_compute_shader as done for nvc0
This has been merged few months ago but this should help
https://mesamatrix.net/ to update its list of supported extensions.

Please note that compute shaders are not really useful without
ARB_image_load_store and only GK104 and GK110 support it for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-04-26 21:10:10 +02:00
Samuel Pitoiset
5c429f88d9 nvc0: expose GLSL version 420 on GK110
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
a0e777f6a1 nvc0: enable ARB_shader_image_load_store on GK110
This exposes 8 images for all shader types.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
2daaa5d657 gk110/ir: add emission for VSHL
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
af5925209d gk110/ir: add emission for OP_SUEAU, OP_SUBFM and OP_SUCLAMP
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
1f8900a8e0 gk110/ir: add emission for OP_SULDB and OP_SUSTx
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
fddd8523d4 gk110/ir: add emission for OP_MADSP
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
c2ce22ca46 gk110/ir: add emission for OP_PERMT
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
222d1a1bff nvc0: expose GLSL version 420 on GK104
Other chipsets will be added later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Ilia Mirkin
9e367ed480 nvc0: enable ARB_shader_image_load_store on GK104
This exposes 8 images for all shader types.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
0d64d39e81 nvc0: inform users that 3D images are not fully supported
3D images are a bit more complicated to implement and will probably
requires a bunch of headaches and we don't care for now because they
do not seem to be really used by apps.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
fdbb476829 nvc0: reduce GL_MAX_3D_TEXTURE_SIZE to 2048 on Kepler+
The blob sets it to 2048 and using 4096 reports an INVALID_DATA error
with RT_ARRAY_MODE when z is 4096. Suggested by Ilia Mirkin.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
6fc6d548ed nvc0/ir: check that the image format doesn't mismatch
This re-uses NVE4_SU_INFO_CALL which is not used anymore because we
don't use our lib for format conversions. While we are at it, add a
todo for image buffers because there are some robustness-related
issues to fix.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
fbeb69757c nvc0/ir: prevent out of bounds when no images are bound
Checking if the image address is not 0 should be enough to prevent
read faults. To improve robustness, make sure that the destination
value of atomic operations is correctly initialized in case the
instruction is not performed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
5ba5714483 nvc0/ir: add indirect support for images on Kepler
This fixes arb_shader_image_load_store-indexing and
arb_shader_image_load_store-max-images.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
8b540db44c nvc0/ir: fix 1D arrays images for Kepler
For 1D arrays, the array index is stored in the Z component.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
e478156ed7 nvc0/ir: fix cube images for Kepler
Like 2d array images, the z-dimension needs to be clamped.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Ilia Mirkin
3ce80f924d nv50/ir: add support for SULDP -> SULDB conversion
This will allow to convert surface formats without adding an extra
call to our lib.

[hakzsam: make use of this for GK104]

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
d64ea4e48e nv50/ir: make use of OP_SUQ for surfaces query
This implements RESQ for surfaces which comes from imageSize() GLSL
bultin. As the dimensions are sticked into the driver constant buffer,
this only has to be lowered with loads.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v2)
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
7c47db359e nv50/ir: add OP_BUFQ for buffers query
TGSI RESQ allows both images and buffers but we have to make a
distinction between these two type of resources in our lowering pass.
Introducing OP_BUFQ which is a fake operand will allow to implement
OP_SUQ for surfaces.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
e09434047d nv50/ir: enable early fragment test with explicit user control
This feature can be enabled in two ways: as an optimization and by
explicit user control (with OpenGL 4.2 or ARB_shader_image_load_store).

This makes use of the recent TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL to
force early fragment tests when needed.

This fixes a bunch of
dEQP-GLES31.functional.image_load_store.early_fragment_tests.* tests.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
08f4faa542 nvc0/ir: fix constraints for OP_SUSTx on Kepler
Destination type is actually always 32-bits, so typeSizeof() returns 4
and no sources are condensed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
119d087758 nv50/ir: re-introduce TGSI lowering pass for images
This is loosely based on the previous lowering pass wrote by calim
four years ago. I did clean the code and fixed some issues.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
76ea143c38 nv50/ir: add support for TGSI image declarations
Old and dead resource code will be removed once images are completely
done. Based on original patch by Ilia Mirkin.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
1fb3cd2489 nvc0: add missing glMemoryBarrier bits
This fixes a bunch of subtests of
arb_shader_image_load_store-host-mem-barrier.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
9bc18a48f3 nvc0: enable RGB10_A2UI format on GK104
No clue why this was not enabled by default before, maybe because
the SULDP conversion was wrong. Anyway, this helps in fixing all
rgb10_a2ui piglit tests.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
da8171dc75 nvc0: shift address with blocksize for image buffers
This fixes a bunch of dEQP image buffers related tests.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
285f2edd14 nvc0: fix address offset when images have multiple levels
This fixes arb_shader_image_load_store-level.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
e28f247e24 nvc0: bind images on 3D shaders for Kepler
Similar to surfaces validation for compute shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
1eca4c51a2 nvc0: bind images on compute shaders for Kepler
Old surfaces validation code will be removed once images are completely
done for Fermi/Kepler, that explains why I only disable it for now.

This also introduces nvc0_get_surface_dims() which computes correct
dimensions regarding the given target.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
c6b3c346d1 nvc0: reserve an area for surfaces info in the driver constbuf
To process surfaces coordinates from the codegen part, and because
some information like the format is not always available (eg. when
writeonly is used), we have to stick some surfaces data in the
driver constbuf. This is especially true for OpenCL because we don't
know the format at shader compile time.

This bumps the size of each shader area from 1K to 2K.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
afa04785fa nvc0: add preliminary support for images
This implements set_shader_images() and resource invalidation for
images. As OpenGL requires at least 8 images, we are going to expose
this minimum value even if this might be raised for Kepler, but this
limit is mainly for Fermi because the hardware only accepts 8 images.

Based on original patch by Ilia Mirkin.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
c62b1b92f7 gk110/ir: add emission for (a OP b) OP c
This is pretty similar to NVC0 except that offsets have changed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-26 19:47:49 +02:00
Samuel Pitoiset
3da8528846 nvc0/ir: fix wrong emission of (a OP b) OP c
The third source must be emitted at offset 49 instead of 17 and the
not modifier is at 52 instead of 20. If you look a bit above in
emitLogicOp() you will see that the dest is emitted at 17 which
confirms that src(2) is obviously wrong.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-26 19:47:49 +02:00
Jose Fonseca
a2fe35bcdf scons: Support Clang on Windows.
- Introduce 'gcc_compat' env flag, for all compilers that define __GNUC__,
  (which includes Clang when it's not emulating MSVC.)

- Clang doesn't support whole program optimization

- Disable enumerator value warnings (not sure why Clang warns about them,
  as my understanding is that MSVC promotes enums to unsigned ints
  automatically.)

This is not enough to build with Clang + AddressSanitizer though.  More
follow up changes will be required for that.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-26 17:17:00 +01:00
Jose Fonseca
dcc3baf733 gallium: Include intrin.h instead of defining ourselves.
More portable, particularly when building with Clang, which implements
all MSVC intrisincs in its own intrin.h, but doesn't actually support
`#pragma instrinsic`.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-26 17:17:00 +01:00
Jose Fonseca
9a25c8af1b scons: Whenever possible decide what to do based on platform and not compiler.
Because compilers like GCC and Clang are effectively available everywhere
so their presence/absence is seldom conclusive.

Furthermore, all compilers we use now have stdint.h.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-26 17:17:00 +01:00
Jose Fonseca
c068610a7d scons: Move fallback HAVE_* definitions to headers.
These were being defined in SCons, but it's not practical:

- we actually need to include Gallium headers from external source trees, with
completely disjoint build infrastructure, and it's unsustainable to
replicate the HAVE_xxx checks or even hard-coded defines across
everywhere.

- checking compiler version via command line doesn't really work due to
  Clang essentially being like a cameleon which can fake either GCC or
  MSVC

There's no change for autoconf.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-26 17:17:00 +01:00
Juha-Pekka Heikkila
940da2ce0e nir: Add missing break into switch in construct_value()
There seemed to be missing one break in nested switchcases.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-26 17:45:56 +02:00
Bas Nieuwenhuizen
31631d8515 radeonsi: Fix memory leak in error path.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-26 15:41:19 +02:00
Oded Gabbay
514c5b5f4b radeonsi: fix build error because of missing param
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-26 13:48:43 +03:00
Oded Gabbay
965175aba3 r600g: use do_endian_swap in texture swapping function
For some texture formats we need to take "do_endian_swap" into account
when configuring their swizzling.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-26 11:00:16 +03:00
Oded Gabbay
c86c761343 r600g: use do_endian_swap in color swapping functions
For some formats we need to take "do_endian_swap" into account when
configuring swapping for color buffers.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-26 11:00:16 +03:00
Oded Gabbay
686ad477bd r600g: set endianess of 16/32-bit buffers according to do_endian_swap
This patch modifies r600_colorformat_endian_swap(), so for 16-bit and for
32-bit buffers, the endianess configuration will be determined not only
by the color/texture format, but also by the do_endian_swap parameter.

The only exception is for array formats, which are always set to not do
swapping, because for them gallium sets an alias based on the machine's
endianess.

v4:
V_0280A0_COLOR_16_16 and V_0280A0_COLOR_16_16_FLOAT should be set to
8IN16 because the bytes inside need to be swapped even for array formats.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-26 11:00:16 +03:00
Oded Gabbay
2242dbe11d r600g/radeonsi: send endian info to format translation functions
Because r600 GPUs can't do swap in their DB unit, we need to disable
endianess swapping for textures that are handled by DB.

There are four format translation functions in r600g driver:

- r600_translate_texformat
- r600_colorformat_endian_swap
- r600_translate_colorformat
- r600_translate_colorswap

This patch adds a new parameters to those functions, called
"do_endian_swap". When running in a big-endian machine, the calling
functions will check whether the texture/color is handled by DB -
"rtex->is_depth && !rtex->is_flushing_texture" - and if so, they will
send FALSE through this parameter. Otherwise, they will send TRUE.

The translation functions, in specific cases, will look at this parameter
and configure the swapping accordingly.

v4:
evergreen_init_color_surface_rat() is only used by compute and don't
handle DB surfaces, so just sent hard-coded FALSE to translation
functions when called by it.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-26 11:00:16 +03:00
Ilia Mirkin
4965c5bf72 glsl: add ability to use essl 3.20
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-25 23:40:54 -04:00
Ilia Mirkin
fa8c0ccfbc main: select ES3.2 version when all extensions are available
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-25 23:40:34 -04:00
Dave Airlie
e3e6859381 tgsi: pass a shader type to the machine create and clean up.
There was definitely bugs here mixing up the PIPE_ and TGSI_ defines,
hopefully they didn't cause any problems, since mostly it was special
cases for GEOMETRY.

This clarifies at shader machine create what type of shader this
machine will execute. This is needed also for compute shaders where
we don't want to allocate inputs/outputs.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-26 13:05:32 +10:00
Dave Airlie
a6aae0c24d gallium/tgsi: move tgsi_exec.h header out of draw_context.h
It gets annoying that changing the tgsi exec rebuilds the state
tracker unnecessarily. Putting this include into draw_gs.h which
uses it causes a lot less rebuilds.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-26 13:00:57 +10:00
Roland Scheidegger
bd07e20d20 gallivm: make sampling more robust against bogus coordinates
Some cases (especially these using fract for coord wrapping) did not handle
NaNs (or Infs) correctly - the following code assumed the fract result
could not be outside [0,1], but if the input is a NaN (or +-Inf) the fract
result was NaN - which then could produce out-of-bound offsets.

(Note that the explicit NaN behavior changes for min/max on x86 sse don't
result in actual changes in the generated jit code, but may on other
architectures. Found by looking through all the wrap functions.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94955

No piglit changes.

(v2: fix min/max typo in coord_mirror, add comment)

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>

Tested-by: Bruce Cherniak <bruce.cherniak@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-26 04:55:37 +02:00
Dave Airlie
d8edc3e97c radeonsi: fix missing include for Elements.
Since u_blitter.h no longer defines this.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-26 09:36:23 +10:00
Samuel Pitoiset
d12c3b02ff nvc0: bump the amount of shared memory per MP on Maxwell
According to the CUDA compute capability version, GM10x can expose
64KB of shared memory while GM20x can use 96KB.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-26 00:32:25 +02:00
Dave Airlie
5b6a1aee46 r600: fix missing include for Elements macro
This got removed from u_blitter.h and we were taking it from
there, this should just move to ARRAY_SIZE eventually.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-26 08:01:01 +10:00
Samuel Pitoiset
725431a5db gm107/ir: s/invalid load/invalid store/
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-25 23:55:52 +02:00
Rob Clark
d2fcd0ce38 freedreno/a3xx: remove unused fxn
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-25 17:10:14 -04:00
Rob Clark
8fe2076243 freedreno/ir3: convert over to ralloc
The home-grown heap scheme (which is ultra-simple but probably not good
to always allocate and memset such a chunk of memory up front) was a
remnant of fdre (where the ir originally came from).  But since we have
ralloc in mesa, lets just use that instead.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-25 17:09:09 -04:00
Rob Clark
27cf3b0052 mesa/st: log some additional invalid-fbo cases
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-25 17:08:22 -04:00
Rob Clark
2c8674f5a9 freedreno: honor handle->offset
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-25 16:16:22 -04:00
Rob Clark
dfd23abdcc freedreno: disallow cat4 immed src
Normally this would never happen (constant-propagation in NIR would
eliminate the instruction), except it does happen for 'undef' which
we turn into immed 0.0 for bookkeeping purposes.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-25 16:16:21 -04:00
Rob Clark
76c6cdd36a freedreno/a4xx: add render-target formats
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-25 16:16:21 -04:00
Rob Clark
7add166a5c freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-25 16:16:21 -04:00
Rob Clark
edcc6ce75d freedreno: reduce line width for deqp further
See a7eb12d0.. but that wasn't restrictive enough.  Fixes
dEQP-GLES3.functional.rasterization.primitives.line_strip_wide, and
similar

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-25 16:16:21 -04:00
Rob Clark
4610e5ef28 freedreno/ir3: fix sin/cos
We seem to need range reduction to get sane results.  Fixes glmark2
jellyfish bench, and a whole bunch of
dEQP-GLES3.functional.shaders.builtin_functions.precision.{sin,cos,tan}.*

v2: squashed in android build fixes from Rob Herring

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-25 16:16:21 -04:00
Kenneth Graunke
21b4bcdd05 i965: Unroll SIMD16 DDY_FINE on Sandybridge.
This fixes 10 dEQP-GLES3 subtests:
dEQP-GLES3.functional.shaders.derivate.dfdy.texture.float_nicest.*.

Matt noticed that our Piglit tests for this use even numbered registers,
while the failing dEQP tests use odd numbered registers.  We believe
that it works for even numbered registers, but not otherwise.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-25 13:13:00 -07:00
Brian Paul
e915903c10 docs: update the instructions for getting a git account
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-25 14:10:40 -06:00
Brian Paul
ef3f00edd8 docs: update link to Intel's graphics website
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-25 14:10:40 -06:00
Jordan Justen
50b82ecd77 mesa/gles: Allow format GL_RED to be used with MESA_FORMAT_R_UNORM
If the bound framebuffer has a format of MESA_FORMAT_R_UNORM, then
IMPLEMENTATION_COLOR_READ_FORMAT will return GL_RED. This change
applies to OpenGLES contexts where additional restrictions are placed
on the formats that are allowed to be supported.

Fixes OpenGLES 3.1 CTS tests:
 * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC16
 * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC16Linear
 * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC32F
 * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC32FLinear

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-25 12:09:15 -07:00
Charmaine Lee
c4cb879f00 svga: eliminiate unnecessary constant buffer updates
Currently if the texture binding is changed, emit_fs_consts()
is triggered to update texture scaling factor for
rectangle texture or texture buffer size in the constant buffer.
But the update is only relevant if the texture binding includes
a rectangle texture or a texture buffer.

To eliminate the unnecessary constant buffer updates due to other texture
binding changes, a new flag SVGA_NEW_TEXTURE_CONSTS will be used
to trigger fragment shader constant buffer update when a rectangle texture
or a texture buffer is bound.

With this patch, the number of constant buffer updates in Lightsmark2008
reduces from hundreds per frame to about 28 per frame.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-25 12:59:29 -06:00
Charmaine Lee
686cd3c606 svga: mark the texture dirty for write transfer map only
Instead of unconditionally mark the texture subresource dirty at transfer map,
we'll set the dirty bit for write transfer only.

Tested with lightsmark2008 and glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-25 12:59:29 -06:00
Charmaine Lee
676931640f svga: fix assert with PIPE_QUERY_OCCLUSION_PREDICATE for non-vgpu10
With this patch, when running in hardware version 11, we'll use
SVGA3D_QUERYTYPE_OCCLUSION query type for PIPE_QUERY_OCCLUSION_PREDICATE
and return TRUE if samples-passed count is greater than 0.

Fixes glretrace/solidworks2012_viewport running in hardware version 11.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-25 12:59:29 -06:00
Charmaine Lee
d7a6c1a476 svga: minimize surface flush
Currently, we always do a surface flush when we try to establish
a synchronized write transfer map. But if the subresource has not
been modified, we can skip the surface flush. In other words,
we only need to do a surface flush if the to-be-mapped subresource
has been modified in this command buffer.

With this patch, lightsmark2008 shows about 15% performance improvement.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-25 12:59:29 -06:00
Frederic Devernay
23949cdf2c glapi: fix _glapi_get_proc_address() for mangled function names
In the dispatch table, all functions are stored without the "m" prefix.
Modify code so that OSMesaGetProcAddress works both with gl and mgl
prefixes. Similar to
https://lists.freedesktop.org/archives/mesa-dev/2015-September/095251.html

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94994
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-25 12:59:29 -06:00
Brian Paul
63df017fda util/blitter: use ARRAY_SIZE macro
And remove local definition of Elements() macro.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-25 12:59:29 -06:00
Brian Paul
e0184b3995 svga: s/Elements/ARRAY_SIZE/
Standardize on the later macro rather than a mix of both.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-04-25 12:59:29 -06:00
Brian Paul
77e4b41671 svga: whitespace and formatting fixes in svga_pipe_rasterizer.c 2016-04-25 12:59:29 -06:00
Brian Paul
25e0d3659f svga: whitespace and formatting fixes in svga_pipe_depthstencil.c 2016-04-25 12:59:29 -06:00
Brian Paul
595fbc8dee svga: whitespace and formatting fixes in svga_pipe_sampler.c 2016-04-25 12:59:29 -06:00
Brian Paul
1db8313168 gallium/util: initialize pipe_framebuffer_state to zeros
To silence a valgrind uninitialized memory warning.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94955
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-25 12:59:29 -06:00
Brian Paul
1e990978ee util/cache: add comments, fix formatting 2016-04-25 12:59:29 -06:00
Kenneth Graunke
4e2d22c5a7 i965: Mark URB reads as volatile.
They can be affected by URB writes.

In the upcoming scalar TCS backend, this prevents read-modify-write
cycles from being broken by CSE removing reads.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-04-25 11:45:15 -07:00
Kenneth Graunke
501bedffa6 i965: Make a few tessellation related functions non-static.
Also, move them to brw_shader.cpp so they're in a location for code
used by both the vec4 and fs worlds.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-04-25 11:44:48 -07:00
Brian Paul
464d6080c6 svga: separate HUD counters for state objects
Count depth/stencil, blend, sampler, etc. state objects separately
but just report the sum for the HUD.  This change lets us use gdb to
see the breakdown of state objects in more detail.

Also, count sampler views too.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-04-25 09:45:16 -06:00
Robert Foss
b87856d25d st/omx: Fix resource leak on OMX_ErrorNone
Avoid leaking buffer allocated for task if an error has occured.

Coverity id: 1213929
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 15:09:37 +01:00
Jonathan Gray
3c8f9ed9b7 isl: remove ffs function that conflicts with system headers
Remove a wrapper around __builtin_ffs that conflicts with system
headers on OpenBSD and perhaps elsewhere:

isl_priv.h:44: error: conflicting types for 'ffs'

v2: include strings.h to ensure prototype is found

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 15:06:46 +01:00
Grazvydas Ignotas
dc732a8ef2 gallium: use unreachable instead of asserts
Avoids warnings in release builds.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 12:23:34 +02:00
Grazvydas Ignotas
d14778656b anv: fix warnings in release build
Mark variables MAYBE_UNUSED to avoid unused-but-set-variable warnings
in release build.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 12:23:31 +02:00
Grazvydas Ignotas
ff48375a16 isl: fix warnings in release build
Mark variables MAYBE_UNUSED to avoid unused-but-set-variable warnings
in release build.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 12:23:28 +02:00
Grazvydas Ignotas
29d2c0e9e6 spirv: fix warning in release build
Mark variable MAYBE_UNUSED to avoid unused-but-set-variable warning in
release build.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 12:23:25 +02:00
Grazvydas Ignotas
cbb0d4ad75 gallium: fix warnings in release build
Mark variables MAYBE_UNUSED to avoid unused-but-set-variable warnings
in release build.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 12:23:21 +02:00
Grazvydas Ignotas
bbeb9ab2f7 glsl: fix warning in release build
Mark variable MAYBE_UNUSED to avoid unused-but-set-variable warning in
release build.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 12:23:16 +02:00
Grazvydas Ignotas
e4fc06a2f8 util: add MAYBE_UNUSED for config dependent variables
This is mostly for variables that are only used in asserts and cause
unused-but-set-variable warnings in release builds. Could just use
UNUSED directly, but MAYBE_UNUSED should be less confusing and is
similar to what the Linux kernel has.

And yes __attribute__((unused)) can be used on variables on both GCC 4.2
(oldest supported by mesa) and clang 3.0 (just some random old version,
not sure what's the minimum for mesa).

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-25 12:23:10 +02:00
Hans de Goede
787a53988c nouveau: codegen: combineLd/St do not combine indirect loads
combineLd/St would combine, i.e. :

st  u32 # g[$r2+0x0] $r2
st  u32 # g[$r2+0x4] $r3

into:

st  u64 # g[$r2+0x0] $r2d

But this is only valid if r2 contains an 8 byte aligned address,
which is not guaranteed for compute shaders

This commit checks for src0 dim 0 not being indirect when combining
loads / stores as combining indirect loads / stores may break alignment
rules.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-25 11:45:07 +02:00
Rob Clark
0831eb94b9 freedreno/ir3: relax restriction in grouping
Currently we were two restrictive, and would insert an output move in
cases like: MOV OUT[0], IN[0].xyzw

Loosen the restriction to allow the current instruction to appear in the
neighbor list but only at it's current possition.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-24 13:40:57 -04:00
Rob Clark
36c9ea6e79 freedreno/ir3: fix small memory leak
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-24 13:40:57 -04:00
Rob Clark
610837fb98 freedreno/ir3: fix small RA bug
Normally the offset in the group would be the same, but not always.  For
example, in a sam(w) which only writes the 4th component.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-24 13:40:57 -04:00
Rob Clark
adf795432f freedreno/a4xx: better workaround for astc+srgb
This *seems* like a hw bug, and maybe only applies to certain a4xx
variants/revisions.  But setting the SRGB bit in sampler view state
(texconst0) causes invalid alpha for ASTC textures.  Work around this
setting up a second texture state and using that to sample alpha
separately.

This way, srgb->linear conversion happens in hw *prior* to
interpolation.

This fixes 546 dEQP tests: dEQP-GLES3.functional.texture.*astc*srgb*

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-24 13:40:57 -04:00
Rob Clark
a148300b13 Revert "freedreno/a4xx: lower srgb in shader for astc textures"
Better workaround in the following patch.

This reverts commit 899bd63ace.
2016-04-24 13:40:57 -04:00
Rob Clark
19118e6f47 freedreno/a4xx: blend state no longer depends on fb state
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-24 13:40:57 -04:00
Marek Olšák
c0c6ca40a2 Revert "st/dri: add 32-bit RGBX/RGBA formats"
This reverts commit ccdcf91104.

It breaks most KDE apps, because DRI doesn't support the RGBA component
ordering.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95071
2016-04-24 15:16:07 +02:00
Jonathan Gray
147a2d25ad genxml: use PYTHON3
Allows the build to work when the python3 binary is not "python3".

v2: remove x bit from the script at Emil's suggestion

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-23 16:45:05 -07:00
Nanley Chery
710b1d2e66 i965/tex_image: Flush certain subnormal ASTC channel values
When uploading a linear, void-extent, ASTC LDR block on Skylake, we are
required to flush to zero the UNORM16 channel values that would be
denormalized. This is specifically required for the values: 1, 2, and 3.

Fixes the 14 failing tests in:
   dEQP-GLES3.functional.texture.compressed.astc.void_extent_ldr.*

v2: Split out flushing function (Kristian Høgsberg)
v3: Map with READ instead of INVALIDATE (Kenneth Graunke)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-23 11:35:08 -07:00
Jonathan Gray
e29b3bfd6e configure.ac: search for and set PYTHON3
src/intel/genxml/gen_pack_header.py requires python3.

v2: check for python3.5 as well

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-23 01:06:20 -07:00
Topi Pohjolainen
f8dd07a2c3 i965/blorp: Enable for buffer resolves
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94181

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-23 07:29:15 +03:00
Topi Pohjolainen
c7cf17ae75 i965/blorp: Enable for normal color clears
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-23 07:29:15 +03:00
Topi Pohjolainen
c4ec0121a8 i965/blorp: Fix clear code for ignoring colormask for XRGB formats on Gen9+
This is equivalent of 73b01e2711
for blorp.

v2 (Ken): No need to call _mesa_format_has_color_component() now
          that the number of components is gotten from
          _mesa_base_format_component_count().

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-23 07:29:15 +03:00
Topi Pohjolainen
19948f1bf6 mesa/formats: Take luminance into account in component count
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-23 07:29:15 +03:00
Topi Pohjolainen
9e153c0692 i965/blorp: Do not trigger re-emission of base state address
In case blorp needs to configure it will be just as if render or
compute pipeline had configured it.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-23 07:28:58 +03:00
Topi Pohjolainen
84db9ca3f7 i965/blorp: Reconfigure base state address only if needed
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-23 07:09:39 +03:00
Topi Pohjolainen
234b5f23f8 i965/blorp: Use BRW_NEW_BLORP instead of trashing all state bits
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-23 07:09:39 +03:00
Kenneth Graunke
6d5ce1b043 i965: Make all atoms to track BRW_NEW_BLORP by default
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com
2016-04-23 07:09:39 +03:00
Topi Pohjolainen
65a5af6dd0 i965: Introduce state flag for blorp
In the past, BLORP has clobbered all BRW_NEW_* state flags, to trigger
re-emission of the entire 3D pipeline on the next draw.  However, there
are some packets BLORP simply leaves alone, so there's no need to
re-emit them.  Trying to reduce the set of dirty bits flagged after
BLORP runs is tricky.

Instead, we introduce a BRW_NEW_BLORP flag.  This should be set on any
atom which emits a packet that BLORP also emits.  When BLORP runs, it
will flag BRW_NEW_BLORP, causing those packets to get re-emitted.

This also makes it easy to avoid re-emitting specific atoms - we can
simply drop the BRW_NEW_BLORP flag on those.

To start, we assume that all packets need to be re-emitted.  This is the
safest approach and closest to the existing code's behavior.  Many of
these are obviously not required, and can be dropped in subsequent
patches.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-23 07:09:39 +03:00
Topi Pohjolainen
0e850452d1 i965/blorp/gen6: Use normal base state address setup
This is identical to the blorp version which only differs in case
fragment shader isn't used. In that case blorp would reset batch
buffer address to zero.
This is not really needed, and having blorp to use base state
address setup that is compatible with normal upload allows one to
skip resetting it.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-23 07:09:39 +03:00
Topi Pohjolainen
ae73e86497 i965: Remove pointers to non-existing atoms
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-23 07:09:39 +03:00
Tom Stellard
9f110a9e10 radeonsi: Implement ddx/ddy on VI using ds_bpermute
The ds_bpermute instruction allows threads to transfer data directly
to or from the vgprs of other threads.  These instructions use the LDS
hardware to transfer data, but do not read or write LDS memory.

DDX BEFORE:                        |  DDX AFTER:
                                   |
v_mbcnt_lo_u32_b32_e64 v2, -1, 0   |  v_mbcnt_lo_u32_b32_e64 v2, -1, 0
v_mbcnt_hi_u32_b32_e64 v2, -1, v2  |  v_mbcnt_hi_u32_b32_e64 v2, -1, v2
v_lshlrev_b32_e32 v4, 2, v2        |  v_and_b32_e32 v2, 60, v2
v_and_b32_e32 v2, 60, v2           |  v_lshlrev_b32_e32 v2, 2, v2
v_lshlrev_b32_e32 v3, 2, v2        |  ds_bpermute_b32 v3, v2, v0
s_mov_b32 m0, -1                   |  ds_bpermute_b32 v0, v2, v0 offset:4
ds_write_b32 v4, v0                |  s_waitcnt lgkmcnt(0)
s_waitcnt lgkmcnt(0)               |
v_or_b32_e32 v0, 1, v2             |
v_lshlrev_b32_e32 v0, 2, v0        |
ds_read_b32 v1, v3                 |
ds_read_b32 v0, v0                 |
s_waitcnt lgkmcnt(0)               |
                                   |
LDS: 1 blocks                      |  LDS: 0 blocks

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2016-04-22 23:48:43 +00:00
Tom Stellard
128267d781 radeonsi: Use llvm.amdgcn.mbcnt.* intrinsics instead of llvm.SI.tid
We're trying to move to more of the new style intrinsics with include
the correct target name, and map directly to ISA instructions.

v2:
  - Only do this with LLVM 3.8 and newer.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-22 23:48:43 +00:00
Tom Stellard
d3427412a3 radeonsi: Set range metadata on calls to llvm.SI.tid
The range metadata tells LLVM the range of expected values for this intrinsic,
so it can do some additional optimizations on the result.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-22 23:48:41 +00:00
Tom Stellard
b31422d970 radeonsi: Create a helper function for computing the thread id
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-22 23:45:34 +00:00
Nanley Chery
86cd9a134f i965: Disable KHR_texture_compression_astc_hdr on Gen9
Although Gen9 samples from most HDR ASTC surfaces of correctly,
there currently are no software workarounds to fix the incorrect
sampling that occurs in others of certain color endpoint modes.

With this change, we are no longer failing the 14 tests from:
   dEQP-GLES3.functional.texture.compressed.astc.endpoint_value_hdr_cem_15.*

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-22 16:57:38 -07:00
Tim Rowley
ec089cd987 swr: [rasterizer memory] Constify load tiles
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-22 18:49:20 -05:00
Tim Rowley
6facf4b74a swr: [rasterizer core] CompleteDrawContext changes for gcc
Add explicit inline and non-inline versions of CompleteDrawContext
to make gcc happy.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-22 18:49:04 -05:00
Tim Rowley
0487377dce swr: [rasterizer] Small cleanups
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-22 18:48:56 -05:00
Tim Rowley
2c4c3c9c71 swr: [rasterizer scripts] Knob scripts tweaks
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-22 18:48:47 -05:00
Tim Rowley
ef293ee9c0 swr: [rasterizer] Interpolation utility functions
v2: use _mm_cmpunord_ps for vIsNaN

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-22 18:48:38 -05:00
Tim Rowley
27cc5924ea swr: [rasterizer core] TemplateArgUnroller
Switch boolean template arguments to typename template arguments of type
std::integral_constant<bool, VALUE>.

This allows the template argument unroller to easily be extended to enums.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-22 18:48:29 -05:00
Tim Rowley
46a448d161 swr: [rasterizer core] Arena: make most allocated blocks the same size
Reduces sorting cost

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-22 18:48:20 -05:00
Tim Rowley
794be41f91 swr: [rasterizer core] Fix global arena allocator bug
- Plus some minor code refactoring

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-22 18:48:11 -05:00
Tim Rowley
e42f00ee39 swr: [rasterizer core] Fix thread binding for 32-bit windows
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-22 18:47:59 -05:00
Tim Rowley
cd21f90ecf swr: [rasterizer fetch] Add support for fetching non-uniform component formats
For example, R10G10B10A2_UNORM.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-22 18:47:48 -05:00
Tim Rowley
244ae7af1b swr: [rasterizer core] Use CS spill/fill size in core
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-22 18:47:02 -05:00
Tim Rowley
ee9621e2f5 swr: fix memory leaks from vs/fs compilation
v2: varient -> variant

Reviewed by: George Kyriazis <George.Kyriazis@intel.com>
2016-04-22 18:05:02 -05:00
Tim Rowley
5815c8b3d3 swr: fix clang warnings
v2: use alternate logic version in swr_check_render_cond

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-22 18:03:41 -05:00
Rob Clark
e85bef8b12 freedreno/a4xx: fix encoding of blend color state
Fixes a whole bunch of dEQP-GLES3.functional.fragment_ops.random.* (now
they all pass)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-22 15:00:34 -04:00
Rob Clark
23abc41d2b freedreno: update generated headers
Pull in RB_BLEND_* fixes.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-22 15:00:34 -04:00
Eric Anholt
79b36168e0 vc4: Make sure we recompile when sample_mask changes.
Part of fixing piglit EXT_framebuffer_multisample/sample-coverage inverted
(there is also a bug with RCL tiled blits)

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-22 11:27:11 -07:00
Eric Anholt
876c647194 vc4: Fix validation of full res tile offset if used for non-MSAA.
There's no reason we couldn't do non-MSAA full resolution tile buffer
load/stores, but we would have claimed buffer overflow was being
attempted.  Nothing does this currently.
2016-04-22 11:27:11 -07:00
Eric Anholt
3fecaf0d0c vc4: Only do MSAA FB operations if the FB is MSAA.
I noticed this as a problem with ET:QW traces emitting coverage code when
the framebuffer was supposed to be single sampled.
2016-04-22 11:27:11 -07:00
Eric Anholt
1410403e1e vc4: Fix tests for format supported with nr_samples == 1.
This was a bug from the MSAA enabling.  Tests for surfaces with
nr_samples==1 instead of 0 (generally GL renderbuffers) would incorrectly
fail out.

Fixes the ARB_framebuffer_sRGB piglit tests other than srgb_conformance.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-22 11:27:11 -07:00
Eric Anholt
6eabdb8959 vc4: Don't try to blit from MSAA surfaces with mismatched width to dst.
I had made the previous blit fix non-MSAA only because I was thinking
about how the hardware infers stride from the RENDERING_CONFIG packet.
However, I'm also inferring the stride for both MSAA src and dst in
vc4_render_cl.c from the width argument in the ioctl.

Fixes 15 EXT_framebuffer_multisample piglit tests.
2016-04-22 11:27:11 -07:00
Kenneth Graunke
42dea145d9 i965: Disable channel expressions for scalar GS, TCS, TES.
On Broadwell, I get the following shader-db statistics:

Tessellation Control Shaders:

   total instructions in shared programs: 57327 -> 57012 (-0.55%)
   instructions in affected programs: 27334 -> 27019 (-1.15%)
   helped: 45
   HURT: 0

   total cycles in shared programs: 265692 -> 255188 (-3.95%)
   cycles in affected programs: 263122 -> 252618 (-3.99%)
   helped: 184
   HURT: 26

Tessellation Evaluation Shaders:

   total instructions in shared programs: 23236 -> 23157 (-0.34%)
   instructions in affected programs: 2791 -> 2712 (-2.83%)
   helped: 27
   HURT: 0

   total cycles in shared programs: 151858 -> 149704 (-1.42%)
   cycles in affected programs: 151858 -> 149704 (-1.42%)
   helped: 101
   HURT: 114

Geometry Shaders:

   Orbital Explorer goes from 6442 -> 6356 instructions.
   Two Shadow of Mordor shaders increase by a single instruction.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-22 10:26:30 -07:00
Topi Pohjolainen
1883613a24 i965/blorp: Add support for 2x msaa
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-22 17:02:29 +03:00
Topi Pohjolainen
125a7fdf32 i965/blorp: Add support for encoding/decoding interleaved 2x msaa
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-22 17:01:29 +03:00
Samuel Iglesias Gonsálvez
f70cacc4bd i965: don't lower mod() in glsl ir
NIR will lower it in nir_opt_algebraic.

No change in shader-db.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-22 13:44:28 +02:00
Timothy Arceri
72b5d00c9c glsl: fix cross validation for explicit locations on structs and arrays
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-22 20:59:57 +10:00
Nicolai Hähnle
39e9cf6cb1 radeonsi: implement TGSI_SEMANTIC_HELPER_INVOCATION
Depends on LLVM support introduced in r267102.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-21 23:14:04 -05:00
Ilia Mirkin
2bac561787 swr: ignore generated files in rasterizer
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-04-22 00:07:25 -04:00
Ilia Mirkin
88ca4a43a2 nvc0: fix retrieving query results into buffer for timestamps
The timestamps are stored in a funny place, and even though they are a
64-bit result, are not stored with is64bit. Account for that when
retrieving the query result into a resource.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
2016-04-22 00:06:49 -04:00
Jason Ekstrand
541e6c0500 i965/surface_state: Use libisl functions for image format lowering
This lets us delete some redundant code and keep all of the
image_load_store format lowering logic in one place: libisl.

Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
e53cabe730 i965/fs_surface_builder: Use isl instead of mesa for format info
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
1831fa104c i965/fs_surface_builder: Add a helper for converting GL to ISL formats
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
24bb75049b i965/fs_surface_builder: Explicitly handle FORMAT_NONE in num_image_coordinates
Previously, we were relying on has_matching_typed_format returning true for
MESA_FORMAT_NONE which, in turn, relied on _mesa_get_format_bytes returning
1 for MESA_FORMAT_NONE.  When we switch to ISL, this behaviour will no
longer be something we can rely on.

Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
f310c02b94 i965/fs_surface_builder: Take a GL format enum instead of mesa_format
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
2980507a19 isl/format: Add a get_num_channels helper
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
3415cf5f2f isl/format: Add more isl_format_has_type_channel functions
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
a4c04dd410 isl/format: Break the guts of has_[us]int_channel into a helper
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
ca8c5993bf anv/image: Use the has_matching_typed_storage_image_format helper from isl
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
65bd8317e2 isl: Add a helper for determining when a typed load/store can be used
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
90576ac963 isl: Take a devinfo in lower_storage_image_format instead of an isl_device
We want to call this function from the shader compiler and having a full
isl_device available at that point isn't practical.

Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
37f6f21b1f isl: Don't use designated initializers in the header
C++ doesn't support designated initializers and g++ in particular doesn't
handle them when the struct gets complicated, i.e. has a union.

Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
2785840586 isl: Include c99_compat.h
We need the restrict keyword in isl.h

Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Jason Ekstrand
ef5dca2034 i965: Add a dependency on libisl
To avoid build issues, ensure that you're running `make' at the top level
and/or you've executed `make clean' beforehand.

Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-21 20:44:27 -07:00
Nicolai Hähnle
fe3b1e1448 radeon: handle query buffer allocation and mapping failures
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94984
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-21 22:33:12 -05:00
Nicolai Hähnle
b222580578 radeon: wire end_query return value to sw/hw_end
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-21 22:33:07 -05:00
Nicolai Hähnle
71f33a6f69 st/mesa: check return value of begin/end_query
They can only indicate out of memory conditions, since the other error
conditions are caught earlier.

v2: fix error message in EndQuery

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-04-21 22:33:03 -05:00
Nicolai Hähnle
32214e0c68 gallium: add bool return to pipe_context::end_query
Even when begin_query succeeds, there can still be failures in query handling.
For example for radeon, additional buffers may have to be allocated when
queries span multiple command buffers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-21 22:32:50 -05:00
Ben Widawsky
6a0d036483 i965: Always use Y-tiled buffers on SKL+
Starting with Skylake, the display engine is capable of scanning out from
Y-tiled buffers. As such, we can and should use Y-tiling for better efficiency.
This also has the added benefit of being able to fast clear the winsys buffer.

Note that the buffer allocation done for mipmaps will already never allocate an
X-tiled buffer for GEN9.

This has an almost universal positive impact on benchmarks, some improving by as
much as 20%.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 20:14:58 -07:00
Marek Olšák
c3b88cc2c1 softpipe: fix a warning due to an incorrect enum comparison
no change in behavior, because both are defined the same

Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-22 01:30:39 +02:00
Marek Olšák
c9e5a7df61 gallium: remove helpers converting to/from TGSI_PROCESSOR_*
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-22 01:30:39 +02:00
Marek Olšák
af249a7da9 gallium: use PIPE_SHADER_* everywhere, remove TGSI_PROCESSOR_*
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-22 01:30:39 +02:00
Marek Olšák
fb523cb6ad gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_*
Use PIPE_SWIZZLE_* everywhere.
Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE.
The new enum is called pipe_swizzle.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-22 01:30:39 +02:00
Marek Olšák
ed23335a31 gallium: use enums in p_shader_tokens.h (v2)
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1)
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Acked-by: Jose Fonseca <jfonseca@vmware.com> (v1)

v2: name enums
2016-04-22 01:30:36 +02:00
Marek Olšák
0135bd44c2 gallium: use enums in p_defines.h (v2)
and remove number assignments which are consecutive

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1)
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Acked-by: Jose Fonseca <jfonseca@vmware.com> (v1)

v2: name enums
2016-04-22 01:30:34 +02:00
Marek Olšák
8cfc4cf76d radeonsi: remove the shader parameter from si_set_ring_buffer
not used anymore

this is a follow-up to the RW buffer cleanup.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-22 01:14:14 +02:00
Marek Olšák
3cbd8cfc7a radeonsi: decrease GS copy shader user SGPRs to 2
const buffers are no longer used since the clip plane const buffer was
moved to RW buffers

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:14 +02:00
Marek Olšák
3acaefb1bb radeonsi: shorten slot masks to 32 bits
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:14 +02:00
Marek Olšák
0954d5e982 radeonsi: clean up shader resource limit definitions
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:14 +02:00
Marek Olšák
3138a28ff2 radeonsi: move default tess level constant buffer to RW buffers
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:14 +02:00
Marek Olšák
302bec24bd radeonsi: move sample positions constant buffer to RW buffers
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:13 +02:00
Marek Olšák
860b658b97 radeonsi: move clip plane constant buffer to RW buffers
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:13 +02:00
Marek Olšák
698821bda3 radeonsi: rework polygon stippling to use constant buffer instead of texture
add it to the RW_BUFFERS descriptor array

now the slot masks don't have to have 64 bits

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:13 +02:00
Marek Olšák
bb1e647ada radeonsi: generalize si_set_constant_buffer
this will be used in the next commit

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:13 +02:00
Marek Olšák
36261c29cd radeonsi: make RW buffer descriptor array global, not per shader stage
v2: also simplify invalidation of RW buffer bindings (squashed)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:13 +02:00
Marek Olšák
1378487fb4 radeonsi: rename and rearrange RW buffer slots
- use an enum
- use a unique slot number regardless of the shader stage
  (the per-stage slots will go away for RW buffers)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:13 +02:00
Roland Scheidegger
4ff8cbb0d8 gallivm: fix bogus argument order to lp_build_sample_mipmap function
Screwed up since 0753b135f6.

(Only an issue with different min/mag filters, and then only in some cases,
which is probably why it went unnoticed for quite a while.
The effect should have simply been nearest mip filter instead of linear, iff
min was nearest, mag was linear, and all pixels hit the mignifying path.)

Fixes a bunch of dEQP failures.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-21 23:57:24 +02:00
Kenneth Graunke
73b01e2711 i965: Fix clear code for ignoring colormask for XRGB formats on Gen9+.
In commit cda886a485, Neil made us stop
advertising RGBX formats on Gen9+, as the hardware apparently no longer
has working fast clear support for those formats.  Instead, we just
fall back to RGBA formats, and use SCS to override alpha to 1.0.

This is fine, but had one unintended side effect: it made us fall back
to slow clears when the color mask disables alpha.  Normally, we ignore
the color mask for non-existent channels.  This includes alpha for XRGB
formats as writing garbage to the X channel is harmless.  But, now that
we use RGBA, we think there's a real alpha channel, and can't do the
optimization.

To hack around this, check if _BaseFormat is GL_RGB and ignore alpha.

Improves WebGL Aquarium performance on Skylake GT3e by about 50%
by letting it use repclears instead of slow clears.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-21 12:01:49 -07:00
Iago Toral Quiroga
bdaa0e12a2 i965/blorp: Improve precission of blitting coordinates when clipping
We do this in two steps: first we clip the dst rect and adjust the src
rect accordingly. Then we do it the other way around. In both passes
the adjustment part involves multiplying by a scale factor that can lead
to a small precision loss. This is breaking a few dEQP tests.

Specifically, the problem happens when we need to clip the same coordinate
twice. For example, if srcX0 and dstX0 need both to be clipped we want to
avoid the situation where we clip srcX0 first, then adjust dstX0 accordingly
but then we realize that the resulting dstX0 still needs to be clipped, so
we clip dstX0 and adjust srcX0 again. Each of these two passes can lead
to precission loss. What we want to do here is detect the rect that leads
to the largest clip (accounting for the scale factor involved), clip that
rect and adjust the other one. With this we ensure that the adjusted
coordinate does not need to be clipped again and we can skip a second pass,
improving precision.

Fixes the following 4 dEQP tests:
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_x_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_x_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_x_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_x_linear

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2016-04-21 10:43:39 -07:00
Bas Nieuwenhuizen
38f4cee3ff radeonsi: Add config parameter to si_shader_apply_scratch_relocs.
shader->config is not updated for compute kernels.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2016-04-21 19:36:19 +02:00
Matt Turner
1bc983cd64 glsl: Relax GLSL 1.10 float suffix error to a warning.
Float suffixes are allowed in all subsequent GLSL specifications, and
it's obvious what the user meant if they specify one. Accept it with a
warning to avoid breaking applications, like Planeshift (although it
looks like between 0.6.1 and 0.6.3 they might have removed the suffixes
from their shaders).

Reviewed-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:33:08 -07:00
Matt Turner
33565d6764 i965/fs: Readd opt_drop_redundant_mov_to_flags().
This reverts commit b449366587.

I removed the pass thinking that it was now not useful, but that was not
true. I believe I ran shader-db on HSW and saw no results, but HSW does
not use the unlit centroid workaround code and as a result does not emit
redundant MOV_DISPATCH_TO_FLAGS instructions.

On IVB, the shader-db results are:

total instructions in shared programs: 6650806 -> 6646303 (-0.07%)
instructions in affected programs: 106893 -> 102390 (-4.21%)
helped: 793

total cycles in shared programs: 56195538 -> 56103720 (-0.16%)
cycles in affected programs: 873048 -> 781230 (-10.52%)
helped: 553
HURT: 209

On SNB, the shader-db results are:

total instructions in shared programs: 7173074 -> 7168541 (-0.06%)
instructions in affected programs: 119757 -> 115224 (-3.79%)
helped: 799

total cycles in shared programs: 98128032 -> 98072938 (-0.06%)
cycles in affected programs: 1437104 -> 1382010 (-3.83%)
helped: 454
HURT: 237

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-21 10:32:40 -07:00
Topi Pohjolainen
0020ca3c92 i965/blorp: Do not emit pma stall on gen9+
This was left out from the original gen8 upload introduction.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 20:18:51 +03:00
Tim Rowley
81c1c481ed swr: add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT to get_param
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-21 11:32:09 -05:00
Emil Velikov
9dcb3dfb23 i965: automake: remove gratuitous "+" during variable assignment
There is not initial assignment, thus appending to it does not work.

Fixes: b27c85c4c0 "i965: add build rule for brw_nir_trig_workarounds.c"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-21 16:48:34 +01:00
Rob Herring
1ba203a085 gbm: add GBM_FORMAT_XBGR8888 format support
Add GBM_FORMAT_XBGR8888/__DRI_IMAGE_FORMAT_XBGR8888 format support which
is needed for Android.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-04-21 14:45:56 +01:00
Rob Herring
ccdcf91104 st/dri: add 32-bit RGBX/RGBA formats
Add support for 32-bit RGBX/RGBA formats which are preferred for Android.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-04-21 14:45:53 +01:00
Rob Herring
3b69076435 dri/common: add MESA_FORMAT_R8G8B8{A8, X8}_UNORM formats as supported configs
Add MESA_FORMAT_R8G8B8A8_UNORM and MESA_FORMAT_R8G8B8X8_UNORM formats as
these are the preferred formats for Android.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-04-21 14:45:21 +01:00
Rob Herring
b27c85c4c0 i965: add build rule for brw_nir_trig_workarounds.c on Android
Commit bfd17c76c1 ("i965: Port INTEL_PRECISE_TRIG=1 to NIR.") added a
generated file brw_nir_trig_workarounds.c which broke the Android build.
Add the necessary makefiles to the Android build.

Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-21 14:43:26 +01:00
Rob Herring
30239ba056 glsl: android: add back missing generated glcpp include path
Commit 4db8f15a25 ("glsl: move the android build scripts a level up")
dropped a generated include path for glcpp. Add it back adjusting for the
new location.

Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-21 14:43:21 +01:00
Jonathan Gray
28e3ae344b loader: add a libdrm case for loader_get_device_name_for_fd
Use dev_node_from_fd() with HAVE_LIBDRM to provide an implmentation
of loader_get_device_name_for_fd() for non-linux systems that
use libdrm but don't have udev or sysfs.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-21 14:41:41 +01:00
Jonathan Gray
5d09394fb1 i965/tiled_memcpy: don't unconditionally use __builtin_bswap32
Use the defines Mesa configure sets to indicate presence of the bswap32
builtins.  This lets i965 work on OpenBSD again after the changes that
were made in 0a5d8d9af4.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-21 14:41:41 +01:00
Jonathan Gray
9bbf3737f9 egl/x11: authenticate before doing chipset id ioctls
For systems without udev or sysfs that use drm ioctls in the loader
drm authentication must take place earlier or the loader will fail
"MESA-LOADER: failed to get param for i915".

Patch from Mark Kettenis.

Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mark Kettenis <kettenis@openbsd.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
[Emil Velikov: remove gratuitous white-space]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-21 14:40:44 +01:00
Bas Nieuwenhuizen
4abe051a3f gallium/radeon: Silence possibly uninitialized variable warning.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-21 13:40:47 +02:00
Bas Nieuwenhuizen
51d1551241 winsys/amdgpu: Silence possibly uninitialized variable warning.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-21 13:40:42 +02:00
Bas Nieuwenhuizen
4d13c7c879 radeonsi: Enable loading into CE RAM.
We need to enable a bit in the CONTEXT_CONTROL packet for the
loads to work.

v2: Style issues.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-21 12:50:58 +02:00
Bas Nieuwenhuizen
f45f54e14a radeonsi: Use defines for CONTEXT_CONTROL instead of magic values.
v2: Use field names provided by Nicolai.
v3: Updated to use CONTEXT_CONTROL prefix.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-21 12:50:58 +02:00
Thomas Hindoe Paaboel Andersen
d4a21a0de0 winsys/amdgpu: fix preamble IB size
The missing break caused the IB size to be overwritten with
the size of IB_CONST.

This was introduced in: 7201230582

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-04-21 12:14:50 +02:00
Topi Pohjolainen
935ce14a44 i965/blorp: Reduce the urb size requirement for vertex buffer
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
26fdb7e51e i965/blorp: Reduce the size of vertex buffer
Previously the vertex buffer consisted of eight floats per vertex
of which six where constants. These can be as easily provided by
vertex fetcher as it is capable of filling vertex elements with
constant one and zero. This reduces the size of the vertex buffer
from 3 * 8 * 4 = 96 to 3 * 2 * 4 = 24 bytes.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
0ae360f098 i965/blorp: Do not tricker urb re-configuration unnecessarily
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
69dfb7b2b7 i965/blorp: Skip re-emitting urb config whenever possible
Otherwise clearing with blorp will regress performance in some
synthetic test cases.

v2: Used vsize >= 2 instead of vsize > 0, and updated the comment.
    Review by Ken in one of the earlier patches revealed this.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
7644e8ab68 i965/blorp: Prepare to switch from compute pipeline
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
aa322f8ae5 i965/blorp: Skip uploading state/options not needed for clears
In case there is no source it means the program does a simple
clear or a resolve. In such case there is no need to program
sampling state or enable pixel kill in fragment shader.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
87d333f2fe i965/blorp: Re-introduce clear programs
This partially reverts 2f28a0dc23

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
69c364f2dc i965/meta: Move check for srgb into is_color_fast_clear_compatible()
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
8a696e75d8 i965/meta: Expose check for fast clear compatibility
Also add the additional render format check to the same utility.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
a848ad6806 i965/meta: Expose fast clear value setup
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
fb14a2fc78 i965/meta: Expose non-fast clear rectangle calculation
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
9d79235e4e i965/meta: Expose resolve clear rectangle calculation
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
2757d723da i965/meta: Expose fast clear rectangle calculation
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
3ef957e783 i965: Declare input to mcs alignment calculation constant
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
c40b1efa70 i965/blorp: Switch the order of render and texture targets
On gen8 color resolving won't work anymore if the target isn't
the first entry in the binding table.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
0d062d79c3 i965/blorp: Reduce scope for generator and its inputs
Generator is only needed for getting the assembly.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
4c3de6b2d6 i965/blorp: Add support for disabling color blending
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
da5a477ce4 i965/blorp: Add support for setting fast clear operation
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
7de72f728b i965/blorp: Enable blits on gen8
v2 (Ken): Moved switch cases for gen8/9 in texel_fetch() to
          earlier patch adding gen8/9 sampling support.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
f7ab4e0cc4 i965/blorp: Prepare stencil sampling for gen8
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
708453952b i965/blorp: Add check for supported sample numbers
v2 (Ken): Fix the condition on using meta for stencil blits:
          use_blorp -> !use_blorp

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:01 +03:00
Topi Pohjolainen
9e4d19372b i965/blorp: Add support for sampling 3D textures
This patch adds additional MOV instruction for all blorp programs
that use SHADER_OPCODE_TXF. Alternative is to augment blorp program
key to tell if z-coordinate is needed, add condition to the blorp
blit compiler and to produce a variant with and without the MOV.
This seems a little overkill.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:01 +03:00
Topi Pohjolainen
6b33d63d77 i965/blorp: Add support for source swizzle
In order to support cases where gen9 uses RGBA format to back client
requested RGB, one needs to have means to force alpha channel to one
when user requested RGB surface is used as blit source.

v2 (Ken): Use helper for constructing the swizzle (this should be
          changed to use brw_get_texture_swizzle() as a follow-up).
          Also calculate the swizzle for CopyTexSubImage.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:01 +03:00
Topi Pohjolainen
52e7008a5a i965/blorp: Pipeline upload support for gen8
v2 (Ken): Drop GEN8_RASTER_FRONT_WINDING_CCW in raster state
          Add emission of pma stall.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:01 +03:00
Topi Pohjolainen
2fda441371 i965/gen8: Expose pma stall emission
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:19:30 +03:00
Topi Pohjolainen
8b2332e3d1 i965: Allow texture surface state setup to be used by blorp
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:42:10 +03:00
Topi Pohjolainen
0ad83d222b i965/blorp: Prepare sampling for gen9
v2 (Ken): Added switch cases for gen8/9 in texel_fetch(). These
          were wrongly introduced in blit-enabling patch.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:41:40 +03:00
Topi Pohjolainen
328ab6c268 i965/blorp: Prepare render target write for gen8
v2 (Ken): Use payload directly instead of retyping it into vec8.
          Drop the implied header, it isn't used for gen6+ anyway.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:40:33 +03:00
Topi Pohjolainen
135f00e666 i965/blorp/gen6: Prepare vertex buffer setup logic for gen8
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:37:06 +03:00
Topi Pohjolainen
395abb9c3b i965/blorp/gen7: Expose state setup applicable to gen8
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:36:53 +03:00
Topi Pohjolainen
ede09e672a i965/blorp: Use 8k chunk size for urb allocation
Previously, we hardcoded "VS URB Starting Address" to 2 (in 8kB chunks),
which meant VS URB data would start at an offset of 16kB.

However, on Haswell GT3 and Gen8+, we allocate the first 32kB for the
push constant region.  This means that the PS push constant and VS URB
data regions overlap, which can lead to corruption.

v2 (Ken): Better description of the change, and do not change vs_size
          from 2 to 1.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:36:26 +03:00
Topi Pohjolainen
e04b3cdf33 i965/blorp/gen7: Prepare re-using for gen8
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:36:14 +03:00
Topi Pohjolainen
f1ddfa8512 i965/blorp: Let compiler calculate the vertex buffer size
Currently the size is sizeof(float) times too large. One reserves
GEN6_BLORP_VBO_SIZE many floats whereas GEN6_BLORP_VBO_SIZE stands
for the size of vertex buffer in bytes.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:35:58 +03:00
Topi Pohjolainen
4c526370ca i965/gen8: Expose state base address setup
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:35:45 +03:00
Topi Pohjolainen
9949103756 i965/gen8: Expose surface state helpers
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:35:34 +03:00
Topi Pohjolainen
4f1d9f2879 i965/gen9: Use correct size for DS_STATE
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:32:12 +03:00
Roland Scheidegger
0295db2a8b glsl: add forgotten textureOffset function for sampler2DArrayShadow
This was part of EXT_gpu_shader4 - as such it should have been supported
by glsl 130.
It was however forgotten, and not added until glsl 430 - with the wrong
syntax no less (glsl 430 mentions it was overlooked).
glsl 440 (but revision 8 only) fixed this finally for good.
At least nvidia supports this with just version glsl version 1.30 as well
(the spec doesn't explicitly say it should be supported retroactively),
so just add this to the other glsl 130 textureOffset functions.

Passes a (hacked) piglit tex-miplevel-selection test (2DArrayShadow
textureOffset -auto) with llvmpipe.

v2: fix up comment (by Ian), add testing to commit message.

Reviewed-by: Dave Airlie <airlied@gmail.com>
2016-04-21 02:38:46 +02:00
Kenneth Graunke
d8c8f4203f i965: Fix interpolateAtSample() on single sampled buffers.
Fixes dEQP-GLES31.functional.shaders.multisample_interpolation tests:
- interpolate_at_sample.non_multisample_buffer.sample_n_default_framebuffer
- interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_rbo
- interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_texture

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 16:18:47 -07:00
Kenneth Graunke
447d3eec6a i965: Fix gl_SampleMaskIn[] in per-sample shading mode.
The coverage mask is not sufficient - in per-sample mode, we also need
to AND with a mask representing the samples being processed by the
current fragment shader invocation.

Fixes 18 dEQP-GLES31.functional.shaders.sample_variables tests:

sample_mask_in.bit_count_per_sample.multisample_{rbo,texture}_{1,2,4,8}
sample_mask_in.bit_count_per_two_samples.multisample_{rbo,texture}_{4,8}
sample_mask_in.bits_unique_per_sample.multisample_{rbo,texture}_{1,2,4,8}
sample_mask_in.bits_unique_per_two_samples.multisample_{rbo,texture}_{4,8}

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 16:18:47 -07:00
Kenneth Graunke
66a725570c i965: Only enable oMask output when there's a multisample FBO.
The ARB_sample_shading specification says that setting gl_SampleMask
bits to 0 means that the corresponding sample "should be considered
uncovered for the purposes of multisample fragment operations
(Section 4.1.3)."

The OpenGL 4.4 specification, section 17.3.3 ("Multisample Fragment
Operations") specifies:

"No changes to the fragment alpha or coverage values are made at this
 step if MULTISAMPLE is disabled, or if the value of SAMPLE_BUFFERS
 is not one."

oMask output alters coverage masks and can kill pixels.  We need to
disable it in the above case, which conveniently corresponds to
key->multisample_fbo being false.

Khronos bug #12188 also spells this out clearly:
https://cvs.khronos.org/bugzilla/show_bug.cgi?id=12188

Fixes two Piglit tests:
tests/spec/arb_sample_shading/builtin-gl-sample-mask-simple 0
tests/spec/arb_sample_shading/builtin-gl-sample-mask 0

Fixes 21 ES3 conformance tests:
ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_zero
ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_0
ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_1
ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_2
ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_3
ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_7
ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_zero
ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_3
ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_4
ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_5
ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_7
ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_zero
ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_2
ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_3
ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_4
ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_6
ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_zero
ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_0
ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_2
ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_5
ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_7

Fixes 9 dEQP-GLES31.functional.shaders.sample_variables tests:
sample_mask.discard_half_per_pixel.default_framebuffer
sample_mask.discard_half_per_pixel.singlesample_rbo
sample_mask.discard_half_per_pixel.singlesample_texture
sample_mask.discard_half_per_sample.default_framebuffer
sample_mask.discard_half_per_sample.singlesample_rbo
sample_mask.discard_half_per_sample.singlesample_texture
sample_mask.discard_half_per_two_samples.default_framebuffer
sample_mask.discard_half_per_two_samples.singlesample_rbo
sample_mask.discard_half_per_two_samples.singlesample_texture

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 16:18:47 -07:00
Kenneth Graunke
81407531e0 i965: Generalize wm_key->compute_sample_id to wm_key->multisample_fbo.
I'm going to need a key entry meaning "we have a multisample FBO,
and multisampling is enabled" in an upcoming patch.  This is basically
wm_key->compute_sample_id, except that it also checks that the SAMPLE_ID
system value is read.

The only use of wm_key->compute_sample_id is in emit_sampleid_setup(),
which is only called when handling the SAMPLE_ID system value.  So we
can just eliminate the check and generalize the field.

v2: Also update the Vulkan driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 16:18:47 -07:00
Kenneth Graunke
de0a46a040 i965: Delete now dead persample_2x FS program key flag.
This was only used by the old gl_SampleID calculations.  The new code
doesn't need to handle 2x specially.

v2: Delete it from the Vulkan driver, too.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 16:18:47 -07:00
Kenneth Graunke
57118a19da i965: Simplify gl_SampleID setup on Gen8+.
On Gen7+, the thread payload provides the sample ID - we can read it
in two instructions, without any elaborate calculations.  We don't even
need a state dependency - this will properly produce zero in the
non-MSAA case.  Unfortunately, we need the state flag anyway, so we
may as well continue to use it to produce a single MOV 0 instead of
SHR/AND.

For some reason, the sample ID field is always zero on Gen7/7.5, so
we can't use this yet.  However, it works fine on Gen8+.  So, land the
code and use it where it's working, and leave a TODO for later.

v2: Fix register types in the comment (caught by Matt Turner!).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 16:18:47 -07:00
Kenneth Graunke
528255b0b1 i965: Flip key->compute_sample_id check.
This just moves the simple case first.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 16:18:47 -07:00
Bas Nieuwenhuizen
43ed1f73f8 st/mesa: Use correct size for compute CAPs.
Some CAPs are stored as 64-bit value while Mesa stores
the related constant as 32-bit value.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-04-21 00:27:01 +02:00
Kenneth Graunke
60a17d0718 i965: Properly handle integer types in opt_vector_float().
Previously, opt_vector_float() always interpreted MOV sources as
floating point, and always created a MOV with a F-type destination.

This meant that we could mess up sequences of integer loads, such as:

   mov vgrf6.0.x:D, 0D
   mov vgrf6.0.y:D, 1D
   mov vgrf6.0.z:D, 2D
   mov vgrf6.0.w:D, 3D

Here, integer 0/1/2/3 become approximately 0.0f, so we generated:

   mov vgrf6.0:F, [0F, 0F, 0F, 0F]

which is clearly wrong.  We can properly handle this by converting
integer values to float (rather than bitcasting), and emitting a type
converting MOV:

   mov vgrf6.0:D, [0F, 1F, 2F, 3F]

To do this, see first see if the integer values (converted to float)
are representable.  If so, we use a D-type MOV.  If not, we then try
the floating point values and an F-type MOV.  We make zero not impose
type restrictions.  This is important because 0D would imply a D-type
MOV, but is often used in sequences such as MOV 0D, MOV 0x3f800000D,
where we want to use an F-type MOV.

Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend.  This
recently became visible due to changes in opt_vector_float() which
made it optimize more cases, but it was a pre-existing bug.

Apparently it also manages to turn more integer loads into VFs,
producing the following shader-db statistics on Haswell:

total instructions in shared programs: 7084195 -> 7082191 (-0.03%)
instructions in affected programs: 246027 -> 244023 (-0.81%)
helped: 1937

total cycles in shared programs: 65669642 -> 65651968 (-0.03%)
cycles in affected programs: 531064 -> 513390 (-3.33%)
helped: 1177

v2: Handle the type of zero better.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 15:05:13 -07:00
Kenneth Graunke
1aa28f3509 i965: Make opt_vector_float() only handle non-type-conversion MOVs.
We don't handle this properly - we'd have to perform the type conversion
before trying to convert the value to a VF.

While we could do that, it doesn't seem particularly useful - most
vector loads should be consistently typed (all float or all integer).

As a special case, we do allow type-converting MOVs of integer 0, as
it's represented the same regardless of the type.  I believe this case
does actually come up.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 15:05:13 -07:00
Kenneth Graunke
2a25a5142b i965: Fold vectorize_mov() back into the one caller.
After the previous patch, this helper is only called in one place.
So, just fold it back in - there are a lot of parameters here and
not much code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 15:05:13 -07:00
Kenneth Graunke
9967561158 i965: Rework opt_vector_float() control flow.
This reworks opt_vector_float() so that there's only one place that
flushes out any accumulated state and emits a VF.

v2: Don't break the sequence for non-representable numbers - just skip
    recording their values.  Only break it for non-MOVs or register
    changes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 15:05:13 -07:00
Jason Ekstrand
50018522d2 anv: s/anv_batch_emit_blk/anv_batch_emit/
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
0a45395902 anv: Remove the old emit macro
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
86c52bc757 anv/gen7_pipeline: Use the new emit macro
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
744e133431 anv/gen7_cmd_buffer: Use the new emit macro
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
cae2f14947 anv/device: Use the new emit macro
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
932c353592 anv/state: Use the new emit macro
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
9e9f3f4e71 anv/gen8_pipeline: Use the new emit macro
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
dba3727bea anv/genX_pipeline: Use the new emit macro
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
a48f8340d9 anv/gen8_cmd_buffer: Use the new emit macro
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
8a6ced83e9 anv/cmd_buffer: Use the new emit macro for quaries
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
db25e1eec5 anv/cmd_buffer: Use the new emit macro for DRAWING_RECTANGLE
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
deb13870d8 anv/cmd_buffer: Use the new emit macro for compute shader dispatch
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
06fc7fa684 anv/cmd_buffer: Use the new emit macro for 3DSTATE_CONSTANT
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
a71ded0e18 anv/cmd_buffer: Use the new emit macro for DEPTH/STENCIL_BUFFER
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
56453eeaff anv/cmd_buffer: Use the new emit macro for PIPE_CONTROL and STATE_BASE_ADDRESS
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
1d4d6852b4 anv/cmd_buffer: Use the new emit macro for 3DPRIMITIVE commands
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Jason Ekstrand
64ad2d3bcd anv: Add a new block-based batch emit macro
This new macro uses a for loop to create an actual code block in which to
place the macro setup code.  One advantage of this is that you syntatically
use braces instead of parentheses.  Another is that the code in the block
doesn't even get executed if anv_batch_emit_dwords fails.

Acked-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-20 14:54:09 -07:00
Samuel Pitoiset
d30768025a gk110/ir: make use of IMUL32I for all immediates
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-20 22:55:36 +02:00
Samuel Pitoiset
17a37c78fc gk110/ir: do not overwrite def value with zero for EXCH ops
This is only valid for other atomic operations (including CAS). This
fixes an invalid opcode error from dmesg. While we are it, make sure
to initialize global addr to 0 for other atomic operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-20 22:55:33 +02:00
Marcin Ślusarz
3caf2e89aa anv: fix build without Wayland platform
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-20 11:12:10 -07:00
Laurent Carlier
6c952d8ac7 anv: fix building on i686 with -mcpu=generic
mcpu=generic doesn't enable sse2, and anvil definitly needs it

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-20 10:48:11 -07:00
Jason Ekstrand
2ef7aef322 spirv: Trivially handle the NonWriteable decoration
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-20 10:33:23 -07:00
Connor Abbott
b6dc940ec2 nir: rename nir_foreach_block*() to nir_foreach_block*_call()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-20 09:47:05 -07:00
Samuel Pitoiset
7143068296 nvc0: avoid tex read fault from compute shaders on GK110
After some investigation, it seems like that disabling the UNK02C4
command avoid a read fault with texelFetch() from a compute shader.

I have no clue on what this method actually does, but this avoid the
GPU to hang with basic-texelFetch.shader_test without introducing any
compute-related regressions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-20 18:28:47 +02:00
Jason Ekstrand
87a4fb516e i965/vec4: Always split uniforms in array_access_to_pull_constants
Normally, we split uniforms at the end but in Vulkan, we bail because we
don't want pull constants.  However, we still need them split because
pack_uniforms relies on it.

I really don't like this patch not because it doesn't work (it does) but
because now that we're using MOV_INDIRECT, uniform numbers and sizes don't
really matter anymore.  In the FS backend, uniform splitting and packing is
handled all at once (actual re-assignment of locations happens later) and
we really should do it that way in vec4 eventually as well.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
2016-04-20 09:15:01 -07:00
Jason Ekstrand
b3f43822c7 i965/vec4: Use the correct offset for the swizzle shift in push constants
This was actually caught by Ken in review the first time around but somehow
didn't get fixed before the patches were pushed. :-(

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
2016-04-20 09:15:01 -07:00
Jason Ekstrand
9f16e170fe i965/vec4: Use nir_intrinsic_base in the load_uniform implementation
We shouldn't be reading the const_index directly

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
2016-04-20 09:15:01 -07:00
Jason Ekstrand
f63a95080f anv/apply_dynamic_offsets: Provide a range on the load_uniform
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
2016-04-20 09:14:58 -07:00
Jason Ekstrand
35b758c378 anv/lower_push_constants: Stop treating scalar specially
All of the code that did something special based on vec4 vs. scalar is
bogus.  In the backend, everything is now in units of bytes and the vec4
backend can handle full std140 packing so we don't need to do anything
special anymore.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
2016-04-20 09:14:47 -07:00
Tim Rowley
3bbe8a09ea swr: fix resource backed constant buffers
Code was using an incorrect address for the base pointer.

v2: use swr_resource_data() utility function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94979
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tested-by: Markus Wick <markus@selfnet.de>
2016-04-20 09:57:55 -05:00
Hans de Goede
2ac2ecdd6c nouveau: codegen: Add support for OpenCL global memory buffers
Add support for OpenCL global memory buffers, note this has only
been tested with regular load and stores and likely needs more work
for e.g. atomic ops.

Tested with piglet on a gf119 and a gk107:
./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader
[9/9] pass: 9 /
./piglit run -o shader -t '.*arb_compute_shader.*' results/shader
[20/20] skip: 4, pass: 16 |

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-04-20 13:46:03 +02:00
Hans de Goede
61d52a5fb9 nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers
Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only
apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for
OpenCL global buffers.

This commits changes the buffer code to use FILE_MEMORY_BUFFER at the
ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL
for use with OpenCL global buffers.

Note that after lowering buffer accesses use the FILE_MEMORY_GLOBAL
register file.

Tested with piglet on a gf119 and a gk107:
./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader
[9/9] pass: 9 /
./piglit run -o shader -t '.*arb_compute_shader.*' results/shader
[20/20] skip: 4, pass: 16 |

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-04-20 13:46:03 +02:00
Jose Fonseca
f02f4d09ce scons: Build dri_common_interop.c. 2016-04-20 12:41:24 +01:00
Marek Olšák
4fa3d35cc5 st/dri: implement the GL interop DRI extension (v2.2)
v2: - set interop_version
    - simplify the offset_after macro
v2.1: - use version numbers, remove offset_after
      - set "out_driver_data_written"
v2.2: - set buf_offset & buf_size for GL_ARRAY_BUFFER too
      - add whandle.offset to buf_offset
      - disable the minmax cache for GL_TEXTURE_BUFFER
2016-04-20 12:18:47 +02:00
Marek Olšák
37d3a26bd6 glx: implement GLX part of interop interface (v2)
v2: - use const
2016-04-20 12:18:47 +02:00
Marek Olšák
b6eda70843 egl: implement EGL part of interop interface (v2)
v2: - use const
2016-04-20 12:18:47 +02:00
Marek Olšák
5e9ed261ed dri_interface: add interface for GL interop with other APIs (v2)
v2: - use const
2016-04-20 12:18:47 +02:00
Marek Olšák
6eeb729490 include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v4.2)
v2: - use "enum" to define stuff
v3: - more comments, define MESA_GLINTEROP_UNSUPPORTED
v4: - add mesa_glinterop_device_info::interop_version
    - more comments
    - remove #define MESA_GLINTEROP_VERSION
    - use const for "in"
v4.1: - use version numbers for structures
      - add "out_driver_data_written"
v4.2: - buf_offset & buf_size affect GL_ARRAY_BUFFER too, this is required
        for sharing suballocations within a larger buffer
2016-04-20 12:15:41 +02:00
Nicolas Dufresne
8093990ef4 st/dri: Fix RGB565 EGLImage creation
When creating egl images we do a bytes to pixel conversion by deviding
by 4 regardless of the pixel format. This does not work for RGB565. In
this patch, we avoid useless conversion and use proper API when the
conversion cannot be avoided.

Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-04-20 17:55:30 +09:00
Nicolas Dufresne
4463f38766 st/dri: Factor out DRI2 to PIPE_FORMAT conversion
This code is already duplicated twice and will be useful again. This
will also help when adding formats.

Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-04-20 17:34:03 +09:00
Rob Clark
899bd63ace freedreno/a4xx: lower srgb in shader for astc textures
This *seems* like a hw bug, and maybe only applies to certain a4xx
variants/revisions.  But setting the SRGB bit in sampler view state
(texconst0) causes invalid alpha for ASTC textures.  Work around this
by doing the srgb->linear conversion in the shader instead.

This fixes 392 dEQP tests: dEQP-GLES3.functional.texture.*astc*srgb*

(The remaining fails seem to be a bug w/ ASTC + linear filtering, also
possibly a420.0 specific.)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 17:14:04 -04:00
Rob Clark
eddfc97709 nir/lower-tex: add srgb->linear lowering
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-19 17:13:50 -04:00
Rob Clark
eb00a0fc58 nir/builder: const'ify swiz param
No need for it not to be const, and lets caller declare it const if
desired.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-04-19 17:13:36 -04:00
Rob Clark
52ccc6349f nir/lower-tex: make options a local var
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:12:49 -04:00
Rob Clark
d4ff42bd0a freedreno: cleanup fd_set_sampler_views
The separate FS/VS entrypoints are no longer used since a3ed98f.  So
just inline them.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:11:47 -04:00
Russell King
fadfaa82c6 tgsi/lowering: improved lowering for LRP
Provide an improved lowering for LRP, which can be implemented in two
MAD instructions with a bit of rearranging of the equation, rather
than the literal implementation of two multiplies, an add and a
subtract.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:04:44 -04:00
Russell King
67da7dd98a tgsi/lowering: improved lowering for XPD
Improve XPD lowering to consume less instructions by using the
MAD instruction to perform the multiply and subtraction together.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:04:44 -04:00
Russell King
65460cf4c8 tgsi/lowering: add support for lowering TRUNC
Add support for lowering TRUNC using the following sequence:

	FRC tmpA, |src|
	SUB tmpA, |src|, tmpA
	CMP dst, -tmpA, tmpA

Note that this is incompatible with FRC lowering.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:04:44 -04:00
Russell King
23e870a888 tgsi/lowering: add support for lowering FLR and CEIL
Add support for lowering FLR and CEIL to FRC/SUB and FRC/ADD
instructions for GPUs that support FRC but not FLR or CEIL.  Since
these uses FRC, it is invalid to ask for FLR or CEIL to be lowered
along with FRC, so add an assert to catch this invalid configuration.

We also need to deal with FLR instructions emitted by the lowering
code.  Fix these up with the FRC+SUB equivalent when FLR lowering is
enabled.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 16:04:44 -04:00
Bas Nieuwenhuizen
464cef5b06 radeonsi: enable TGSI support cap for compute shaders
v2: Use chip_class instead of family.

v3: Check kernel version for SI.

v4: Preemptively allow amdgpu winsys for SI.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:31:23 +02:00
Bas Nieuwenhuizen
1f32d5d59f radeonsi: Consider input SGPR count for compute shader SGPR count.
si_shader_create corrects the SGPR count with si_fix_num_sgprs. We then
recompute the rsrc1 register to use the new SGPR count.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:31:23 +02:00
Bas Nieuwenhuizen
6c833ba1ab radeonsi: Add CE synchronization for compute dispatches.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:31:23 +02:00
Bas Nieuwenhuizen
e0b729c544 mesa/st: enable compute shaders if images are also supported
v2: Also depend on atomic counters.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-19 18:10:32 +02:00
Bas Nieuwenhuizen
41d79bcbfa radeonsi: clean up compute flush
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-19 18:10:32 +02:00
Bas Nieuwenhuizen
7a92c08428 radeonsi: do not do two full flushes on every compute dispatch
v2: Add more CS_PARTIAL_FLUSH events.

Essentially every place with waits on finishing for pixel shaders
also has a write after read hazard with compute shaders.

Invalidating L2 waits implicitly on pixel and compute shaders,
so, we don't need a CS_PARTIAL_FLUSH for switching FBO.

v3: Add CS_PARTIAL_FLUSH events even if we already have INV_GLOBAL_L2.

According to Marek the INV_GLOBAL_L2 events don't wait for compute
shaders to finish, so wait for them explicitly.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
e764ee13ae radeonsi: split setting graphics and compute descriptors
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
061ce9399a radeonsi: split texture decompression for compute shaders
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
e56514f631 radeonsi: update predicate condition for compute dispatches
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
c3083d841e radeonsi: implement TGSI compute dispatch
v2: - Use radeon_set_sh_reg_seq.
    - Set predicate bit for conditional rendering.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
1349dd16ff radeonsi: only emit compute shader state when switching shaders
v2: - Do check if anything changed earlier
    - Use emitted_program instead of emitted_bo to prevent
      shaders with shader->bo = NULL confusing the check
    - Use radeon_set_sh_reg*

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
ba1f66a73d radeonsi: rework compute scratch buffer
Instead of having a scratch buffer per program, have one per
context.

Also removed the per kernel wave count calculations, but
that only helped if the total number of waves in the dispatch
was smaller than sctx->scratch_waves.

v2: Fix style issue.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
107f4d3538 radeonsi: do per cs setup for compute shaders once per cs
Also removes PKT3_CONTEXT_CONTROL as that is already being done
by si_begin_new_cs, when emitting init_config.

v2: - Use radeon_set_sh_reg_seq.
    - Also set COMPUTE_STATIC_THREAD_MGMT_SE2 / SE3 for CIK+

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
52d3584dec radeonsi: don't pass scratch buffer to user SGPRs
As far as I can see we use relocations for clover too.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
422a19f76f radeonsi: split input upload off from si_launch_grid
Also uses a dynamically allocated buffer using u_upload_alloc.
The old buffer per program approach required serializing all
dispatches of the same program.

v2: - Clarified commit message.
    - Use radeon_set_sh_reg_seq.
    - Also upload input buffer for clover kernels, even when
      input_size is 0, as it contains grid parameters.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
898298efc9 radeonsi: implement TGSI compute shader creation
v2: Moved scratch_enabled initialization after compile.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
85fd7817ee radeonsi: update shader count for compute shaders
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
da88c2a8e8 radeonsi: set maximum work group size based on block size
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
b082147b78 radeonsi: implement shared atomics
v2: - Use single region
    - Use get_memory_ptr

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
8acf3e501b radeonsi: implement shared memory load/store
v2: - Use single region
    - Combine address calculation

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen
84a6761ae3 radeonsi: add shared memory
Declares the shared memory as a global variable so that
LLVM is aware of it and it does not conflict with passes
like AMDGPUPromoteAlloca.

v2: - Use ctx->i8.
    - Dropped null-check for declare_memory_region.
    - Changed memory region array to single region.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen
753a3e472b radeonsi: lower compute shader arguments
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen
008d977d01 radeonsi: Use CE for all descriptors.
v2: Load previous list for new CS instead of re-emitting
    all descriptors.

v3: Do radeon_add_to_buffer_list in si_ce_upload.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen
0b6c463dac gallium/util: Add u_bit_scan_consecutive_range64.
For use by radeonsi.

v2: Make sure that it works for all 64 bits set.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen
058b54c624 radeonsi: Replace list_dirty with a mask.
We can then upload only the dirty ones with the constant engine.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen
aabc7d61d6 radeonsi: Add CE uploader.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen
0d7ddd6819 radeonsi: Allocate chunks of CE ram.
v2: Use 32 byte alignment.

v3: Don't allocate CE space for vertex buffer descriptors.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen
86c71ff989 radeonsi: Add CE synchronization.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen
fe1ef23b66 radeonsi: Add CE packet definitions.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen
8fee75d606 radeonsi: Create CE IB.
Based on work by Marek Olšák.

v2: Add preamble IB.

Leaves the load packet in the space calculation as the
radeon winsys might not be able to support a premable.

The added space calculation may look expensive, but
is converted to a constant with (at least) -O2 and -O3.

v3: - Fix code style.
    - Remove needed space for vertex buffer descriptors.
    - Fail when the preamble cannot be created.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen
7201230582 winsys/amdgpu: Enlarge const IB size.
Necessary to prevent performance regressions due to extra flushing.

Probably should enlarge it even further when also updating
uniforms through the CE, but this seems large enough for now.

v2: Add preamble IB.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:30 +02:00
Marek Olšák
7997b5f005 winsys/amdgpu: Add support for const IB.
v2: Use the correct IB to update request (Bas Nieuwenhuizen)
v3: Add preamble IB. (Bas Nieuwenhuizen)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-19 18:10:30 +02:00
Marek Olšák
e78170f388 winsys/amdgpu: split IB data into a new structure in preparation for CE
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-19 18:10:30 +02:00
Marek Olšák
f4b77c764a gallium/radeon: move ring_type into winsyses
Not used by drivers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-19 18:10:30 +02:00
Jose Fonseca
1d2ac7a7ca llvmpipe: Call LLVMShutdown before exiting.
So that LLVM frees its globals.

Trivial.
2016-04-19 12:10:09 +01:00
Jose Fonseca
524042fa35 llvmpipe: Avoid LLVMGetGlobalContext in tests.
Trivial.
2016-04-19 12:10:02 +01:00
Jose Fonseca
bb9e8c5090 llvmpipe: Skip false exp2 failure in lp_test_arit due to buggy MSVCRT.
64bits MSVCRT's exp2f(-inf) returns -inf instead of 0.  Tested with
MSVC 2013's CRT.  (I haven't tried 2015 yet.)

Also this does not happen with MinGW.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:53 +01:00
Jose Fonseca
ee9876be1d llvmpipe: Test more vector lengths.
All power of two of up native vector length.

There is actually a bug in lp_build_round for v2, whereby it doesn't
round to nearest.  Fixing is left to the future, but the test is now
able to expect it to fail.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:44 +01:00
Jose Fonseca
932b71f17d gallivm: Avoid llvm::sys::getProcessTriple().
Just use LLVM_HOST_TRIPLE, which is available at least from LLVM 3.3
onwards, and is pretty much what llvm::sys::getProcessTriple() does anyway,

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:37 +01:00
Jose Fonseca
b5ca689cee gallivm: Remove lp_get_module_id.
Just keep a copy of the module_name in gallivm.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:26 +01:00
Jose Fonseca
969ba8bfa7 gallivm: Fix MCJIT with LLVM 3.3.
One needs to call setJITMemoryManager for LLVM 3.3, instead of
setMCJITMemoryManager.

This regressed in commits 065256df/75ad4fe7 when trying to make the
code to build with LLVM 3.6.

Tested MCJIT with LLVM 3.3 to 3.6.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:17 +01:00
Jose Fonseca
cf4105740f gallivm: Make MCJIT a runtime option.
On the LLVM versions that support it, so we can easily switch between
MCJIT/old-jit for testing.

The new option is GALLIVM_MCJIT.

Unfortunately setting GALLIVM_MCJIT=1 for LLVM 3.3 or 3.4 causes
segfault, both on Linux and Windows.  I'm almost certain this used to
work, so there probably is a regression somewhere.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:14 +01:00
Jose Fonseca
7d2151b6ea scons: Show the unit test full path.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:11 +01:00
Jose Fonseca
2211f8d559 gallivm: Use LLVMSetTarget.
Instead of LLVM C++ interfaces.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:31:00 +01:00
Jose Fonseca
9aa23b11e4 gallivm: Use LLVMPrintValueToString where available.
And llvm::raw_string_ostream where not (LLVM 3.3).

Thereby eliminating yet another dependency on unstable LLVM interfaces.

As a bonus this also gets LLVM IR on OutputDebugMessageA on MSVC (which
was disabled, probably due to C++ issues.)

Tested `lp_test_arit -v -v` on LLVM 3.3, 3.4 and 3.8.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:28:37 +01:00
Jose Fonseca
f6621cd3be gallium/tests: Update UTIL_FORMAT_MAX_* defines.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-19 11:28:16 +01:00
Jose Fonseca
121a0cedc8 Revert "nv50/ra: isinf() is in namespace std since C++11."
This reverts commit f525db6358.

It was superseeded by commit 649704f1f7.
2016-04-19 11:22:45 +01:00
Eric Anholt
802b9292aa vc4: Fix fbo-generatemipmap-formats for NPOT.
Single-sampled texture miplevels > 1 are stored in POT-aligned areas, but
we only get one value to control the stride of the src and dst for single
sampled buffers.  A RCL tile blit from level != 1 to level == 0 would
therefore load from the wrong stride.
2016-04-18 16:55:36 -07:00
Eric Anholt
2402bb6095 vc4: Remove unused "immediates" field
This was for TGSI, which we no longer have to deal with.
2016-04-18 16:48:45 -07:00
Ben Widawsky
2408899cb2 i965: Define miptree map functions static (trivial)
They were already declared as such. It was changed here:
commit 31f0967fb5
Author: Ian Romanick <ian.d.romanick@intel.com>
Date:   Wed Sep 2 14:43:18 2015 -0700

    i965: Make intel_miptree_map_raw static

Cc: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-18 16:12:13 -07:00
Matt Turner
b1d9353cb5 glsl: Properly handle ldexp(0.0f, non-zero-exp). 2016-04-18 15:48:54 -07:00
Dave Airlie
3a26ef23e7 gallivm: convert size query to using a set of parameters.
This isn't currently that easy to expand, so fix it up
before expanding it later to include dynamic samplers.

[airlied: use some local variables (Roland)]

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-19 07:33:39 +10:00
Tim Rowley
3227c10270 swr: dereference cbuf/zbuf/views on context destroy
Fixes resource memory leaks.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-18 15:52:26 -05:00
Rob Clark
77a9107bf2 freedreno/ir3: fix grouping issue w/ reverse swizzles
When we have something like:

   MOV OUT[n], IN[m].wzyx

the existing grouping code was missing a potential conflict.  Due to
input needing to be sequential scalar regs, we have:

 IN:  x <-> y <-> z <-> w

which would be grouped to:

 OUT: w <-> z2 <-> y2 <-> x  (where the 2 denotes a copy/mov)

but that can't actually work.  We need to realize that x and w are
already in the same chain, not just that they aren't both already in
new chain being built.

With this fixed, we probably no longer need the hack from f68f6c0.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-18 15:41:32 -04:00
Marek Olšák
ed66c75784 radeonsi: use enums in si_shader.h
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
0c52caf7b7 gallium/radeon: use enums in r600_query.h
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
dd9ca77cb9 radeonsi: always use PFP_SYNC_ME when doing flushes and waits
This is typically used by the closed driver before SURFACE_SYNC.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
1db5678688 radeonsi: don't do VS/PS partial flushes if SURFACE_SYNC waits too
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
58494b42b5 radeonsi: add safety assertions for meta cache flushes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
78f58a4e6f radeonsi: don't use ACQUIRE_MEM on the graphics ring
It's only required on the compute ring. This matches the closed driver.

The compute flag is removed to prevent confusion and Bas's compute shader
patches remove it in the whole function.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
3faecdd4e1 radeonsi: remove TODO and correct a comment in si_emit_cache_flush
Yes, that flag is really needed.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
28c2573b4f radeonsi: don't flush CB/DB caches for performance counters
I'm not sure about this. This will make the engines go idle, but the caches
will be unflushed. This should match app behavior without performance
counters, which can be a good thing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:24 +02:00
Marek Olšák
97c328b2a3 gallium/radeon: don't flush CB/DB caches for timestamp queries
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:24 +02:00
Marek Olšák
6dc21b1962 gallium/util: fix undefined shift to the last bit in u_bit_scan
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:24 +02:00
Marek Olšák
9434aa8103 gallium/util: fix u_bit_scan_consecutive_range for mask == 0xffffffff
The second ffs returns 0, yielding count == -1.

v2: change 1 to 1u

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-18 19:51:24 +02:00
Marek Olšák
e50e1f86b0 gallium/radeon: fix Nine with its slightly shifted viewports
just need to do the calculation in floating-point and then round things
properly

Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-04-18 19:51:24 +02:00
Erik Faye-Lund
ee5b35142a docs: correct name for GL_OES_primitive_bounding_box
When this extension was added, an underscore were mistakenly replaced
by a space. Let's correct this, so it's a tad easier to grep for this
extension.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
2016-04-18 10:48:57 -07:00
Kenneth Graunke
c092f9b96a meta: Don't botch color masks when changing drawbuffers.
Color clears should respect each drawbuffer's color mask state.

Previously, we tried to leave the color mask untouched.  However,
_mesa_meta_drawbuffers_from_bitfield() ended up rebinding all the
color drawbuffers in a different order, so we ended up pairing
drawbuffers with the wrong color mask state.

The new _mesa_meta_drawbuffers_and_colormask() function does the
same job as the old _mesa_meta_drawbuffers_from_bitfield(), but
also rearranges the color mask state to match the new drawbuffer
configuration.

This code was largely ripped off from Gallium's st_Clear code.

This fixes ES31-CTS.draw_buffers_indexed.color_masks, which binds
up to 8 drawbuffers, sets color masks for each, and then calls
glClearBufferfv to clear each buffer individually.  ClearBuffer
causes us to rebind only one drawbuffer, at which point we used
ctx->Color.ColorMask[0] (draw buffer 0's state) for everything.

We could probably delete _mesa_meta_drawbuffers_from_bitfield(),
but I'd rather not think about the i965 fast clear code.  Topi is
rewriting a bunch of that soon anyway, so let's delete it then.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94847
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-18 10:39:31 -07:00
Kenneth Graunke
a33f94ba8c meta: Don't smash ColorMask when using MESA_META_COLOR_MASK save bit.
This allows meta operations to inspect the existing color mask, and
then do their own smashing.

BlitFramebuffer and Clear already override the color mask, so this
was also redundant.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-18 10:39:26 -07:00
Eric Anholt
48fe53bbb9 vc4: Add support for rendering to cube map surfaces.
We need to fix up the offset to point at the face of the cube.  Fixes
piglit fbo-cubemap, copyteximage CUBE, and glean's fbo test.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-18 10:10:44 -07:00
Eric Anholt
21a9ed6207 vc4: Don't flush on read-only access of buffers read by the CL.
Fixes piglit mixed-immediate-and-vbo, and may significantly improve
performance of applications that store a 4-byte IB in the same VBO as
vertex data.
2016-04-18 10:10:44 -07:00
Eric Anholt
9e8a8b0c8b vc4: Sanity check that flushes don't happen between state emit and draw.
Catches the cause of failure in
arb_vertex_buffer_object-mixed-immediate-and-vbo, I've had this class of
failure before, and it probably won't be the last time.
2016-04-18 10:10:44 -07:00
Eric Anholt
56b14adf85 vc4: Sanity check strides for imported BOs.
If we're going to sample from or render to them at some particular size,
we'd better make sure that they actually are that size.  Causes some tests
under simulation to generate appropriate error messages instead of
failures.
2016-04-18 10:10:44 -07:00
Pierre Moreau
649704f1f7 math: Import isinf and others to global namespace
Starting from C++11, several math functions, like isinf, moved into the std
namespace. Since cmath undefines those functions before redefining them inside
the namespace, and glibc 2.23 defines the C variants as macros, the C variants
in global namespace are not accessible any longer.

v2: Move the fix outside of Nouveau, as suggested by Jose Fonseca, since anyone
    might need it when GCC switches to C++14 by default with GCC 6.0.

v3:
*   Put the code directly inside c99_math.h rather than creating a new header
    file, as asked by Jose Fonseca;
*   Guard the code behind glibc version checks, as only glibc > =2.23 defines
    isinf & co. as functions, as suggested by Jose Fonseca.

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-18 11:10:25 +01:00
Oded Gabbay
d3c98c73dc r600g: Move R600_BIG_ENDIAN to r600_pipe_common.h
I need to do this so I could use R600_BIG_ENDIAN in files which include
r600_pipe_common.h but not r600_pipe.h

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-18 09:50:08 +03:00
Oded Gabbay
72d0d2ba59 r600g: fix code indentation
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-18 09:50:08 +03:00
Emil Velikov
a998e49259 docs: add news item and link release notes for 11.1.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-17 23:32:41 +01:00
Emil Velikov
50eeb5fb16 docs: add sha256 checksums for 11.1.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 596c6504b3)
2016-04-17 23:32:41 +01:00
Emil Velikov
c1bf47ada2 docs: add release notes for 11.1.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ca2fbf6f8f)
2016-04-17 23:32:41 +01:00
Roland Scheidegger
d11111a551 gallivm: don't use vector selects with llvm 3.7
llvm 3.7 sometimes simply miscompiles vector selects.
See https://bugs.freedesktop.org/show_bug.cgi?id=94972

This was fixed in llvm r249669
(https://llvm.org/bugs/show_bug.cgi?id=24532).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-18 00:23:34 +02:00
Dave Airlie
b3616f1326 nir: only dereference undef after NULL check. (v2)
Pointed out by coverity.

v2: nuke line, Jason pointed out the constructor does it.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-18 07:37:48 +10:00
Emil Velikov
96b4cfe834 docs: update the sha256 checksums for 11.2.1
Turns out the previous tarballs got corrupted during upload which I
carelessly forgot to check prior to deleting the local ones.
Lesson learned - double check before removing the local ones.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 79b0e13913)
2016-04-17 19:32:20 +01:00
Emil Velikov
2197581816 docs: add news item and link release notes for 11.2.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-17 18:36:59 +01:00
Emil Velikov
03a234c1d1 docs: add sha256 checksums for 11.2.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit c65835d812)
2016-04-17 18:36:59 +01:00
Emil Velikov
c15f457958 docs: add release notes for 11.2.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 21e6440e82)
2016-04-17 18:36:59 +01:00
Jason Ekstrand
f30f6e2625 i965/fs: Don't allow OOB array access of images
We have had a guard against OOB array access of images on IVB for a long
time, but it can actually cause hangs on any GPU generation.  This can
happen due to getting an untyped SURFACE_STATE for a typed message.  We
didn't used to hit this with the piglit test on anything other than IVB
because the OOB in the test would cause us to go past the top of the pull
constant UBO and we would get a surface index of 0 which is was always a
valid surface.  Now that we're pushing small arrays, we can end up grabbing
garbage from the GRF and going to some random index which causes a hang.
The solution is to just do the bounds check on all hardware.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94944
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2016-04-15 22:47:33 -07:00
Jason Ekstrand
93db828e42 anv/device: Images are only enabled in scalar stages
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-15 16:40:56 -07:00
Marek Olšák
c1a2fe7fd1 gallium/radeon: handle vertex shaders that disable clipping & viewport
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-16 00:21:15 +02:00
Nanley Chery
696d8ff5a1 mesa/texstore: Use Driver.CompressedTexSubImage in the default CompressedTexImage
Enable drivers to use their own implementation of this method instead of
the mesa default. Since the drivers that currently overwrite
dd_function_table::CompressedTexSubImage also overwrite
::CompressedTexImage, there should be no behavioral change.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-15 15:06:27 -07:00
Jason Ekstrand
5ec4ecce44 anv: Advertise vertexPipelineStoresAndAtomics based on scalar stages
Previously, we just looked at the hardware generation but this meant that
if you did INTEL_DEBUG=vec4 on BDW or SKL, you would have advertised but
non-working features.
2016-04-15 14:53:16 -07:00
Jason Ekstrand
0166ad6ced i965/vec4: Support full std140 layout for push constants
Up until now, we have been able to assume that all push constants are
vec4-aligned because this is what the GL driver gives us.  In Vulkan, we
need to be able to support full std140 because we get the layout from the
client.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-15 14:04:38 -07:00
Jason Ekstrand
a112391d52 i965/vec4: Handle MOV_INDIRECT in pack_uniform_registers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-15 14:04:38 -07:00
Jason Ekstrand
aaac8a1890 i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-15 14:04:38 -07:00
Jason Ekstrand
61ee5e62a2 i965/vec4: Use can_do_writemask in can_reswizzle
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-15 14:04:38 -07:00
Jason Ekstrand
75b68f9114 i965/vec4: Move can_do_writemask to vec4_instruction
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-15 14:04:37 -07:00
Chad Versace
4a80890177 util: Fix warning of invalid return value
_mesa_libgcrypt_init() returns NULL, but its return type is void.

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2016-04-15 15:00:58 -07:00
Jason Ekstrand
cab30cc5f9 Merge branch 'vulkan' 2016-04-15 13:52:34 -07:00
Roland Scheidegger
64d3ae09b7 llvmpipe: (trivial) initialize src1_alpha var to NULL
The blend code would do a conditional assignment based on it, causing valgrind
to complain. Since that variable was actually unused in this case, this
doesn't fix anything but the warning.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94955
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-15 22:51:28 +02:00
Jason Ekstrand
d8b85c96d1 Merge remote-tracking branch 'public/master' into vulkan 2016-04-15 13:35:16 -07:00
Jason Ekstrand
1a100d4f28 configure: Add support for the Intel Vulkan driver
This adds a --with-vulkan-drivers option with one driver, "intel".  In the
future, we may add more drivers to this list.

v2: Don't enable any drivers by default.  This should prevent this patch
    from breaking anyone's build.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-15 13:29:29 -07:00
Jason Ekstrand
ce7e82fb6f i965/surface_formats: Update some formats for more recent gens
The surface format table hasn't entirely been kept up-to-date.  This commit
marks a couple more compressed formats as sampleable on gen8+ and adds the
A4B4G4R4 format as renderable on gen9.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-15 13:29:29 -07:00
Jason Ekstrand
7dac4a2889 util/list: Add list splicing functions
This adds functions for splicing one list into another.  These have
more-or-less the same API as the kernel list splicing functions.  The
implementation, however, was stolen from the Wayland list implementation.

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2016-04-15 13:29:09 -07:00
Jason Ekstrand
17a181bfa6 Remove the Intel Vulkan readme 2016-04-15 13:17:08 -07:00
Tim Rowley
082f6d75ae gallium/swr: confine c++11 flag to swr driver
On the philosophy that a driver shouldn't change the compile flags
for the entire tree, take the clove approach of moving the c++11 flag
to the swr driver directory.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-15 14:43:01 -05:00
Tim Rowley
ee72fec9cf gallium/swr: allow swr use as a swrast dri driver
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-15 14:21:50 -05:00
Eric Anholt
f6d21bcd6b vc4: Fix subimage accesses to LT textures.
This code started out like the T case, iterating over utile offsets, but I
had partially switched it to iterating over pixel offsets.  I hadn't
caught this before because it's unusual to do piecemeal uploads to small
textures.

Fixes bad text rendering in QT5 apps, which use a 256x16 glyph cache.
Also fixes 6 piglit tests related to glTexSubImage() and
glGetTexSubImage().

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-15 11:57:17 -07:00
Mark Janes
ade3108bb5 util: Fix race condition on libgcrypt initialization
Fixes intermittent Vulkan CTS failures within the test groups:
dEQP-VK.api.object_management.multithreaded_per_thread_device
dEQP-VK.api.object_management.multithreaded_per_thread_resources
dEQP-VK.api.object_management.multithreaded_shared_resources

Signed-off-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94904

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-15 10:24:40 -07:00
Jason Ekstrand
8403e6de9f i965: Default to scalar GS 2016-04-15 09:54:42 -07:00
Jason Ekstrand
17d9a2b011 i965/surface_formats: Mark A4B4G4R4_UNORM as SKL+ only
This is what is indicated by the bspec.
2016-04-15 09:53:55 -07:00
Jason Ekstrand
d7189bdeee Revert "i965/fs: Properly write-mask spills"
This reverts commit 9c0109a1f6.
2016-04-15 09:53:55 -07:00
Jason Ekstrand
c3362453f9 Revert "i965/fs: Feel free to spill partial reads/writes"
This reverts commit 2434ceabf4.
2016-04-15 09:53:55 -07:00
Jason Ekstrand
2d5bd66e4f configure: Add support for detecting valgrind headers
We have several places where the Vulkan driver explicitly hooks into
valgrind when it's available.  We need to be able to detect it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-15 09:41:15 -07:00
Eduardo Lima Mitev
7e4628da48 nir/print: Fix printing variable mode
nir_variable_mode is currently a bitflag enum, while
nir_print::print_var_decl() assumes is still a numbered list.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-15 16:41:41 +02:00
John Sheu
f8752e0d95 xlib: remove MESA_GLX_VISUAL_HACK
This removes a hack introduced in 1999 in the first version of
fakeglx.c, with the comment:

  /* XXX revisit this after 3.0 is finished. */

Mesa 4.0 was released in 2001.  It is now 2016, and Mesa 11.0 was
released last year.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-04-15 07:46:00 +02:00
John Sheu
8a9c0f1025 xlib: fix leaks of returned values from XGetVisualInfo
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-04-15 07:45:46 +02:00
John Sheu
781232e0ac xlib: fix memory leak of and remove vishandle from XMesaVisualInfo
The vishandle member of XMesaVisualInfo is used to support the
comparison of XVisualInfo instances by pointer value, in
find_glx_visual().  The comparison however will always be false, as in
every case the comparison is made, the VisualInfo instance being
compared to is a new allocation passed in through a GLX API call.

In addition, the XVisualInfo instance pointed to by vishandle is itself
never freed, causing a memory leak.  Since vishandle is essentially
useless, we just remove it and thereby also fix the leak.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-04-15 07:45:28 +02:00
John Sheu
fe9d8cd79e xlib: do not cache return value of glXChooseVisual/glXGetVisualFromFBConfig
The returned XVisualInfo from glXChooseVisual/glXGetVisualFromFBConfig
is being cached in XMesaVisual.vishandle (and unconditionally
overwritten on subsequent calls).  However, these entry points are
specified to return XVisualInfo instances to be owned by the caller and
freed with XFree(), so the return values should not be retained.

With this change, XMesaVisual.vishandle is essentially unused and will
be removed in a subsequent change.

v2: update commit message

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-04-15 07:44:34 +02:00
Jason Ekstrand
76fa7b16f4 Merge remote-tracking branch 'public/master' into vulkan 2016-04-14 18:30:52 -07:00
Jason Ekstrand
547032c56a main/mtypes: Remove the "set" parameter from gl_uniform_block
This is a left-over from the early days of the Vulkan driver
2016-04-14 18:27:09 -07:00
Jason Ekstrand
f0bbb34e49 Revert "i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT"
This reverts commit 4115648a6b.  This commit
was half-baked and probably never should have been committed.  We'll add
this back in properly later when we need it.
2016-04-14 18:22:08 -07:00
Jason Ekstrand
eeff133158 i965: Expose the surface format table
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 18:07:48 -07:00
Jason Ekstrand
d7cddbd6d6 nir/lower_io: Add UBOs and SSBOs to get_io_offset_src
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 18:07:40 -07:00
Jason Ekstrand
c825e29a82 nir/intrinsics: Add a vulkan_resource_index intrinsic
This is used to facilitate the Vulkan binding model where each resource is
described by a (descriptor set, binding, array index) tuple.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-14 17:20:05 -07:00
Jason Ekstrand
1e0012e3e4 nir: Add a descriptor_set field to nir_variable
This is needed for supporting the Vulkan binding model

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-14 17:20:05 -07:00
Chad Versace
7a835b3fd9 dri: Fix robust context creation via EGL attribute
driCreateContextAttribs() emits an error if bit
__DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS is set for an ES context.  But,
EGL_EXT_create_context_robustness and EGL 1.5 both allow creation of
robust ES contexts. One requests a robust ES context by setting the
EGL_CONTEXT_OPENGL_ROBUST_ACCESS *attribute*, which Mesa's EGL layer
translates into the __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS *bit*.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-14 17:38:41 -07:00
Jason Ekstrand
5567ae0547 Merge remote-tracking branch 'public/master' into vulkan 2016-04-14 17:14:28 -07:00
Leo Liu
8f4340c5e6 radeon/uvd: fix tonga feedback buffer size
This only applies to tonga

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-14 19:33:44 -04:00
Jason Ekstrand
f1d29099b4 i965: Push everything if pull_param == NULL
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 16:00:18 -07:00
Jason Ekstrand
963513bb24 i965/fs: Push small uniform arrays
Unfortunately, this also means that we need to use a slightly different
algorithm for assign_constant_locations.  The old algorithm worked based on
the assumption that each read of a uniform value read exactly one float.
If it encountered a MOV_INDIRECT, it would immediately bail and push the
whole thing.  Since we can now read ranges using MOV_INDIRECT, we need to
be able to push a series of floats without breaking them up.  To do this,
we use an algorithm similar to the on in split_virtual_grfs.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
71f8039f72 i965/fs: Rename demote_pull_constants to lower_constant_loads
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
8e76f664be i965/vec4: Get rid of the uniform_size array
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
056849772f i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants
This commit moves us to an instruction based model rather than a
register-based model for indirects.  This is more accurate anyway as we
have to emit instructions to resolve the reladdr.  It's also a lot simpler
because it gets rid of the recursive reladdr problem by design.

One side-effect of this is that we need a whole new algorithm in
move_uniform_array_access_to_pull_constants.  This new algorithm is much
more straightforward than the old one and is fairly similar to what we're
already doing in the FS backend.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
479e38ad63 i965/fs: Get rid of the param_size array
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
30874216cb i965/fs: Stop relying on param_size in assign_constant_locations
Now that we have MOV_INDIRECT opcodes, we have all of the size information
we need directly in the opcode.  With a little restructuring of the
algorithm used in assign_constant_locations we don't need param_size
anymore.  The big thing to watch out for now, however, is that you can have
two ranges overlap where neither contains the other.  In order to deal with
this, we make the first pass just flag what needs pulling and handle
assigning pull constant locations until later.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
275855f315 i965/fs: Get rid of reladdr
We aren't using it anymore.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
3c93cdfaf5 i965/fs: Use MOV_INDIRECT for all indirect uniform loads
Instead of using reladdr, this commit changes the FS backend to emit a
MOV_INDIRECT whenever we need an indirect uniform load.  We also have to
rework some of the other bits of the backend to handle this new form of
uniform load.  The obvious change is that demote_pull_constants now acts
more like a lowering pass when it hits a MOV_INDIRECT.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
63101177f3 nir: Add another index to load_uniform to specify the range read
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
27bd8ac6f3 i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware
While we're at it, we also add support for the possibility that the
indirect is, in fact, a constant.  This shouldn't happen in the common case
(if it does, that means NIR failed to constant-fold something), but it's
possible so we should handle it.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
889e6054b7 i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnr
The subnr field is in bytes so we don't need to multiply by type_sz.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
7e08a13009 i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructions
It should work fine without it and the visitor can set it if it wants.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
40a8fe04dc i965/fs: Add support for doing MOV_INDIRECT on uniforms
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand
48cc8c284a anv: Install the installable ICD 2016-04-14 15:15:00 -07:00
Jason Ekstrand
e40b867145 anv/intel_icd: Don't provide an absolute path
The driver will be installed to $(libdir)/libvulkan_intel.so and just
providing a driver name is enough for the loader.  This also ensures that
multi-arch systems work ok.
2016-04-14 15:15:00 -07:00
Jason Ekstrand
ca16373a2b configure: Add initial support for enabling Vulkan drivers 2016-04-14 15:15:00 -07:00
Jason Ekstrand
e61c812f76 anv/pipeline: Use the right mask for lower_indirect_derefs 2016-04-14 15:13:29 -07:00
Ben Widawsky
a8975a91cc i965: Make intel_get_param return an int
This will fix the spurious error message: "Failed to query GPU properties."
that was unintentionally added in cc01b63d73.

This patch changes the function to return an int so that the caller is able to
do stuff based on the return value.

The equivalent of this patch was in the original series that fixed up the
warning, but I dropped it at the last moment. It is required to make the desired
behavior of not warning when trying to query GPU properties from the kernel
unless there is something the user can do about it.

v2: Use strerror (Jason)
Make EINVAL check similar in all places (Ian)

NOTE: Broadwell appears to actually have some issue where the kernel returns
ENODEV when it shouldn't be. I will investigate this separately.

Reported-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-04-14 15:13:22 -07:00
Brian Paul
aed975d5c5 st/mesa: fix sampler view leak in st_DrawAtlasBitmaps()
I neglected to free the sampler view which was created earlier in the
function.  So for each glCallLists() command that used the bitmap atlas
to draw text, we'd leak a sampler view object.

Also, check for st_create_texture_sampler_view() failure and record
GL_OUT_OF_MEMORY.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-04-14 15:32:18 -06:00
Nicolai Hähnle
a17911ceb1 gallium/radeon: handle failure when mapping staging buffer
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-14 16:29:23 -05:00
Nicolai Hähnle
8bd0f0df50 radeonsi: mark ssbo and images descriptor pointers dirty at beginning of CS
Without this, we were getting non-deterministic VM faults under high pressure.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-14 16:29:23 -05:00
Jason Ekstrand
cb372b39ea i965/vec4: Use UD rather than D for uniform indirects
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 14:25:01 -07:00
Jason Ekstrand
240d16ea94 i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD
Reveiewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 14:24:57 -07:00
Samuel Pitoiset
bb4cdee9a4 nvc0: do not break the universe on GK110+
I removed that return 0 by mistake. Ooops.

Fixes: 6e23fd4 ("nvc0: allow to use compute support on GM200")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-14 21:57:21 +02:00
Samuel Pitoiset
6e23fd420d nvc0: allow to use compute support on GM200
This works like a charm but please not that NVF0_COMPUTE have to be set
because compute support is still not enabled by default on GK110+. This
will require more testing to make sure it won't break the 3D state.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-14 21:01:51 +02:00
Jason Ekstrand
34b5db17d9 i965: remove pointless diff with the master branch 2016-04-14 10:39:54 -07:00
Jason Ekstrand
769b5614f8 nir/opt_algebraic: Remove the encoding line
This is an unneeded diff between the vulkan and master branches
2016-04-14 10:35:40 -07:00
Jason Ekstrand
c34be07230 spirv: Move to compiler/
While it does rely on NIR, it's not really part of the NIR core.  At the
moment, it still builds as part of libnir but that can be changed later if
desired.
2016-04-14 10:28:47 -07:00
Jason Ekstrand
bfa3a38280 nir: Remove some pointless delta between vulkan and master 2016-04-14 10:24:33 -07:00
Jose Fonseca
ffcc00ce30 scons: Build NIR.
Emil Velikov:
 - Attribute the src/{glsl,compiler}/nir move
 - Flesh out to separate SConscript

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-14 16:38:59 +01:00
Jose Fonseca
feb6732e80 nir: Use _snprintf on Windows.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-14 16:38:37 +01:00
Jose Fonseca
ba0c0e3940 nir: Avoid structure initalization expressions.
Not supported by MSVC, and completely unnecessary -- inline functions
work just as well.

NIR_SRC_INIT/NIR_DEST_INIT could and probably should be replaced by the
inline functions.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-14 16:38:37 +01:00
Jose Fonseca
8f96524f13 nir: Remove unistd.h include.
It doesn't seem needed, and is not available on MSVC.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-14 16:38:31 +01:00
Jose Fonseca
f8e2f1fba5 nir: Avoid empty {} struct initializer.
Not supported by MSVC and consistent through NIR.

[Emil Velikov: rebase]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-14 16:33:52 +01:00
Emil Velikov
bb949e262c gallium/swr: fold the almost identical Makefiles
Rather than having two almost identical Makefiles, with various VPATH
hacks just fold them, using COMMON_* variables and actually getting
things buildable/shipable.

v2: whitespace fixes, remove Makefile.sources-arch

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-04-14 16:30:57 +01:00
Tim Rowley
aee976703d install-gallium-links.mk: handle multiple libraries
Need to prevent bash from interpreting whitespace between libraries
as a command line.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-14 16:30:57 +01:00
Marek Olšák
112291964e radeonsi: don't overwrite the scratch offset in shader prologs
Prologs only look at num_input_sgprs.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-14 17:00:14 +02:00
Marek Olšák
ffe44d0283 radeonsi: fold num_user_sgprs where it is possible
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-14 17:00:14 +02:00
Marek Olšák
51c4034f9b radeonsi: fix SGPRS calculation once more
This fixes GS piglit failures after adding SI_PARAM_SHADER_BUFFERS,
which bumped NUM_USER_SGPRS and uncovered this bug on SI.

If this was fixed in LLVM, these workarounds wouldn't be needed.

LLVM would have to look at the calling convention to know how many SGPR
inputs are declared, and add VCC and the scratch wave offset (which is
enabled even if we spill SGPRs but not VGPRs, oh well).

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-14 17:00:14 +02:00
Marek Olšák
aaf5be4a29 radeonsi: disable hw ETC2 on Polaris
not supported by hw directly, but it's still fully supported by the driver

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-14 16:58:59 +02:00
Emil Velikov
4358cfc4ad doxygen: remove git rebase fallouts
Should never have been (git) added in the first place.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-14 09:49:09 +01:00
Jose Fonseca
8fcacb4f90 appveyor: Run unit tests.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-14 07:19:04 +01:00
Jose Fonseca
50ddf03ada scons: Add a "check" target to run all unit tests.
Except:
- u_cache_test -- too long
- translate_test -- unreliable (it's probably testing corner cases that
  translate module doesn't care about.)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-14 07:19:04 +01:00
Jose Fonseca
9ae0e8ee3c test/unit: Make translate_test invoke translate_create by default.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-14 07:19:04 +01:00
Jose Fonseca
f8a51034bd test/unit: Make pipe_barrier_test actually check correct bahavior.
So it can run unattended.

Also make it silent by default.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-14 07:19:04 +01:00
Jason Ekstrand
12f88ba32a Merge remote-tracking branch 'public/master' into vulkan 2016-04-13 20:25:39 -07:00
Michel Dänzer
171a570f38 clover: Fix build against LLVM SVN >= r266163
createInternalizePass now takes a callback instead of a StringSet.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-04-14 11:53:41 +09:00
Nanley Chery
79fbec30fc anv: Remove default scissor and viewport concepts
Users should never provide a scissor or viewport count of 0 because
they are required to set such state in a graphics pipeline. This
behavior was previously only used in Meta, which actually just
disables those hardware operations at pipeline creation time.

Kristian noticed that the current assignment of viewport count
reduces the number of viewport uploads, so it is not removed.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2016-04-13 18:02:38 -07:00
Nanley Chery
1949e502bc anv: Replace ::disable_scissor with ::use_rectlists
Meta currently uses screenspace RECTLIST primitives that lie within
the framebuffer rectangle. Since this behavior shouldn't change in the
future, disable the scissor operation whenever rectlists are used.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2016-04-13 18:00:41 -07:00
Nanley Chery
9f72466e9f anv: Delete anv_graphics_pipeline_create_info::disable_viewport
There are no users of this field.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2016-04-13 18:00:41 -07:00
Nanley Chery
cff0f6b027 gen{7,8}_pipeline: Always set ViewportXYClipTestEnable
For the following reasons, there is no behavioural change with this
commit: the ViewportXYClipTest function of the CLIP stage will continue
to be enabled outside of Meta (where disable_viewport is always false),
and the CLIP stage is turned off within Meta, so this function will
continue to be disabled in that case.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2016-04-13 18:00:41 -07:00
Nanley Chery
992bbed98d gen{7,8}_pipeline: Apply 3DPRIM_RECTLIST restrictions
According to 3D Primitives Overview in the Bspec, when the RECTLIST
primitive is in use, the CLIP stage should be disabled or set to have
a different Clip Mode, and Viewport Mapping must be disabled:

   Clipping: Must not require clipping or rely on the CLIP unit’s
   ClipTest logic to determine if clipping is required. Either the CLIP
   unit should be DISABLED, or the CLIP unit’s Clip Mode should be set
   to a value other than CLIPMODE_NORMAL.

   Viewport Mapping must be DISABLED (as is typical with the use of
   screen-space coordinates).

We swap out ::disable_viewport for ::use_rectlist, because we currently
always use the RECTLIST primitive when we disable viewport mapping, and
we'll likely continue to use this primitive.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2016-04-13 17:53:38 -07:00
Nanley Chery
88d1c19c9d anv_cmd_buffer: Don't make the initial state dirty
Avoid excessive state emission. Relevant state for an action command
will get set by the user:

From Chapter 5. Command Buffers,
 When a command buffer begins recording, all state in that command
 buffer is undefined.
 [...]
 Whenever the state of a command buffer is undefined, the application
 must set all relevant state on the command buffer before any state
 dependent commands such as draws and dispatches are recorded, otherwise
 the behavior of executing that command buffer is undefined.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2016-04-13 17:52:24 -07:00
Nanley Chery
9fae6ee026 anv/meta: Don't set the dynamic state for disabled operations
CmdSet* functions dirty the CommandBuffer's dynamic state. This causes
the new state to be emitted when CmdDraw is called. Since we don't need
the state that would be emitted, don't call the CmdSet* functions.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2016-04-13 17:52:20 -07:00
Nanley Chery
76b0ba087c anv/clear: Disable the scissor operation
Since the scissor rectangle always matches that of the framebuffer,
this operation isn't needed.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2016-04-13 17:45:18 -07:00
Jason Ekstrand
b63a98b121 nir/dead_variables: Configurably work with any variable mode
The old version of the pass only worked on globals and locals and always
left inputs, outputs, uniforms, etc. alone.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-13 15:45:10 -07:00
Kenneth Graunke
505a8fbdf8 i965: Switch to NIR for ldexp lowering.
The old GLSL IR based lowering doesn't quite work right in all cases,
and fails several dEQP-GLES31 and Vulkan CTS tests.  Jason's new
approach in NIR passes all the tests.  There's not likely to be a ton
of advantage to lowering early in GLSL IR anyway, so...switch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-13 15:44:33 -07:00
Jason Ekstrand
4455bfa9a0 nir/algebraic: Add lowering for ldexp
The algorithm used is different from both the naive suggestion from the
GLSL spec and the one used in GLSL IR today.  Unfortunately, the GLSL IR
implementation that we have today doesn't handle denormals (for those that
care) or the case where the float source is +-inf.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-13 15:44:19 -07:00
Jason Ekstrand
765dd65349 i965: Implement the new imod and irem opcodes
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-13 15:44:08 -07:00
Jason Ekstrand
745b3d295e nir: Add more modulus opcodes
These are all needed for SPIR-V

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-13 15:44:00 -07:00
Jason Ekstrand
d880c6f9f5 i965/vec4: Inline get_pull_constant_offset
It's not really doing enough anymore to justify a helper function.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reveiewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-04-13 15:39:20 -07:00
Jason Ekstrand
dd616cab01 nir/lower_io: Allow for a full bitmask of modes
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-13 12:44:10 -07:00
Jason Ekstrand
2caaf0ac5e nir/lower_indirect: nir_variable_mode is now a bitfield
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-13 12:44:07 -07:00
Jason Ekstrand
ffa0e12e15 nir: Convert nir_variable_mode to a bitfield
There are several passes where we need to specify some set of variable
modes that the pass needs top operate on.  This lets us easily do that.

Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-13 12:40:12 -07:00
George Kyriazis
f69a61b1aa gallium/swr: Make flat shading tris work.
- Incorporate flatshade flag into the shader generation
- Use provoking vertex (vc) in shader when flat shading.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-13 13:46:37 -05:00
Rob Clark
c53a12fedc Revert "freedreno/a4xx: better occlusion/sample counting"
This reverts commit 62fa868728.

dEQP-GLES3.functional.occlusion_query.* was unhappy about that change.
Still not really sure *what* the other slots in the sample results
buffer are.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:16:40 -04:00
Rob Clark
46e9bbc918 freedreno/a4xx: rasterizer_discard support
This one is slightly annoying, since trying to write RBRC from draw
would clobber values set in the tiling/gmem code.  We could do command-
stream patching for RBRC, as is done on a3xx.  Although since it seems
to be a rarely used feature, it is easier just to do RMW to set/clear
the bit.

Fixes dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_triangles
and related tests.

a3xx still needs the same feature, although there it probably makes more
sense to take advantage of the existing cmdstream patching which is
required for RBRC for other reasons.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:16:21 -04:00
Rob Clark
216225ce57 freedreno/ir3: fix array textures on a4xx
Seems like a4xx needs offset added to array index for all arrays,
whereas a3xx only for cubemap arrays.  Fixes a whole swath of dEQP fails
(roughly *sampler2darray*).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:16:14 -04:00
Rob Clark
7e93b26b5d freedreno: fix stream-out offset handling for lines/tris
We need to increment offset by # of vertices, not by # of prims.  Fixes
a bunch of dEQP fails involving prims other than points.  For example,
dEQP-GLES3.functional.transform_feedback.position.lines_separate

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:16:02 -04:00
Rob Clark
6ca6e80f61 freedreno: fix handling for stream-out offsets
If changed && append, we shouldn't be resetting the internal offset back
to zero.  This fixes issues w/ sequences like:

   glBeginTransformFeedback()
   glDraw()
   glPauseTransformFeedback()
   glDraw()
   glResumeTransformFeedback()
   glDraw()
   glEndTransformFeedback()

Fixes dEQP-GLES3.functional.transform_feedback.array.separate.points.lowp_vec3
and related tests.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:15:54 -04:00
Rob Clark
0a4b0fc315 freedreno: fix prims-emitted query
This should only count when TF is not paused.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:15:47 -04:00
Rob Clark
a7eb12d089 freedreno: fix max-line-width
dEQP noticed that we were advertising completely bogus values.  The
actual maximum is 127.0f.

*But* we have to use an artifically low maximum to work around a bug
in the dEQP test, which gets confused when the max line width is too
large and lines start going off-screen.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:15:31 -04:00
Rob Clark
6bf462a1ab freedreno: add flag to enable dEQP hacks
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:15:24 -04:00
Rob Clark
f68f6c0246 freedreno/ir3: hack to avoid getting stuck in a loop
There are still some edge cases which result in a neighbor-loop.  Which
needs to be fixed, but this hack at least makes deqp tests finish.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:15:13 -04:00
Rob Clark
dd70945e09 freedreno/ir3: use (ss) instead of (sy) for ldlv
Fixes a bunch of flat-varying fail on a4xx (where we need to use ldlv to
read the un-interpolated varying).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:15:05 -04:00
Rob Clark
b35ad6e701 freedreno/ir3: cleanup double cmps.s from frontend
Since we cannot mov into a predicate register, the frontend uses a
'cmps.s p0.x, cond, 0' as a stand-in for mov to p0.x.  It does this
since it has no way to know that the source cond instruction (ie.
for a kill, br, etc) will only be used to write the predicate reg.
Detect this, and re-write the instruction writing p0.x to skip the
original cmps.[sfu].  (It is done like this, rather than re-writing
the dest of the first cmps.[sfu] in case the first cmps.[sfu]
actually has other users.)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:14:41 -04:00
Matt Turner
9bac27dbf9 glsl: Rename "vertex_input_slots" -> "is_vertex_input"
vertex_input_slots would be an appropriate name for an integer, but not
a bool.

Also remove a cond ? true : false from a count_attribute_slots() call
site, noticed during the rename.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-13 11:00:21 -07:00
Jose Fonseca
9586468c03 gallivm: Workaround LLVM PR 27332.
The credit for finding and isolating this bug goes to Vinson and Roland.

The buggy LLVM versions were found by doing

  opt -instcombine llvm-pr27332.ll > /dev/null

where llvm-pr27332.ll is the IR from
https://llvm.org/bugs/show_bug.cgi?id=27332#c3

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-13 16:42:55 +01:00
Marek Olšák
dd0a296895 gallium/radeon: move a comment to the correct place
trivial
2016-04-13 17:31:03 +02:00
Nicolai Hähnle
9e9a2bb44a radeonsi: gate PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT by LLVM version
Otherwise we incorrectly claim ARB_ssbo support even with older LLVM versions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94917
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-13 10:06:22 -05:00
Elie TOURNIER
f04565c876 doxygen: Generate Doxygen for NIR
Now, one can do the following to generate and read the nir Doxygen:
cd $MESA_TOP/doxygen
make
firefox nir/index.html

Update v2:
Correct TAGFILES in nir.doxy

Signed-off-by: Elie TOURNIER <tournier.elie@gmail.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>

[Emil Velikov] v3: Rebase.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-13 13:44:33 +01:00
Elie TOURNIER
3157df58d0 doxygen: update glsl link
Signed-off-by: Elie TOURNIER <tournier.elie@gmail.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
Tested-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-13 13:44:30 +01:00
Rhys Kidd
0e9fc1228a doxygen: Remove deprecated settings in common.doxy
These Doxygen features are deprecated, as reported by Doxygen 1.8.9.1

Warning: Tag `USE_WINDOWS_ENCODING' at line 66 of file `common.doxy' has become obsolete.
         To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
Warning: Tag `DETAILS_AT_TOP' at line 157 of file `common.doxy' has become obsolete.
         To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
Warning: Tag `HTML_ALIGN_MEMBERS' at line 616 of file `common.doxy' has become obsolete.
         To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
Warning: Tag `XML_SCHEMA' at line 848 of file `common.doxy' has become obsolete.
         To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
Warning: Tag `XML_DTD' at line 854 of file `common.doxy' has become obsolete.
         To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
Warning: Tag `MAX_DOT_GRAPH_WIDTH' at line 1115 of file `common.doxy' has become obsolete.
         To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
Warning: Tag `MAX_DOT_GRAPH_HEIGHT' at line 1123 of file `common.doxy' has become obsolete.
         To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-13 13:44:26 +01:00
Rhys Kidd
3d18ab72bf doxygen: Fix typo in doxygen/tnl.doxy
TAGFILE relative folder should match .tag file

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-13 13:44:23 +01:00
Rhys Kidd
4ba409a364 doxygen: Correct TAGFILE linkage of main
core.doxy was renamed to main.doxy, along with output folder in
the below 2004 commit.

Correct the other modules' TAGFILE linkage to find the main folder.

  commit 3ef972f538
  Author: Brian Paul <brian.paul@tungstengraphics.com>
  Date:   Sun May 16 22:07:02 2004 +0000

      Replaced 'core' with 'main'.
      Other minor updates.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-13 13:44:19 +01:00
Rhys Kidd
7703a3e3d0 doxygen: Update .gitignore
The last of these output directories was removed in 2007.

  commit c2e0570831
  Author: Jerome Glisse <glisse@freedesktop.org>
  Date:   Fri Feb 16 23:18:56 2007 +0100

      Update doxygen doc to reflet vbo changes.

      Update doxygen doc, array_cache no longuer exist,
      new shiny vbo modules is there. Tested on unix,
      but i think i didn't broke that bat :).

  commit 3ef972f538
  Author: Brian Paul <brian.paul@tungstengraphics.com>
  Date:   Sun May 16 22:07:02 2004 +0000

      Replaced 'core' with 'main'.
      Other minor updates.

  commit 69db632a9d
  Author: Jose Fonseca <j_r_fonseca@yahoo.co.uk>
  Date:   Thu May 1 23:32:54 2003 +0000

      Move the Doxygen configuration files into the usual places and integrate with the build system.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-13 13:44:15 +01:00
Rhys Kidd
ced18f4d60 doxygen: Remove references to miniglx
miniglx was removed in February 2010. Clean up remaining
unnecessary doxygen references.

  commit a9e3669683
  Author: Kristian Høgsberg <krh@bitplanet.net>
  Date:   Thu Feb 25 16:17:04 2010 -0500

      Remove remaining miniglx references

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-13 13:44:12 +01:00
Rhys Kidd
29b805b929 doxygen: Fix doxygen/gbm.doxy TAGFILES
There has never been a doxygen/gbm_setup output folder.

Appears to have been a copy-paste error from original commit
in 245341f406.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-13 13:44:08 +01:00
Rhys Kidd
684e7a4a14 doxygen: Correct TAGFILE relative paths
Per Doxygen documentation, to combine external documentation (stored in
a *.tag file) with a project the TAGFILES option should be set in the
configuration file.

  A tag file typically only contains a relative location of the
  documentation from the point where doxygen was run. So when
  you include a tag file in other project you have to specify
  where the external documentation is located in relation this
  project.

  You can do this in the configuration file by assigning the
  (relative) location to the tag files specified after the
  TAGFILES configuration option.

  If you use a relative path it should be relative with respect
  to the directory where the HTML output of your project is
  generated; so a relative path from the HTML output directory
  of a project to the HTML output of the other project that is
  linked to.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-13 13:44:04 +01:00
Rhys Kidd
f066fb529b doxygen: Fix doxygen/glapi.doxy
The src/mesa/glapi folder was relocated in the below commit.
Amend the doxygen/glapi.doxy INPUT setting accordingly.

Whilst here, in addition this change also avoids a bug in the
consolidated Doxygen output caused by doxygen/glapi.doxy inadvertently
overwriting doxygen/swrast.tag via its GENERATE_TAGFILE setting.

This bug depended upon the specific order each *.tag was built.

   commit 296adbd545
   Author: Chia-I Wu <olv@lunarg.com>
   Date:   Mon Apr 26 12:56:44 2010 +0800

       glapi: Move to src/mapi/.

       Move glapi to src/mapi/{glapi,es1api,es2api}.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-13 13:43:58 +01:00
Rhys Kidd
cf3bc91c06 doxygen: Remove src/mesa/shader/ references
Mesa has not had a src/mesa/shader/ folder since Mesa 7.9 removed it
in October 2010, as part of a revised GLSL compiler written by Intel.

Remove doxygen/shader.doxy and consequential changes made throughout.

In addition to removing an unnecessary Doxygen doxyfile, this change also
avoids a bug in the consolidated Doxygen output caused by
doxygen/shader.doxy inadvertently overwriting doxygen/swrast.tag via its
GENERATE_TAGFILE setting.

This bug depended upon the specific order each *.tag was built.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-13 13:43:54 +01:00
Marek Olšák
04f15e491f gallium/radeon: add an env variable to force a level of aniso filtering
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-13 12:42:28 +02:00
Jose Fonseca
cc5d8b678e llvmpipe: Test rounding of x.5.
Leverage nearbyintif function, which should be available on all C99
implementations.

Trivial.
2016-04-13 11:13:05 +01:00
Roland Scheidegger
cb438d8b3e gallivm: use llvm.nearbyint instead of llvm.round.
We used to use sse roundps intrinsic directly, but switched to use the llvm
intrinsics for rounding with e4f01da15d.
However, llvm semantics follows standard math lib round function which is
specced to do roundNearestAwayFromZero but we really want roundNearestEven
(moreoever, using round generates atrocious code since the cpu can't do it
directly and it results in scalar calls to libm __roundf).
So, use llvm.nearbyint instead, which does exactly the right thing, and even
has the advantage of being available with llvm 3.3 too. (I've verified it
actually generates a roundps instruction with llvm 3.3.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94909

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-13 11:13:03 +01:00
Pierre Moreau
f525db6358 nv50/ra: isinf() is in namespace std since C++11.
This fixes a compile error while building Nouveau with C++11 enabled (and
glibc >= 2.23). This happens if SWR is enabled, as it forces C++11.

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Jose Fonseca <jfonseca@vmware.com>

https://bugs.freedesktop.org/show_bug.cgi?id=94907
2016-04-13 07:41:13 +01:00
Jose Fonseca
fa46848e51 scons: Allow building with Address Sanitizer.
libasan is never linked to shared objects (which doesn't go well with
-z,defs).  It must either be linked to the main executable, or (more
practically for OpenGL drivers) be pre-loaded via LD_PRELOAD.

Otherwise works.

I didn't find anything with llvmpipe.  I suspect the fact that the
JIT compiled code isn't instrumented means there are lots of errors it
can't catch.

But for non-JIT drivers, the Address/Leak Sanitizers seem like a faster
alternative to Valgrind.

Usage (Ubuntu 15.10):

   scons asan=1 libgl-xlib
   export LD_LIBRARY_PATH=$PWD/build/linux-x86_64-debug/gallium/targets/libgl-xlib
   LD_PRELOAD=libasan.so.2 any-opengl-application

Acked-by: Roland Scheidegger <sroland@vmware.com>
2016-04-13 06:54:32 +01:00
Kenneth Graunke
d1c89f6005 mesa: Change an error code in glSamplerParameterI[iu]v().
This is supposed to be INVALID_OPERATION in ES.  We already did this
for the fv/iv variants, but not Iiv/Iuv, which are new in ES 3.2 (or
extensions).

Fixes:
ES31-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-12 20:30:32 -07:00
Jose Fonseca
46bfcd61f5 softpipe: Free tgsi.image elements on context destruction.
Courtesy of address sanitizer.

[airlied: free buffers as well]
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-13 13:21:37 +10:00
Edward O'Callaghan
5a3d928e2c softpipe: Enable ARB_framebuffer_no_attachments
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-13 13:21:37 +10:00
Eric Anholt
3b63301d9f vc4: Work around hardware limits on the number of verts in a single draw.
Fixes rendering failures in glmark2's refract and
bump:render-mode=high-poly demos, and partially in its terrain demo.
2016-04-12 19:10:51 -07:00
Thomas Hindoe Paaboel Andersen
6d6525a377 softpipe: avoid buffer overflow
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-13 11:51:35 +10:00
Thomas Hindoe Paaboel Andersen
b89708f95f tgsi: fix buffer overflow
Increase r to four channels as rgba is written to it
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-13 11:51:34 +10:00
Tim Rowley
b9294bc345 swr: handle pci cap requests
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2016-04-12 20:18:00 -05:00
Tim Rowley
b19d214b23 swr: support samplers in vertex shaders
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2016-04-12 20:18:00 -05:00
Nicolai Hähnle
10cfd7a604 radeonsi: enable GLSL 4.20 and therefore OpenGL 4.2
This is the last necessary bit for OpenGL 4.2 support. All driver-specific
functionality has already been implemented as part of extensions.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 20:13:49 -05:00
Iurie Salomov
047e3264f6 va: check null context in vlVaDestroyContext
Signed-off-by: Iurie Salomov <iurcic@gmail.com>
Reviewed-by: Julien Isorce <j.isorce@samsung.com>
2016-04-13 00:52:53 +01:00
Jason Ekstrand
8f3b516f2e nir/clone: Copy bit size when cloning registers
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-04-12 16:41:58 -07:00
Marek Olšák
8e70a58af3 radeonsi: fix a critical SI hang since PIPELINESTAT_START/STOP was added
For some reason unknown to me, SI hangs if the event is written after
CONTEXT_CONTROL.
2016-04-13 01:05:15 +02:00
Kenneth Graunke
95d622e16d glsl: Don't copy propagate or tree graft precise values.
This is kind of a hack.  We currently track precise requirements
by decorating ir_variables.  Propagating or grafting the RHS of an
assignment to a precise value into some other expression tree can
lose those decorations.

In the long run, it might be better to replace these ir_variable
decorations with an "exact" decoration on ir_expression nodes,
similar to what NIR does.

In the short run, this is probably good enough.  It preserves
enough information for glsl_to_nir to generate "exact" decorations,
and NIR will then handle optimizing these expressions reasonably.

Fixes ES31-CTS.gpu_shader5.precise_qualifier.

v2: Drop invariant handling, as it shouldn't be necessary (caught
    by Jason Ekstrand).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-12 15:57:48 -07:00
Mark Janes
9e351e077b util: Fix race condition on libgcrypt initialization
Fixes intermittent Vulkan CTS failures within the test groups:
dEQP-VK.api.object_management.multithreaded_per_thread_device
dEQP-VK.api.object_management.multithreaded_per_thread_resources
dEQP-VK.api.object_management.multithreaded_shared_resources

Signed-off-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94904

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-12 15:38:43 -07:00
Kristian Høgsberg Kristensen
8ec971a997 i965/tiled_memcpy: Fix rgba8_copy_16_aligned_dst() typo
Copy and paste error in commit eafeb8db66:

    i965/tiled_memcpy: Unroll bytes==64 case.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-12 15:32:43 -07:00
Kristian Høgsberg Kristensen
1af0f0151c glsl/linker: Recurse on struct fields when adding shader variables
ARB_program_interface_query requires that we add struct fields
recursively down to basic types.

Fixes 52 struct test cases in dEQP-GLES31.functional.program_interface_query.*

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-12 14:38:26 -07:00
Kristian Høgsberg Kristensen
778fd46aa4 glsl/linker: Pass name and type through to create_shader_variable()
No functional change here, but this now lets us recurse throught structs
in add_shader_variable().

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-12 14:38:26 -07:00
Kristian Høgsberg Kristensen
09f0121593 glsl/linker: Pass absolute location to add_shader_variable()
This lets us pass in the absolution location of a variable instead of
computing it in add_shader_variable() based on variable location and
bias. This is in preparation for recursing into struct variables.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-12 14:38:26 -07:00
Kristian Høgsberg Kristensen
8ab6aae4dc glsl/linker: Add add_shader_variable() helper
This consolidates the combination of create_shader_variable() and
add_program_resource() into a new helper function. No functional
difference, but we'll expand add_shader_variable() in the next few
commits.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-12 14:38:26 -07:00
Matt Turner
eafeb8db66 i965/tiled_memcpy: Unroll bytes==64 case.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-12 14:37:05 -07:00
Roland Scheidegger
0e605d9b3a i965/tiled_memcpy: Provide SSE2 for RGBA8 <-> BGRA8 swizzle.
The existing code uses SSSE3, and because it isn't compiled in a
separate file compiled with that, it is usually not used (that, of
course, could be fixed...), whereas SSE2 is always present with 64-bit
builds.  This should be pretty much as fast as the pshufb version,
albeit those code paths aren't really used on chips without llc in any
case.

v2: fix andnot argument order, add comments
v3: use pshuflw/hw instead of shifts (suggested by Matt Turner), cut comments
v4: [mattst88] Rebase

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-12 14:37:01 -07:00
Matt Turner
fc88b4babf i965/tiled_memcpy: Move SSSE3 code back into inline functions.
This will make adding SSE2 code a lot cleaner.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-12 14:36:59 -07:00
Matt Turner
0a5d8d9af4 i965/tiled_memcpy: Optimize RGBA -> BGRA swizzle.
Replaces four byte loads and four byte stores with a load, bswap,
rotate, store; or a movbe, rotate, store.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-12 14:36:56 -07:00
Nicolai Hähnle
a191e6b719 radeonsi: fix bounds check in si_create_vertex_elements
This was triggered by
dEQP-GLES3.functional.vertex_array_objects.all_attributes

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-12 16:32:46 -05:00
Nicolai Hähnle
4285a97cea docs: mark atomic counters and SSBOs as done for radeonsi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 16:30:51 -05:00
Nicolai Hähnle
bfd11c5996 radeonsi: enable shader buffer pipe caps
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 16:30:48 -05:00
Nicolai Hähnle
4e81843b13 radeonsi: add shader buffer support to TGSI_OPCODE_RESQ
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 16:30:45 -05:00
Nicolai Hähnle
01109282ce radeonsi: add shader buffer support to TGSI_OPCODE_STORE
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 16:30:43 -05:00
Nicolai Hähnle
745014c502 radeonsi: add shader buffer support to TGSI_OPCODE_LOAD
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 16:30:41 -05:00
Nicolai Hähnle
68bc25c931 radeonsi: add shader buffer support to TGSI_OPCODE_ATOM*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 16:30:38 -05:00
Nicolai Hähnle
c6f5d000db radeonsi: add offset parameter to buffer_append_args
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 16:30:35 -05:00
Nicolai Hähnle
c565466eea radeonsi: adjust buffer_append_args to take a 128 bit resource
Move the buffer resource extraction code out into its own function.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 16:30:32 -05:00
Nicolai Hähnle
e88018ffe5 radeonsi: preload shader buffers in shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 16:30:29 -05:00
Nicolai Hähnle
c495c0ad37 radeonsi: implement set_shader_buffers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 16:30:26 -05:00
Nicolai Hähnle
73c8b85b64 radeonsi: move resetting of constant buffers into a separate function
This will be re-used for shader buffers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 16:30:04 -05:00
Haixia Shi
35ade36c88 dri/i965: fix incorrect rgbFormat in intelCreateBuffer().
It is incorrect to assume that pixel format is always in BGR byte order.
We need to check bitmask parameters (such as |redMask|) to determine whether
the RGB or BGR byte order is requested.

v2: reformat code to stay within 80 character per line limit.
v3: just fix the byte order problem first and investigate SRGB later.
v4: rebased on top of the GLES3 sRGB workaround fix.
v5: rebased on top of the GLES3 sRGB workaround fix v2.

Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-12 14:06:45 -07:00
Kenneth Graunke
e303e88a9c glsl: Reject illegal qualifiers on atomic counter uniforms.
This fixes

dEQP-GLES31.functional.uniform_location.negative.atomic_fragment
dEQP-GLES31.functional.uniform_location.negative.atomic_vertex

Both of which have lines like

layout(location = 3, binding = 0, offset = 0) uniform atomic_uint uni0;

The ARB_explicit_uniform_location spec makes a very tangential mention
regarding atomic counters, but location isn't something that makes sense
with them.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-12 14:06:42 -07:00
Kenneth Graunke
929e44099f glsl: Add a method to print error messages for illegal qualifiers.
Suggested by Timothy Arceri a while back on mesa-dev:
https://lists.freedesktop.org/archives/mesa-dev/2016-February/107735.html

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2016-04-12 14:06:42 -07:00
John Sheu
7f08547248 xlib: fix memory leak on Display close
The XMesaVisual instances freed in the visuals table on display close
are being freed with a free() call, instead of XMesaDestroyVisual(),
causing a memory leak.

Signed-off-by: John Sheu <sheu@google.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-12 13:56:41 -06:00
Jakob Sinclair
d04bb14d04 st/mesa: Replace GLvoid with void
GLvoid was used before in OpenGL but it has changed to just using void.
All GLvoids in mesa's state tracker has been changed to void in this patch.

Tested this with piglit and no problems were found. No compiler warnings.

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-12 13:37:16 -06:00
Bas Nieuwenhuizen
126da23d70 radeonsi: Mark ARB_robust_buffer_access_behavior as supported.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-12 20:53:10 +02:00
Bas Nieuwenhuizen
70dcd841f7 gallium: Add capability for ARB_robust_buffer_access_behavior.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-12 20:53:06 +02:00
Bas Nieuwenhuizen
285dc05055 mesa: Expose the ARB_robust_buffer_access_behavior extension.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-12 20:40:26 +02:00
Miklós Máté
aad8707b28 main: rework the compatibility check of visuals in glXMakeCurrent
Now it follows the compatibility criteria listed in section 2.1 of
the GLX 1.4 specification.
This is needed for post-process effects in SW:KotOR.

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-04-12 19:48:01 +02:00
Tim Rowley
df37b06276 swr: [rasterizer core] warning cleanup
Acked-by: Brian Paul <brianp@vmware.com>
2016-04-12 11:52:05 -05:00
Tim Rowley
06c59dc417 swr: [rasterizer] Put in rudimentary garbage collection for the global arena allocator
- Check for unused blocks every few frames or every 64K draws
- Delete data unused since the last check if total unused data is > 20MB

Doesn't seem to cause a perf degridation

Acked-by: Brian Paul <brianp@vmware.com>
2016-04-12 11:52:05 -05:00
Tim Rowley
b990483de2 swr: [rasterizer core] Put DRAW_CONTEXT on a diet
No need for 256 pointers per DC.

Acked-by: Brian Paul <brianp@vmware.com>
2016-04-12 11:52:05 -05:00
Tim Rowley
a939a58881 swr: [rasterizer core] Add experimental support for hyper-threaded front-end
Acked-by: Brian Paul <brianp@vmware.com>
2016-04-12 11:52:05 -05:00
Tim Rowley
9a8146d0ff swr: [rasterizer] Avoid segv in thread creation on machines with non-consecutive NUMA topology.
Acked-by: Brian Paul <brianp@vmware.com>
2016-04-12 11:52:05 -05:00
Tim Rowley
2c71fd4bf8 swr: [rasterizer core] Replace all naked OSALIGN macro uses with OSALIGNSIMD / OSALIGNLINE
Future proofing

Acked-by: Brian Paul <brianp@vmware.com>
2016-04-12 11:52:05 -05:00
Tim Rowley
32a8653ad2 swr: [rasterizer] Ensure correct alignment of stack variables used as vectors
Acked-by: Brian Paul <brianp@vmware.com>
2016-04-12 11:52:05 -05:00
Tim Rowley
e1871c4459 swr: [rasterizer core] Quantize depth to depth buffer precision prior to depth test/write.
Fixes z-fighting issues.

Acked-by: Brian Paul <brianp@vmware.com>
2016-04-12 11:52:05 -05:00
Tim Rowley
2a19aca05f swr: [rasterizer common] win32 build fixups
Acked-by: Brian Paul <brianp@vmware.com>
2016-04-12 11:52:05 -05:00
Tim Rowley
c25244f2f7 swr: [rasterizer core] Affinitize thread scratch space to numa node of worker
Acked-by: Brian Paul <brianp@vmware.com>
2016-04-12 11:52:04 -05:00
Tim Rowley
f89f6d562a swr: [rasterizer] Misc fixes identified by static code analysis
No perf loss detected

Acked-by: Brian Paul <brianp@vmware.com>
2016-04-12 11:52:04 -05:00
Brian Paul
6c01478213 st/mesa: fix memleak in glDrawPixels cache code
If the glDrawPixels size changed, we leaked the previously cached
texture, if there was one.  This patch fixes the reference counting,
adds a refcount assertion check, and better handles potential malloc()
failures.

Tested with a modified version of the drawpix Mesa demo which changed
the image size for each glDrawPixels call.

Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-04-12 10:44:45 -06:00
Jose Fonseca
b5105e67a8 gallium: Use STATIC_ASSERT whenever possible.
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-12 16:56:15 +01:00
Jose Fonseca
b025c23cfe softpipe: Use STATIC_ASSERT whenever possible.
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-12 16:56:15 +01:00
Jose Fonseca
2f13d7543f svga: Use STATIC_ASSERT whenever possible.
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-12 16:56:15 +01:00
Jose Fonseca
7279098dc5 mesa: Use STATIC_ASSERT whenever possible.
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-12 16:56:15 +01:00
Marek Olšák
686b018ab3 r600g: use common scissor and viewport code
It's the same as radeonsi. This adds guard band support to r600g.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 17:13:25 +02:00
Marek Olšák
87a5b07f90 gallium/radeon: add R600/Evergreen/Cayman support to common viewport code
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 17:13:25 +02:00
Marek Olšák
2ca5566ed7 radeonsi: move scissor and viewport states into gallium/radeon
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 17:13:24 +02:00
Marek Olšák
db00f6cc9c radeonsi: use guard band clipping
Guard band clipping speeds up rasterization for primitives that are
partially off-screen.  This change in particular results in small
framerate improvements in a wide range of games.

Started by Grigori Goronzy <greg@chown.ath.cx>.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 17:12:14 +02:00
Marek Olšák
cb21f8a97c radeonsi: compute scissor from viewport in set_viewport_states
and clamp it right before emitting. This is a prerequisite for computing
the guard band.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:49 +02:00
Marek Olšák
5b6a0b7fc0 gallium/radeon: set GTT WC on tiled textures
Just for consistency. This should have no effect, because OpenGL textures
always go to VRAM.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
5a4b74d1ba gallium/radeon: relax requirements on VRAM placements on APUs
This makes Tonga with vramlimit=128 2x faster in Heaven.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
a57309f807 winsys/amdgpu: remove hack for low VRAM configuration
A better solution will be used.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
b36f19bf98 r600g: disable aniso filtering for non-mipmap textures on EG
this is the default behavior of the closed driver when running on VI

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
3bc2d967c4 r600g: clean up aniso state translation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
b0d4469519 radeonsi: disable aniso filtering for non-mipmap textures on SI-CI
The closed driver does this, but it looks at base_level and last_level
and uses a conditional assignment, which LLVM can't generate on SGPRs.

That led me to invent this solution that abuses the image descriptor.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
ddd33431c5 radeonsi: clean up aniso state translation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
f7420ef5b4 radeonsi: enable some sampler fields to match the closed driver
copied from the Vulkan driver

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
1a98be001f gallium/radeon: fix maximum texture anisotropy setup
We were overdoing it for non-power-of-two values.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
2d7be5d37e gallium/radeon: never choose a linear tiling for DB surfaces
Just for consistency. This is actually not a problem, because both addrlib
and radeon check and fix this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
b7878146c4 gallium/radeon: removing dead code for sharing stencil buffers
This is a remnant of the times when the DDX was allocating depth-stencil
buffers for windows. Now, st/dri allocates them and doesn't share them.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
73aeebd772 radeonsi: allow clearing buffers >= 4 GB
Only CMASK and DCC clears can use this, because only textures can be so
large.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
1dd8832e04 gallium/radeon: allow allocating textures >= 4 GB
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:48 +02:00
Marek Olšák
0689741e51 winsys/radeon: fix printing allocation failures
print as unsigned instead of signed

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák
0ba0933f48 winsys/amdgpu: add support for 64-bit buffer sizes
v2: fail in radeon_winsys_bo_create if size > 32 bits

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák
7e78b5ed38 pb_buffer: switch pb_buffer::size to 64 bits
being able to allocate more than 4 GB may be useful

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák
e241a63512 gallium/radeon: remove R600_QUERY_HW_FLAG_TIMER
not used anymore

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák
0222351fc1 gallium/radeon: merge timer and non-timer query lists
All of them are paused only between IBs.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák
7347c068d8 r600g: don't manually stop queries for blitter
r600_set_active_query_state does it better.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák
12fee5b93e r600g: add pausing pipeline & streamout queries into set_active_query_state
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák
e90fe60b72 r600g: implement set_active_query_state for pausing occlusion queries
Use ZPASS_INCREMENT_DISABLE everywhere.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák
5248676f87 r600g: simplify r600_set_occlusion_query_state
The caller does the same checking.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák
b82893f93a gallium/radeon: move pipeline stat context flags to common code
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák
aa79a3269f r600g: fix typo in r600 register definitions
Acked-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-04-12 14:29:47 +02:00
Marek Olšák
a4c288d8e1 gallium/radeon: unify checking streamout enable state
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:47 +02:00
Marek Olšák
466aa57185 radeonsi: fix mask checking when emitting scissors and viewports
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
2016-04-12 14:29:46 +02:00
Marek Olšák
f3eebb84eb radeonsi: implement and rely on set_active_query_state
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:46 +02:00
Marek Olšák
e599b8f384 gallium: pause queries for all meta ops
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:46 +02:00
Marek Olšák
26171bd67e gallium: add pipe_context::set_active_query_state for pausing queries
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:46 +02:00
Bas Nieuwenhuizen
fc67375379 radeonsi: Synchronize a streamout write after read hazard.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-12 13:55:38 +02:00
Hans de Goede
dccdb655a1 nv30: Add missing PIPE_SHADER_CAP_INTEGERS to get_shader_param()
Add missing PIPE_SHADER_CAP_INTEGERS for frag shaders to
nv30_screen_get_shader_param().

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-04-12 11:41:12 +02:00
Haixia Shi
b0e3ba61b5 dri/i965: extend GLES3 sRGB workaround to cover all formats
It is incorrect to assume BGRA byte order for the GLES3 sRGB workaround.

v2: use _mesa_get_srgb_format_linear to handle all formats

Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-12 02:06:12 -07:00
Eduardo Lima Mitev
ea8a65f503 i965: Add autogenerated 'brw_nir_trig_workarounds.c' to gitignore
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-12 10:44:19 +02:00
Rhys Kidd
703c1e69d8 glsl: Update hash table comments in constant propagation
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-12 01:29:19 -07:00
Dave Airlie
afa8707ba9 softpipe: add SSBO/shader atomics support.
This adds support for the features requires for ARB_shader_storage_buffer_object
and ARB_shader_atomic_counters, ARB_shader_atomic_counter_ops.

[airlied: some cleanups applied]
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-12 14:16:13 +10:00
Dave Airlie
c2aeeca455 draw: add support for passing buffers to vs/gs shaders.
Like the image code, but for shader buffers this time.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-12 14:15:36 +10:00
Dave Airlie
081a958bcd tgsi: add support for buffer/atomic operations to tgsi_exec.
This adds support for doing load/store/atomic operations on
buffer objects.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-12 14:15:33 +10:00
Dave Airlie
9c7a0d188a tgsi: set nonhelpermask for vertex shaders
For atomic operations we really need to avoid executing unnecessary shaders, so for some
tests that just draw a single point we only want one vertex to get processed not 4,

this fixes a number of the atomic counters tests.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-12 14:15:16 +10:00
Ian Romanick
193a5cee6a nir: Fix typo in comment
Trivial.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-11 19:24:19 -07:00
Markus Wick
18c8b927e2 nir: Merge redudant integer clamping.
Dolphin uses them a lot. Range tracking would be better in the long term,
but this two lines works fine for now.

Signed-off-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 18:48:50 -07:00
Kenneth Graunke
bfd17c76c1 i965: Port INTEL_PRECISE_TRIG=1 to NIR.
This makes the extra multiply visible to NIR's algebraic optimizations
(for constant reassociation) as well as constant folding.  This means
that when the result of sin/cos are multiplied by an constant, we can
eliminate the extra multiply altogether, reducing the cost of the
workaround.

It also means we only have to implement it one place, rather than in
both backends.

This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion,
which has a ton of sin() calls, but always multiplies them by an
immediate constant.  The extra multiply gets folded away.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-11 18:44:17 -07:00
Kenneth Graunke
b0dffdc616 i965: Pass brw_compiler into brw_preprocess_nir() instead of is_scalar.
I want to be able to read other fields.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-11 18:44:12 -07:00
Kenneth Graunke
808d26c771 nir: Silence unused "options" warning in algebraic passes.
Some passes may not refer to options->..., at which point the compiler
will warn about an unused variable.  Just cast to void unconditionally
to shut it up.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-11 18:44:08 -07:00
Kenneth Graunke
5886cd79a0 nir: Do basic constant reassociation.
Many shaders contain expression trees of the form:

    const_1 * (value * const_2)

Reorganizing these to

    (const_1 * const_2) * value

will allow constant folding to combine the constants.  Sometimes, these
constants are 2 and 0.5, so we can remove a multiply altogether.  Other
times, it can create more immediate constants, which can actually hurt.

Finding a good balance here is tricky.  While much more could be done,
this simple patch seems to have a lot of positive benefit while having
a low downside.

shader-db results on Broadwell:

total instructions in shared programs: 8963768 -> 8961369 (-0.03%)
instructions in affected programs: 438318 -> 435919 (-0.55%)
helped: 1502
HURT: 245

total cycles in shared programs: 71527354 -> 71421516 (-0.15%)
cycles in affected programs: 11541788 -> 11435950 (-0.92%)
helped: 3445
HURT: 1224

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-11 18:43:55 -07:00
Boyuan Zhang
1c7ba7f156 radeon/uvd: alignment fix for decode message buffer
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-04-11 19:30:47 -04:00
Brian Paul
704d203d5f st/mesa: replace _mesa_sysval_to_semantic table with function
Instead of using an array indexed by SYSTEM_VALUE_x, just use a
switch statement.  This fixes a regression caused by inserting new
SYSTEM_VALUE_ enums but not updating the mapping to TGSI semantics.

v2: fix a few switch statement mistakes for compute-related enums

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-11 17:04:13 -06:00
Jason Ekstrand
a9e6213edd nir/lower_system_values: Add support for several computed values
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-11 13:53:03 -07:00
Jason Ekstrand
39103145ff glsl/shader_enums: Add the other two compute builtins
These weren't added before because they are actually calculated values that
are computed from other inputs.  However, in order to handle them in
nir_lower_system_values, it's nice for them to have a cannonical locaiton.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-11 13:53:00 -07:00
Jason Ekstrand
22836dbefa glsl/shader_enums: Add an enum for Vulkan InstanceIndex
In Vulkan, you have InstanceIndex which begins at the base instance value
rather than the zero-based InstanceID of GL.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-11 13:52:51 -07:00
Emil Velikov
581c8016f8 mesa: add missing header to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-11 19:08:23 +01:00
Emil Velikov
5e010a72c9 drivers/softpipe: add missing header to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-11 19:08:23 +01:00
Emil Velikov
c69ab885d7 mesa: automake: update and reuse X86_SSE41_FILES list
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 19:08:23 +01:00
Emil Velikov
28da0d6922 compiler: android: flesh out nir into separate makefile
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 19:08:23 +01:00
Emil Velikov
8d51500b2d compiler: automake: flesh out NIR into separate makefile.
Analogous to previous commit - improved readability at the expense of
an extra file.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 19:08:23 +01:00
Emil Velikov
9324afc0e9 compiler: automake: split out glsl into separate makefile
Preserve the functionality while keeping the files smaller and
more readable.

v2: Do not include Makefile.sources from the GLSL makefile (silences
automake warnings)

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2016-04-11 19:08:23 +01:00
Emil Velikov
3d67780b80 compiler: remove {glsl,nir}/Makefile.sources
No longer used as of last commit.

v2: Rebase.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2016-04-11 19:08:23 +01:00
Emil Velikov
c481c8f7f1 configure.ac: update the path of the generated files
... in order to determine if we need bison/flex. Failing to locate the
files will lead to mandating bison/flex even when building from a
release tarball.

CC: "11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-11 19:08:23 +01:00
Emil Velikov
4db8f15a25 glsl: move the android build scripts a level up
Analogous to previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 19:08:23 +01:00
Emil Velikov
abf7088eb7 glsl: move the scons build script a level up
It will allow us to remove the duplicate glsl/Makefile.sources.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 19:08:23 +01:00
Emil Velikov
594e868555 Part revert "gallium/auxiliary: don't build NIR sources with MSVC2008 flags"
This reverts commit 41c7912d04 but leaves
out the pragma [that inspired the original commit].

Building mesa requires MSVC2013 or later, thus we no longer need this.

v2: Use correct include path (src/glsl/nir -> src/compiler/nir)

Conflicts:
	src/gallium/auxiliary/Makefile.am

Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2016-04-11 19:08:23 +01:00
Nicolai Hähnle
590a37dc05 GL3: ARB_shader_image_load_store/size is done for radeonsi also in GLES
Trivial.
2016-04-11 12:48:10 -05:00
Brian Paul
05aec42d3d docs: fix Coverity URL 2016-04-11 09:10:39 -06:00
Oded Gabbay
d97f5d60f5 tgsi/doc: fix spelling error
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-11 11:43:43 +03:00
Jason Ekstrand
3aa1a5ee88 nir/lower_system_values: Simplify the computation of LocalInvocationIndex 2016-04-10 23:43:38 -07:00
Connor Abbott
a89c474157 nir: add a pass for lowering (un)pack_double_2x32
v2: Undo unintended change to the signature of
    nir_normalize_cubemap_coords (Iago).

v3: Move to compiler/nir (Iago)

v4: Remove Authors from copyright header (Michael Schellenberger)

v5 (Sam):
- Use nir_channel() and nir_ssa_for_alu_src() helpers (Jason)
- Inline lower_double_pack_instr() code into lower_double_pack_block()
  (Jason).
- Initialize nir_builder at lower_double_pack_impl() (Jason).

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Connor Abbott
663e6421df nir: add split versions of (un)pack_double_2x32
v2 (Sam):
- Use uint64 instead of float64 for sources and destinations. (Connor)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Connor Abbott
b093808d26 nir: don't try to scalarize unpack_double_2x32
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Connor Abbott
9e31e0a21b nir: add support for (un)pack_double_2x32
v2 (Sam):
- Use uint64 instead of float64 for sources and destinations. (Connor)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Iago Toral Quiroga
d5d6260329 nir: add i2d and u2d opcodes
v2:
- Assert supports_int and don't fallback to nir_fmov (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Iago Toral Quiroga
b16d06252e nir: add d2i, d2u, d2b opcodes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Connor Abbott
a4bce07dc6 nir: add support for d2f and f2d
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Iago Toral Quiroga
fab5d4cd95 nir/glsl_to_nir: set bit_size on ssbo_load result
v2 (Sam):
- Add missing bit_size assignment when ssbo_load destination is a boolean.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Samuel Iglesias Gonsálvez
a741378cb5 nir/glsl_to_nir: add bit-size info to add_instr()
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:28:01 +02:00
Connor Abbott
4b37c64f3b nir/split_var_copies: handle doubles
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:05 +02:00
Connor Abbott
106a1b5501 nir/instr_set: handle 64-bit bit-sizes
v2: Revert spurious change in nir_opt_cse.c (Iago)

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:05 +02:00
Connor Abbott
f2ccb63be1 nir: handle doubles in nir_deref_get_const_initializer_load()
v2 (Sam):
- Use proper bitsize value when calling to nir_load_const_instr_create()
  (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:05 +02:00
Connor Abbott
41c2541fc7 nir/print: add support for printing doubles and bitsize
v2:
- Squash the printing doubles related patches into one patch (Sam).

v3:
- Print using PRIx64 format: long is 32-bit on some 32-bit platforms but long
long is basically always 64-bit (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:05 +02:00
Connor Abbott
f5551f8a8b nir/glsl_to_nir: support doubles
v2:
- Don't set sized types to the destination of texture related opcodes.
  (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:05 +02:00
Iago Toral Quiroga
8e69782e3e nir/lower_load_const_to_scalar: support doubles and multiple bit sizes
v2 (Sam):
- Add assert to detect bitsizes differents than 32 and 64 (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:05 +02:00
Iago Toral Quiroga
12f628adcb nir/lower_to_source_mods: Handle different bit sizes
v2 (Sam):
- Use helper to get base type from nir_alu_type.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Samuel Iglesias Gonsálvez
3663a2397e nir: add bit_size info to nir_load_const_instr_create()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Connor Abbott
a5b17ae745 nir/lower_vec: adapt to different bit sizes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Samuel Iglesias Gonsálvez
e3edaec739 nir: add bit_size info to nir_ssa_undef_instr_create()
v2:
- Make the users to give the right bit_sizes as arguments (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Connor Abbott
41a39e3384 nir/locals_to_regs: adapt to different bit sizes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Connor Abbott
40d1b671a9 nir/from_ssa: adapt to different bit sizes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Timothy Arceri
4979cec820 i965: fix struct type in comment
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-04-11 14:03:09 +10:00
Jason Ekstrand
7d58cfa366 nir: Add a pass for gathering various bits of shader info
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-10 20:43:47 -07:00
Ilia Mirkin
875543e270 i965: enable OES_texture_buffer on gen7+
It will only end up getting exposed on gen8+ since it requires GL ES
3.1, but it should be ready to go on gen7 when support for GL ES 3.1 is
completed there.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-10 20:24:26 -07:00
Dave Airlie
6f5f818b6d docs: add some missing softpipe entries.
I just forgot these when I added this stuff.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-11 13:14:48 +10:00
Kenneth Graunke
26c56e24e7 glsl: Don't remove XFB-only varyings.
Consider the case of linking a program with both a vertex and fragment
shader.  The VS may compute output varyings that are intended for
transform feedback, and not read by the fragment shader.

In this case, var->data.is_unmatched_generic_inout will be true,
but we still cannot eliminate the varyings.  We need to also check
!var->data.is_xfb_only.

Fixes failures in ES31-CTS.gpu_shader5.fma_precision_*, which happen
to use transform feedback in a way we apparently hadn't seen before.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-10 19:03:06 -07:00
Kenneth Graunke
ce84a92df5 i965/disasm: Decode per-slot offsets.
We just never bothered to decode this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-04-09 21:10:32 -07:00
Kenneth Graunke
20c8f36508 i965/disasm: Decode "channel mask present" bit correctly.
Bit 15 means "interleave" for most messages, but for SIMD8 messages it
means "use channel masks".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-04-09 21:10:20 -07:00
Kenneth Graunke
b790232524 i965/disasm: Simplify the URB opcode printing with ?:.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-04-09 21:10:11 -07:00
Ilia Mirkin
9b5bd20eb2 glsl: allow usage of the keyword buffer before GLSL 430 / ESSL 310
The GLSL 4.20 and ESSL 3.00 specs don't list 'buffer' as a reserved
keyword. Make the parser ignore it unless GLSL 4.30 / ESSL 3.10 are
used, or ARB_shader_storage_buffer_objects is enabled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2016-04-09 20:41:54 -04:00
Jason Ekstrand
bff7a8c4f3 anv/pipeline: Set up flat enables correctly 2016-04-09 17:06:59 -07:00
Jason Ekstrand
1275c7c744 genxml: Fix the name of a 3DSTATE_SF/SBE field on gen6-7.5 2016-04-09 17:02:21 -07:00
Jason Ekstrand
aa6f9a4e1e genxml: Break output detail of 3DSTATE_SF on gen7 into a struct
This makes it work like 3DSTATE_SBE[_SWIZ] on gen7+
2016-04-09 17:00:22 -07:00
Jason Ekstrand
ddae342618 genxml: Fix up MOCS in RENDER_SURFACE_STATE on gen6 to match gen7 2016-04-09 16:59:04 -07:00
Ilia Mirkin
cdb6fa91fa nvc0: handle the case where there are no framebuffer attachments
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-09 14:55:44 -04:00
Ilia Mirkin
59ca92137b nv50,nvc0: support sending string markers down into the command stream
This should hopefully make it a little easier to debug with GL
applications like glretrace and looking at command streams.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-09 14:55:43 -04:00
Ilia Mirkin
f9480d7918 nv50,nvc0: add invalidate_resource support for buffer resources
Provide a callback to reallocate the underlying storage of a resource so
that it is not bound to any existing fences.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-09 14:55:43 -04:00
Eric Anholt
30b818d5eb vc4: Move FRAG_X/Y/REV_FLAG to a QFILE like VPM or TLB color writes.
This gives us one less set of special instruction generation cases, and
instead just the case for returning the correct register to read.
2016-04-08 18:41:46 -07:00
Eric Anholt
f029932cac vc4: Allow TLB Z/color/stencil writes from any ALU operation in QIR.
This lets us write the Z directly from the FTOI for computed Z, and may
let us coalesce color writes in the future.

No change in my shader-db, but clearly drops an instruction in piglit's
early-z test.
2016-04-08 18:41:46 -07:00
Eric Anholt
44d7b8ad12 vc4: Add a helper function for the construction of qregs.
The separate declaration of the struct is not helping clarity, and I was
going to be writing a whole lot more of these in the upcoming patches.
2016-04-08 18:41:45 -07:00
Eric Anholt
114c8b38d3 vc4: Add missing scheduling dependency for MS color writes. 2016-04-08 18:41:45 -07:00
Eric Anholt
483c172989 vc4: Drop the multi_instruction distinction for QIR instructions.
It wasn't correctly flagged everywhere, and QPU generation now handles the
only remaining case that was paying attention to it.

No change on shader-db.
2016-04-08 18:41:45 -07:00
Eric Anholt
a8b525f8c4 vc4: Handle SF on instructions that write r4.
Normal SFU writes couldn't have SF because they were marked as
multi_instruction, but tex_result and tlb_color_read weren't.  This ended
up not being a problem according to anything in shader-db, but it seems
possible.
2016-04-08 18:41:45 -07:00
Eric Anholt
e46b48963a vc4: Allow multi-instruction QIR nodes to get VPM optimization.
There used to be multi-instruction operations that would use src[] twice,
which is why we couldn't do some optimizations on them.  This is no longer
the case.

total instructions in shared programs: 77973 -> 77969 (-0.01%)
instructions in affected programs:     84 -> 80 (-4.76%)
total estimated cycles in shared programs: 234165 -> 234157 (-0.00%)
estimated cycles in affected programs:     92 -> 84 (-8.70%)
2016-04-08 18:41:45 -07:00
Eric Anholt
99a759a4a3 vc4: Switch to using NIR_PASS macros.
This gets us better validation of our NIR transformations.
2016-04-08 18:41:45 -07:00
Eric Anholt
7030eadbed vc4: Handle nir_intrinsic_load_user_clip_plane as a vec4.
I liked having all my NIR be scalar, but nir_validate() complains that the
intrinsic writes 4 components but the destination we set up was only 1
component.  I could generate a new scalar variant, but it's a lot easier
to just leave it as a vec4.  This doesn't hurt codegen since we GC unused
uniforms, and UCP dot products use all the components anyway.
2016-04-08 18:40:55 -07:00
Rhys Kidd
40e77741cf vc4: Emit a warning and proceed for handling loops in NIR.
We don't really suppor control flow yet, but it's a lot nicer to render
something and warn on stderr than to crash.

Fixes the following piglit tests:
- shaders/complex-loop-analysis-bug
- shaders/glsl-fs-discard-04

Converts the following piglit tests from crash to fail:
- shaders/glsl-fs-continue-inside-do-while
- shaders/glsl-fs-loop
- shaders/glsl-fs-loop-continue
- shaders/glsl-fs-loop-nested
- shaders/glsl-texcoord-array
- shaders/glsl-vs-continue-inside-do-while
- shaders/glsl-vs-loop
- shaders/glsl-vs-loop-continue
- shaders/glsl-vs-loop-nested

No piglit regressions.

v2 (Eric): Add stronger stderr warning.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-04-08 18:28:43 -07:00
Rhys Kidd
2450b219e5 vc4: Add a stub for NIR->QIR of control flow function nodes
We shouldn't have any NIR functions present since all GLSL functions get
inlined, but this would be a more informative error if it does happen.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-04-08 18:28:43 -07:00
Rhys Kidd
e5997778bc vc4: Add better debug of NIR->QIR control flow graph failure
Ensure NIR control flow graph nodes that are unhandled in QIR
are reported with sufficient verbosity to aid debugging.

This improves piglit outputs, amongst other tools.

There are no other remaining uses of assert(0) as a blunt tool
within vc4.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-04-08 18:28:43 -07:00
Rhys Kidd
e529dd179f vc4: Remove unused include from vc4_program.c
Found with grep and inspection. Test compiled on RPi hw.
Assists any future effort to remove TGSI as an intermediate stage.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-04-08 18:28:43 -07:00
Lars Hamre
e25c24c638 glsl: handle unsigned int wraparound in link_shaders()
v2: change check_explicit_uniform_locations() to return an
    unsigned 0 (Timothy Arceri)

We were storing the int result of check_explicit_uniform_locations()
in num_explicit_uniform_locs as an unsigned int which caused it to
be 4294967295 when a -1 was returned.

This in turn would cause the following error during linking:
error: count of uniform locations > MAX_UNIFORM_LOCATIONS(4294967295 > 98304)

Results from running piglit tests/all with this patch
and when ARB_explicit_uniform_location disabled:

changes:     178
fixes:       176
regressions: 2

The two regressions are for the following tests:
glean@glsl1-matrix column check (1)
glean@glsl1-matrix column check (2)
which regress from FAIL to CRASH.

The regressions are acceptable because the tests are currently failing due to
the aforementioned linker error.

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-09 11:06:04 +10:00
Jason Ekstrand
d4a28ae52a anv/meta: Make clflushes conditional on !devinfo->has_llc 2016-04-08 17:07:49 -07:00
Jason Ekstrand
c226e72a39 anv/formats: Advertise blit support for stencil
Thanks to advances in the blit code, we can do this now.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:59:29 -07:00
Jason Ekstrand
e3312644cb anv/blit2d: Add support for W-tiled destinations
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-08 15:59:26 -07:00
Jason Ekstrand
0a6842c1bd isl/surface_state: Set the correct pitch for W-tiled surfaces
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:58:52 -07:00
Jason Ekstrand
2e827816fa anv/blit2d: Add another passthrough varying to the VS
We need the VS to provide some setup data for other stages.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:58:49 -07:00
Jason Ekstrand
b377c1d08e anv/image: Remove the offset parameter from image_view_init
The only place we were using this was in meta_blit2d which always creates a
new image anyway so we can just use the image offset.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:58:45 -07:00
Jason Ekstrand
f9a2570a06 anv/blit2d: Add a bind_dst helper function
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:58:42 -07:00
Jason Ekstrand
15a9468d85 anv/blit2d: Simplify create_iview
Now it just creates the image and view.  The caller is responsible for
handling the offset calculations.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:58:40 -07:00
Jason Ekstrand
b8f3909b73 nir/gather_info: Handle discard_if
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:58:36 -07:00
Jason Ekstrand
819d0e1a7c anv/meta2d: Add support for blitting from W-tiled sources on gen7
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-08 15:58:03 -07:00
Jason Ekstrand
b0a5ca5cfc isl: Remove surf_get_intratile_offset_el
The intratile offset may not be a multiple of the element size so this
calculation is invalid.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:58:01 -07:00
Jason Ekstrand
b37502b983 isl: Rework the get_intratile_offset function
The old function tried to work in elements which isn't, strictly speaking,
a valid thing to do.  In the case of a non-power-of-two format, there is no
guarantee that the x offset into the tile is a multiple of the format
block size.  This commit refactors it to work entirely in terms of a tiling
(not a surface) and bytes/rows.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:58 -07:00
Jason Ekstrand
4caba94086 anv/image: Expose the guts of CreateBufferView for meta
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:55 -07:00
Jason Ekstrand
4ee80e8816 anv/blit2d: Refactor in preparation for different src/dst types
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:52 -07:00
Jason Ekstrand
85b9a007ac anv/blit2d: Add layouts for using a texel buffer source
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:49 -07:00
Jason Ekstrand
28eb02e345 anv/blit2d: Rename the descriptor set and pipeline layouts
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:47 -07:00
Jason Ekstrand
00e70868ee anv/blit2d: Enhance teardown and clean up init error paths
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:45 -07:00
Jason Ekstrand
43fbdd7156 anv/blit2d: Factor binding the source image into a helper
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:43 -07:00
Jason Ekstrand
5187ab05b8 anv/blit2d: Inline meta_emit_blit2d
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:41 -07:00
Jason Ekstrand
b0a6cfb9b4 anv/blit2d: Pass the source pitch into the shader
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:39 -07:00
Jason Ekstrand
e466164c87 anv/blit2d: Break the texelfetch portion of shader building into a helper
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:37 -07:00
Jason Ekstrand
afada45590 anv/blit2d: Fix whitespace
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:35 -07:00
Jason Ekstrand
9553fd2c97 anv/blit2d: Fix a NIR writemask
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:32 -07:00
Jason Ekstrand
b38a0d64ba anv/meta2d: Don't declare an array sampler in the fragment shader
With the new blit framework we aren't using array textures and, from
talking with Nanley, we don't think it's going to be useful in the future
either.  Just get rid of it for now.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:57:28 -07:00
Jason Ekstrand
dd6f720046 anv/blit2d: Remove the tex_dim parameter from copy_fragment_shader
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:56:52 -07:00
Jason Ekstrand
6cc7aec5b0 i965/tiled_memcopy: Get rid of the direction parameter to get_memcpy
Now that we can use the much simpler rgba8_copy function, we don't need to
hand different functions out based on direction.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-08 12:09:20 -07:00
Jason Ekstrand
d2b32656e1 i965/tiled_memcpy: Rework the RGBA -> BGRA mem_copy functions
This splits the two copy functions into three: One for unaligned copies,
one for aligned sources, and one for aligned destinations.  Thanks to the
previous commit, we are now guaranteed that the aligned ones will *only*
operate on aligned memory so they should be safe.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93962
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-08 12:09:15 -07:00
Jason Ekstrand
f6f54a29ca i965/tiled_memcopy: Add aligned mem_copy parameters to the [de]tiling functions
Each of the [de]tiling functions has three mem_copy calls:

 1) Left edge to tile boundary
 2) Tile boundary to tile boundary in a loop
 3) Tile boundary to right edge

Copies 2 and 3 start at a tile edge so the pointer to tiled memory is
guaranteed to be at least 16-byte aligned.  Copy 1, on the other hand,
starts at some arbitrary place in the tile so it doesn't have any such
alignment guarantees.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-04-08 12:08:51 -07:00
Ben Widawsky
e5295b5fb4 i965: Check eu/subslices are > 0
Now that the check is restricted to gen8+, we should always get back a non-zero
positive value for the EU and subslice counts.

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-08 11:52:29 -07:00
Ben Widawsky
cc01b63d73 i965: Fix eu/subslice warning
Older gen platforms do not actually return a value for sublice and eu total
(IMO, confusingly) they return -ENODEV. This patch defers the SSEU setup until
we have the actual GPU generation to avoid useless warnings when running on
older platforms with older kernels.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-08 11:52:29 -07:00
Ben Widawsky
4213b00e30 i965: Extract SSEU configuration info
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-08 11:51:01 -07:00
Brian Paul
4420f189b6 st/mesa: fix glReadBuffer() assertion failure
If the first call in a GL app is glReadPixels(GL_FRONT) we'd fail the
assert(st->ctx->FragmentProgram._Current) at st_atom_shader.c:114 in
update_fp().

This is because we were calling st_validate_state() without first
updating Mesa state with _mesa_update_state().

The regression came from commit 83b589301f "st/mesa: fix
frontbuffer glReadPixels regressions".

The new piglit gl-1.0-simple-readbuffer test exercises this.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-08 09:49:05 -06:00
Thomas Hindoe Paaboel Andersen
b9855dcdf7 st/va: avoid dereference after free in vlVaDestroyImage
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Julien Isorce <j.isorce@samsung.com>
2016-04-08 06:57:17 +01:00
Jason Ekstrand
e26a978773 Merge remote-tracking branch 'public/master' into vulkan 2016-04-07 16:56:34 -07:00
Jason Ekstrand
15895bf777 i965/fs: Use the scale helper in surface_builder
As requested by Curro
2016-04-07 16:49:09 -07:00
Marek Olšák
1cd19ebc4a radeonsi: do per-pixel clipping based on viewport states
In other words, vport scissors are derived from viewport states.
If the scissor test is enabled, the intersection of both is used.

The guard band will disable clipping, so we have to clip per-pixel.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-08 00:23:05 +02:00
Samuel Pitoiset
059308db84 nv50/ir: do not try to attach JOIN ops to ATOM
This might result in an INVALID_OPCODE dmesg error in case a join is
attached to an atomic operation.

Spotted with arb_shader_image_load_store-host-mem-barrier on GK104.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2016-04-07 23:10:26 +02:00
Nicolai Hähnle
2abe4f8d7d radeonsi: raise number of samplers per shader to 32
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94835
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 13:15:06 -05:00
Nicolai Hähnle
9d2693f58a radeonsi: expand the compressed color and depth texture masks to 64 bits
This is in preparation of raising the number of exposed sampler views to 32
bits, which will raise the total number of sampler views to 33 for the
polygon stipple texture. That texture should never be compressed (and it's
certainly not a depth texture), but this approach seems cleaner to me than
special-casing the last slot in all affected code paths.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 13:15:06 -05:00
Nicolai Hähnle
f270067ef9 radeonsi: replace magic 16 by SI_NUM_USER_SAMPLERS
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 13:15:06 -05:00
Nicolai Hähnle
f09036f6c0 gallium: raise PIPE_MAX_SAMPLERS to 32
The previous value of 18 was motivated by having drivers that want to expose
16 samplers but also use some additional samplers for internal use. Raising
the value even higher isn't going to hurt that case.

On the other hand, some drivers actually use PIPE_MAX_SAMPLERS as the number
of samplers they expose externally, so raising this number above 32 is fragile
(because several places in the code use bitfields, and tracking down and
widening all of them is prone to miss some case).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 13:15:05 -05:00
Nicolai Hähnle
84c4d069ac st/glsl_to_tgsi: make samplers_used an uint32_t (v2)
It is used as a bitfield, so it seems cleaner to keep it unsigned.

The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.

v2: add an assert for bitfield size and use 1u << idx

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2016-04-07 13:15:05 -05:00
Nicolai Hähnle
4bfcc86bf9 tgsi/scan: add an assert for the size of the samplers_declared bitfield
The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-07 13:15:05 -05:00
Nicolai Hähnle
cc39879989 draw/aaline: stronger guard against no free samplers (v2)
Line anti-aliasing will fail when there is no free sampler available. Make
the corresponding guard more robust in preparation of raising
PIPE_MAX_SAMPLERS to 32.

The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.

v2: add an assert for bitfield size and use 1u << idx

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2016-04-07 13:15:05 -05:00
Nicolai Hähnle
040f5cb09e util/pstipple: stronger guard against no free samplers (v2)
When hasFixedUnit is false, polygon stippling will fail when there is no free
sampler available. Make the corresponding guard more robust in preparation
of raising PIPE_MAX_SAMPLERS to 32.

The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.

v2: add an assert for bitfield size and use 1u << idx

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2016-04-07 13:15:02 -05:00
Brian Paul
b7e67b2337 svga: new SVGA_MSAA env var to disable/enable MSAA pixel formats
On by default.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-07 11:42:43 -06:00
Brian Paul
9f443af449 svga: add some trivial null pointer checks
These small mallocs will probably never fail, but static analysis tools
may complain about the missing checks.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-07 11:42:43 -06:00
Samuel Pitoiset
60cf2fa477 trace: add missing set_shader_images()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-07 18:52:27 +02:00
Marek Olšák
5fac4887d8 radeonsi: disable perfect ZPASS counts for PIPE_QUERY_OCCLUSION_PREDICATE
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-07 13:58:01 +02:00
Marek Olšák
baa0b3f4cc radeonsi: don't use the real barrier instruction in tess ctrl shaders
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-07 13:58:01 +02:00
Michel Dänzer
715e97e342 Revert "clover: Fix build against clang SVN >= r265359"
This reverts commit 0daab9878d.

The corresponding clang change was reverted.

Trivial.
2016-04-07 17:03:09 +09:00
Jason Ekstrand
05db680248 nir/types: Add a wrapper for count_attribute_slots
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-07 09:44:11 +02:00
Kristian Høgsberg Kristensen
068935844c genxml: Add GEN6 genxml
Not used yet, but let's put it here for now.
2016-04-06 21:08:34 -07:00
Dave Airlie
828d84c8e2 r600: use radeon_emit in a few more places in evergreen_compute
This is just a cleanup of the code.

Acked-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-07 04:39:26 +01:00
Dave Airlie
0c40b6f96c r600: make compute global buffer functions static.
This moves things around so that the global buffer handling
functions in evergreen_compute.c are static.

Acked-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-07 04:39:22 +01:00
Dave Airlie
a5d247dda0 r600: make two compute functions static.
These aren't used outside evergreen_compute.c

Acked-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-07 04:39:17 +01:00
Dave Airlie
41558efa87 r600: using pipe_grid_info more in evergreen_compute.
No reason to pull the pieces apart here, also make
one of the functions static as it's unused outside this.

Acked-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-07 04:39:13 +01:00
Dave Airlie
a6e17d7d69 r600: in evergreen_compute use ctx consistently instead of ctx_
Acked-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-07 04:39:09 +01:00
Dave Airlie
aeb2be3a2f r600: use rctx consistently in evergreen_compute.c
Another step towards cleaning this up.

Acked-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-07 04:39:05 +01:00
Dave Airlie
0560c82ff6 r600: cleanup whitespace in evergreen_compute.c
This aligns the code with the style of the rest of the driver.

Makes editing it a lot less painful.

Acked-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-07 04:38:51 +01:00
Edward O'Callaghan
6fc3e7c988 GL3.txt: Mark ARB_framebuffer_no_attachments as done
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 12:03:59 +10:00
Edward O'Callaghan
ea310f2b38 r600g: Enable ARB_framebuffer_no_attachments
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 12:03:59 +10:00
Edward O'Callaghan
483a686f80 radeonsi: Enable ARB_framebuffer_no_attachments
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 12:03:59 +10:00
Edward O'Callaghan
1156cad405 radeonsi: Improve assert info out of si_set_framebuffer_state()
Lets give the developer a little hand if we are going to assert
on a zero literal at the end of a branch.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 12:03:58 +10:00
Edward O'Callaghan
bb1bd0ddd7 radeonsi: Allow 16 samples MSAA mode for PIPE_FORMAT_NONE
For ARB_framebuffer_no_attachment; A is_format_supported() query
with 'PIPE_FORMAT_NONE' passed implies a query of the number of
samples supported from the framebuffer with no attachment.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 12:03:58 +10:00
Edward O'Callaghan
63f2b2f2c0 softpipe: Set samples and layers in set_framebuffer_state() cb
Carries across the number of samples and layers state in the
'softpipe_set_framebuffer_state()' callback. This state is
part of 'ARB_framebuffer_no_attachments' support.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 12:03:58 +10:00
Edward O'Callaghan
c6a514d7df mesa/st: Update framebuffer state with no.of samples,layers
Handle the case of ARB_framebuffer_no_attachment.
Also, kill off a dead debug printf() call while we are here.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 12:03:58 +10:00
Edward O'Callaghan
7ff28d2af0 gallium/trace: Dump no.of samples and layers in fb state
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 12:03:58 +10:00
Edward O'Callaghan
0b7075fed7 gallium: Put no.of {samples,layers} into pipe_framebuffer_state
Here we store the number of samples and layers directly in the
pipe_framebuffer_state so that in the case of
ARB_framebuffer_no_attachment we may make use of them directly.

Further, we adjust various gallium/auxiliary helper functions
accordingly.

V2:
  Convert branches in util_framebuffer_get_num_layers() and
  util_framebuffer_get_num_samples() to their canonical form.

V3:
  'git stash pop' the typo fix of 'cbufs' which should be
  'nr_cbufs' that was missing in V2, woops! Thanks Marek for
  pointing this out yet again.

V4:
  Squash in the following patch:

  'gallium/util: Ensure util_framebuffer_get_num_samples() is valid'

   Upon context creation, internal driver structures are malloc()'ed
   and memset() to zero them. This results in a invalid number of
   samples 'by default'. Handle this in the simplest way to avoid
   elaborate and probably equally sub-optimial solutions.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 12:03:58 +10:00
Edward O'Callaghan
b512b5fd36 mesa/st: Set _NumSamples in update_framebuffer_state()
Using PIPE_FORMAT_NONE to indicate what MSAA modes are supported
with a framebuffer using no attachment.

V.2:
 Rewrite MSAA mode loop to be more general.
V.3:
 Move comment to right place after loop was rewritten.
V.4: [airlied]
 remove unneeded variable, and assert, and unneeded pipe assignment

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-07 12:02:06 +10:00
Edward O'Callaghan
2016e9ffda gallium: Obtain ARB_framebuffer_no_attachment constants
Set default values for the constants required in
ARB_framebuffer_no_attachments and obtained the number
of layers from ``PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS``.

We also obtain the MaxFramebufferSamples value using
a query back to the driver for PIPE_FORMAT_NONE.

V.1:
 Merge if branch predicates into one branch.
 Move const init into st_init_limits()

[airlied: whitespace fixup]
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-07 11:56:44 +10:00
Edward O'Callaghan
4bc9130fba gallium: Add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT
Add PIPE_CAP to determine if the GL extension
'GL_ARB_framebuffer_no_attachments' shall be
supported.

The driver is required to support 'PIPE_FORMAT_NONE'
via its 'is_format_supported()' callback in order
to determine the MSAA modes the hardware supports so
that values requested from the application using
'GL_ARB_framebuffer_no_attachments' may be quantized
to what the hardware expects.

V.2:
 Fix doc for a more detailed description of the PIPE_CAP
 and the corresponding GL constant.

V.3:
 Renamed and repurposed once again.

V.4:
 Remove CAP from cap_mapping array.

[airlied: fix damaged whitespace]

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-07 11:56:44 +10:00
Edward O'Callaghan
85f79f0c75 mesa/st: Use _mesa_geometric_ functions appropriately
Change references to gl_framebuffer::Width, Height, MaxNumLayers
and Visual::samples to use the _mesa_geometric_ convenience functions
for those places where the geometry of the gl_framebuffer is needed.
This is in contrast to the geometry of the intersection of the
attachments of the gl_framebuffer.

This patch paves the way to enable GL_ARB_framebuffer_no_attachements
for all gallium drivers.

V.2:
 Remove itermeditate variable state.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-07 11:56:35 +10:00
Edward O'Callaghan
b40375a21c mesa: Add comment to framebuffer_parameteri()
V.2:
 Change 'N.B.,' to 'NOTE:'.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-07 11:55:33 +10:00
Jason Ekstrand
c62db279b6 i965/sf_state: Pull flat_enables out of prog_data
Previously, we were walking over the shader source to figure out which
inputs should be marked flat.  Now, we can just pull it out of prog_data.
This is needed for properly setting up 3DSTATE_SF/SBE for Vulkan and it
also means that it will get properly cached.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-06 18:08:56 -07:00
Jason Ekstrand
e61cc87c75 i965/fs: Add a flat_inputs field to prog_data
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-06 18:08:56 -07:00
Jason Ekstrand
5c5a9b7bf6 brw/device_info: Add a helper for getting a device name
This is needed by the Vulkan driver

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-06 18:08:56 -07:00
Jason Ekstrand
a241ab43b5 i965/fs_surface_builder: Mask signed integers after conversion
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-04-06 18:08:56 -07:00
Jason Ekstrand
3921b64e63 i965/fs: Make the repclear shader support either a uniform or a flat input
In the Vulkan driver we use a single flat input instead of a uniform
because setting up push constants is more disruptive to the pipeline than
setting up another vertex input.  This uses the number of uniforms as a key
to keep it working for the GL driver.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-06 18:08:50 -07:00
Jason Ekstrand
061969f9dd i965: Move get_hw_prim_for_gl_prim to brw_util.c
It's used by brw_compile_gs in brw_vec4_gs_visitor.cpp so it needs to be in
a file that's linked into libi965_compiler.la.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-06 18:08:47 -07:00
Bas Nieuwenhuizen
3393358115 radeonsi: set shader calling conventions
Note that old mesa + new LLVM or new mesa + old LLVM breaks
with this change and the corresponding LLVM change (D18559).

For LLVM version <= 3.8 we use the old method, but we can't detect
people using a post 3.8 svn version that is still too old.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-06 21:54:35 +02:00
Marek Olšák
0293d72fa5 drirc: add a workaround for blackness in Warsow
Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org>
2016-04-06 12:53:40 +02:00
Ilia Mirkin
2e123e1a25 glsl: use has_shader_storage_buffer_objects helper
Replaces open-coded logic with existing helper.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-05 20:27:32 -04:00
Timothy Arceri
5d39f03806 glsl: remove remaining tabs in link_uniform_blocks.cpp
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-06 09:56:33 +10:00
Timothy Arceri
7ef57aa685 mesa: remove unused IsShaderStorage field
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-06 09:56:28 +10:00
Timothy Arceri
f1293b2f9b glsl: fully split apart buffer block arrays
With this change we create the UBO and SSBO arrays separately from the
beginning rather than putting them into a combined array and splitting
it apart later.

A bug is with UBO and SSBO stage reference querying is also fixed as
we now use the block index to lookup the references in the separate arrays
not the combined buffer block array.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-06 09:56:24 +10:00
Rob Clark
506b561ba7 freedreno/ir3: insert extra move into phi
We had an implicit assumption that the phi src was assigned in it's
source (pred) block leading into the phi.  But this is not true with
NIR, so we can't just ignore the source block specified in the
nir_phi_src.  Insert an extra mov in the source block.  If it is not
required the CP pass will take it back out again.

Fixes:

  ./tests/spec/glsl-1.10/execution/vs-call-in-nested-loop.shader_test
  ./tests/spec/glsl-1.10/execution/vs-inner-loop-modifies-outer-loop-var.shader_test

and probably others.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-05 15:04:43 -04:00
Rob Clark
f9cdbf4405 freedreno/ir3: eliminate unnecessary absneg's
The frontend inserts (abs) and (neg)'s to convert between NIR boolean
(~0/0) and native boolean (1/0).  So we'd end up with things like:

   cmps.s.ge r1.x, ...
   absneg.s r1.x, (neg)r1.x
   absneg.s r1.x, (abs)r1.x
   sel.b32 r2.x, r0.x, r1.x, r0.y

The (neg) already gets collapsed due to the following (abs).  Now by
realizing that r1.x comes from a cmps.s instruction, we can drop the
(abs) as well.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-05 15:04:25 -04:00
Michel Dänzer
0daab9878d clover: Fix build against clang SVN >= r265359
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2016-04-05 17:00:58 +00:00
Bas Nieuwenhuizen
799789ba99 radeonsi: use bounded indexing for samplers
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-05 19:19:18 +02:00
Bas Nieuwenhuizen
713353db18 radeonsi: use bounded indexing for constant buffers
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-05 19:19:07 +02:00
Marek Olšák
a64dbdf612 gallium/radeon: allow multiple exports of the same texture with different usage
Instead of failing an assertion, disable DCC and CMASK on the first export
that needs it, and merge the external usage flags.

v2: clear the EXPLICIT_FLUSH flag if it's not set; whitespace fixes

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-04-05 15:32:40 +02:00
Marek Olšák
25f96d2b97 docs/relnotes: document EGL_KHR_reusable_sync 2016-04-05 15:32:40 +02:00
Dongwon Kim
70299474f5 egl: add EGL_KHR_reusable_sync to egl_dri
This patch enables an EGL extension, EGL_KHR_reusable_sync.
This new extension basically provides a way for multiple APIs or
threads to be excuted synchronously via a "reusable sync"
primitive shared by those threads/API calls.

This was implemented based on the specification at

https://www.khronos.org/registry/egl/extensions/KHR/EGL_KHR_reusable_sync.txt

v2
- use thread functions defined in C11/threads.h instead of
  using direct pthread calls
- make the timeout set with reference to CLOCK_MONOTONIC
- cleaned up the way expiration time is calculated
- (bug fix) in dri2_client_wait_sync, case EGL_SYNC_CL_EVENT_KHR
  has been added.
- (bug fix) in dri2_destroy_sync, return from cond_broadcast
  call is now stored in 'err' intead of 'ret' to prevent 'ret'
  from being reset to 'EGL_FALSE' even in successful case
- corrected minor syntax problems

v3
- dri2_egl_unref_sync now became 'void' type. No more error check
  is needed for this function call as a result.
- (bug fix) resolved issue with duplicated unlocking of display in
  eglClientWaitSync when type of sync is "EGL_KHR_REUSABLE_SYNC"

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-04-05 15:24:57 +02:00
Rob Clark
3e13572826 freedreno/ir3: deal with duplicate phi sources
Otherwise we end up with funny things like:

  mov.f32f32 r0.x, r1.y
  mov.f32f32 r0.x, r1.y

(It doesn't happen as much after fixing the problem w/ CP into phi src,
but it can still happen since we aren't too clever about generating phi
sources in the first place.)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-04 20:18:18 -04:00
Rob Clark
f8feb97ba5 freedreno/ir3: fix silly brain-fart in RA
We want to consider all the vars, not 1/32nd of them, when extending
live-ranges.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-04 20:18:18 -04:00
Rob Clark
8e451c2d06 freedreno/ir3: don't cp into phi's
The block defining a phi source might not have been executed.  If we
allow copy propagation, we could end up pointing to a src instruction in
the wrong block.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-04 20:18:18 -04:00
Rob Clark
383b6e87f9 freedreno/ir3: we can't store immediate values
Fixes some transform-feedback piglits, like:

bin/ext_transform_feedback-nonflat-integral

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-04 20:18:18 -04:00
Rob Clark
d47fb856af freedreno/ir3: add dumping for use/def/live-in/live-out
Turned out to be useful to debug an issue in RA.  Let's keep it.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-04 20:18:18 -04:00
Rob Clark
38ae05a340 freedreno/ir3: drop unused instr category arg
No longer used, so drop the extra arg to ir3_instr_create()

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-04 20:18:18 -04:00
Rob Clark
19739e4fb9 freedreno/ir3: remove ir3_instruction::category
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-04 20:18:18 -04:00
Rob Clark
70735643f4 freedreno/ir3: encode instruction category in opc_t
Been on my TODO list for a while.  If nothing else this will make gdb
properly grok the opc_t enum.

This first step preserves ir3_instruction::category (with an added
assert that category matches what is encoded in opc_t).  Next step is
to drop the category field (and arg to ir3_instr_create()), but that
is split into next commit for bisectability and so that we can run
piglit in the intermediate state to flush out any problems.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-04 20:18:18 -04:00
Jason Ekstrand
5ea3647f89 i965/fs: Move the code for load/store_shared to emit_cs_intrinsic
They are compute-shader only and that's where the code for doing atomics on
shared variables lives so it seemes to make sense.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-04-04 15:56:50 -07:00
Jason Ekstrand
80c72a8ea7 i965/nir: Provide a default LOD for buffer textures
Our hardware requires an LOD for all texelFetch commands even if they are
on buffer textures.  GLSL IR gives us an LOD of 0 in that case, but the LOD
is really rather meaningless.  This commit allows other NIR producers to be
more lazy and not provide one at all.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-04-04 15:56:39 -07:00
Jason Ekstrand
e5c833db5a i965/compiler: Remove a redundant declaration of brw_compiler_create 2016-04-04 14:51:35 -07:00
Kenneth Graunke
3babb7b0a4 nir: Use PRIi64 and PRIu64 instead of %ld and %lu.
%ld and %lu aren't the right format specifiers for int64_t and uint64_t
on 32-bit (x86) systems.  They're %zu on Linux and %Iu on Windows.

Use the standard C99 macros in hopes that they work everywhere.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-04 14:38:48 -07:00
Kenneth Graunke
da5d08707b i965: Fix invalid pointer read in dead_control_flow_eliminate().
There may not be a previous block.  In this case, there's no real work
to do, so just continue on to the next one.

v2: Update for bblock->prev() API change.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-04 14:34:40 -07:00
Kenneth Graunke
9486614938 i965: Make bblock_t::next and friends return NULL at sentinels.
The bblock_t::prev/prev_const/next/next_const API returns bblock_t
pointers, rather than exec_nodes.  So it's a bit surprising.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-04 14:34:16 -07:00
Kenneth Graunke
5509d43a11 glsl: Lower variable indexing of system value arrays unconditionally.
lower_variable_index_to_cond_assign() did not handle system values.
gl_SampleMaskIn[] is a system value, and also an array.  Accessing it
with a variable index would trigger an unreachable() assert.

Rather than adding a new EmitNoIndirectSystemValues flag, we simply
lower unconditionally.  There is exactly one case where this occurs,
and for all current drivers, lowering produces optimal code.  Even
for future drivers with 32x MSAA, it produces reasonable code.

Fixes Piglit's new samplemaskin-indirect test.  Also fixes many ES31-CTS
tests when OES_sample_variables is enabled.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-04 14:29:21 -07:00
Jason Ekstrand
db35a851ad i965/defines: Unconditionally define primitives 2016-04-04 14:25:36 -07:00
Jason Ekstrand
6a04968784 Merge remote-tracking branch 'public/master' into vulkan 2016-04-04 13:58:05 -07:00
Jason Ekstrand
88ef2476dc i965/peephole_ffma: Only match a mul+add if none of the ops are exact
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-04 13:48:10 -07:00
Jason Ekstrand
eb93d6dec8 nir/search: Don't match inexact expressions with exact subexpressions
In the first pass of implementing exact handling, I made a mistake with
search-and-replace.  In particular, we only reallly handled exact/inexact
on the root of the tree.  Instead, we need to check every node in the tree
for an exact/inexact match.  As an example of this, consider the following
GLSL code

precise float a = b + c;
if (a < 0) {
   do_stuff();
}

In that case, only the add will be declared "exact" and an expression that
looks for "b + c < 0" will still match and replace it with "b < -c" which
may yield different results.  The solution is to simply bail if any of the
values are exact when matching an inexact expression.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-04 13:48:10 -07:00
Jason Ekstrand
fe247bbe92 nir: Stop double-printing function arguments 2016-04-04 12:10:20 -07:00
Jason Ekstrand
cb317b8d07 glsl: Stop force-enabling compute shaders
This isn't needed since we no longer use the GLSL compiler in Vulkan.
2016-04-04 12:09:12 -07:00
Jason Ekstrand
4d040a4ad3 glsl/standalone: Get rid of the unneeded _mesa_error_no_memory stub
This hasn't been needed since we stopped using the GLSL compiler in the
Vulkan driver and it was tripping up scons.  Removing it fixes the scons
build.
2016-04-04 12:07:51 -07:00
Kenneth Graunke
65fbc43d54 i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/COS output range.
The SIN and COS instructions on Intel hardware can produce values
slightly outside of the [-1.0, 1.0] range for a small set of values.
Obviously, this can break everyone's expectations about trig functions.

According to an internal presentation, the COS instruction can produce
a value up to 1.000027 for inputs in the range (0.08296, 0.09888).  One
suggested workaround is to multiply by 0.99997, scaling down the
amplitude slightly.  Apparently this also minimizes the error function,
reducing the maximum error from 0.00006 to about 0.00003.

When enabled, fixes 16 dEQP precision tests

   dEQP-GLES31.functional.shaders.builtin_functions.precision.
   {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}.

at the cost of making every sin and cos call more expensive (about
twice the number of cycles on recent hardware).  Enabling this
option has been shown to reduce GPUTest Volplosion performance by
about 10%.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-04 11:35:16 -07:00
Jason Ekstrand
8c8157bf6f Remove more spirv2nir remnants 2016-04-04 11:24:48 -07:00
Kenneth Graunke
3aa51e02d6 i965: Allow 8x MSAA on >= 64bpp formats on Gen8+.
See commit 3b0279a69 - this restriction is documented in the "Surface
Format" field of RENDER_SURFACE_STATE.

Looking at newer documentation, this restriction appears to exist on
Haswell, but no longer applies on Gen8+.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-04-04 10:41:29 -07:00
Brian Paul
1eeec7ec41 docs: remove stray 'TBD' in 11.2.0 relnotes file 2016-04-04 10:33:11 -06:00
Emil Velikov
35132c413c docs: add news item and link release notes for 11.2.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-04 12:57:56 +01:00
Emil Velikov
dc4923d41f docs: add sha256 checksums for 11.2.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit e7fb889dcc)
2016-04-04 12:55:55 +01:00
Emil Velikov
7dc11ed0b2 docs: Update 11.2.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ff9ddb9eb1)
2016-04-04 12:55:54 +01:00
Dave Airlie
f9b8b48bed mesa/get: fix MAX_GEOMETRY_SHADER_STORAGE_BLOCKS
this was returning the fragment shader value.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-04 10:52:25 +01:00
Ilia Mirkin
4bc3b1ca48 nvc0: add hardware ETC2 and ASTC support on GK20A and GM107+
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-04 00:32:48 -04:00
Ilia Mirkin
dab40d8083 docs: add note about GL_EXT_base_instance, sort entries
Trivial.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-03 21:18:17 -04:00
Ilia Mirkin
d76e1cd2dd mesa: expose EXT_base_instance in ES3 contexts
This extension is identical to ARB_base_instance. Reuse the same
entrypoints.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-03 20:40:55 -04:00
Ilia Mirkin
807e2c27ac mesa: expose EXT_polygon_offset_clamp in ES contexts
The extension spec was extended to also support ES. This functionality
is provided all the way back to ES 1.0.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-03 20:40:55 -04:00
Kenneth Graunke
40628886ca glsl: Print "precise" on ir_variable nodes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-04-03 17:33:38 -07:00
Jose Fonseca
7ad49daca6 gallivm: Introduce lp_format_intrinsic.
For adding .v4f32 like suffixes to intrinsics, taking special care for
scalar case, which was being often neglected.

This fixes invalid IR when doing mipmap filtering on SSE2 (the only
case where we'd use intrinsics with scalars.)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-04 00:06:09 +01:00
Ilia Mirkin
7af12a8dc6 glsl: make *sampler2DMSArray available in ESSL 3.20
Also avoid double-adding the *sampler2DMS types when the array ext is
enabled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-03 18:06:52 -04:00
Ilia Mirkin
aebb0e0186 glsl: make ssbo predicate return true when in a GLSL 430 or ESSL 310 shader
I can't tell whether this actually matters, but we're creating function
signatures with this predicate, so it should probably match when SSBO's
are available.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-03 18:06:49 -04:00
Ilia Mirkin
87906cbc37 glsl: allow conservative depth qualifiers in GLSL 420
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-03 18:06:35 -04:00
Ilia Mirkin
d50ffb5e46 mesa: add always-false-for-now enables for GL 4.3, 4.4, 4.5.
As the relevant extensions get implemented, the lines should be
uncommented. I believe this is (almost) everything needed for those GL
versions though.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-03 18:01:15 -04:00
Ilia Mirkin
9abbc49712 glsl: add ARB_ES3_1_compatibility support
Oddly a bunch of the features it adds are actually from ESSL 3.20. But
the spec is quite clear, oh well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-03 18:01:15 -04:00
Ilia Mirkin
1708e24f65 mesa: add ES3_1_compatibility extension enable
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-03 18:01:15 -04:00
Jose Fonseca
a293f57e13 gallivm: Use llvm.fabs.
Exactly the same code.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 22:09:09 +01:00
Jose Fonseca
e4f01da15d gallivm: Prefer backend agnostic intrinsic for rounding.
We could unconditionally use these instrinsics, but performance with SSE2
would suck, as LLVM falls back to calling libm.

lp_test_arit.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 22:09:07 +01:00
Jose Fonseca
324451e73f gallivm: Add debug option to force SSE2.
For simulating less capable machines.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 22:08:57 +01:00
Jose Fonseca
5fa31a4aba llvmpipe: Test abs.
Trivial.
2016-04-03 11:17:20 +01:00
Jose Fonseca
522ebe701d llvmpipe: Build lp_test_arit on MSVC too.
It builds fine now.  Probably due to C99 support.

Trivial.
2016-04-03 11:17:20 +01:00
Jose Fonseca
b284f1f7f9 gallivm: Fix performance regressions due to vector selects.
LLVM often can't determine the mask elements are all ones/zeros, and
there doesn't seem to be a good way to hint that.

Thanks to Roland Scheidegger for spotting and analyzing the issue.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 09:51:27 +01:00
Jose Fonseca
11c4e5b45c gallivm: Remove lp_build_load_volatile.
No longer needed.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 09:51:27 +01:00
Jose Fonseca
bcfb86b09d gallivm: Use standard LLVMSetAlignment from LLVM 3.4 onwards.
Only provide a fallback for LLVM 3.3.

One less dependency on LLVM C++ interface.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-03 09:51:27 +01:00
Timothy Arceri
6d54096fa6 mesa: remove unrequired else
The if always returns so no need for an else.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-03 09:55:19 +10:00
Ilia Mirkin
d64134ecae gm107/ir: add OP_SELP emission, used in DSQRT lowering
The current DSQRT lowering code emits an OP_SELP, so we have to handle
its emission. This will eventually go away, but no harm supporting this
op.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-02 19:27:51 -04:00
Ilia Mirkin
3610b1466d nv50/ir: we can't load local memory directly into an output
This fixes piglit tests like

tests/spec/glsl-1.10/execution/variable-indexing/vs-output-array-float-index-wr.shader_test

and related ones.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-02 18:10:20 -04:00
Christian Schmidbauer
2a529a8ac8 st/nine: specify WINAPI only for i386 and amd64
Currently mesa fails building with the x32 abi as ms_abi is not defined
in such a case.

The patch uses ms_abi only for amd64 targets and stdcall only for i386
targets to be sure that those are defined.

This patch additionally checks for __GNUC__ to guarantee that
__attribute__ is available.

CC: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Schmidbauer <ch.schmidbauer@gmail.com>
Acked-by: Axel Davy <axel.davy@ens.fr>
2016-04-02 23:30:40 +02:00
Samuel Pitoiset
0852c5703b nv50/ir: fix envyas variants when building the code lib
nvc0 and nve4 have been respectively replaced by gf100 and gk104.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-02 20:00:57 +02:00
Brian Paul
36d8fed798 svga: remove unused svga_compile_key::texture_msaa field
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-02 08:05:20 -06:00
Brian Paul
b283c76342 svga: check TXF instruction's target to determine MSAA
Rather than the currently bound texture.  This goes along with the
earlier patch to get away from examining bound textures and sampler
views during shader translation.

Fixes VMware bug 1632739.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-02 08:05:20 -06:00
Brian Paul
ef10b5427a tgsi: add simple tgsi_is_msaa_target() helper
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-02 08:05:20 -06:00
Timothy Arceri
070e5a7405 glsl: rename var and simplify if
is_ubo_var is true for both UBOs and SSBOs

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-02 17:10:56 +11:00
Timothy Arceri
0fbd073dc2 glsl: store ubo or ssbo index in block index
Previously we store the buffer block index i.e the index of a combined
ubo/ssbo list.

Fixes several dEQP-GLES31.functional tests:
- program_interface_query.uniform.block_index.block_array
- program_interface_query.uniform.block_index.named_block
- program_interface_query.uniform.block_index.unnamed_block
- program_interface_query.uniform.random.10
- program_interface_query.uniform.random.15
- program_interface_query.uniform.random.22
- program_interface_query.uniform.random.24
- program_interface_query.uniform.random.26
- program_interface_query.uniform.random.28
- program_interface_query.uniform.random.3
- program_interface_query.uniform.random.31
- program_interface_query.uniform.random.38
- program_interface_query.uniform.random.5

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94116
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-02 17:10:56 +11:00
Timothy Arceri
1265e1c4e1 glsl: store stage reference in gl_uniform_block
This allows us to simplify the code and drop InterfaceBlockStageIndex
which is a per stage array of integers the size of all blocks in the
program combined including duplicates across stages. Adding a stage
ref per block will use less memory.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-02 17:10:56 +11:00
Timothy Arceri
d8855d66f4 glsl: simplify buffer block resource limit checking
This changes the code to use the buffer counts stored for each stage
rather than counting from scratch. It also moves the checks outside
of the for loop which means we now just get a single link error
message if we go over the max rather than X error messages where X
is the number we have exceeded the max by.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-02 17:10:56 +11:00
Timothy Arceri
0082b33a78 glsl: simplify SSBO resources check
We already have a count of active SSBOs per stage so use it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-02 17:10:56 +11:00
Timothy Arceri
3e74bf5b9d glsl: split buffer block arrays earlier
This will allow us to use them when checking resources in a
following patch and clean up a bunch of code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-02 17:10:56 +11:00
Timothy Arceri
0163881528 glsl: only set buffer block binding once during initialisation
Since 8683d54d2b there is now a single instance of the buffer
block information that needs to be updated rather than one instance
for each stage.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-02 17:10:56 +11:00
Kenneth Graunke
94ed482c19 glsl: Fix prorgram interface query locations biasing for SSO.
With SSO, the GL_PROGRAM_INPUT and GL_PROGRAM_OUTPUT interfaces refer to
the first and last shader stage linked into a program.  This may not be
the vertex and fragment shader stages.

So, subtracting VERT_ATTRIB_GENERIC0 and FRAG_RESULT_DATA0 is bogus.
We need to subtract VERT_ATTRIB_GENERIC0 for VS inputs,
FRAG_RESULT_DATA0 for FS outputs, and VARYING_SLOT_VAR0 for other cases.

Note that built-in variables get a location of -1.

Fixes 4 dEQP-GLES31.functional.program_interface_query tests:
- program_input.location.separable_fragment.var_explicit_location
- program_input.location.separable_fragment.var_array_explicit_location
- program_output.location.separable_vertex.var_array_explicit_location
- program_output.location.separable_vertex.var_array_explicit_location

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-01 22:05:20 -07:00
Kenneth Graunke
c123294dfe glsl: Return -1 for program interface query locations in many cases.
We were recording locations for all variables, even ones without an
explicit location set.  Implement the rules from the spec, and record
-1 in the resource list accordngly.  Make program_resource_location
stop doing math on negative values.  Remove hacks that are no longer
necessary now that we've stopped doing that.

Fixes 4 dEQP-GLES31.functional.program_interface_query tests:
- program_input.location.separable_fragment.var
- program_input.location.separable_fragment.var_array
- program_output.location.separable_vertex.var_array
- program_output.location.separable_vertex.var_array

v2: Delete more code

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-01 22:05:20 -07:00
Kenneth Graunke
9fe211bec4 glsl: Consolidate gl_VertexIDMESA -> gl_VertexID query hacks.
A program will either have gl_VertexID or gl_VertexIDMESA (the lowered
zero-based version), not both.  Just spoof it in the resource list so
the hacks are done in a single place.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-01 22:05:20 -07:00
Kenneth Graunke
013f25c3b3 glsl: Clean up some leftover cruft.
stages is always 1 << stage now.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-01 22:05:20 -07:00
Kenneth Graunke
98c22c0403 glsl: Add all system variables to the input resource list.
System values are just built-in input variables that we've opted to
special-case out of convenience.  We need to consider all inputs,
regardless of how we've classified them.

Unfortunately, there's one exception: we shouldn't add gl_BaseVertex
unless ARB_shader_draw_parameters is enabled, because it doesn't
actually exist in the language, and shouldn't be counted in the
GL_ACTIVE_RESOURCES query.

Fixes dEQP-GLES31.functional.program_interface_query.program_input.
resource_list.compute.empty, which expects gl_NumWorkGroups to appear
in the resource list.

v2: Delete more code

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-01 22:05:18 -07:00
Kenneth Graunke
6e8b9d5bdd glsl: Delete hack for VS system values.
This makes no sense.  If the stage being considered is the vertex
shader, then we'll add inputs and system values appropriately.

If we're not considering the vertex shader, then we absolutely should
not do anything with it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-01 21:58:25 -07:00
Kenneth Graunke
47daf17da0 glsl: Make add_interface_variables only consider the appropriate stage.
add_interface_variables() is supposed to add variables for the inputs of
the first shader stage linked into a program, and the outputs of the
last shader stage linked into a program.

From the ARB_program_interface_query specification:

"* PROGRAM_INPUT corresponds to the set of active input variables used by
   the first shader stage of <program>.  If <program> includes multiple
   shader stages, input variables from any shader stage other than the
   first will not be enumerated.

 * PROGRAM_OUTPUT corresponds to the set of active output variables
   (section 2.14.11) used by the last shader stage of <program>.  If
   <program> includes multiple shader stages, output variables from any
   shader stage other than the last will not be enumerated."

Previously, we used build_stageref here, which walks over all linked
shaders in the program.  This meant that internal varyings would be
visible.  We don't actually need any of build_stageref's code: we
already explicitly skip packed varyings, handle modes, and the name
comparisons just do a fuzzy string comparison of name with itself.

Fixes two tests: dEQP-GLES31.functional.program_interface_query.
program_{input,output}.referenced_by.referenced_by_vertex_fragment.

These tests have a VS and FS linked together into a single program.
Both stages have an input called "shaderInput".  But the FS input
should not be visible because it isn't the first stage.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-01 21:58:25 -07:00
Kenneth Graunke
998ef1ad71 glsl: Clarify "mask" variable in add_interface_variables().
This is a bitfield of which stages refer to a variable.  It is not used
to mask off bits.  In fact, it's used to contribute additional bits.

Rename it and tidy a bit of the logic.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-01 21:58:25 -07:00
Kenneth Graunke
356c99b4e7 glsl: Pass stage to add_interface_variables().
add_interface_variables is supposed to add variables from either the
first or last stage of a linked shader.  But it has no way of knowing
the stage it's being asked to process, which makes it impossible to
produce correct stagerefs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-01 21:58:25 -07:00
Kenneth Graunke
2c5afe1fa9 glsl: Make vertex ID lowering declare gl_BaseVertex as hidden.
If the GL_ARB_shader_draw_parameters extension is enabled, we'll already
have a gl_BaseVertex variable.  It will have var->how_declared set to
ir_var_declared_implicitly, and will appear in the program resource
list.

If not, we make one for internal use.  We don't want it to be listed
in the program resource list, as the application won't be expecting
it.  Marking it hidden will properly exclude it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 21:58:22 -07:00
Kenneth Graunke
33df1c2935 glsl: Exclude ir_var_hidden variables from the program resource list.
We occasionally generate variables internally that we want to exclude
from the program resource list, as applications won't be expecting them
to be present.

The next patch will make use of this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 21:56:43 -07:00
Kenneth Graunke
15cd3ebede mesa: Make _mesa_choose_tex_format() handle stencil textures.
This is necessary for ARB_texture_stencil8 support on classic drivers.
Presumably Gallium works because it implements its own ChooseTexFormat.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-01 19:04:28 -07:00
Jordan Justen
ef1b397b07 glsl: Don't require matching centroid qualifiers
Note: This patch appears to violate older OpenGL and OpenGLES specs.

The OpenGLES GLSL 3.1 and OpenGL GLSL 4.3 specifications both remove
the requirement for the output and input centroid qualifiers to match.

The deqp
dEQP-GLES3.functional.shaders.linkage.varying.rules.differing_interpolation_2
test wants the newer OpenGLES 3.1 specification behavior, even for
OpenGLES 3.0. This patch simply removes the checking in all cases.

The OpenGLES 3.0 conformance test suite doesn't appear to require the
older ("must match") spec behavior.

For reference, here are the relavent spec citations:

  The OpenGL 4.2 spec says: "the last active shader stage output
  variables and fragment shader input variables of the same name must
  match in type and qualification (other than out matching to in)"

  The OpenGL 4.3 spec says: "interpolation qualification (e.g., flat)
  and auxiliary qualification (e.g. centroid) may differ."

  The OpenGLES GLSL 3.00.4 specification says: "The output of the
  vertex shader and the input of the fragment shader form an
  interface. For this interface, vertex shader output variables and
  fragment shader input variables of the same name must match in type
  and qualification (other than precision and out matching to in)."

  The OpenGLES GLSL 3.10 Specification says: "interpolation
  qualification (e.g., flat) and auxiliary qualification (e.g.
  centroid) may differ"

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743
Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=7819
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-01 18:06:19 -07:00
Bas Nieuwenhuizen
1a5c8c24b5 gallium: distinguish between shader IR in get_compute_param
For radeonsi, native and TGSI use different compilers and this results
in different limits for different IR's.

The set we strictly need for radeonsi is only the MAX_BLOCK_SIZE
and MAX_THREADS_PER_BLOCK params, but I added a few others as shader
related that seemed like they would also typically depend on the
compiler.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-02 01:51:13 +02:00
Bas Nieuwenhuizen
be5899dcf9 gallium: add global buffer memory barrier bit
Currently radeonsi synchronizes after every dispatch and Clover
does nothing to synchronize. This is overzealous, especially with
GL compute, so add a barrier for global buffers.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-02 01:51:06 +02:00
Bas Nieuwenhuizen
01f993a21f gallium: add threads per block TGSI property
The value 0 for unknown has been chosen to so that
drivers using tgsi_scan_shader do not need to detect
missing properties if they zero-initialize the struct.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-02 01:50:59 +02:00
Bas Nieuwenhuizen
ea8f4a6b13 gallium: add compute shader IR type
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-02 01:49:57 +02:00
Timothy Arceri
5ea825f556 glsl: remove tabs and fix some other style issues in glcpp-parse.y
Note there are still tabs left in the parser rules.

Acked-by: Dave Airlie <airlied@redhat.com>
2016-04-02 10:32:01 +11:00
Jason Ekstrand
cc1320220f nir/gather_info: Add an assert for supported stages 2016-04-01 15:44:43 -07:00
Jason Ekstrand
ebb0bcc11d nir: Move variable_get_io_mask back into gather_info
It used to be in nir_gather_info.c until I moved it out to nir.h so it
could be re-used with some linking code that never got merged.  We'll move
it back out if and when we have real code to share it with.
2016-04-01 15:39:48 -07:00
Jason Ekstrand
95106f6bfb Merge remote-tracking branch 'public/master' into vulkan 2016-04-01 15:16:21 -07:00
Jason Ekstrand
14c46954c9 i965: Add an implemnetation of nir_op_fquantize2f16
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-01 13:52:56 -07:00
Jason Ekstrand
de60e250f5 nir: Add an opcode for stomping a 32-bit value to 16-bit precision
This correlates directly to the SPIR-V opcode OpQuantizeToF16

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-01 13:52:28 -07:00
Samuel Pitoiset
60e1c6a7fc nvc0: enable compute shaders on GK104 and GM107+
Compute support on GK110 is still unstable for weird reasons, but
this can be fixed later as the NVF0_COMPUTE envvar prevent using
compute.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 22:26:24 +02:00
Samuel Pitoiset
71f327aa21 nvc0: bump the maximum number of UBOs for compute on Kepler
The maximum number of uniform blocks (MAX_COMPUTE_UNIFORM_BLOCKS)
per compute program must be at least 12.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 22:26:24 +02:00
Samuel Pitoiset
839a469166 nvc0/ir: do not lower shared+atomics on GM107+
For Maxwell, the ATOMS instruction can be used to perform atomic
operations on shared memory instead of this load/store lowering pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 22:26:24 +02:00
Samuel Pitoiset
543fb95473 nvc0/ir: add atomics support on shared memory for Kepler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 22:26:24 +02:00
Samuel Pitoiset
275019d7db nvc0/ir: fix wrong pred emission for ld lock on GK104
This fixes 84b9b8f (nvc0/ir: add missing emission of locked load
predicate).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 22:26:24 +02:00
Samuel Pitoiset
4f58b78c30 nvc0/ir: add support for compute UBOs on Kepler
Make sure to avoid out of bounds access in presence of indirect
array indexing by loading the size from the driver constant buffer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 22:26:24 +02:00
Samuel Pitoiset
3b246a71d7 nvc0: add indirect compute support on Kepler
The grid size is stored as three 32-bits integers in the indirect
buffer but the launch descriptor uses a 32-bits integer for both
griddim_y and griddim_z like this (z << 16) | y. To make it work,
the 16 high bits of griddim_y are overwritten by griddim_z.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 22:26:24 +02:00
Samuel Pitoiset
7797d5f7d9 nvc0: reduce likelihood of collision for real buffers on Kepler
Reduce likelihood of collision with real buffers by placing the
hole at the top of the 4G area. This fixes some indirect draw+compute
tests with large buffers.

Suggested by Ilia Mirkin.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 22:26:24 +02:00
Samuel Pitoiset
e2e8085fac nvc0: store ubo info to the driver constbuf on Kepler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 22:26:24 +02:00
Samuel Pitoiset
12aa047c98 nvc0: bind user uniforms for compute on Kepler
Uniform buffer objects will be sticked to the driver constant buffer
like buffers because the launch descriptor only allows 8 CBs.

Input kernel parameters for OpenCL are still uploaded to screen->parm
which is bound on c0, but this will be changed later with a new series.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 22:26:24 +02:00
Samuel Pitoiset
1828d90a00 nvc0: bind shader buffers for compute on Kepler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 22:26:24 +02:00
Samuel Pitoiset
debd910512 nvc0: bind driver cb for compute on c7[] for Kepler
Instead of using the screen->parm buffer object which will be removed,
upload auxiliary constants to uniform_bo to be consistent regarding
what we already do for Fermi.

This breaks surfaces support (for compute only) but this will be
properly re-introduced later for ARB_shader_image_load_store.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-01 22:26:24 +02:00
Jose Fonseca
f72de6f386 gallivm: Prevent disassembly debug output from being truncated.
By using os_log_message directly, as _debug_vprintf truncates messages
to 4K.

Also cleanup the disassemble interface.

Spotted by Roland.

Trivial.
2016-04-01 21:22:42 +01:00
Rob Clark
972054f5bf compiler: random comment fixup
Just noticed this in passing..  gl_shader_stage already has tess so this
comment no longer applies.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-01 12:34:40 -04:00
Brian Paul
58557b345c docs: minor updates to license.html file
Mesa demos are no longer part of the main Mesa tree/tarball.
Add Gallium and GLX code to list of major components.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-01 09:50:08 -06:00
Mauro Rossi
e09d04cd56 radeonsi: use util_strchrnul() to fix android build error
Android Bionic does not support strchrnul() string function,
gallium auxiliary util/u_string.h provides util_strchrnul()

This change avoids the following building error:

external/mesa/src/gallium/drivers/radeonsi/si_shader.c:3863: error:
undefined reference to 'strchrnul'
collect2: error: ld returned 1 exit status

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-01 13:56:57 +01:00
Rob Herring
952720ccee egl: android: enable EGL_FRAMEBUFFER_TARGET_ANDROID and EGL_RECORDABLE_ANDROID
Set EGL_FRAMEBUFFER_TARGET_ANDROID and EGL_RECORDABLE_ANDROID config
attributes to true for Android. These are required in Marshmallow.

The implementation of EGL_RECORDABLE_ANDROID support has 2 options in
the definition of the extension. Android implements the 2nd option
which is the encoder must support RGB input. The requested input format
is RGB888, so setting the attribute on all the native Android visual
formats should be sufficient.

Similarly, setting EGL_FRAMEBUFFER_TARGET_ANDROID for all configs with
a EGL_NATIVE_VISUAL_ID should be sufficient. Most likely, the HWC should
support the same set of formats the underlying DRM driver supports.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-01 13:45:13 +01:00
Rob Herring
e21e81aa18 egl: Add EGL_RECORDABLE_ANDROID attribute
This is used by Android to select an eglconfig compatible with screen
recording.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rob Herring <robh@kernel.org>
[Emil Velikov: add the _eglIsConfigAttribValid check]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-01 13:45:08 +01:00
Rob Herring
8975527f58 egl: Add EGL_FRAMEBUFFER_TARGET_ANDROID attribute
This is used by Android to select an eglconfig compatible with HWComposer.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rob Herring <robh@kernel.org>
[Emil Velikov: add the _eglIsConfigAttribValid check]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-01 13:44:25 +01:00
Rob Herring
2d9e0f24e1 Android: fix x86 gallium builds
Builds with gallium enabled fail on x86 with linker error:

external/mesa3d/src/mesa/vbo/vbo_exec_array.c:127: error: undefined reference to '_mesa_uint_array_min_max'

The problem is sse_minmax.c is not included in the libmesa_st_mesa
library. Since the SSE4.1 files are needed for both libmesa_st_mesa
and libmesa_dricore, move SSE4.1 files into a separate static library
that can be used by both.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-01 13:44:22 +01:00
Jose Fonseca
cdf7c6b83d gallivm: Use vector selects on LLVM 3.3+.
This is an old patch I had around.

Vector selects seem to work well from LLVM 3.3.  Using them should
improve code quality, as it might make constant propagation pass more
effective.

Tested lp_test_*

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-01 09:05:19 +01:00
Alejandro Piñeiro
cd7d631c71 glsl: do not raise unitialized variable warnings on builtins/reserved GL variables
Needed because not all the built-in variables are marked as system
values, so they still have the mode ir_var_auto. Right now it fixes
raising the warning when gl_GlobalInvocationID and
gl_LocalInvocationIndex are used.

v2: use is_gl_identifier instead of filtering for some names (Ilia
    Mirkin)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-01 09:54:09 +02:00
Ilia Mirkin
df03be196a nv50,nvc0: add PIPE_BIND_LINEAR support to is_format_supported
vdpau has recently come to rely on this, so make sure to check it
properly.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-31 21:53:11 -04:00
Ilia Mirkin
e0e1683087 mesa: add GL_OES/EXT_draw_buffers_indexed support
This is the same ext as ARB_draw_buffers_blend (plus some core
functionality that already exists). Add the alias entrypoints.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 21:12:49 -04:00
Kenneth Graunke
a57320a9ba i965: Use brw->urb.min_vs_urb_entries instead of 32 for BLORP.
Haswell GT2 and GT3 have a minimum of 64 entries.  Hardcoding 32
is not legal.

v2: Delete stale comment (caught by Alejandro).

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-31 16:45:07 -07:00
Kenneth Graunke
58d4751fa0 i965: Fix textureSize() depth value for 1 layer surfaces on Gen4-6.
According to the Sandybridge PRM's description of the resinfo message,
the .z value returned will be Depth == 0 ? 0 : Depth + 1.  The earlier
PRMs have the same table.

This means we return 0 for array textures with a single slice, when
we ought to return 1.  Just override it to max(depth, 1).

Fixes 10 dEQP-GLES3.functional tests on Sandybridge:
shaders.texture_functions.texturesize.sampler2darray_fixed_vertex
shaders.texture_functions.texturesize.sampler2darray_fixed_fragment
shaders.texture_functions.texturesize.sampler2darray_float_vertex
shaders.texture_functions.texturesize.sampler2darray_float_fragment
shaders.texture_functions.texturesize.isampler2darray_vertex
shaders.texture_functions.texturesize.isampler2darray_fragment
shaders.texture_functions.texturesize.usampler2darray_vertex
shaders.texture_functions.texturesize.usampler2darray_fragment
shaders.texture_functions.texturesize.sampler2darrayshadow_vertex
shaders.texture_functions.texturesize.sampler2darrayshadow_fragment

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-31 15:23:49 -07:00
Ian Romanick
08ff5f4d1f nir: Simplify a bcsel to logical-or
Oddly, this did not affect the shader where I first noticed the pattern.
That particular shader doesn't get its if-statement converted to a bcsel
because there are two assignments in the else-statement.  This led to me
submitting https://bugs.freedesktop.org/show_bug.cgi?id=94747.

shader-db results:

Sandy Bridge
total instructions in shared programs: 8467384 -> 8467069 (-0.00%)
instructions in affected programs: 36594 -> 36279 (-0.86%)
helped: 46
HURT: 0

total cycles in shared programs: 117573448 -> 117568518 (-0.00%)
cycles in affected programs: 339114 -> 334184 (-1.45%)
helped: 46
HURT: 0

Ivy Bridge / Haswell / Broadwell / Skylake:
total instructions in shared programs: 7774258 -> 7773999 (-0.00%)
instructions in affected programs: 30874 -> 30615 (-0.84%)
helped: 46
HURT: 0

total cycles in shared programs: 65739190 -> 65734530 (-0.01%)
cycles in affected programs: 180380 -> 175720 (-2.58%)
helped: 45
HURT: 1

No change on G45 or Ironlake.

I also tried these expressions, but none of them affected any shaders in
shader-db:

   (('bcsel', a, 'a@bool', 'b@bool'), ('ior', a, b)),
   (('bcsel', a, 'b@bool', False),    ('iand', a, b)),
   (('bcsel', a, 'b@bool', 'a@bool'), ('iand', a, b)),

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-31 14:59:36 -07:00
Ian Romanick
cdea12bf03 ptn: Fix all users of ptn_swizzle
None of the callers actually wanted what it did.  In ptn_xpd, you only
ever want a vec3 swizzle.  In ptn_tex, you want a swizzle that matches
the number of required texture coordinates.

shader-db results:

G45:
total instructions in shared programs: 4011240 -> 4010911 (-0.01%)
instructions in affected programs: 59232 -> 58903 (-0.56%)
helped: 114
HURT: 0

total cycles in shared programs: 84314194 -> 84313220 (-0.00%)
cycles in affected programs: 779150 -> 778176 (-0.13%)
helped: 110
HURT: 13

Ironlake:
total instructions in shared programs: 6397262 -> 6396605 (-0.01%)
instructions in affected programs: 117402 -> 116745 (-0.56%)
helped: 227
HURT: 0

total cycles in shared programs: 128889798 -> 128888524 (-0.00%)
cycles in affected programs: 1214644 -> 1213370 (-0.10%)
helped: 179
HURT: 44

Sandy Bridge:
total instructions in shared programs: 8467391 -> 8467384 (-0.00%)
instructions in affected programs: 3107 -> 3100 (-0.23%)
helped: 10
HURT: 6

total cycles in shared programs: 117580120 -> 117573448 (-0.01%)
cycles in affected programs: 103158 -> 96486 (-6.47%)
helped: 84
HURT: 11

Ivy Bridge:
total instructions in shared programs: 7774255 -> 7774258 (0.00%)
instructions in affected programs: 1677 -> 1680 (0.18%)
helped: 8
HURT: 6

total cycles in shared programs: 65743828 -> 65739190 (-0.01%)
cycles in affected programs: 89312 -> 84674 (-5.19%)
helped: 78
HURT: 23

Haswell:
total instructions in shared programs: 7107172 -> 7107150 (-0.00%)
instructions in affected programs: 2048 -> 2026 (-1.07%)
helped: 16
HURT: 0

total cycles in shared programs: 64653636 -> 64647486 (-0.01%)
cycles in affected programs: 86836 -> 80686 (-7.08%)
helped: 85
HURT: 17

Broadwell and Skylake:
total instructions in shared programs: 8447529 -> 8447507 (-0.00%)
instructions in affected programs: 2038 -> 2016 (-1.08%)
helped: 16
HURT: 0

total cycles in shared programs: 66418670 -> 66413416 (-0.01%)
cycles in affected programs: 90110 -> 84856 (-5.83%)
helped: 83
HURT: 20

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-31 14:59:36 -07:00
Ian Romanick
8bb9c6ff7f ptn: Silence unused parameter warning
The KIL instruction doesn't have a destination, so ptn_kil never uses
dest.

program/prog_to_nir.c: In function ‘ptn_kil’:
program/prog_to_nir.c:547:38: warning: unused parameter ‘dest’ [-Wunused-parameter]
 ptn_kil(nir_builder *b, nir_alu_dest dest, nir_ssa_def **src)
                                      ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-31 14:59:36 -07:00
Samuel Pitoiset
d22eca5f90 tgsi: silence compiler warning in fetch_sampler_unit()
The unit variable can be used uninitialized.

Fixes: 24e77cb09 ("tgsi: handle indirect sampler arrays. (v2)")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-01 07:16:24 +10:00
Samuel Pitoiset
05902a6686 tgsi: fix out of bounds access in exec_atomop()
The number of channels must be 4 for all RGBA components.

Fixes: 22d129601 ("tgsi: add support for image operations to tgsi_exec. (v2.1)")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-04-01 07:15:16 +10:00
Brian Paul
9076e04934 tgsi: split tgsi_util_get_texture_coord_dim() function into two
It was kind of overloaded, returning two different things.  Now get
the index of the shadow reference src register with a new
tgsi_util_get_shadow_ref_src_index() function.

To verify the new code, I added some temp/debug code which looped
over all TGSI_TEXTURE_x values, calling the old function and new and
checking that the returned indexes matched.

Also tested piglit "shadow" tests with softpipe/llvmpipe.
No testing of ilo and radeonsi changes.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:48:00 -06:00
Brian Paul
9d7cd43988 tgsi: skip texture query opcodes when examining texture targets
Should fix the assertion in piglit
spec@arb_gpu_shader5@texturegather@fs-r-none-shadow-2d when the
TXQ instruction specifies a 2D target but the sampler view was
declared as SHADOW2D.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-31 09:47:40 -06:00
Pierre Moreau
f96a403bc3 nv50/ir: Check for valid insn instead of def size
This fixes a null pointer dereference during the register allocation pass,
if a function had arguments.

Functions arguments get a definition from the function itself, a definition
which is therefore not linked to any instruction. If a value ends up having
a definition but no linked instruction, the register allocation pass doesn't
need to consider whether that value is generated by an instruction that
can only handle "short" registers (on nv50).

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
2016-03-31 10:30:29 -04:00
Ilia Mirkin
a94d8d51d7 mesa: add GL_EXT_copy_image support
The extension is identical to GL_OES_copy_image. But dEQP has tests that
want the EXT variant.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-30 22:57:17 -04:00
Ilia Mirkin
ebdb534548 mesa: add GL_OES_copy_image support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-30 22:57:17 -04:00
Ilia Mirkin
571f538a62 mesa: remove duplicate MAX_GEOMETRY_SHADER_INVOCATIONS entry
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-30 22:57:17 -04:00
Ilia Mirkin
2c7f5fe296 st/mesa: add ES sample-shading support
We require the full ARB_gpu_shader5 for now, but in the future some
other CAP could get exposed to indicate that only the multisample-related
behavior of ARB_gpu_shader5 is available.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-30 22:57:17 -04:00
Ilia Mirkin
3002296cb6 mesa: add GL_OES_shader_multisample_interpolation support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-30 22:57:17 -04:00
Ilia Mirkin
411a88accc mesa: add GL_OES_sample_shading support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-30 22:57:17 -04:00
Ilia Mirkin
5283e81015 glsl: add GL_OES_sample_variables support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-30 22:57:17 -04:00
Ilia Mirkin
6a8ca859f9 mesa: add OES_sample_variables to extension table, add enable bit
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-30 22:57:17 -04:00
Ilia Mirkin
903640c2ac glsl: add gl_MaxSamples, new in GL 4.5 / GL ES 3.2
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-30 22:57:17 -04:00
Matt Turner
4fea98991c i965: Don't add barrier deps for FB write messages.
Ken did this earlier, and this is just me reimplementing his patch a
little differently.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-30 19:54:30 -07:00
Matt Turner
3495265158 i965: Add and use is_scheduling_barrier() function. 2016-03-30 19:54:30 -07:00
Matt Turner
b4e223cfbf i965: Remove NOP insertion kludge in scheduler.
Instead of removing every instruction in add_insts_from_block(), just
move the instruction to its scheduled location. This is a step towards
doing both bottom-up and top-down scheduling without conflicts.

Note that this patch changes cycle counts for programs because it begins
including control flow instructions in the estimates.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-30 19:54:30 -07:00
Matt Turner
a607f4aa57 i965: Assert that an instruction is not inserted around itself.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-30 19:54:30 -07:00
Matt Turner
7b208a7312 i965: Relax restriction on scheduling last instruction.
I think when this code was written, basic blocks were always ended by a
control flow instruction or an end-of-thread message. That's no longer
the case, and removing this restriction actually helps things:

   instructions in affected programs: 7267 -> 7244 (-0.32%)
   helped: 4

   total cycles in shared programs: 66559580 -> 66431900 (-0.19%)
   cycles in affected programs: 28310152 -> 28182472 (-0.45%)
   helped: 9577
   HURT: 879

   GAINED: 2

The addition of the is_control_flow() checks is not a functional change,
since the add_insts_from_block() does not put them in the list of
instructions to schedule. I plan to change this in a later patch.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-30 19:54:30 -07:00
Matt Turner
f60750968c i965/vec4/tcs: Set conditional mod on TCS_OPCODE_SRC0_010_IS_ZERO.
Missing this causes an assertion failure in the scheduler with the next
patch.

Additionally, this gives cmod propagation enough information to optimize
code better.

total instructions in shared programs: 7112991 -> 7112852 (-0.00%)
instructions in affected programs: 25704 -> 25565 (-0.54%)
helped: 139

total cycles in shared programs: 64812898 -> 64810674 (-0.00%)
cycles in affected programs: 127224 -> 125000 (-1.75%)
helped: 139

Acked-by: Francisco Jerez <currojerez@riseup.net>
2016-03-30 19:54:30 -07:00
Matt Turner
436bdd7403 Revert "i965: Don't add barrier deps for FB write messages."
This reverts commit d0e1d6b7e2.

The change in the vec4 code is a mistake -- there's never an
FS_OPCODE_FB_WRITE in vec4 code.

The change in the fs code had the (harmless) effect of not recognizing
an FB_WRITE as a scheduling barrier even if it was marked EOT --
harmless because the scheduler marked the last instruction of a block as
a barrier, something I'm changing in the following patches.

This will be reimplemented later in the series.
2016-03-30 19:54:30 -07:00
Matt Turner
0d253ce34a i965: Simplify full scheduling-barrier conditions.
All of these were simply code for "architecture register file" (and in
the case of destinations, "not the null register").

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-30 19:54:30 -07:00
Matt Turner
65bc94022b i965: Remove incorrect cycle estimates.
These printed the cycle count the last basic block (sched.time is set
per basic block!). We have accurate, full program, data printed
elsewhere.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-30 19:54:29 -07:00
Dave Airlie
10b189f985 st/mesa: fix fallout from xfb changes.
Failed to update state tracker with new buffer interface.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:36:55 +10:00
Matt Turner
05ee6627d6 nir: Fix typo from commit 6702f1acde. 2016-03-30 19:18:35 -07:00
Timothy Arceri
b273958c74 docs: mark xfb_* qualifiers as DONE
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:53:08 +11:00
Timothy Arceri
c5704bb350 mesa: add query support for GL_TRANSFORM_FEEDBACK_BUFFER interface
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:53:02 +11:00
Timothy Arceri
7234be0338 glsl: add transform feedback buffers to resource list
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:52:57 +11:00
Timothy Arceri
9e317271d7 mesa: add support to query GL_TRANSFORM_FEEDBACK_BUFFER_INDEX
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:52:47 +11:00
Timothy Arceri
51142e7705 mesa: add support to query GL_OFFSET for GL_TRANSFORM_FEEDBACK_VARYING
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:52:43 +11:00
Timothy Arceri
047139e8a0 mesa: rename tranform feeback varying macro XFB to XFV
A latter patch will use XFB for buffers.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:52:39 +11:00
Timothy Arceri
b77c909878 glsl: always enable transform feedback mode when xfb_stride defined
This enables in shader defined transform feedback mode even if the
only place xfb_stride is defined is on the global out.

We don't worry about xfb_buffer since Issue 22 c) in the spec says:

   "If the shader has an "xfb_buffer" qualifier identifying a buffer,
    but doesn't declare "xfb_offset" on anything associated with it,
    what happens?

    ...

    variables not qualified with "xfb_offset" are not captured, which
    makes the associated "xfb_buffer" qualifier irrelevant."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:52:34 +11:00
Timothy Arceri
c95e92b14d glsl: handle varyings that are not written to but have an xfb_offset
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:52:29 +11:00
Timothy Arceri
d5c09d40b9 glsl: when lowering named interface set assigned flag
This will be used when checking if xfb should attempt to capture
a varying.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:52:22 +11:00
Timothy Arceri
a2fbc5ed44 glsl: reset current stream tracker
When we move to the next buffer we need to reset the stream
so that we don't generate an error message about streams not
matching.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:52:17 +11:00
Timothy Arceri
f2a3c87a00 glsl: generate link error when implicit stride is to large
This moves the check until after we have done the stride
calculation and applies it to the xfb_* qualifiers.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:52:11 +11:00
Timothy Arceri
2fab85aaea glsl: add xfb_stride link time validation
From the ARB_enhanced_layous spec:

   "It is a compile-time or link-time error to have any *xfb_offset*
    that overflows *xfb_stride*, whether stated on declarations before
    or after the *xfb_stride*, or in different compilation units.

    ...

    When no *xfb_stride* is specified for a buffer, the stride of a
    buffer will be the smallest needed to hold the variable placed at
    the highest offset, including any required padding."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:52:05 +11:00
Timothy Arceri
8120e869b1 glsl: validate global out xfb_stride qualifiers and set stride on empty buffers
Here we use the built-in validation in
ast_layout_expression::process_qualifier_constant() to check for mismatching
global out strides on buffers in a single shader.

From the ARB_enhanced_layouts spec:

   "While *xfb_stride* can be declared multiple times for the same buffer,
   it is a compile-time or link-time error to have different values
   specified for the stride for the same buffer."

For intrastage validation a new helper link_xfb_stride_layout_qualifiers()
is created. We also take this opportunity to make sure stride is at least
a multiple of 4, we will validate doubles at a later stage.

From the ARB_enhanced_layouts spec:

   "If the buffer is capturing any double-typed outputs, the stride must
   be a multiple of 8, otherwise it must be a multiple of 4, or a
   compile-time or link-time error results."

Finally we update store_tfeedback_info() to apply the strides to
LinkedTransformFeedback and update the buffers bitmask to mark any global
buffers with a stride as active. For example a shader with:

layout (xfb_buffer = 0, xfb_offset = 0)  out vec4 gs_fs;
layout (xfb_buffer = 1, xfb_stride = 64) out;

Is expected to have a buffer bound to both 0 and 1.

From the ARB_enhanced_layouts spec:

   "A binding point requires a bound buffer object if and only if its
   associated stride in the program object used for transform feedback
   primitive capture is non-zero."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:52:00 +11:00
Timothy Arceri
cf039a309a mesa: split transform feedback buffer into its own struct
This will be used in a following patch to implement interface
query support for TRANSFORM_FEEDBACK_BUFFER.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:51:52 +11:00
Timothy Arceri
258299d87a glsl: use bitmask of active xfb buffer indices
This allows us to print the correct binding point when not all
buffers declared in the shader are bound.

For example if we use a single buffer:

layout(xfb_buffer=2, offset=0) out vec4 v;

We now print '2' when the buffer is not bound rather than '0'.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:51:47 +11:00
Timothy Arceri
99cb5151ed glsl: sort xfb varyings in offset/buffer order
The existing transform feedback code expects to receive the list
of varyings in increasing buffer order.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:51:38 +11:00
Timothy Arceri
0c66460fc6 glsl: basic linking support for xfb qualifiers
This adds the initial infrastructure for enabling transform feedback
mode via in shader qualifiers and adds initial buffer support.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:51:33 +11:00
Timothy Arceri
4305a60173 glsl: add xfb helpers and fields to the tfeedback_decl class
We also apply any array/struct offsets.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:51:27 +11:00
Timothy Arceri
0822517936 glsl: add helper to process xfb qualifiers during linking
This function checks for any xfb_* qualifiers which will enable
transform feedback mode and cause any API defined xfb varyings
to be ignored.

It also counts the number of varyings that have a xfb_offset
qualifier and finally it calls the create_xfb_varying_names()
helper to generate the names of varyings to be caputured.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:51:21 +11:00
Timothy Arceri
707fd3972f glsl: add helper to generate xfb varying names
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:51:17 +11:00
Timothy Arceri
8b6f8fe503 glsl: add helper for counting varyings
This will be used to get a count of the number of varying name
strings we are required to generate for use with the query api.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:51:06 +11:00
Timothy Arceri
ba7a7d4c39 glsl: add xfb qualifier lowering support for named blocks
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:51:01 +11:00
Timothy Arceri
4a873ef049 glsl: add xfb qualifiers to has_layout helper
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:50:54 +11:00
Timothy Arceri
598790e856 glsl: apply xfb_stride to implicit offsets for ifc block members
When we have an interface block like:

layout (xfb_buffer = 0, xfb_offset = 0) out Block {
                             vec4 var1;
    layout (xfb_stride = 32) vec4 var2;
                             vec4 var3;
};

We take into account the stride of var2 when calculating the offset
for var3.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:50:49 +11:00
Timothy Arceri
04a72e6e57 glsl: add xfb_stride compile time rules
From the ARB_enhanced_layouts spec:

   "The *xfb_stride* qualifier specifies how many bytes are consumed
   by each captured vertex.  It applies to the transform feedback
   buffer for that declaration, whether it is inherited or explicitly
   declared. It can be applied to variables, blocks, block members,
   or just the qualifier out.  If the buffer is capturing any
   double-typed outputs, the stride must be a multiple of 8, otherwise
   it must be a multiple of 4, or a compile-time or link-time error
   results.

   ...

   The resulting stride (implicit or explicit) must be less than or
   equal to the implementation-dependent constant
   gl_MaxTransformFeedbackInterleavedComponents."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:50:44 +11:00
Timothy Arceri
edddad0eee glsl: add xfb_offset compile time rules
We also copy the qualifier values to the IR in this step.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:50:39 +11:00
Timothy Arceri
f6a8c7ef21 glsl: add xfb_buffer compile time rules
Also copies the qualifier values to GLSL IR.

From the ARB_enhanced_layouts spec:

    "The *xfb_buffer* qualifier can be applied to the qualifier out,
    to output variables, to output blocks, and to output block
    members.  Shaders in the transform  feedback capturing mode have
    an initial global default of

        layout(xfb_buffer = 0) out;

    This default can be changed by declaring a different buffer with
    xfb_buffer on the interface qualifier out.  This is the only way
    the global default can be changed.  When a variable or output block
    is declared without an  xfb_buffer qualifier, it inherits the global
    default buffer.  When a variable or output block is declared with an
    xfb_buffer qualifier, it has that declared buffer.  All members of a
    block inherit the block's buffer.  A  member is allowed to declare
    an xfb_buffer, but it must match the buffer inherited from its
    block, or a compile-time error results.

    The *xfb_buffer* qualifier follows the same conventions, behavior,
    defaults, and inheritance rules as the qualifier stream, and the
    examples for stream apply here as well.  This includes a block's
    inheritance of the current global default buffer, a block member's
    inheritance of  the block's buffer, and the requirement that any
    *xfb_buffer* declared on a block member must match the buffer
    inherited from the block.

    ...

    It is a compile-time error to specify an *xfb_buffer* that is
    greater than  the implementation-dependent constant
    gl_MaxTransformFeedbackBuffers."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:50:34 +11:00
Timothy Arceri
04d2f770c8 glsl: add field to track if xfb_buffer is an explicit or implicit value
Since any of the xfb_* qualifiers trigger the shader to be in
transform feedback mode we need an extra field to track if
the xfb_buffer on interface members was set explicitly since
xfb_buffer will always have a default value.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:50:29 +11:00
Timothy Arceri
733f1b2a55 glsl: add xfb_* qualifiers to glsl_struct_field
These will be used to hold qualifier values for interface and
struct members.

Support is added to the struct/interface constructors to copy these
fields upon creation.

We also update record_compare() to ensure we don't reuse a glsl_type
with the wrong xfb_* qualifier values.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:50:19 +11:00
Timothy Arceri
2dbcecb7a9 glsl: add IR fields for transform feedback layout qualifiers
Adds xfb_buffer/stride fields and adds comment to offset field
which is reused for xfb_offset.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:50:13 +11:00
Timothy Arceri
5c2516fc33 glsl: add validation for out layout qualifiers
This adds validation for all qualifiers as allowed by the
table in Section 4.4 (Layout Qualifiers) of the GLSL 4.5 spec.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:50:08 +11:00
Timothy Arceri
7b407fecec glsl: relax stage restrictions on layout defaults for outputs
The new xfb_buffer and xfb_stride global qualifiers are allowed in
geom, tess and vertex stages.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:50:04 +11:00
Timothy Arceri
c9afd94af6 glsl: parse new transform feedback layout qualifiers
We reuse the existing offset field for holding the xfb_offset
expression but create a new flag as to avoid hitting the rules
for the offset qualifier for UBOs.

xfb_buffer qualifiers require extra processing when merging as
they can be applied to global out defaults. We just apply the
same rules as we do for the stream qualifier as the spec says:

   "The *xfb_buffer* qualifier follows the same conventions,
    behavior, defaults, and inheritance rules as the qualifier
    stream, and the examples for stream apply here as well."

For xfb_stride we push everything into a global out field for
later processing as xfb_stride applies to the entire buffer.
We still need to have a separate field to store per variable
strides because they can still effect implicit offsets
e.g. when applied to block members with implicit offsets.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:50:00 +11:00
Timothy Arceri
13f6c788eb glsl: move process_qualifier_constant() to ast_type.cpp
We will make use of this function being here in the following patch.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:49:55 +11:00
Timothy Arceri
52caeee7e7 glsl: add transform feedback built-in constants
These are new built-ins added by ARB_enhanced_layouts.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:49:51 +11:00
Timothy Arceri
8765a9e0fe glsl: generate named interface block names correctly
Firstly this updates the named interface lowering pass to store the
interface without the arrays removed.

Note we need to remove the arrays in the interface/varying matching
code to not regress things but in future this should be fixed
futher as it would seem we currently successfully match interface
blocks with differnt array sizes.

Since we now know if the interface was an array we can reduce the
IR flags from_named_ifc_block_array and from_named_ifc_block_nonarray
to just from_named_ifc_block.

Next rather than having a different code path for named interface
blocks in program_resource_visitor we just make use of the one used
by UBOs this allows us to now handle arrays of arrays correctly.

Finally we add a new param to the recursion function
named_ifc_member this is because we only want to process a single
member at a time. Note that this is also the glsl_struct_field
from the original ifc type before lowering rather than the type
from the lowered variable. This fixes a bug in Mesa where we would
generate the names like WithInstArray[0].g[0][0] when it should be
WithInstArray[0].g[0] for the following interface.

   out WithInstArray {
      float g[3];
   } instArray[2];

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:49:47 +11:00
Timothy Arceri
7ebc3deaad glsl: Fix segfault when lhs is error_type in TCS
It seems expected that both lhs and rhs could be of type error_type
in this code however the TCS case wasn't expecting it.

Fixes segfault in an enhanced layouts GL CTS test.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-31 12:49:42 +11:00
Dave Airlie
c9367c13ca docs: update softpipe status for shader_image_load_store.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:14:30 +10:00
Dave Airlie
eb9ad9faa3 softpipe: add image support to softpipe (v3)
This adds support for ARB_shader_image_load_store to softpipe.

v2: add RESQ support (Ilia)
v3: constify, cleanup internals, add some comments (Brian).

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:14:16 +10:00
Dave Airlie
0d1f679ded draw: add support for passing images to vs/gs shaders.
This just adds support for passing through images to the
tgsi execution stage.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:14:11 +10:00
Dave Airlie
22d1296013 tgsi: add support for image operations to tgsi_exec. (v2.1)
This adds support for load/store/atomic operations on images
along with image tracking support.

v2: add RESQ support. (Ilia)
v2.1: constify interface (Brian)
split get_image_coord_dim (Brian)

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:14:05 +10:00
Dave Airlie
493eab7679 softpipe: add support for explicit early depth testing
ARB_shader_image_load_store adds support for explicit early
depth testing. However we need to make sure we don't overwrite
values using the shader written values in this case.

This fixes early depth testing in softpipe to conform with
those requirements.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:13:54 +10:00
Dave Airlie
827393b76f tgsi: introduce NonHelperMask
This is a mask of which of the current 2x2 grid are non-helper
invocations. This allows us to mask off the helper invocations
later for the image operations.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:13:50 +10:00
Dave Airlie
ca180c09bb tgsi_exec: handle execmask when doing indirect lookups
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:13:46 +10:00
Dave Airlie
1ff4cc0535 tgsi_exec: add support for up to 3 address registers (v2)
v2: be consistent with other definitions.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-31 09:13:08 +10:00
Matt Turner
6702f1acde nir: Propagate negates up multiplication chains.
total instructions in shared programs: 7112159 -> 7088092 (-0.34%)
instructions in affected programs: 1374915 -> 1350848 (-1.75%)
helped: 7392
HURT: 621

GAINED: 2
LOST:   2
2016-03-30 13:12:34 -07:00
Matt Turner
a74fc3fe8a i965: Don't inline intel_batchbuffer_require_space().
It's called by the inline intel_batchbuffer_begin() function which
itself is used in BEGIN_BATCH. So in sequence of code emitting multiple
packets, we have inlined this ~200 byte function multiple times. Making
it an out-of-line function presumably improved icache usage.

Improves performance of Gl32Batch7 by 3.39898% +/- 0.358674% (n=155) on
Ivybridge.

Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2016-03-30 13:12:34 -07:00
Christian König
1faca438bd r600: ignore PIPE_BIND_LINEAR in *_is_format_supported
Similar to radeonsi linear layout should work for all not compressed
or depth/stencil formats. Fixes issues with VDPAU on r600.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2016-03-30 20:00:27 +02:00
Thomas Hindoe Paaboel Andersen
9a73f5728e st/vdpau: correct null check
The null check of result was the wrong way around. Also, move memset
and dereference of result after the null check.

Reviewed-by: Christian König <christian.koenig@amd.com>
2016-03-30 20:00:27 +02:00
Brian Paul
4541a78502 docs: remove docs/COPYING which contains GPL license
There hasn't been GPL code in Mesa for a long time now.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-30 11:38:51 -06:00
Samuel Pitoiset
bb37886f75 glsl: add missing types for buffer images
Type of GLSL_SAMPLER_DIM_BUF can be sampler or image.

Spotted while trying to run dEQP tests related to
ARB_shader_image_load_store.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-30 19:01:33 +02:00
Lars Hamre
6773128bbf glsl: invalidate float suffixes for GLSL 1.10 and GLSL ES 1.00
Float suffixes are not allowed in GLSL 1.10 nor GLSL ES 1.00.

Fixes the following piglit tests:
tests/spec/glsl-1.10/compiler/literals/invalid-float-suffix-capital-f.vert
tests/spec/glsl-1.10/compiler/literals/invalid-float-suffix-f.vert`

v2: modify error message
v3: parse the float instead of returning an ERROR_TOK
v4: (by Ken) Change to is_version(120, 300) to avoid breaking ES3
    shaders; update commit message accordingly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81585
Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-29 21:26:34 -07:00
Jason Ekstrand
cf2257069c nir/spirv: Set a default number of invocations for geometry shaders
The SPIR-V spec says geometry shaders are supposed to have one invocation
by default.  The execution mode is only required if there are multiple
invocations.
2016-03-29 20:30:27 -07:00
Roland Scheidegger
2d3b8aefda tgsi: (trivial) only verify target for is_tex instructions
d3d10 state tracker does not encode (valid) target (only offsets are
really used from the texture bits), since that information always comes
from the sview dcl, and not the instruction (note the meaning of target
is actually slightly different between gl and d3d10 in any case, because
d3d10 target does never include shadow bit).
Also move the msaa sampler identification as well - would need to set that
on the sview not sampler, so while this does not fix it make it at least
obvious it won't work with sample instructions.
2016-03-30 04:26:54 +02:00
Ilia Mirkin
553e37aa33 mesa: allow mutable buffer textures to back GL ES images
Since there is no way to create immutable texture buffers in GL ES,
mutable buffer textures are allowed to back images. See issue 7 of the
GL_OES_texture_buffer specification.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-29 21:41:03 -04:00
Brian Paul
513384d7e8 mesa: make _mesa_prepare_mipmap_level() static
No longer called from any other file.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-29 18:13:46 -06:00
Brian Paul
ed39de90f1 meta: use _mesa_prepare_mipmap_levels()
The prepare_mipmap_level() wrapper for _mesa_prepare_mipmap_level() is
not needed.  It only served to undo the GL_TEXTURE_1D_ARRAY height/depth
change was was made before the call to prepare_mipmap_level()

Said another way, regardless of how the meta code manipulates the height/
depth dims for GL_TEXTURE_1D_ARRAY, the gl_texture_image dimensions are
correctly set up by _mesa_prepare_mipmap_levels().

Tested by plugging _mesa_meta_GenerateMipmap() into the swrast driver
and testing with piglit.

v2 (idr): Early out of the mipmap generation loop with dstImage is NULL.
This can occur for immutable textures that have a limited range of
levels or in the presense of memory allocation failures.  Fixes
arb_texture_view-mipgen on Intel platforms.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-29 18:13:46 -06:00
Brian Paul
bab0752a80 docs: add HTTP link for Mesa downloads
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92628
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-29 18:13:46 -06:00
Brian Paul
5c85c3be26 tgsi: simplify tgsi_shader_info::is_msaa_sampler checking
We assert that fullinst->Instruction.Texture != 0 above so no need to
check it in the conditional.  We also have the fullinst->Texture.Texture
value in a local variable, so use it.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-03-29 18:13:46 -06:00
Brian Paul
86e1768c13 tgsi: collect texture sampler target info in tgsi_scan_shader()
Texture sample instructions specify a sampler unit and texture target
such as "1D", "2D", "CUBE", etc.  Sampler view declarations also specify
the sampler unit and texture target.

This patch checks that the texture instructions agree with the declarations
and collects the texture target type for each sampler unit.

v2: only compare instruction's texture target to the sampler view declaration
target if the instruction is a TEX instruction, not a SAMPLE instruction.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-03-29 18:13:46 -06:00
Brian Paul
6775268b61 gallium/docs: s/gven/given/ 2016-03-29 18:13:46 -06:00
Brian Paul
75b713455c xlib: add support for GLX_ARB_create_context
This adds the glXCreateContextAttribsARB() function for the xlib/swrast
driver.  This allows more piglit tests to run with this driver.

For example, without this patch we get:
$ bin/fbo-generatemipmap-1d -auto
piglit: error: waffle_config_choose failed due to WAFFLE_ERROR_UNSUPPORTED_
ON_PLATFORM: GLX_ARB_create_context is required in order to request an OpenGL
version not equal to the default value 1.0
piglit: error: Failed to create waffle_config for OpenGL 2.0 Compatibility Context
piglit: info: Failed to create any GL context
PIGLIT: {"result": "skip" }

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2016-03-29 18:13:45 -06:00
Brian Paul
d8d029f22b st/mesa: simplify st_generate_mipmap()
The whole st_generate_mipmap() function was overly complicated.  Now
we just call the new _mesa_prepare_mipmap_levels() function to prepare
the texture mipmap memory, then call the generate function which fills
in the texture images.

This fixes a failed assertion in llvmpipe/softpipe which is hit with the
new piglit generatemipmap-base-change test.  Also fixes some device errors
(format mismatches) with the VMware svga driver.

v2: fix a comment typo, per Sinclair

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-03-29 18:13:45 -06:00
Brian Paul
105fe52784 mesa: new _mesa_prepare_mipmap_levels() function for mipmap generation
Simplifies the loops in generate_mipmap_uncompressed() and
generate_mipmap_compressed().  Will be used in the state tracker too.
Could probably be used in the meta code.  If so, some additional
clean-ups can be done after that.

v2: use unsigned types instead of GLuint, per Ian

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-29 18:13:45 -06:00
Kenneth Graunke
d4a5a61d44 i965: Don't use CUBE wrap modes for integer formats on IVB/BYT.
There is no linear filtering for integer formats, so we should always
be using CLAMP_TO_EDGE mode.

Fixes 46 dEQP cases on Ivybridge (which were likely broken by commit
0faf26e6a0).

This workaround doesn't appear to be necessary on any other hardware;
I haven't found any documentation mentioning errata in this area.

v2: Only apply on Ivybridge/Baytrail to avoid regressing GLES3.1 tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1]
2016-03-29 15:43:18 -07:00
Kenneth Graunke
f8c69fbb54 Revert "i965: Set address rounding bits for GL_NEAREST filtering as well."
This reverts commit 60d6a8989a.

It's pretty sketchy, and apparently regressed a bunch of dEQP tests
on Sandybridge.
2016-03-29 15:35:07 -07:00
Rovanion Luckey
7087e0ab27 gallium: Format code in pb_buffer_fenced.c according to style guide.
This is a tiny housekeeping patch which does the following:

  * Replaced tabs with three spaces.
  * Formatted oneline and multiline code comments. Some doxygen
    comments weren't marked as such and some code comments were marked
    as doxygen comments.
  * Spaces between if- and while-statements and their parenthesis.

According to the mesa coding style guidelines.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-29 13:44:11 -06:00
Charmaine Lee
2d8df0306b svga: emit sampler declarations in the helper function for non vgpu10
With commit dc9ecf58c0,
we are now getting the sampler target from the sampler view
declaration. But since a sampler view declaration can be defined
after a sampler declaration, we need to emit the
sampler declarations in the pre-helpers function, otherwise,
the sampler target might not have defined yet for the sampler declaration.

Fixes viewperf maya-03 and various gl trace regressions in hwv11.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-29 13:35:09 -06:00
Brian Paul
96e0894106 svga: avoid freeing non-malloced memory
svga_shader_expand() will fall back to using non-malloced memory for
emit.buf if malloc fails. We should check if the memory is malloced
before freeing it in the error path of svga_tgsi_vgpu9_translate.

Original patch by Thomas Hindoe Paaboel Andersen <phomes@gmail.com>.
Remove trivial svga_destroy_shader_emitter() function, by BrianP.

Signed-off-by: Brian Paul <brianp@vmware.com>
2016-03-29 13:35:08 -06:00
Samuel Pitoiset
9d57c84994 nvc0/ir: move load/store lowering pass to handleLDST()
Having all this code in a big switch is not really a good pratice.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-29 19:55:51 +02:00
Christian König
cc68dc2b5e st/mesa: implement new DMA-buf based VDPAU interop v2
Avoid using internal structures from another API.

v2: rebase and moved includes so they don't cause problem when VDPAU isn't installed.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-03-29 17:29:22 +02:00
Christian König
bdeb22b7b6 st/vdpau: implement the new DMA-buf based interop v2
That should allow us to get away from passing internal structures around.

v2: rebased

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-03-29 17:29:18 +02:00
Christian König
0042aa508e st/vdpau: move FormatRGBAToPipe into the interop
We are going to need that in the Mesa state tracker as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-03-29 17:29:14 +02:00
Christian König
faba96bc60 st/vdpau: add new interop interface
Use DMA-buf for the VDPAU interop interface instead of using
internal structures.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-03-29 17:29:10 +02:00
Christian König
d180de3532 st/vdpau: use linear layout for output surfaces
Works around a bug in radeonsi and tiling is actually
not very beneficial in this use case.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-03-29 17:28:43 +02:00
Christian König
7eb5e5b8b4 radeonsi: ignore PIPE_BIND_LINEAR in si_is_format_supported v2
Linear layout should work for all not compressed or depth/stencil formats.

v2: restrict it a bit more

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-29 17:28:35 +02:00
Ilia Mirkin
9286cbdd1e st/mesa: enable OES_texture_buffer when all components available
OES_texture_buffer combines bits from a number of desktop extensions.
When they're all available, turn it on.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-29 10:15:21 -04:00
Adam Jackson
5e1aec6db0 glapi/glx: Mark the indirect swapped dispatch functions _X_COLD
A modest size savings:

   text	   data	    bss	    dec	    hex	filename
 264143	  15608	    232	 279983	  445af libglx.so.before
 254303	  15608	    232	 270143	  41f3f libglx.so.after

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-03-29 10:10:57 -04:00
Adam Jackson
ea0f62e45e glapi/glx: Sync some additional error checking from xserver
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-03-29 10:10:57 -04:00
Jordan Justen
f56f538ce4 anv/gen7: Fix command parser version test with indirect dispatch
Caught-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-28 22:30:33 -07:00
Alejandro Piñeiro
dcd41ca87a glsl: raise warning when using uninitialized variables
v2:
 * Take into account out varyings too (Timothy Arceri)
 * Fix style (Timothy Arceri)
 * Use a new ast_expression variable, instead of an
   ast_expression::hir new parameter (Timothy Arceri)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94129

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-03-29 07:28:57 +02:00
Alejandro Piñeiro
8568d02498 glsl: add is_lhs bool on ast_expression
Useful to know if a expression is the recipient of an assignment
or not, that would be used to (for example) raise warnings of
"use of uninitialized variable" without getting a false positive
when assigning first a variable.

By default the value is false, and it is assigned to true on
the following cases:
 * The lhs assignments subexpression
 * At ast_array_index, on the array itself.
 * While handling the method on an array, to avoid the warning
   calling array.length
 * When computed the cached test expression at test_to_hir, to
   avoid a duplicate warning on the test expression of a switch.

set_is_lhs setter is added, because in some cases (like ast_field_selection)
the value need to be propagated on the expression tree. To avoid doing the
propatagion if not needed, it skips if no primary_expression.identifier is
available.

v2: use a new bool on ast_expression, instead of a new parameter
    on ast_expression::hir (Timothy Arceri)

v3: fix style and some typos on comments, initialize is_lhs default value
    on constructor, to avoid a c++11 feature (Ian Romanick)

v4: some tweaks on comments (Timothy Arceri)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94129

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-03-29 07:28:57 +02:00
Jason Ekstrand
35e2e96b30 nir: Add a helper for getting the current block from a cursor
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand
be98c47528 nir/lower_out_to_temp: Add an "entrypoint" parameter
Previously, the pass assumed that the entrypoint would be whatever function
happened to have the name "main".  We really shouldn't trust in the
function names.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand
31a5bec93f nir/lower_out_to_temp: Steal the output's constant initializer
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand
38de85f9a5 nir: Add a helper for getting the unique function in a shader
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand
49be812be6 nir/sweep: Sweep function parameters
They are no longer in the list of local variables so we need to explicitly
sweep them.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand
1be4c61c95 nir/builder: Add a helper for creating undefs
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand
6a2479d618 nir/builder: Add a helper for storing to variable derefs
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand
77e2ac1da7 nir/builder: Add a helper for building fdot instructions
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand
da422663a6 nir: Add a variable_foreach_safe helper
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand
731870fbe3 nir/Makefile: Fix alphabetization
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Ilia Mirkin
b4c0c514b1 mesa: add OES_texture_buffer and EXT_texture_buffer support
Allow ES 3.1 contexts to access the texture buffer functionality.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-28 20:29:29 -04:00
Ilia Mirkin
720670a615 glsl: add OES_texture_buffer and EXT_texture_buffer support
Expose the samplerBuffer/imageBuffer types, and allow the various
functions to operate on them.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-28 20:20:49 -04:00
Ilia Mirkin
74b76c08a3 mesa: add OES_texture_buffer and EXT_texture_buffer extension to table
We need to add a new bit since the GL ES exts require functionality from
a combination of texture buffer extensions as well as images (for
imageBuffer) support. Additionally, not all GPUs support all the texture
buffer functionality (e.g. rgb32 isn't supported by nv50).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-28 20:19:14 -04:00
Ilia Mirkin
659beca666 mesa: properly return GetTexLevelParameter queries for buffer textures
This fixes all failures with dEQP tests in this area. While
ARB_texture_buffer_object explicitly says that GetTexLevelParameter & co
should not be supported, GL 3.1 reverses this decision and allows all of
these queries there.

Conversely, there is no text that forbids the buffer-specific queries
from being used with non-buffer images.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-28 20:18:46 -04:00
Kenneth Graunke
4ed4a2af86 glsl: Delete initialized field from uniform storage test.
Timothy deleted this field.  Fixes "make check".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-03-28 17:02:00 -07:00
Jordan Justen
8dbfa265a4 anv/gen7: DispatchIndirect requires cmd parser 5
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-28 17:01:35 -07:00
Jordan Justen
1a3adae84a anv/gen7: Save kernel command parser version
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-28 17:01:35 -07:00
Jordan Justen
f60683b32a anv: Invalidate state cache before L3 partitioning set-up.
Port 10d84ba9f0 to anv.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-28 17:01:35 -07:00
Jordan Justen
5879cb0251 anv: Fix cache pollution race during L3 partitioning set-up.
Port 0aa4f99f56 to anv.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-28 17:01:35 -07:00
Timothy Arceri
86d87d1047 mesa: remove initialized field from uniform storage
The only place this was used was in a gallium debug function that
had to be manually enabled.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-29 09:59:03 +11:00
Samuel Pitoiset
b8b3af2932 nvc0: use a different offset for buffers and surfaces
To not overwrite buffers and surfaces information, we need to use
a different offset in the driver constant buffer. Currently, OP_SUQ
is only supported for buffers but this will be slightly updated for
images support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-29 00:47:28 +02:00
Kenneth Graunke
60d6a8989a i965: Set address rounding bits for GL_NEAREST filtering as well.
Yuanhan Liu decided these were useful for linear filtering in
commit 76669381 (circa 2011).  Prior to that, we never set them;
it seems he tried to preserve that behavior for nearest filtering.

It turns out they're useful for nearest filtering, too: setting
these fixes the following dEQP-GLES3 tests:

functional.fbo.blit.rect.nearest_consistency_mag
functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_x
functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_y
functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_x
functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y
functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_x
functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_y
functional.fbo.blit.rect.nearest_consistency_min
functional.fbo.blit.rect.nearest_consistency_min_reverse_src_x
functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y
functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_x
functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y
functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_x
functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_y

Apparently, BLORP has always set these bits unconditionally.

However, setting them unconditionally appears to regress tests using
texture projection, 3D samplers, integer formats, and vertex shaders,
all in combination, such as:

functional.shaders.texture_functions.textureprojlod.isampler3d_vertex

Setting them on Gen4-5 appears to regress Piglit's
tests/spec/arb_sampler_objects/framebufferblit.

Honestly, it looks like the real problem here is a lack of precision.
I'm just hacking around problems here (as embarassing as it is).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-28 15:28:58 -07:00
Kenneth Graunke
0faf26e6a0 i965: Always use BRW_TEXCOORDMODE_CUBE when seamless filtering.
When using seamless cube map mode and NEAREST filtering, we explicitly
overrode the wrap modes to CLAMP_TO_EDGE.  This was to implement the
following spec text:

   "If NEAREST filtering is done within a miplevel, always apply apply
    wrap mode CLAMP_TO_EDGE."

However, textureGather() ignores the sampler's filtering mode, and
instead returns the four pixels that would be blended by LINEAR
filtering.  This implies that we should do proper seamless filtering,
and include pixels from adjacent cube faces.

It turns out that we can simply delete the NEAREST -> CLAMP_TO_EDGE
overrides.  Normal cube map sampling works by first selecting the
face, and then nearest filtering fetches the closest texel.  If the
nearest texel was on a different face, then that face would have been
chosen.  So it should always be within the face anyway, which
effectively performs CLAMP_TO_EDGE.

Fixes 86 dEQP-GLES31.texture.gather.basic.cube.* tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Suggested-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-28 15:25:04 -07:00
Kenneth Graunke
72473658c5 i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs.
Our driver uses the brw_render_cache mechanism to track buffers we've
rendered to and are about to sample from.

Previously, we did a single PIPE_CONTROL with the following bits set:
- Render Target Flush
- Depth Cache Flush
- Texture Cache Invalidate
- VF Cache Invalidate
- Instruction Cache Invalidate
- CS Stall

This combined both "top of pipe" invalidations and "bottom of pipe"
flushes, which isn't how the hardware is intended to be programmed.

The "top of pipe" invalidations may happen right away, without any
guarantees that rendering using those caches has completed.  That
rendering may continue altering the caches.  The "bottom of pipe"
flushes do wait for the rendering to complete.  The CS stall also
prevents further work from happening until data is flushed out.

What we wanted to do was wait for rendering complete, flush the new
data out of the render and depth caches, wait, then invalidate any
stale data in read-only caches.  We can accomplish this by doing the
"bottom of pipe" flushes with a CS stall, then the "top of pipe"
flushes as a second PIPE_CONTROL.  The flushes will wait until the
rendering is complete, and the CS stall will prevent the second
PIPE_CONTROL with the invalidations from executing until the first
is done.

Fixes dEQP-GLES3.functional.texture.specification.teximage2d_pbo
subtests on Braswell and Skylake.  These tests hit the meta PBO
texture upload path, which binds the PBO as a texture and samples
from it, while rendering to the destination texture.  The tests
then sample from the texture.

For now, we leave Gen4-5 alone.  It probably needs work too, but
apparently it hasn't even been setting the (G45+) TC invalidation
bit at all...

v2: Add Sandybridge post-sync non-zero workaround, for safety.

Cc: mesa-stable@lists.freedesktop.org
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-28 15:23:56 -07:00
Kenneth Graunke
de505f7d7b i965: Whack UAV bit when FS discards and there are no color writes.
dEQP-GLES31.functional.fbo.no_attachments.* draws a quad with no
framebuffer attachments, using a shader that discards based on
gl_FragCoord.  It uses occlusion queries to inspect whether pixels
are rendered or not.

Unfortunately, the hardware is not dispatching any pixel shaders,
so discards never happen, and the full quad of pixels increments
PS_DEPTH_COUNT, making the occlusion query results bogus.

To understand why, we have to delve into the WM_INT internal
signalling mechanism's formulas.

The "WM_INT::Pixel Shader Kill Pixel" signal is defined as:

    3DSTATE_WM::ForceKillPixel == ON ||
    (3DSTATE_WM::ForceKillPixel != Off &&
     !WM_INT::WM_HZ_OP &&
     3DSTATE_WM::EDSC_Mode != PREPS &&
     (WM_INT::Depth Write Enable || WM_INT::Stencil Write Enable) &&
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     (3DSTATE_PS_EXTRA::PixelShaderKillsPixels ||
      3DSTATE_PS_EXTRA:: oMask Present to RenderTarget ||
      3DSTATE_PS_BLEND::AlphaToCoverageEnable ||
      3DSTATE_PS_BLEND::AlphaTestEnable ||
      3DSTATE_WM_CHROMAKEY::ChromaKeyKillEnable))

Because there is no depth or stencil buffer, writes to those buffers
are disabled.  So the highlighted condition is false, making the whole
"Kill Pixel" condition false.  This then feeds into the following
"WM_INT::ThreadDispatchEnable" condition:

    3DSTATE_WM::ForceThreadDispatch != OFF &&
    !WM_INT::WM_HZ_OP &&
    3DSTATE_PS_EXTRA::PixelShaderValid &&
    (3DSTATE_PS_EXTRA::PixelShaderHasUAV ||
     WM_INT::Pixel Shader Kill Pixel ||
     WM_INT::RTIndependentRasterizationEnable ||
     (!3DSTATE_PS_EXTRA::PixelShaderDoesNotWriteRT &&
      3DSTATE_PS_BLEND::HasWriteableRT) ||
     (WM_INT::Pixel Shader Computed Depth Mode != PSCDEPTH_OFF &&
      (WM_INT::Depth Test Enable || WM_INT::Depth Write Enable)) ||
     (3DSTATE_PS_EXTRA::Computed Stencil && WM_INT::Stencil Test Enable) ||
     (3DSTATE_WM::EDSC_Mode == 1 && (WM_INT::Depth Test Enable ||
                                     WM_INT::Depth Write Enable ||
                                     WM_INT::Stencil Test Enable)))

Given that there's no depth/stencil testing, no writeable render target,
and the hardware thinks kill pixel doesn't happen, all of these
conditions are false.  We have to whack some bit to make PS invocations
happen.  There are many options.

Curro suggested using the UAV bit.  There's some precedence in doing
that - we set it for fragment shaders that do SSBO/image/atomic writes
when no color buffer writes are enabled.  We can simply include discard
here too.

Fixes 64 dEQP-GLES31.functional.fbo.no_attachments.* tests.

v2: Add a comment suggested and written by Jason Ekstrand.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-28 14:36:47 -07:00
Jason Ekstrand
433cf90650 nir/spirv: Remove the NoContraction hack
NIR now just handles this for us by not fusing if the multiply is marked as
exact.
2016-03-28 13:07:39 -07:00
Jason Ekstrand
5d9afb65a6 i965/peephole_ffma: Only match a mul+add if none of the ops are exact 2016-03-28 13:07:39 -07:00
Jason Ekstrand
035f66025b nir/search: Don't match inexact expressions with exact subexpressions
In the first pass of implementing exact handling, I made a mistake with
search-and-replace.  In particular, we only reallly handled exact/inexact
on the root of the tree.  Instead, we need to check every node in the tree
for an exact/inexact match.  As an example of this, consider the following
GLSL code

precise float a = b + c;
if (a < 0) {
   do_stuff();
}

In that case, only the add will be declared "exact" and an expression that
looks for "b + c < 0" will still match and replace it with "b < -c" which
may yield different results.  The solution is to simply bail if any of the
values are exact when matching an inexact expression.
2016-03-28 13:07:39 -07:00
Rhys Kidd
668b6ddfc5 vc4: Remove unused include from vc4_nir_lower_txf_ms.c
Found with grep and inspection. Test compiled on RPi hw.
Assists any future effort to remove TGSI as an intermediate stage.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
2016-03-28 11:51:11 -07:00
Adam Jackson
2b8492d63e glapi/glx: Treat xserver generated targets as .PHONY
Meaning, always rebuild them when asked instead of bothering to look at
timestamps (and then wondering why nothing happened when you said make).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-03-28 14:37:12 -04:00
Adam Jackson
c2f0bc2537 glapi/glx: Thunk non-ABI calls through GetProcAddress
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-03-28 14:37:12 -04:00
Adam Jackson
ce3f0b23d1 glapi/glx: Emit direct GL calls instead of dispatch lookup
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-03-28 14:28:51 -04:00
Adam Jackson
c0a9cbea4d glx: Unbreak generating some of the xorg glx headers
Broken by:

    commit 9ace0b5422
    Author: Dylan Baker <baker.dylan.c@gmail.com>
    Date:   Wed May 20 15:49:11 2015 -0700

	glapi: glX_proto_size.py: use argparse instead of getopt

Which changed most, but not all, callers to use --header-tag instead of
-h.

Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-03-28 14:28:36 -04:00
Bas Nieuwenhuizen
dd5f0950e4 mesa/st: Fix NULL access if no fragment shader is bound
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-28 18:02:07 +02:00
Rob Clark
b4c72b792c freedreno/ir3: fix for load_front_face intrinsic
Seems like trying to widen in the same instruction as the add.s does a
non-sign-extending widen.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-28 10:19:53 -04:00
Rob Clark
3ca034cada freedreno/ir3: fix compiler warn
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-28 10:19:09 -04:00
Ilia Mirkin
b9f1affb2e nvc0: make sure to disable fetches from previously-set VBOs when blitting
We disable the vertex attributes, but also disable the VBO fetch details
as well, just in case. Not known to fix anything.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-28 08:36:34 -04:00
Ilia Mirkin
41100b6b44 nvc0: disable primitive restart and index bias during blits
Back in the dawn of time, we used to do immediate uploads for the vertex
data, and all was well. However Maxwell dropped support for immediate
vertex data, so we started feeding in a VBO (in all cases). But we
forgot to disable some things that apply in such cases, specifically
primitive restart and index bias. The latter was causing WoW and other
Blizzard games trouble as they use a pattern where they draw with a base
vertex (aka index bias), followed by texture uploads (aka blits,
internally).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91526
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <nouveau@karolherbst.de>
2016-03-28 08:35:38 -04:00
Ilia Mirkin
f667d15561 nvc0/ir: fix picking of coordinates from tex instruction for textureGrad
On Fermi, there's an argument in front of the coords that combines array
and indirect handle, while on Kepler the array and the indirect handle
are separate (and in front of the coords). We were previously only
accounting for the array bit of it, if there were an indirect access it
wouldn't be counted in the formula.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-28 08:35:38 -04:00
Ilia Mirkin
6711f159d9 nv50/ir: saturate depth writes
Apparently there's no post-FS clamping logic, so we have to do this by
hand. The depth will never be outside of the 0..1 range, even on
floating point zeta buffers, so this should be safe.

Fixes dEQP-GLES3.functional.fbo.depth.*clamp.* which tests writing
invalid values on various zeta buffer formats.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-28 08:35:38 -04:00
Marek Olšák
6262d6125a gallium/util: fix up inaccurate behavior of util_framebuffer_state_equal (v2)
v2: move the nr_cbufs check above the loop

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
2016-03-28 00:46:23 +02:00
Marek Olšák
21c479256a st/mesa: only minify height if target != 1D array in st_finalize_texture
The st_texture_object documentation says:
  "the number of 1D array layers will be in height0"

We can't minify that.

Spotted by luck. No app is known to hit this issue.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-28 00:44:45 +02:00
Miklós Máté
50d653c2bb mesa: optimize out the realloc from glCopyTexImagexD()
v2: comment about the purpose of the code
v3: also compare texFormat,
 add a perf debug message,
 formatting fixes

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 19:58:33 +02:00
Miklós Máté
baab345b19 st/mesa: fix handling the fallback texture
This fixes crash when post-processing is enabled in SW:KotOR.

v2: fix const-ness
v3: move assignment into the if() block

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 19:58:33 +02:00
Miklós Máté
920fbecf57 st/mesa: enable GL_ATI_fragment_shader
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 19:58:33 +02:00
Miklós Máté
dee274477f st/mesa: implement GL_ATI_fragment_shader
v2: fix arithmetic for special opcodes,
 fix fog state, cleanup
v3: simplify handling of special opcodes,
 fix rebinding with different textargets or fog equation,
 lots of formatting fixes
v4: adapt to the compile early, fix later architecture,
 formatting fixes

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 19:58:33 +02:00
Miklós Máté
d71c1e9e54 program: add ATI_fragment_shader to shader stages list
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 19:58:33 +02:00
Miklós Máté
e2d5a6fac5 mesa: optionally associate a gl_program to ATI_fragment_shader
the state tracker will use it

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 19:58:33 +02:00
Edward O'Callaghan
11bd53933e gallium/p_context.h: Make comment more readable
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 18:03:04 +02:00
Edward O'Callaghan
2df141087a mesa/st: Remove GLSLVersion clamping
While here, remove itermediate glsl_feature_level variable.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 18:00:36 +02:00
Edward O'Callaghan
ca22d2f1fd radeon/r600: Fix return type in failure branch
Commit `d4e847ea` introduced a warning about making an
integer from a pointer without a cast, fix it here.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 18:00:35 +02:00
Edward O'Callaghan
1fb05a9a0c radeon/r600_query.c: Minor style fix
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 18:00:35 +02:00
Dave Airlie
fc3b000fef virgl: drop next shader property for now.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-26 17:50:32 +10:00
Jason Ekstrand
6d658e9bd5 i965: Allow mul+add fusing again 2016-03-25 21:35:41 -07:00
Jason Ekstrand
fbb9e1f008 spirv/alu: Add support for the NoContraction decoration 2016-03-25 21:35:41 -07:00
Jason Ekstrand
00fa795cd3 spirv/glsl: Add a helper for converting glsl opcodes into nir opcodes
This is similar to the way that regular ALU operations are handled.
2016-03-25 21:35:41 -07:00
Jason Ekstrand
98522c1853 nir/spirv: Get rid of the spirv2nir helper binary
This was useful once upon a time but now that we have a real Vulkan driver
to run our SPIR-V binaries through, there's really no point.
2016-03-25 21:35:41 -07:00
Nanley Chery
0e82896a11 anv/blit2d: Add a function to create an ImageView
This function differs from the open-coded implementation in that the
ImageView's width is determined by the caller and is not unconditionally
set to match the number of texels within the surface's pitch.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-25 17:33:50 -07:00
Nanley Chery
4eab37d6cd anv/image: Enable specifying a surface's minimum pitch
This is required to create multiple, horizontally adjacent, max-width
images from one blit2d surface. This is also required for more accurate
width specification of surfaces within a larger surface (which is seen
as the smaller surface's enclosing region).

Note that anv_image_create_info::stride has been unused since commit,
b369389640 .

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-25 17:33:40 -07:00
Timothy Arceri
8683d54d2b glsl: reduce buffer block duplication
This reduces some of the craziness required for handling buffer
blocks. The problem is each shader stage holds its own information
about a block in memory, we were copying that information to a
program wide list but the per stage information remained meaning
when a binding was updated we needed to update all versions of it.

This changes the per stage blocks to instead point to a single
version of the block information in the program list.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-26 09:26:30 +11:00
Jason Ekstrand
38250a9ca3 i965/vec4: Get rid of a stray predicate inverse in opquantizef16
This fixes 30 opquantize CTS tests on HSW
2016-03-25 14:37:37 -07:00
Jason Ekstrand
13bad493b4 nir/algebraic: Get rid of a redundant copy of fdiv lowering 2016-03-25 14:04:05 -07:00
Jason Ekstrand
08fe89864b nir/algebraic: Add better lowering of ldexp 2016-03-25 14:04:05 -07:00
Jason Ekstrand
b75d770963 nir/builder: Simplify nir_ssa_undef a bit 2016-03-25 14:04:05 -07:00
Jason Ekstrand
ab31951bef nir/spirv: Use the nir_ssa_undef helper from nir_builder 2016-03-25 14:04:05 -07:00
Jason Ekstrand
d2eee52a65 nir/builder: Add a bit size field to nir_ssa_undef 2016-03-25 14:04:05 -07:00
Jason Ekstrand
b50f7f0011 nir: Add a better comment for INTRINSIC_RANGE 2016-03-25 14:04:05 -07:00
Jason Ekstrand
add8c837b5 nir/glsl: Stop carying a pointer to the nir_shader in the visitor 2016-03-25 14:04:05 -07:00
Brian Paul
a8e5edaadf st/xa: emit sampler view declarations in shaders
Fixes recent regressions with the VMware gallium driver.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
2016-03-25 14:53:59 -06:00
Tim Rowley
74a04840e5 swr: [rasterizer jitter] Fix MASKLOADD AVX prototype (float -> i32) 2016-03-25 14:45:40 -05:00
Tim Rowley
93c1a2dedf swr: [rasterizer core] NUMA optimizations...
- Affinitize hot-tile memory to specific NUMA nodes.
- Only do BE work for macrotiles assoicated with the numa node
2016-03-25 14:45:40 -05:00
Tim Rowley
090be2e434 swr: [rasterizer jitter] Fix logic bug for alpha-to-coverage. 2016-03-25 14:45:40 -05:00
Tim Rowley
0767e820fd swr: [rasterizer core] Fix Compute workitem retirement 2016-03-25 14:45:40 -05:00
Tim Rowley
813e89c0cc swr: [rasterizer core] Cleanup state ring arena after last draw that references it completes
Rather than waiting for the API thread to re-use it.
2016-03-25 14:45:40 -05:00
Tim Rowley
83822d7ed5 swr: [rasterizer jitter] add missing include for llvm jitevents 2016-03-25 14:45:40 -05:00
Tim Rowley
51549912d1 swr: [rasterizer core] Reduce Arena blocksize to 128KB (from 1MB).
With global allocator this doesn't seem to affect performance at all.
Overall memory consumption drops by up to 85%.
2016-03-25 14:45:40 -05:00
Tim Rowley
ed5b953919 swr: [rasterizer core] One last pass at Arena optimizations 2016-03-25 14:45:40 -05:00
Tim Rowley
ee6be9e92d swr: [rasterizer core] CachedArena optimizations
Reduce list traversal during Alloc and Free.

Add ability to have multiple lists based on alloc size (not used for now)
2016-03-25 14:45:39 -05:00
Tim Rowley
68314b6769 swr: [rasterizer jitter] support llvm-svn 2016-03-25 14:45:39 -05:00
Tim Rowley
ec9d4c4b37 swr: [rasterizer core] Globally cache allocated arena blocks for fast re-allocation. 2016-03-25 14:45:39 -05:00
Tim Rowley
12ce9d9aa1 swr: [rasterizer] more arena work 2016-03-25 14:45:39 -05:00
Tim Rowley
4893224e28 swr: [rasterizer core] Add clipping against user clip distances in the NullPS backend. 2016-03-25 14:45:39 -05:00
Tim Rowley
700a5b06e0 swr: [rasterizer core] Arena optimizations - preparing for global allocator. 2016-03-25 14:45:39 -05:00
Tim Rowley
5899076b6b swr: [rasterizer core] Reset DrawContext arena at end of draw rather than upon reclaim of DC
Keeps overall memory consumption lower.
Also, remove unused knobs.
2016-03-25 14:45:39 -05:00
Tim Rowley
7390418441 swr: [rasterizer core] Add clipping of user clip planes in clipper. 2016-03-25 14:45:39 -05:00
Tim Rowley
4b4547a721 swr: [rasterizer] Reduce max in-flight draws to 96 (by default) 2016-03-25 14:45:39 -05:00
Tim Rowley
9111d63228 swr: [rasterizer] Fix run-time check asserts
One innocuous (uninitialized variable), and one not so innocuous
(stack corruption).
2016-03-25 14:45:39 -05:00
Tim Rowley
257db3610a swr: [rasterizer jitter] signed immediate builder 2016-03-25 14:45:39 -05:00
Tim Rowley
b958aea78a swr: [rasterizer common] changes for cygwin 2016-03-25 14:45:39 -05:00
Tim Rowley
e1222ade00 swr: [rasterizer] code styling and update copyrights 2016-03-25 14:45:14 -05:00
Tim Rowley
c75314ec67 swr: [rasterizer core] Guard against enquing work to invalid hot tiles 2016-03-25 14:43:15 -05:00
Tim Rowley
fee56fda6f swr: [rasterizer] Stop setting viewport size to larger than hottile array
Guard against enquing work to invalid tiles
2016-03-25 14:43:14 -05:00
Tim Rowley
e374d2d24b swr: [rasterizer] Discard work + misc fixes 2016-03-25 14:43:14 -05:00
Tim Rowley
542d7dec7b swr: [rasterizer] remove use of BYTE type 2016-03-25 14:43:14 -05:00
Tim Rowley
be4c558d01 swr: [rasterizer core] Fix crash that can occur when switching contexts 2016-03-25 14:43:14 -05:00
Tim Rowley
51a11658d9 swr: [rasterizer] remove unused knob 2016-03-25 14:43:14 -05:00
Tim Rowley
61beaa2279 swr: [rasterizer core] subcontext rework 2016-03-25 14:43:14 -05:00
Tim Rowley
0c18900cfb swr: [rasterizer common] add _simd_s[rl]lv_epi32 2016-03-25 14:43:14 -05:00
Tim Rowley
bef222db22 swr: [rasterizer core] Alleviate potential stack overflow for 32bit builds
Move large stack allocations in the GS and clipper into thread local storage.
2016-03-25 14:43:14 -05:00
Tim Rowley
3132f731f8 swr: [rasterizer] remove use of UCHAR and UINT64 types 2016-03-25 14:43:14 -05:00
Tim Rowley
643857f596 swr: [rasterizer] remove use of FLOAT type 2016-03-25 14:43:14 -05:00
Tim Rowley
3252fe3705 swr: [rasterizer] Fix Coverity issues reported by Mesa developers. 2016-03-25 14:43:14 -05:00
Tim Rowley
45d52673c2 swr: [rasterizer] add debug/perf category to knobs 2016-03-25 14:43:13 -05:00
Tim Rowley
1da9c8a970 swr: [rasterizer core] don't assume linux is 64-bit 2016-03-25 14:43:13 -05:00
Tim Rowley
49678803f7 swr: [rasterizer common] remove old unused win32 types 2016-03-25 14:43:13 -05:00
Tim Rowley
aca5513184 swr: [rasterizer jitter] vpermps support 2016-03-25 14:43:13 -05:00
Tim Rowley
bfb954189e swr: [rasterizer] Add rdtsc buckets support for shaders
Pass pointer to core buckets mgr back to sim layer.

Add support for RDTSC_START/RDTSC_STOP macros in the builder.

Each unique shader now has a unique bucket associated with it,
enabling more detailed reporting at the shader level. Currently
due to some llvm issue with thread local storage, 64bit runs require
single threaded mode.
2016-03-25 14:43:13 -05:00
Tim Rowley
abd4aa68cc swr: [rasterizer core] backend reorganization 2016-03-25 14:43:13 -05:00
Tim Rowley
13303f3320 swr: [rasterizer core] store blend output in temporary instead of PS output.
Fixes additive blend problem with MSAA
2016-03-25 14:26:17 -05:00
Tim Rowley
3f4fba3772 swr: [rasterizer core] Move InitializeHotTiles and corresponding clear code out of threads.cpp. 2016-03-25 14:26:17 -05:00
Tim Rowley
bdd690dc36 swr: [rasterizer jitter] Cleanup use of types inside of Builder.
Also, cached the simd width since we don't have to keep querying
the JitManager for it.
2016-03-25 14:26:17 -05:00
Tim Rowley
7ead4959a5 swr: [rasterizer jitter] Fix type mismatch on select args for SCATTERPS 2016-03-25 14:26:17 -05:00
Tim Rowley
136988b42b swr: [rasterizer core] fix rasterizing multisampling with scissor enabled
We were not evaluating the scissor edge equations at sample positions.
2016-03-25 14:26:17 -05:00
Tim Rowley
45f0ce168c swr: [rasterizer core] RingBuffer class for DC/DS
Use head/tail ring buffer indices for thread synchronization.

1. SwrWaitForIdle loops until ring is empty. (head == tail)
2. GetDrawContext waits until ring is not full. (head - tail) == Ring Size
3. Draw enqueues by incrementing head.
4. Last worker thread to move past a DC dequeues by incrementing tail.

Todo: To reduce contention we can cache the tail in the API thread. For
example, if you know you have 64 free entries in the ring then you don't
need to keep checking the tail until you used those 64 entries.
2016-03-25 14:26:17 -05:00
Tim Rowley
dd0f9eed8c swr: [rasterizer] switch assert uses to SWR_ASSERT 2016-03-25 14:26:16 -05:00
Tim Rowley
45a4afa634 swr: [rasterizer core] Split all RECT_LIST draws into 1 RECT per draw
Needed until proper RECT_LIST PrimAssembly code is written.
2016-03-25 14:26:16 -05:00
Tim Rowley
3a25185990 swr: [rasterizer] Add string knob type 2016-03-25 14:26:16 -05:00
Jordan Justen
8f3c236674 anv: Use genxml register support for L3 Cache config
The programming of the L3 Cache registers should match the previous
manually packed LRI values.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-25 00:19:18 -07:00
Jordan Justen
7a03fb9ccb genxml: Add L3 Cache Control register definitions
Based on intel_reg.h (5912da45a6)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 23:49:53 -07:00
Jordan Justen
d353ba8f5f anv: Add genxml register support
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 23:49:53 -07:00
Jordan Justen
b332013a56 genxml: Add register support
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 23:46:59 -07:00
Sonny Jiang
f00c840578 radeonsi: add Polaris PCI IDs
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (Polaris10)
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (Polaris11)
2016-03-24 23:08:12 -04:00
Sonny Jiang
f87ed903fb radeon/vce: disable two pipe mode for Polaris11
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-03-24 23:08:04 -04:00
Sonny Jiang
0c5477465f radeon/vce: add Polaris11 VCE firmware support
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
2016-03-24 23:07:53 -04:00
Sonny Jiang
42e442d888 radeonsi: add support for Polaris (v2)
v2: Polaris chips should be defined after Stoney

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> (v1)
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1)
Signed-off-by: Leo Liu <leo.liu@amd.com> (v2 diff)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v2 diff)
2016-03-24 23:07:32 -04:00
Sonny Jiang
f5e24b19e8 winsys/amdgpu: addrlib - add Polaris support (v2)
v2: fix indentation as noted by Michel

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-24 23:06:39 -04:00
Jason Ekstrand
2c3f95d6aa Merge remote-tracking branch 'public/master' into vulkan 2016-03-24 17:30:14 -07:00
Kenneth Graunke
511ce2925b mesa: Check glReadBuffer enums against the ES3 table.
From the ES 3.2 spec, section 16.1.1 (Selecting Buffers for Reading):

   "An INVALID_ENUM error is generated if src is not BACK or one of
    the values from table 15.5."

Table 15.5 contains NONE and COLOR_ATTACHMENTi.

Mesa properly returned INVALID_ENUM for unknown enums, but it decided
what was known by using read_buffer_enum_to_index, which handles all
enums in every API.  So enums that were valid in GL were making it
past the "valid enum" check.  Such targets would then be classified
as unsupported, and we'd raise INVALID_OPERATION, but that's technically
the wrong error code.

Fixes dEQP-GLES31's
functional.debug.negative_coverage.get_error.buffer.read_buffer

v2: Only call read_buffer_enuM_to_index when required (Eduardo).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-24 16:52:08 -07:00
Nanley Chery
a5dc3c0f02 anv: Sanitize Image extents and offsets
Prepare Image extents and offsets for internal consumption by assigning
the default values implicitly defned by the spec. Fixes textures on
several Vulkan demos in which the VkImageCopy depth is set to zero when
copying a 2D image.

v2 (Jason Ekstrand):
   Replace "prep" with "sanitize"
   Make function static inline
   Pass structs instead of pointers

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2016-03-24 16:15:00 -07:00
Jason Ekstrand
22b343a8ec nir: Add a pass to inline functions
This commit adds a new NIR pass that lowers all function calls away by
inlining the functions.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand
debf23ec68 nir/builder: Add helpers for easily inserting copy_var intrinsics
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand
79dec93ead nir: Add return lowering pass
This commit adds a NIR pass for lowering away returns in functions.  If the
return is in a loop, it is lowered to a break.  If it is not in a loop,
it's lowered away by moving/deleting code as needed.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand
8d61d72524 nir: Add a cursor helper for getting a cursor after any phi nodes
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand
18b0166749 nir/builder: Add a helper for inserting jump instructions
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand
97b663481c nir/cf: Make extracting or re-inserting nothing a no-op
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand
7022a673cd nir: Add a function for comparing cursors
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand
124f229ece nir/cf: Handle relinking top-level blocks
This can happen if a function ends in a return instruction and you remove
the return.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand
364212f1ed nir: Add a pass to repair SSA form
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand
ea98d415e4 nir/vars_to_ssa: Use the new nir_phi_builder helper
The efficiency should be approximately the same.  We do a little more work
per phi node because we have to sort the predecessors.  However, we no
longer have to walk the blocks a second time to pop things off the stack.
The bigger advantage, however, is that we can now re-use the phi placement
and per-block SSA value tracking in other passes.

As a side-benifit, the phi builder actually handles unreachable blocks
correctly.  The original vars_to_ssa code, because of the way it iterated
the blocks and added phi sources, didn't add sources corresponding to
predecessors of unreachable blocks.  The new strategy employed by the phi
builder creates a phi source for each predecessor and should correctly
handle unreachable blocks by setting those sources to SSA undefs.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand
42ddfc611f nir/dominance: Handle unreachable blocks
Previously, nir_dominance.c didn't properly handle unreachable blocks.
This can happen if, for instance, you have something like this:

loop {
   if (...) {
      break;
   } else {
      break;
   }
}

In this case, the block right after the if statement will be unreachable.
This commit makes two changes to handle this.  First, it removes an assert
and allows block->imm_dom to be null if the block is unreachable.  Second,
it properly skips unreachable blocks in calc_dom_frontier_cb.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand
e4dc82cfcf nir: Add a phi node placement helper
Right now, we have phi placement code in two places and there are other
places where it would be nice to be able to do this analysis.  Instead of
repeating it all over the place, this commit adds a helper for placing all
of the needed phi nodes for a value.

v2: Add better documentation

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand
9a41d94731 util/bitset: Allow iterating over const bitsets
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Rob Clark
61c7d20e4f ttn: remove stray global from header
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-24 16:04:54 -04:00
Samuel Pitoiset
b9c70fcdad nv50/ir: silence unhandled TGSI_PROPERTY_NEXT_SHADER info
radeonsi uses this property to make the best decision about which
shader to compile, but this is not currently used by our codegen.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-24 18:53:24 +01:00
Kenneth Graunke
d1bb1df87e mesa: Handle negative length in glPushDebugGroup().
The KHR_debug spec doesn't actually say we should handle this, but that
is most likely an oversight - it says to check against strlen and
generate errors if length is negative.  It appears they just forgot to
explicitly spell out that we should then proceed to actually handle it.

Fixes crashes from uncaught std::string exceptions in many
dEQP-GLES31.functional.debug.error_filters.* tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-24 10:47:50 -07:00
Kenneth Graunke
028459a00d mesa: Make glDebugMessageInsert deal with negative length for all types.
From the KHR_debug spec, section 5.5.5 (Externally Generated Messages):

   "If <length> is negative, it is implied that <buf> contains a null
    terminated string. The error INVALID_VALUE will be generated if the
    number of characters in <buf>, excluding the null terminator when
    <length> is negative, is not less than the value of
    MAX_DEBUG_MESSAGE_LENGTH."

This indicates that length should be set to strlen for all types, not
just GL_DEBUG_TYPE_MARKER.  We want it to be after validate_length()
so we still generate appropriate errors.

Fixes crashes from uncaught std::string exceptions in many
dEQP-GLES31.functional.debug.error_filters.* tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-24 10:47:45 -07:00
Kenneth Graunke
412e686da9 mesa: Include null terminator in GL_DEBUG_NEXT_LOGGED_MESSAGE_LENGTH.
From the KHR_debug spec:
"Applications can query the number of messages currently in the log by
 obtaining the value of DEBUG_LOGGED_MESSAGES, and the string length
 (including its null terminator) of the oldest message in the log
 through the value of DEBUG_NEXT_LOGGED_MESSAGE_LENGTH."

Because we weren't including the null terminator, many dEQP tests
called glGetDebugMessageLog with a bufSize parameter that was 1 too
small, and unable to contain the message, so we skipped returning it,
failing many cases.

Fixes 298 dEQP-GLES31.functional.debug.* tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-24 10:47:29 -07:00
Nicolai Hähnle
6b763c026d st/mesa: use RGBA instead of BGRA for SRGB_ALPHA
This fixes a regression introduced by commit a8eea696 "st/mesa: honour sized
internal formats in st_choose_format (v2)".

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94657
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94671
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-24 12:23:31 -05:00
Nicolai Hähnle
7880b81d39 radeonsi: silence a coverity warning
The following Coverity warning

5378     	tmpl.fetch_args = atomic_fetch_args;
5379     	tmpl.emit = atomic_emit;
>>>     CID 1357115:  Uninitialized variables  (UNINIT)
>>>     Using uninitialized value "tmpl". Field "tmpl.intr_name" is uninitialized.
5380     	bld_base->op_actions[TGSI_OPCODE_ATOMUADD] = tmpl;
5381     	bld_base->op_actions[TGSI_OPCODE_ATOMUADD].intr_name = "add";

... is a false positive, but what the hell. This change should "fix" it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-24 12:23:14 -05:00
Bas Nieuwenhuizen
f96309753b mesa: replace gl_context->Multisample._Enabled with _mesa_is_multisample_enabled.
This removes any dependency on driver validation of the number of
framebuffer samples.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Brian Paul <brianp@vmware.com>
2016-03-24 08:36:43 -06:00
Rob Clark
0bea0e7141 nir: fix dangling ssadef->name ptrs
In many places, the convention is to pass an existing ssadef name ptr
when construction/initializing a new nir_ssa_def.  But that goes badly
(as noticed by garbage in nir_print output) when the original string
gets freed.

Just use ralloc_strdup() instead, and add ralloc_free() in the two
places that would care (not that the strings wouldn't eventually get
freed anyways).

Also fixup the nir_search code which was directly setting ssadef->name
to use the parent instruction as memctx.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-24 08:30:04 -04:00
Jason Ekstrand
4e060d80ff glsl: Add propagate_invariance to the other makefile
This fixes the scons build
2016-03-23 21:12:44 -07:00
Jason Ekstrand
a984e44abd nir/glsl: Propagate invariant into NIR alu ops
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:07 -07:00
Jason Ekstrand
028d6ecfe0 glsl/rebalance_tree: Don't handle invariant or precise trees
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:07 -07:00
Jason Ekstrand
b2209b2333 glsl/opt_algebraic: Don't handle invariant or precise trees
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:07 -07:00
Jason Ekstrand
89b604922d glsl: Add a pass to propagate the "invariant" and "precise" qualifiers
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:06 -07:00
Jason Ekstrand
91d6272c2b nir/alu_to_scalar: Propagate the "exact" bit
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:06 -07:00
Jason Ekstrand
865e83b9ec i965/peephole_ffma: Don't fuse exact adds
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:06 -07:00
Jason Ekstrand
5f39e3e165 nir/cse: Properly handle nir_ssa_def.exact
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:06 -07:00
Jason Ekstrand
0dbda153aa nir/algebraic: Flag inexact optimizations
Many of our optimizations, while great for cutting shaders down to size,
aren't really precision-safe.  This commit tries to flag all of the
inexact floating-point optimizations so they don't get run on values that
are flagged "exact".  It's a bit conservative and maybe flags some safe
optimizations as unsafe but that's better than missing one.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:02 -07:00
Jason Ekstrand
ed3a029e80 nir/algebraic: Fix fmin detection to match the spec
The previous transformation got the arguments to fmin backwards.  When NaNs
are involved, the GLSL min/max aren't commutative so it matters.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:00 -07:00
Jason Ekstrand
89545b1314 nir/algebraic: Get rid of an invlid fxor optimization
The fxor opcode is required to return 1.0f or 0.0f but the input variable
may not be 1.0f or 0.0f.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:27:58 -07:00
Jason Ekstrand
3a7cb6534c nir/algebraic: Allow for flagging operations as being inexact
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:27:55 -07:00
Jason Ekstrand
a6f25fa7d7 nir/search: Propagate exactness into newly created expressions
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:27:52 -07:00
Jason Ekstrand
ded3133d47 nir/builder: Add a flag for setting exact
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:26:34 -07:00
Jason Ekstrand
4ff89377d9 nir: Add an "exact" bit to nir_alu_instr
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:26:34 -07:00
Jason Ekstrand
f849f53990 nir/clone: Export nir_variable_clone
Reviewed-by: Rob Clark <robclark@gmail.com>
2016-03-23 15:26:11 -07:00
Jason Ekstrand
5fe8959912 nir/clone: Expose nir_constant_clone
Reviewed-by: Rob Clark <robclark@gmail.com>
2016-03-23 15:26:08 -07:00
Jason Ekstrand
c4c373f156 nir: Fix whitespace
Reviewed-by: Rob Clark <robclark@gmail.com>
2016-03-23 15:25:53 -07:00
Brian Paul
9a6da49371 docs: use latest libDRM version
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-03-23 12:56:32 -06:00
Lars Hamre
43c6f3f82f compiler/glsl: allow sequence op as a const expr in gles 1.0
Allow the sequence operator to be a constant expression in GLSL ES
versions prior to GLSL ES 3.0

Fixes the following piglit test:
/all/spec/glsl-es-1.0/compiler/array-sized-by-sequence-in-parenthesis.vert

This is similar to the logic from process_initializer() which performs
the same check for constant variable initialization with sequence
operators.

v2: Fixed regression pointed out by Eduardo Lima Mitev

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-23 18:13:26 +01:00
Nicolai Hähnle
c4931ae174 radeonsi: fix out-of-bounds indexing of shader images
Results are undefined but may not crash. Without this change, out-of-bounds
indexing can lead to VM faults and GPU hangs.

Constant buffers, samplers, and possibly others will eventually need similar
treatment to support GL_ARB_robust_buffer_access_behavior.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-23 11:49:53 -05:00
Nicolai Hähnle
a8f5d11426 radeonsi: cache flush/invalidation for missing PIPE_BARRIER_*_BUFFER bits (v2)
This fixes arb_shader_image_load_store-host-mem-barrier.

v2: flush TC L2 for index buffers on <= CIK (Marek)

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-23 11:48:19 -05:00
Nicolai Hähnle
fc94bc2986 st/mesa: add missing MemoryBarrier bits and some explanations
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-23 11:48:15 -05:00
Nicolai Hähnle
b15b1faefd gallium: add PIPE_BARRIER_STREAMOUT_BUFFER
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-23 11:48:02 -05:00
Marek Olšák
b8ec205515 radeonsi: fix 2D array MSAA failures since image support landed
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-23 12:14:15 +01:00
Jason Ekstrand
9881eab197 i965/fs: Don't constant-fold RCP
No shader-db changes on Broadwell

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 16:46:15 -07:00
Jason Ekstrand
01425c45b3 i965: Remove the RCP+RSQ algebraic optimizations
NIR already has this optimization and it can do much better than the little
peephole in the backend.

No shader-db change on Haswell or Broadwell.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 16:46:15 -07:00
Jason Ekstrand
20417b2cb0 anv/device: Advertise version 1.0.5
Nothing substantial has changed since 1.0.2
2016-03-22 16:21:23 -07:00
Jason Ekstrand
204d937ac2 anv/device: Ignore the patch portion of the requested API version
Fixes dEQP-VK.api.device_init.create_instance_name_version

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94661
2016-03-22 16:20:45 -07:00
Jason Ekstrand
4844723405 anv: Don't assert-fail if someone asks for a non-existent entrypoint 2016-03-22 16:11:53 -07:00
Jason Ekstrand
8dd86e8aa7 Update to the latest Vulkan header from Khronos 2016-03-22 16:06:53 -07:00
Ian Romanick
d7a25a9def nir: Don't abs slt and friends
No shader-db changes, but this is symmetric with the previous commit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:48:02 -07:00
Ian Romanick
2bb006af68 nir: Don't abs the result of b2f or b2i
In the results below, 2 SIMD16 shaders in Trine are lost.

G4X
total instructions in shared programs: 4012279 -> 4011108 (-0.03%)
instructions in affected programs: 116776 -> 115605 (-1.00%)
helped: 339
HURT: 0

total cycles in shared programs: 84315862 -> 84313584 (-0.00%)
cycles in affected programs: 1767232 -> 1764954 (-0.13%)
helped: 274
HURT: 81

Ironlake
total instructions in shared programs: 6399073 -> 6396998 (-0.03%)
instructions in affected programs: 218050 -> 215975 (-0.95%)
helped: 600
HURT: 0

total cycles in shared programs: 128892088 -> 128888810 (-0.00%)
cycles in affected programs: 2867452 -> 2864174 (-0.11%)
helped: 422
HURT: 137

Sandy Bridge
total instructions in shared programs: 8462174 -> 8460759 (-0.02%)
instructions in affected programs: 178529 -> 177114 (-0.79%)
helped: 596
HURT: 0

total cycles in shared programs: 117542276 -> 117534098 (-0.01%)
cycles in affected programs: 1239166 -> 1230988 (-0.66%)
helped: 369
HURT: 150

Ivy Bridge
total instructions in shared programs: 7775131 -> 7773410 (-0.02%)
instructions in affected programs: 162903 -> 161182 (-1.06%)
helped: 590
HURT: 0

total cycles in shared programs: 65759882 -> 65747268 (-0.02%)
cycles in affected programs: 1004354 -> 991740 (-1.26%)
helped: 467
HURT: 141

Haswell
total instructions in shared programs: 7107786 -> 7106327 (-0.02%)
instructions in affected programs: 140954 -> 139495 (-1.04%)
helped: 590
HURT: 0

total cycles in shared programs: 64668028 -> 64655322 (-0.02%)
cycles in affected programs: 967080 -> 954374 (-1.31%)
helped: 452
HURT: 149

LOST:   2
GAINED: 0

Broadwell
total instructions in shared programs: 8980029 -> 8978287 (-0.02%)
instructions in affected programs: 197232 -> 195490 (-0.88%)
helped: 715
HURT: 0

total cycles in shared programs: 70070448 -> 70055970 (-0.02%)
cycles in affected programs: 975724 -> 961246 (-1.48%)
helped: 471
HURT: 111

LOST:   2
GAINED: 0

Skylake
total instructions in shared programs: 9115178 -> 9113436 (-0.02%)
instructions in affected programs: 203012 -> 201270 (-0.86%)
helped: 715
HURT: 0

total cycles in shared programs: 68848660 -> 68834004 (-0.02%)
cycles in affected programs: 993888 -> 979232 (-1.47%)
helped: 473
HURT: 116

LOST:   2
GAINED: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:48:02 -07:00
Ian Romanick
348e5a71d8 nir: Simplify 0 < fabs(a)
Sandy Bridge / Ivy Bridge / Haswell
total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
instructions in affected programs: 564 -> 558 (-1.06%)
helped: 6
HURT: 0

total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
cycles in affected programs: 9768 -> 9582 (-1.90%)
helped: 12
HURT: 0

Broadwell / Skylake
total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
instructions in affected programs: 626 -> 619 (-1.12%)
helped: 7
HURT: 0

total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
cycles in affected programs: 9378 -> 9192 (-1.98%)
helped: 12
HURT: 0

G45 and Ironlake showed no change.

v2: Modify the comments to look more like a proof.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:47:56 -07:00
Ian Romanick
564a8b8a26 nir: Simplify 0 >= b2f(a)
This also prevented some regressions with other patches in my local
tree.

Broadwell / Skylake
total instructions in shared programs: 8980835 -> 8980833 (-0.00%)
instructions in affected programs: 45 -> 43 (-4.44%)
helped: 1
HURT: 0

total cycles in shared programs: 70077904 -> 70077900 (-0.00%)
cycles in affected programs: 122 -> 118 (-3.28%)
helped: 1
HURT: 0

No changes on earlier platforms.

v2: Modify the comments to look more like a proof.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:44:57 -07:00
Ian Romanick
bf0d60aa11 nir: Simplify i2b with negated or abs operand
This enables removing ssa_201 and ssa_202 in sequences like:

                 vec1 ssa_200 = flt ssa_199, ssa_194
                 vec1 ssa_201 = b2i ssa_200
                 vec1 ssa_202 = i2b -ssa_201

shader-db results:

Sandy Bridge
total instructions in shared programs: 8462257 -> 8462180 (-0.00%)
instructions in affected programs: 3846 -> 3769 (-2.00%)
helped: 35
HURT: 0

total cycles in shared programs: 117542934 -> 117542462 (-0.00%)
cycles in affected programs: 20072 -> 19600 (-2.35%)
helped: 20
HURT: 1

Ivy Bridge
total instructions in shared programs: 7775252 -> 7775137 (-0.00%)
instructions in affected programs: 3645 -> 3530 (-3.16%)
helped: 35
HURT: 0

total cycles in shared programs: 65760522 -> 65760068 (-0.00%)
cycles in affected programs: 21082 -> 20628 (-2.15%)
helped: 25
HURT: 2

Haswell
total instructions in shared programs: 7108666 -> 7108589 (-0.00%)
instructions in affected programs: 3253 -> 3176 (-2.37%)
helped: 35
HURT: 0

total cycles in shared programs: 64675726 -> 64675272 (-0.00%)
cycles in affected programs: 21034 -> 20580 (-2.16%)
helped: 26
HURT: 1

Broadwell / Skylake
total instructions in shared programs: 8980912 -> 8980835 (-0.00%)
instructions in affected programs: 3223 -> 3146 (-2.39%)
helped: 35
HURT: 0

total cycles in shared programs: 70077926 -> 70077904 (-0.00%)
cycles in affected programs: 21886 -> 21864 (-0.10%)
helped: 21
HURT: 6

G45 and Ironlake showed no change.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:43:28 -07:00
Ian Romanick
a4079f1cb2 nir: Lower flrp with Boolean interpolator to bcsel
On Intel platforms that don't set lower_flrp, using bcsel instead of
flrp seems to be a small amount worse.  On those platforms, the use of
flrp, bcsel, and multiply of b2f is still an active area of research.
In review, Matt suggested this is because bcsel turns into CMP+SEL, and
because of the flag register we can't schedule instructions well.

shader-db results:

G4X / Ironlake
total instructions in shared programs: 4016538 -> 4012279 (-0.11%)
instructions in affected programs: 161556 -> 157297 (-2.64%)
helped: 1077
HURT: 1

total cycles in shared programs: 84328296 -> 84315862 (-0.01%)
cycles in affected programs: 4174570 -> 4162136 (-0.30%)
helped: 926
HURT: 53

Unsurprisingly, no changes on later platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:42:42 -07:00
Ian Romanick
9442db4f89 i965: Have NIR lower flrp on pre-GEN6 vec4 backend
Previously we were doing the lowering by hand in vec4_visitor::emit_lrp.
By doing it in NIR, we have the opportunity for NIR to do additional
optimization of the expanded code.

This also enables optimizations added by the next commit.

shader-db results:

G4X / Ironlake
total instructions in shared programs: 4024401 -> 4016538 (-0.20%)
instructions in affected programs: 447686 -> 439823 (-1.76%)
helped: 2623
HURT: 0

total cycles in shared programs: 84375846 -> 84328296 (-0.06%)
cycles in affected programs: 16964960 -> 16917410 (-0.28%)
helped: 2556
HURT: 41

Unsurprisingly, no changes on later platforms.

v2: Formatting and comment changes suggested by Matt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:42:42 -07:00
Brian Paul
18c5fa1122 swrast: fix discarded const warning in s_texture.c
Signed-off-by: Brian Paul <brianp@vmware.com>
2016-03-22 08:35:27 -06:00
Marc-André Lureau
530593da65 i965: fix invalid memory write
I noticed some heap corruption running virgl tests, and valgrind
helped me to track it down to the following error:

==29272== Invalid write of size 4
==29272==    at 0x90283D4: push_loop_stack (brw_eu_emit.c:1307)
==29272==    by 0x9029A7D: brw_DO (brw_eu_emit.c:1750)
==29272==    by 0x90554B0: fs_generator::generate_code(cfg_t const*, int) (brw_fs_generator.cpp:1999)
==29272==    by 0x904491F: brw_compile_fs (brw_fs.cpp:5685)
==29272==    by 0x8FC5DC5: brw_codegen_wm_prog (brw_wm.c:137)
==29272==    by 0x8FC7663: brw_fs_precompile (brw_wm.c:638)
==29272==    by 0x8FA4040: brw_shader_precompile(gl_context*, gl_shader_program*) (brw_link.cpp:51)
==29272==    by 0x8FA4A9A: brw_link_shader (brw_link.cpp:260)
==29272==    by 0x8DEF751: _mesa_glsl_link_shader (ir_to_mesa.cpp:3006)
==29272==    by 0x8C84325: _mesa_link_program (shaderapi.c:1042)
==29272==    by 0x8C851D7: _mesa_LinkProgram (shaderapi.c:1515)
==29272==    by 0x4E4B8E8: add_shader_program (vrend_renderer.c:880)
==29272==  Address 0xf2f3cb0 is 0 bytes after a block of size 112 alloc'd
==29272==    at 0x4C2AA98: calloc (vg_replace_malloc.c:711)
==29272==    by 0x8ED11F7: ralloc_size (ralloc.c:113)
==29272==    by 0x8ED1282: rzalloc_size (ralloc.c:134)
==29272==    by 0x8ED14C0: rzalloc_array_size (ralloc.c:196)
==29272==    by 0x9019C7B: brw_init_codegen (brw_eu.c:291)
==29272==    by 0x904F565: fs_generator::fs_generator(brw_compiler const*, void*, void*, void const*, brw_stage_prog_data*, unsigned int, bool, gl_shader_stage) (brw_fs_generator.cpp:124)
==29272==    by 0x9044883: brw_compile_fs (brw_fs.cpp:5675)
==29272==    by 0x8FC5DC5: brw_codegen_wm_prog (brw_wm.c:137)
==29272==    by 0x8FC7663: brw_fs_precompile (brw_wm.c:638)
==29272==    by 0x8FA4040: brw_shader_precompile(gl_context*, gl_shader_program*) (brw_link.cpp:51)
==29272==    by 0x8FA4A9A: brw_link_shader (brw_link.cpp:260)
==29272==    by 0x8DEF751: _mesa_glsl_link_shader (ir_to_mesa.cpp:3006)

if_depth_in_loop is an array of size p->loop_stack_array_size, and
push_loop_stack() will access if_depth_in_loop[p->loop_stack_depth+1],
thus the condition to grow the array should be
p->loop_stack_array_size <= (p->loop_stack_depth + 1) (it's currently
off by 2...)

This can be reproduced by running the following test with virgl test
server:
LIBGL_ALWAYS_SOFTWARE=y GALLIUM_DRIVER=virpipe bin/shader_runner
./tests/shaders/glsl-fs-unroll-explosion.shader_test -auto

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-21 20:50:07 -07:00
Dave Airlie
53afbc980a tgsi: drop unused set_exec/kill_mask interfaces.
These don't get used and haven't been in git history from what I can
see, so drop them.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-22 13:07:05 +10:00
Dave Airlie
1e8435ce0c docs/relnotes: update ARB_internalformat_query2 status.
Signed-off-by: Dave Airlie <Airlied@redhat.com>
2016-03-22 09:54:08 +10:00
Dave Airlie
ee7c8b9804 st/mesa: add support for internalformat query2.
Add code to handle GL_INTERNALFORMAT_PREFERRED.
Add code to deal with GL_RENDERBUFFER being passes into ChooseTextureFormat.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-22 09:49:08 +10:00
Jason Ekstrand
869e393eb3 anv/batch_chain: Fall back to growing batches when chaining isn't available 2016-03-21 15:29:30 -07:00
Anuj Phogat
4ba47f7b2a i965: Fix assert conditions for src/dst x/y offsets
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-21 14:55:18 -07:00
Anuj Phogat
65cd2f8443 swrast: Move assert for 'slice' in to check_map_teximage
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-03-21 14:55:18 -07:00
xavier
fce0b55ccb r600/sb: Do not distribute neg in expr_handler::fold_assoc() when folding multiplications.
Previously it was doing this transformation for a Trine 3 shader:
     MUL     R6.x.12,    R13.x.23, 0.5|3f000000
-    MULADD     R4.x.12,    -R6.x.12, 2|40000000, 1|3f800000
+    MULADD     R4.x.12,    -R13.x.23, -1|bf800000, 1|3f800000

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94412
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
Cc: "11.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-22 07:43:13 +10:00
Samuel Pitoiset
9efd8b590f nvc0: make sure to delete samplers used by compute shaders
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-21 22:04:18 +01:00
Kenneth Graunke
4b0a5b21ae i965/blorp: Make BlitFramebuffer() do sRGB encoding in ES 3.x.
According to the ES 3.0 and GL 4.4 specifications, glBlitFramebuffer
is supposed to perform sRGB decoding and encoding whenever sRGB formats
are in use.  The ES 3.0 specification is completely clear, and has
always stated this.

However, the GL specification has changed behavior in 4.1, 4.2, and
4.4.  The original behavior stated that no sRGB encoding should occur.
The 4.4 behavior matches ES 3.0's wording.  However, implementing the
new behavior appears to break applications such as Left 4 Dead 2.

This patch changes Meta to apply the ES 3.x rules in ES 3.x, but
leaves OpenGL alone for now, to avoid breaking applications.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-21 13:55:32 -07:00
Kenneth Graunke
8679bb7c9e i965/blorp: Refactor sRGB encoding/decoding.
Because the rules for sRGB are so insane, we change brw_blorp_miptrees
to take decode_srgb and encode_srgb flags, which control linearization
of the source and destination separately.

This should make it easy to implement whatever crazy combination of
rules people throw at us.  For now, it should be equivalent.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-21 13:54:29 -07:00
Kenneth Graunke
eee8a53906 meta: Make BlitFramebuffer() do sRGB encoding in ES 3.x.
According to the ES 3.0 and GL 4.4 specifications, glBlitFramebuffer
is supposed to perform sRGB decoding and encoding whenever sRGB formats
are in use.  The ES 3.0 specification is completely clear, and has
always stated this.

However, the GL specification has changed behavior in 4.1, 4.2, and
4.4.  The original behavior stated that no sRGB encoding should occur.
The 4.4 behavior matches ES 3.0's wording.  However, implementing the
new behavior appears to break applications such as Left 4 Dead 2.

This patch changes Meta to apply the ES 3.x rules in ES 3.x, but
leaves OpenGL alone for now, to avoid breaking applications.

Meta implements several other functions in terms of BlitFramebuffer,
and many of those explicitly do not perform sRGB encoding.  So, this
patch explicitly disables sRGB encoding in those other functions,
preserving the existing (correct) behavior.

If you're from the future and are reading this, hi!  Welcome to
the "fun" of debugging sRGB problems!  Best of luck!

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-21 13:53:44 -07:00
Nicolai Hähnle
b74784638d docs: mark GL_ARB_shader_image_load_store/_size as done for radeonsi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:26 -05:00
Edward O'Callaghan
5219eb15e1 radeonsi: Set PIPE_SHADER_CAP_MAX_SHADER_IMAGES
This enables ARB_shader_image_load_store and ARB_shader_image_size.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
[allow the same number of images for all shader stages and require LLVM 3.9]

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:26 -05:00
Nicolai Hähnle
6f942ac5ee radeonsi: disable early Z if the fragment shader writes to memory
Empirically, both the EXEC_ON_* flags and LATE_Z are necessary.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00
Nicolai Hähnle
79762e877c tgsi/scan: add writes_memory to flag presence of stores or atomics
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00
Nicolai Hähnle
e9d935ed0e radeonsi: force the DCC enable bit off in image descriptors for writing (v2)
This avoids a lockup at least on Tonga.

v2: only force DCC off on VI+ (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00
Nicolai Hähnle
43f5ce1d20 radeonsi: implement MemoryBarrier (v2)
v2: invalidate both constant and VMEM/TC L1 for constant buffers (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00
Nicolai Hähnle
97352aa50a radeonsi: implement volatile memory access
Prevent loads from being re-ordered or coalesced.

Atomics don't need special handling by definition, and stores don't need
special handling because LLVM is unable to detect dead image or buffer
stores.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00
Nicolai Hähnle
5a61b428f4 radeonsi: implement coherent memory access (v2)
v2: set glc=1 for volatile also on buffers

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00
Nicolai Hähnle
d6fa650454 radeonsi: Lower TGSI_OPCODE_MEMBAR down to LLVM op
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00
Nicolai Hähnle
f7a85a8a0a radeonsi: Lower TGSI_OPCODE_ATOM* down to LLVM op
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:24 -05:00
Nicolai Hähnle
bfcefcb3c7 radeonsi: Lower TGSI_OPCODE_STORE down to LLVM op
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:24 -05:00
Nicolai Hähnle
1e82dedeca radeonsi: Lower TGSI_OPCODE_LOAD down to LLVM op (v3)
v2: new signature style for buffer intrinsics (offsets)
v3: new signature style for llvm.amdgcn.buffer.load.format (overloaded return)

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2016-03-21 15:34:24 -05:00
Nicolai Hähnle
136686a51d radeonsi: extract the LLVM type name construction into its own function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:23 -05:00
Nicolai Hähnle
02bd0cd7b1 radeonsi: Lower TGSI_OPCODE_RESQ down to LLVM op
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:23 -05:00
Nicolai Hähnle
75539197c7 radeonsi: extract TXQ buffer size computation into its own function
This will allow it to be reused for RESQ.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:23 -05:00
Nicolai Hähnle
515fb2c09c radeonsi: decompress shader images
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:23 -05:00
Nicolai Hähnle
f61566b77a radeonsi: update shader image descriptor for invalidated buffer
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:23 -05:00
Nicolai Hähnle
e85cf35a65 radeonsi: implement set_shader_images (v2)
Whether DCC is disabled depends on the access flags with which the image
is bound: image_load supports DCC, but store and atomic don't.

v2: remove an unnecessary masking of images->desc.enabled_mask

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:23 -05:00
Nicolai Hähnle
b1b7268f01 gallium/radeon: make r600_texture_disable_dcc externally accessible
We will need it in radeonsi for shader images.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:22 -05:00
Nicolai Hähnle
457f9c6b25 tgsi/scan: track which shader images are really buffers
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:22 -05:00
Nicolai Hähnle
fa096a14af tgsi/scan: add images_writemask
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:22 -05:00
Nicolai Hähnle
1379544081 st/mesa: translate additional flags in MemoryBarrier
Re-order flags in the order in which they appear in the OpenGL spec in the
description of MemoryBarrier().

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:22 -05:00
Nicolai Hähnle
96cd908fd3 gallium: add additional PIPE_BARRIER_* bits
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:22 -05:00
Brian Paul
86caa67aef svga: add svga_winsys_context::pipe_debug_callback pointer
The svga winsys modules can use this to send debug messages to the
state tracker and Mesa.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2016-03-21 13:37:40 -06:00
Charmaine Lee
f8aaf0094d svga: Fix the index buffer rebind regression
The index buffer handle saved in the hw_state structure could
be invalid after the index buffer is destroyed. Instead of
rebinding the index buffer using the saved index buffer handle,
we will reset the index buffer handle in the hw_state structure
to force resending of the index buffer.

Fixes bug 1593320

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-21 13:37:40 -06:00
Charmaine Lee
47856e5945 svga: rebind stream output targets
To ensure stream output target surfaces are available for the draw commands,
we need to rebind the current stream output targets at the first draw in the
command buffer.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-21 13:37:40 -06:00
Charmaine Lee
47cfc83440 svga: rebind index buffer
Similar to other resources, current index buffer needs to be
rebound at the first draw of the current command buffer to make
sure the buffer is available for the draw command.

Fixes bug 1587263.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-21 13:37:40 -06:00
Brian Paul
299f8ca0a7 svga: minor formatting fix, comment addition
To sync with our internal tree.

Signed-off-by: Brian Paul <brianp@vmware.com>
2016-03-21 13:37:25 -06:00
Charmaine Lee
b45b47c5c9 svga: optimize constant buffer uploads
When a constant buffer slot is allocated in the upload buffer,
the allocated slot size is always in multiple of 256. But the actual buffer
size might not be in multiple of 256. This causes a gap between
the ending offset of a slot and the starting offset of the next slot.
The gap will prevent the two slots to be updated in a single update command.
In order to maximize the chance of merging the contiguous dirty ranges,
when a slot is to be allocated in the constant upload buffer,
specify a buffer size in multiple of 256.

There is about 10% performance improvement with Lightsmark2008 and
30% with Cinebench R11.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
2016-03-21 12:58:25 -06:00
Charmaine Lee
0a1d91ef97 svga: add a few more resource updates HUD query
This patch adds the following HUD queries:
.num-resource-updates  -- number of resource update. Commands include
                          UPDATE_SUBRESOURCE, UPDATE_GB_IMAGE.
.num-buffer-uploads    -- number of buffer uploads.
.num-const-buf-updates -- number of set constant buffer.
.num-const-updates     -- number of set shader constant.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
2016-03-21 12:58:25 -06:00
Charmaine Lee
79e343b36a svga: add new num-readbacks HUD query
To find out how many image readback command is issued.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
2016-03-21 12:58:25 -06:00
Brian Paul
dc9ecf58c0 svga: use shader sampler view declarations
Previously, we looked at the bound textures (via the pipe_sampler_views)
to determine texture dimensions (1D/2D/3D/etc) and datatype (float vs.
int).  But this could fail in out of memory conditions.  If we failed to
allocate a texture and didn't create a pipe_sampler_view, we'd default
to using 0 (PIPE_BUFFER) as the texture type.  This led to device errors
because of inconsistent shader code.

This change relies on all TGSI shaders having an SVIEW declaration for
each SAMP declaration.  The previous patch series does that.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul
b56b853ab3 gallium/tests: declare sampler views in shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul
38e831ca3d gallium/util: declare sampler view in util_make_fs_blit_msaa_depthstencil()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul
e7b5a844e3 postprocess: declare sampler views in shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul
5a9f2a2d89 hud: add sampler view declaration in text fragment shader
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul
b3daaefadb st/mesa: emit sampler view decls in drawpixels code
v2: support both TGSI_TEXTURE_2D and _RECT

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul
0f0a23d4d8 st/mesa: emit sampler view declaration in bitmap shader
In June 2015, Rob Clark started updating the tgsi utility code to emit
SVIEW declarations in various shaders (for polygon stipple, blitting,
etc).  These patches do the same for the Mesa state tracker.

The VMware driver will use this.

v2: support both TGSI_TEXTURE_2D and _RECT

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul
72eb5a3cfe st/mesa: emit sampler view declarations for ARB vert/frag programs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul
eda81fa357 st/mesa: use correct TGSI texture target in drawpix fragment shader
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul
83b5b3d66e st/mesa: use correct TGSI texture target in bitmap fragment shader
Depending on the driver's support for NPOT textures, we might use
a RECT texture instead of 2D texture.  We should propogate that info
to the fragment shader's TEX instruction.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Brian Paul
63e020d734 gallium/tgsi: pass TGSI tex target to tgsi_transform_tex_inst()
Instead of hard-coded 2D tex target in tgsi_transform_tex_2d_inst()

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-21 11:59:25 -06:00
Nicolai Hähnle
a8b315b827 st/mesa: use the texture view's format for render-to-texture
Aside from the bug below, it fixes a simplistic test I've written locally,
and I see no regression in Piglit for radeonsi.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94595
Cc: "11.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 11:28:38 -05:00
Hans de Goede
dcf8a4d281 gallium: Remove unused TGSI_RESOURCE_ defines
These magic file-index defines where only ever used in the nouveau code
and that no longer uses them.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2016-03-21 12:20:58 +01:00
Hans de Goede
9b4c8f6629 nouveau: codegen: Do not silently fail in handeLOAD / handleSTORE / handleATOM
handeLOAD / handleSTORE / handleATOM can only handle TGSI_FILE_BUFFER
and TGSI_FILE_MEMORY. Make things fail explictly when another
register-file is used in these functions.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)
2016-03-21 12:20:48 +01:00
Hans de Goede
86e4440361 nouveau: codegen: Disable more old resource handling code
Commit c3083c7082 ("nv50/ir: add support for BUFFER accesses") disabled /
commented out some of the old resource handling code, but not all of it.

Effectively all of it is dead already, if we ever enter the old code
paths in handeLOAD / handleSTORE / handleATOM we will get an exception
due to trying to access the now always zero-sized resources vector.

Disable all the dead code.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)
2016-03-21 12:20:40 +01:00
Hans de Goede
71e315475c nouveau: codegen: gk110: Make emitSTORE offset handling identical to emitLOAD
Make the store offset handling in CodeEmitterGK110::emitSTORE identical
to the one in CodeEmitterGK110::emitLOAD handling.

This is just a cleanup, it does not cause any functional changes.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-21 12:20:38 +01:00
Hans de Goede
c783ad0e24 nouveau: codegen: Slightly refactor Source::scanInstruction() dst handling
Use the dst temp variable which was used in the TGSI_FILE_OUTPUT
case everywhere. This makes the code somewhat easier to reads
and helps avoiding going over 80 chars with upcoming changes.

This also brings the dst handling more in line with the src
handling.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-21 12:20:32 +01:00
Hans de Goede
54cdde5eff nouveau: codegen: Add support for clover / OpenCL kernel input parameters
Add support for clover / OpenCL kernel input parameters.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)
2016-03-21 12:20:28 +01:00
Hans de Goede
3788e1bf74 tgsi: Add support for global / private / input MEMORY
Extend the MEMORY file support to differentiate between global, private
and shared memory, as well as "input" memory.

"MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a
special memory type is added for this, since the actual storage of these
(e.g. UBO-s) may differ per implementation. The uploading of kernel
parameters is handled by launch_grid, "MEMORY[x], INPUT" allows drivers
to use an access mechanism for parameter reads which matches with the
upload method.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)
2016-03-21 12:20:24 +01:00
Hans de Goede
43ddec2f43 tgsi: Fix decl.Atomic and .Shared not propagating when parsing tgsi text
When support for decl.Atomic and .Shared was added, tgsi_build_declaration
was not updated to propagate these properly.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)
2016-03-21 12:20:19 +01:00
Iago Toral Quiroga
8f45691cda doc: document spilling options accepted by INTEL_DEBUG
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-21 08:16:49 +01:00
Hans de Goede
b72156c8e0 tgsi: Fix return of uninitialized memory in tgsi_*_instruction_memory
tgsi_default_instruction_memory / tgsi_build_instruction_memory were
returning uninitialized memory for tgsi_instruction_memory.Texture and
tgsi_instruction_memory.Format. Note 0 means not set, and thus is a
correct default initializer for these.

Fixes: 3243b6fc97 ("tgsi: add Texture and Format to tgsi_instruction_memory")
Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-20 18:01:53 -04:00
Ilia Mirkin
bbbdcdcf75 st/mesa: report correct precision information for low/medium/high ints
When we have native integers, these have full precision. Whether they're
low/medium/high isn't piped through the TGSI yet, but eventually those
might have differing precisions. For now they're just 32-bit ints.

Fixes the following dEQP tests:

  dEQP-GLES3.functional.state_query.shader.precision_vertex_highp_int
  dEQP-GLES3.functional.state_query.shader.precision_fragment_highp_int

which expected highp ints to have full 32-bit precision, not the default
23-bit float precision.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-20 17:51:08 -04:00
Nishanth Peethambaran
eeb117a09d st/omx/dec: Correct the timestamping
Attach the timestamp to the dpb buffer and use that timestamp
while pushing buffer from dpb list to the omx client.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nishanth Peethambaran <nishanth.peethambaran@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-20 15:01:28 -04:00
Nishanth Peethambaran
46de6bbb77 st/omx: Remove trailing spaces
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nishanth Peethambaran <nishanth.peethambaran@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-20 15:01:28 -04:00
Ilia Mirkin
7d98bfedd7 nv50/ir: fix indirect texturing for non-array textures on nvc0
If a layer parameter is provided, we want to flip it to position 0 (and
combine it with any indirect params). However if the target is not an
array, there is no layer, so we have to shift all of the arguments down
by one to make room for it.

This fixes situations where there were non-coordinate parameters, such
as bias, lod, depth compare, explicit derivatives. Instead of adding a
new parameter at the front for the indirect reference, we would swap one
of those in its place.

Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler.uniform.compute.*shadow

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-20 14:14:32 -04:00
Ilia Mirkin
adb40a7399 st/mesa: only minify depth for 3d targets
We make sure that that image depth matches the level's depth before
copying it into place. However we should only be minifying the first
level's depth for 3d textures - array textures have the same depth for
all levels.

This fixes tests such as
dEQP-GLES3.functional.texture.specification.texsubimage3d_depth.* and I
suspect account for a number of other odd situations I've run into where
level > 0 of array textures was messed up.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-20 14:14:32 -04:00
Ilia Mirkin
6eeb284e4f nv50/ir: normalize cube coordinates after derivatives have been computed
In "manual" derivative mode (always used on nv50 and sometimes on nvc0
but always for cube), the idea is that using the quadop instruction, we
set up the "other" quads to have values such that the derivatives work
out, and then run the texture instruction as if nothing were strange. It
pulls values from the other lanes, and does its magic.

However cube coordinates have to be normalized - one of the 3 coords has
to be 1, to determine which is the major axis, to say which face is
being sampled. We were normalizing the coordinates first, and then
adding the derivatives. This is wrong for two reasons:

- the coordinates got normalized by a scaling factor but the
  derivatives didn't
- the result of the addition didn't end up normalized

To resolve this, we flip the logic around to normalize *after* the
per-lane coordinates are set up.

This fixes a bunch of textureGrad cube dEQP tests.

NOTE: nv50 cube arrays with explicit derivatives are still broken, to be
resolved at a later date.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-20 14:14:32 -04:00
Marek Olšák
ea2bff1d11 gallium/radeon: remove remnants of R600 TGSI->LLVM
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-20 00:57:05 +01:00
Marek Olšák
4e5dc69af1 r600g: flatten if (1) statement after removal of TGSI->LLVM
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-20 00:57:05 +01:00
Marek Olšák
20a09897a6 r600g: remove TGSI->LLVM translation
It was useful for testing and as a prototype for radeonsi bringup,
but it's not used anymore and doesn't support OpenGL 3.3 even.

v2: try to fix OpenCL build

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2016-03-20 00:57:02 +01:00
Marek Olšák
8140154ae9 gallium/radeon: remove old CS tracing
Cons:
- it was only integrated in r600g
- it doesn't work with GPUVM
- it records buffer contents at the end of IBs instead of at the beginning,
  so the replay isn't exact
- it lacks an IB parser and user-friendliness

A better solution is apitrace in combination with gallium/ddebug, which
has a complete IB parser and can pinpoint hanging CP packets.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-20 00:56:35 +01:00
Marek Olšák
a73a657def radeonsi: process TGSI property NEXT_SHADER
This allows compiling the main shader part as ES or LS.

If we get the correct hint, non-separable GLSL shaders no longer have to be
compiled as VS first, followed by LS or ES compiled on demand.

The result is that fewer shaders are compiled by piglit, but it doesn't
improve piglit running time.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-19 23:20:01 +01:00
Marek Olšák
2bdd7a46a9 st/mesa: set TGSI property NEXT_SHADER
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-19 23:20:01 +01:00
Marek Olšák
fbe6e92899 gallium: add TGSI property NEXT_SHADER
Radeonsi needs to know which shader stage will execute after a shader
in order to make the best decision about which shader variant to compile
first.

This is only set for VS and TES, because we don't need it elsewhere.

VS has 3 variants:
- next shader is FS
- next shader is GS
- next shader is TCS

TES has 2 variants:
- next shader is FS
- next shader is GS

Currently, radeonsi always assumes the next shader is FS, which is suboptimal,
since st/mesa always knows which shader is next if the GLSL program is not
a "separate shader".

By default, ureg always sets "next shader is FS".

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-19 23:20:01 +01:00
Pierre Moreau
9184d9a0bb nvc0/ir: Use double constant in handleSQRT
Fixes: a100d89d09 (nv50,nvc0: Fix invalid constant.)
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-19 15:59:52 -04:00
Kenneth Graunke
789e096594 mesa: Disallow GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME on winsys FBO.
Fixes:
dEQP-GLES3.functional.negative_api.state.get_framebuffer_attachment_parameteriv

Apparently, GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME is not allowed when
GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is GL_FRAMEBUFFER_DEFAULT, and
is expected to result in a GL_INVALID_ENUM error.

No GL specification actually defines what GL_FRAMEBUFFER_DEFAULT means.
It probably means the window system FBO.  It also doesn't mention the
behavior of any queries for that type.  Various ARB folks seem fairly
confused about it too.  For now, just do something vaguely like what
dEQP expects.

I think we probably need to check the visual bits against 0 for the
attachment, but we haven't been doing that thusfar, and given how
confusingly this is specified, I can't imagine anyone relying on it.

v2: Improve comments, move error condition above the
    _mesa_get_fb0_attachment call, add forgotten "return"
    (all suggested/caught by Jordan Justen).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-19 12:58:15 -07:00
Ilia Mirkin
d2445b0083 nv50/ir: force-enable derivatives on TXD ops
This matters especially in vertex shaders, where derivatives are
disabled by default. This fixes textureGrad in vertex shaders on nv50.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-19 13:09:49 -04:00
Ilia Mirkin
d1b85dbffa nv50: reset TFB bufctx when we no longer hold a reference to the buffers
This fix is analogous to commit ff085d014.

This fixes some use-after-free situations in dEQP when an xfb state is
removed, and then a clear is triggered, which only does a partial
validation. It would attempt to read the no-longer-valid buffers,
resulting in crashes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-19 13:09:49 -04:00
Samuel Pitoiset
902bbda81b nvc0: avoid using magic numbers for the uniform_bo offsets
Instead make use of constants to improve readability.

The first 32 bytes of the driver constant buffer are unknown... This
doesn't seem to be used in the codegen part, but if the texBindBase
offset is shifted from 0x20 to 0x00, this breaks the universe for
really weird reasons. This sounds like to be related to textures.

Anyway, name this NVC0_CB_AUX_UNK_INFO and add a todo should be
enough for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-19 18:01:08 +01:00
Samuel Pitoiset
26cc411db8 nv50/ir: make use of auxCBSlot instead of magic numbers
This avoids using magic numbers for the driver constbuf slot which
is always 15 except for compute shaders on gk104+ where the slot 0
is used.

For gk104+, some special compute-related values like the thread
index are uploaded to screen->parm which is currently bound on c0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Pierre Moreau <pierre.morrow@free.fr>
2016-03-19 18:01:04 +01:00
Samuel Pitoiset
d86933e6f4 nv50,nvc0: replace resInfoCBSlot by auxCBSlot
Having two different variables for the driver constant buffer slot
is confusing and really useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Pierre Moreau <pierre.morrow@free.fr>
2016-03-19 18:00:59 +01:00
Samuel Pitoiset
e05492fd7f nv50/ir: fix compilation warning in handleSharedATOM()
In release build mode only, op may be used uninitialized because
the assertion has been removed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-19 17:01:17 +01:00
Vinson Lee
a100d89d09 nv50,nvc0: Fix invalid constant.
Fix clang build error.

  CXX      codegen/nv50_ir_lowering_nvc0.lo
codegen/nv50_ir_lowering_nvc0.cpp:1783:42: error: invalid suffix 'd' on floating constant
      Value *zero = bld.loadImm(NULL, 0.0d);
                                         ^

Fixes: c1e4a6bfbf ("nv50,nvc0: handle SQRT lowering inside the driver")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-18 20:38:41 -07:00
Kenneth Graunke
46610238e0 mesa: Do proper format error checks for GenerateMipmap in ES 3.x.
According to the OpenGL ES 3.2 spec's description of GenerateMipmap:

"An INVALID_OPERATION error is generated if the levelbase array was not
 specified with an unsized internal format from table 8.3 or a sized
 internal format that is both color-renderable and texture-filterable
 according to table 8.10."

Similar text exists in the ES 3.0 specification as well.

Our existing rules are pretty close, but miss a few things.  The
OpenGL specification actually doesn't have any text about internal
format checking - our existing code comes from a Khronos bug report.
The ES 3.x spec provides a clearer description.

Fixes dEQP-GLES3.functional.negative_api.texture.generatemipmap and
dEQP-GLES2.functional.negative_api.texture.generatemipmap_zero_level
_array_compressed.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-18 18:43:47 -07:00
Kenneth Graunke
f1b0573510 mesa: Add color renderable/texture filterable format info for ES 3.x.
OpenGL ES 3.x contains a table of sized internal formats and their
required properties.  In particular, each format is marked as
"Color Renderable" or "Texture Filterable".

This patch introduces two functions that can be used to query the
information from that table.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-18 18:43:23 -07:00
Kenneth Graunke
88d28aa4d9 i965: Stop XY clipping point and line primitives.
Wide points and lines are not supposed to be clipped by the viewport.
Rather, they should be rendered, and any fragments outside of the
viewport should be discarded.

The traditional use case for this behavior is rendering moving wide
point particles.  When the center of the point approaches the viewport
edge, clipping would make it pop out of view early.

Fixes:
- dEQP-GLES2.functional.clipping.point.wide_point_clip
- dEQP-GLES3.functional.clipping.point.wide_point_clip
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner
- dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_center
- dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_corner

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-18 18:42:51 -07:00
Kenneth Graunke
0de64ab788 i965: Scissor to the viewport when rendering points/lines.
We're about to start allowing wide points/lines whose vertices are
outside the viewport past the clipper.  This scissoring hack ensures
that any fragments generated are still restricted to the viewport.

It is not necessary on Gen8+ as those platforms already discard
fragments which are outside the viewport.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-18 18:42:30 -07:00
Kenneth Graunke
d000a4989f i965: Include the viewport in the scissor rectangle.
We'll need to use scissoring to restrict fragments to the viewport
soon.  It seems harmless to include it generally, so let's do that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-18 18:42:15 -07:00
Kenneth Graunke
47be5a64c7 i965: Introduce an is_drawing_lines() helper.
Similar to is_drawing_points().

v2: Account for isoline tessellation output topology.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-18 18:41:59 -07:00
Kenneth Graunke
757674e8d0 i965: Move is_drawing_points to brw_state.h.
I need to use this in multiple source files.

v2: Rebase on TES output domain fix.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-18 18:41:25 -07:00
Jason Ekstrand
ecfb074276 anv/allocator: Make the bo_pool dynamically sized 2016-03-18 17:25:58 -07:00
Kenneth Graunke
5b2d8c2273 i965: Fix gl_TessLevelOuter[] for isolines.
Thanks to James Legg for finding this!

From the ARB_tessellation_shader spec:
"The number of isolines generated is derived from the first outer
 tessellation level; the number of segments in each isoline is
 derived from the second outer tessellation level."

According to the PRM, "TF.LineDensity determines # lines" while
"TF.LineDetail determines # segments".  Line Density is stored at
DWord 6, while Line Detail is at DWord 7.  So, they're not reversed
like they are for triangles and quads.

Fixes Piglit's spec/arb_tessellation_shader/execution/isoline,
and about 24 dEQP isoline tests (with GL_EXT_tessellation_shader
hacked on - it's not normally enabled).

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94524
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-18 16:45:23 -07:00
Kenneth Graunke
24298b7e2f i965: Decode non-normalized coordinates bit in SAMPLER_STATE.
We weren't printing this for some reason.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-18 16:44:51 -07:00
Kenneth Graunke
8679d40dc7 i965: Account for TES in is_drawing_points().
Now that we implement tessellation shaders, the TES might be the last
stage enabled.  If it's outputting points, then the primitive type
reaching the SF is points.  We need to account for this.

Caught by Ilia Mirkin.

v2: Update dirty bit comment above caller (caught by Iago)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-18 16:44:15 -07:00
Pierre Moreau
1282146d4e nv50: Mark compute states as dirty on context switch
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
[ Samuel Pitoiset: Trivial rebase conflict ]
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-19 00:18:00 +01:00
Samuel Pitoiset
a734c0f8ba nv50/ir: print SUBFM subops
Only 3d subop is currently emitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-19 00:09:18 +01:00
Samuel Pitoiset
af0c97fb90 nv50: add a new validation path for compute
This makes use of the new state validation interface to be consistent
with 3d.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
2016-03-19 00:09:14 +01:00
Samuel Pitoiset
5ed387675d nv50: rework nv50_compute_validate_program()
Reduce the amount of duplicated code by re-using
nv50_program_validate(). While we are at it, change the prototype to
return void. We don't check anymore if the translation fails but
improving the state validation is a long process.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
2016-03-19 00:09:09 +01:00
Samuel Pitoiset
a07ebc1993 nv50: rework the validation path for 3D
This exposes an interface for state validation that will be also used
to rework the compute validation path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
2016-03-19 00:09:05 +01:00
Samuel Pitoiset
517d2c97e1 nv50: rename 3d binding points to NV50_BIND_3D_XXX
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
2016-03-19 00:09:02 +01:00
Samuel Pitoiset
9374fc1e67 nv50: rename 3d dirty flags to NV50_NEW_3D_XXX
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
2016-03-19 00:08:56 +01:00
Samuel Pitoiset
e844aac40b nv50: rename NV50_COMPUTE to NV50_CP
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
2016-03-19 00:08:52 +01:00
Samuel Pitoiset
dedb46f582 nv50: rename nv50_context::dirty to nv50_context::dirty_3d
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
2016-03-19 00:08:28 +01:00
Jason Ekstrand
b1c5d45872 anv/allocator: Add a size field to bo_pool_alloc 2016-03-18 11:50:53 -07:00
Brian Paul
9211b68ad3 st/mesa: clean up st_translate_texture_target()
Reformat code.  Improve assertion.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-18 12:06:31 -06:00
Brian Paul
0f73c3ab25 st/mesa: simplify drawpixels shader code with tgsi transform helper functions
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-18 12:06:30 -06:00
Brian Paul
373910f4e7 st/mesa: simplify bitmap shader code with tgsi transform helper functions
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-18 12:06:30 -06:00
Brian Paul
e9d5e68d1b tgsi: add tgsi_transform_op3_inst() function
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-18 12:06:30 -06:00
Juan A. Suarez Romero
7a712e64d6 doc: add 'vec4' option in INTEL_DEBUG
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-18 17:30:56 +01:00
Daniel Czarnowski
d4714512e4 egl: support EGL_LARGEST_PBUFFER in eglCreatePbufferSurface(...)
Patch provides a default for a set pbuffer surface size when
EGL_LARGEST_PBUFFER is used by the client. MIN2 macro is moved
to egldefines so that it can be shared.

Fixes following Piglit test:
   egl-create-largest-pbuffer-surface

From EGL 1.5 spec:
   "Use EGL_LARGEST_PBUFFER to get the largest available pbuffer
   when the allocation of the pbuffer would otherwise fail."

Currently there exists no API to query largest available pixmap size
using xlib or xcb so right now this seems most straightforward way to
ensure that we fulfill above API and also we don't attempt to allocate
'too big' pixmap which might succeed on server side but not work in
practice when driver starts to use it as a texture.

v2: add more explanation about the change (Emil)

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-03-18 07:35:32 +02:00
George Kyriazis
dd63fa28f1 gallium/swr: Cleaned up some context-resource management
Removed bound_to_context.  We now pick up the context from the screen
instead of the resource itself.  The resource could be out-of-date
and point to a pipe that is already freed.

Fixes manywin mesa xdemo.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-03-17 20:57:52 -05:00
Timothy Arceri
952c166170 mesa: remove remaining tabs in prog_parameter.c
Acked-by: Matt Turner <mattst88@gmail.com>
2016-03-18 12:42:53 +11:00
Timothy Arceri
ce9c042ab3 mesa: inline _mesa_add_unnamed_constant()
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-18 12:42:43 +11:00
Timothy Arceri
fa9bd6b663 mesa: simplify and inline _mesa_lookup_parameter_index()
The function has only one user and strings are always null terminated.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-18 12:42:39 +11:00
Timothy Arceri
350b1ef027 mesa: make _mesa_lookup_parameter_constant static
This is not used outside of prog_parameter.c

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-18 12:42:34 +11:00
Timothy Arceri
7794b22a84 mesa: remove unused function
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-18 12:42:30 +11:00
Nicolai Hähnle
a8eea696b8 st/mesa: honour sized internal formats in st_choose_format (v2)
The bitcasting which is possible with shader images (and texture views?)
requires that when the user specifies a sized internal format for a
texture, we really allocate that format. To this end:

(1) find_exact_format should ignore sized internal formats and

(2) some of the entries in the mapping table corresponding to sized
    internal formats are reordered to use an RGBA format instead of
    a BGRA one.

This fixes arb_shader_image_load_store-bitcast in the (work in progress)
ARB_shader_image_load_store implementation for radeonsi.

v2: don't change the mapping of GL_RGB10: the change caused a regression
    because it preferred a format with an alpha channel, and GL_RGB10
    is not among the supported formats for shader images

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-17 19:26:40 -05:00
Dongwon Kim
49eb5e75bd configure.ac: enable_asm=yes when x-compiling across same X86 arch
Currently, configure script is forcing 'enable_asm' to be 'no'
whenever cross-compilation is performed on X86 host. This is
based on an assumption that target architecture is different
from host's (i.e. ARM). But there's always a case that we do
cross-compilation for target that is also X86 based just like
host in which same ASM codes will be supported. 'enable_asm'
should not be forced to be "no" anymore in this case.

v2: corrected commit message

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
2016-03-17 16:53:23 -07:00
Timothy Arceri
d6b9202873 glsl: disable varying packing when its not safe
In GL 4.4+ there is no guarantee that interpolation qualifiers will
match between stages so we cannot safely pack varyings using the
current packing pass in Mesa.

We also disable packing on outerward facing interfaces for SSO
because in ES we need to retain the unpacked varying information
for draw time validation. For desktop GL we could allow packing for
SSO in versions < 4.4 but its just safer not to do so.

We do however enable packing on individual arrays, structs, and
matrices as these are required by the transform feedback code and it
is still safe to do so.

Finally we also enable packing when a varying is only used for
transform feedback and its not a SSO.

This fixes all remaining rendering issues with the dEQP SSO tests,
the only issues remaining with thoses tests are to do with validation.

Note: There is still one remaining SSO bug that this patch doesn't fix.
Their is a chance that VS -> TCS will have mismatching interfaces
because we pack VS output in case its used by transform feedback but
don't pack TCS input for performance reasons. This patch will make the
situation better but doesn't fix it.

V4: fix out of order function params after rebase, make sure packing
still disabled in tess stages. Update comments as to why we disable
packing on SSO.

V3: ES 3.1 *does* require interpolation to match so don't disable
packing there. Rebased on master rather than on enhanced layouts
component packing series.

V2: Make is_varying_packing_safe() a function in the varying_matches
class, fix spelling (Matt) and make sure to remove the outer array
when dealing with Geom and Tess shaders where appropriate.
Lastly fix piglit regression in new piglit test and document the
undefined behaviour it depends on:
arb_separate_shader_objects/execution/vs-gs-linking.shader_test

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-18 10:26:34 +11:00
Timothy Arceri
c0ae6eeb3b glsl: pass disable_varying_packing bool to the lowering pass
This will allow us to choose to ignore the disable which will be
useful for more fine grained control over when to enable or disable
packing.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-18 10:26:30 +11:00
Marek Olšák
4ab2ac3349 radeonsi: fix Hyper-Z hangs on P2 configs
Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-17 18:30:45 +01:00
Romain Failliot
151724159d docs: Renormalize older extensions.
For older extensions, there is an explanation first and the extension
name in brackets, like that:
    Clamping controls (GL_ARB_color_buffer_float)
I inverted that so we have the extension first and then the explanation
in brackets, like that:
    GL_ARB_color_buffer_float (Clamping controls)

It will help me later to parse the few extensions that use this syntax:
    all drivers that support <GL_extension>

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-17 11:35:20 -05:00
Romain Failliot
f5d47dd428 docs: Renormalize some extensions.
This fixes some exceptions I have to deal with in mesamatrix.net.
The extensions GL_ARB_texture_buffer_object had a comment between "DONE"
and the brackets.
And the extension GL_KHR_robustness (in GL 4.5 and GLES 3.1) was using
"90% done" instead of "in progress". The "90% done" is still here
though, but as an extension comment.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-17 11:35:12 -05:00
Romain Failliot
3671bb3eaf docs: Realign the "Status" column.
The "Status" column was misaligned in some GL sections.
This is a lot of diffs, but it's only spaces in the end.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-17 11:35:09 -05:00
Romain Failliot
e571f11de8 docs: howto to read and edit GL3.txt
Added a small guide on how to read and edit GL3.txt.
I think this would help as much the devs as the users reading this file.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-17 11:34:50 -05:00
Brian Paul
84b961dd53 r300g: add missing layer argument to rws->buffer_get_handle() call
Fixes compilation error since 5aea0d691.

Reviewed-by: Christian König <christian.koenig@amd.com>
2016-03-17 09:52:21 -06:00
Christian König
5aea0d6919 radeon/winsys: add layer support for BO export
Add layer support to export individual array layers.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-17 14:17:06 +01:00
Christian König
04bc082f6a radeon/winsys: add offset support for BO import/export
Add offset support to handle NV12 offsets as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-17 14:17:03 +01:00
Christian König
f1e78a48f2 gallium/winsys/drm: add layer to struct winsys_handle
For exporting a specific layer of an array texture.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-17 14:16:59 +01:00
Christian König
29d26f1522 gallium/winsys/drm: add offset to struct winsys_handle
We are going to need this for EGL_EXT_image_dma_buf_import.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-17 14:16:03 +01:00
Connor Abbott
58fe7837b8 nir: propagate bitsize information in nir_search
When we replace an expresion we have to compute bitsize information for the
replacement. We do this in two passes to validate that bitsize information
is consistent and correct: first we propagate bitsize from child nodes to
parent, then we do it the other way around, starting from the original's
instruction destination bitsize.

v2 (Iago):
- Always use nir_type_bool32 instead of nir_type_bool when generating
  algebraic optimizations. Before we used nir_type_bool32 with constants
  and nir_type_bool with variables.
- Fix bool comparisons in nir_search.c to account for bitsized types.

v3 (Sam):
- Unpack the double constant value as unsigned long long (8 bytes) in
nir_algrebraic.py.

v4 (Sam):
- Use helpers to get type size and base type from nir_alu_type.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:54:45 +01:00
Connor Abbott
3124ce699b nir: add a bit_size parameter to nir_ssa_dest_init
v2: Squash multiple commits addressing the new parameter in different
    files so we don't break the build (Iago)

v3: Fix tgsi (Samuel)

v4: Fix nir_clone.c (Samuel)

v5: Fix vc4 and freedreno (Iago)

v6 (Sam)
- Fix build errors in nir_lower_indirect_derefs
- Use helper to get type size from nir_alu_type.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:54:45 +01:00
Iago Toral Quiroga
084b24f558 nir: rename nir_const_value fields to include bitsize information
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-17 11:16:33 +01:00
Connor Abbott
9076c4e289 nir: update opcode definitions for different bit sizes
Some opcodes need explicit bitsizes, and sometimes we need to use the
double version when constant folding.

v2: fix output type for u2f (Iago)

v3: do not change vecN opcodes to be float. The next commit will add
    infrastructure to enable 64-bit integer constant folding so this is isn't
    really necessary. Also, that created problems with source modifiers in
    some cases (Iago)

v4 (Jason):
  - do not change bcsel to work in terms of floats
  - leave ldexp generic

Squashed changes to handle different bit sizes when constant
folding since otherwise we would break the build.

v2:
- Use the bit-size information from the opcode information if defined (Iago)
- Use helpers to get type size and base type of nir_alu_type enum (Sam)
- Do not fallback to sized types to guess bit-size information. (Jason)

Squashed changes in i965 and gallium/nir drivers to support sized types.
These functions should only see sized types, but we can't make that change
until we make sure that nir uses the sized versions in all the relevant places.
A later commit will address this.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:16:33 +01:00
Connor Abbott
6700d7e423 nir: add nir_{src,dest}_bit_size() helpers
v2: use a ternary (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:16:33 +01:00
Jason Ekstrand
e172dbe5d2 nir: Add a bit_size to nir_register and nir_ssa_def
This really hacky commit adds a bit size to registers and SSA values.  It
also adds rules in the validator to validate that they do the right things.

It's still an open question as to whether or not we want a bit_size in
nir_alu_instr or if we just want to let it inherit from the destination.
I'm inclined to just let it inherit from the destination.  A similar
question needs to be asked about intrinsics.

v2 (Connor):
  - Relax validation: comparisons have explicit destination sizes
    and implicit source sizes.

v3 (Sam):
- Use helpers to get size and base types of nir_alu_type enum.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:16:33 +01:00
Connor Abbott
3d37de930d nir/types: add a function to get the bitsize of a base type
v2: fix it for GLSL_TYPE_SUBROUTINE (Iago)

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:16:33 +01:00
Samuel Iglesias Gonsálvez
c38a25af2f i965/nir: fix check to resolve booleans to work with sized nir_alu_type
As nir_alu_type has now embedded the data size, the check for the
instruction's output type (to see if a boolean resolve is required)
should ignore the data size part.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:16:33 +01:00
Jason Ekstrand
78f1919429 nir: Add explicitly sized types
v2: Fix size/type mask to properly handle 8-bit types.

v3: Add helpers to get the bitsize and base type of a
nir_alu_type enum.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:16:33 +01:00
Jordan Justen
3fd308a357 Merge remote-tracking branch 'origin/master' into vulkan 2016-03-17 01:44:07 -07:00
Jordan Justen
7d021cb15e i965/nir: Lower nir compute shader shared variables
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-17 01:23:40 -07:00
Jordan Justen
b1e7cdfdcf nir: Lower shared var atomics during nir_lower_io
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-17 01:23:40 -07:00
Jordan Justen
e3cbb9d37c nir: Add support for lowering load/stores of shared variables
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-17 01:23:40 -07:00
Jordan Justen
683c359c54 nir: Add atomic operations on variables
This allows us to first generate atomic operations for shared
variables using these opcodes, and then later we can lower those to
the shared atomics intrinsics with nir_lower_io.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-17 01:23:40 -07:00
Jordan Justen
3c807607df nir: Add compute shader shared variable storage class
Previously we were receiving shared variable accesses via a lowered
intrinsic function from glsl. This change allows us to send in
variables instead. For example, when converting from SPIR-V.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-17 01:23:40 -07:00
Jordan Justen
26f8262698 nir/print: Add space after shader_storage var mode
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-17 01:23:40 -07:00
Iago Toral Quiroga
5be11d2236 i965: Skip execution size adjustment for instructions of width 4
This code in brw_set_dest adjusts the execution size of any instruction
with a dst.width < 8. However, we don't want to do this with instructions
operating on doubles, since these will have a width of 4, but still
need an execution size of 8 (for SIMD8). Unfortunately, we can't just check
the size of the operands involved to detect if we are doing an operation on
doubles, because we can have instructions that do operations on double
operands interpreted as UD, operating on any of its 2 32-bit components.

Previous commits have made it so we never emit instructions with a horizontal
width of 4 that don't have the correct execution size set for gen6+, so
we can skip it in this case, avoiding the conflicts with fp64 requirements.

Expanding the same fix to other hardware generations requires many more
changes but since we are not targetting fp64 support on them
wer don't really care for now.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-17 08:23:25 +01:00
Samuel Iglesias Gonsalvez
22a10dd030 i965/vec4/gen6: fix exec_size for MOV with a width of 4 in generate_gs_ff_sync()
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-17 08:23:25 +01:00
Samuel Iglesias Gonsalvez
b91b9e4b00 i965/vec4/gen6: fix exec_size for instructions with destination width of 4
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-17 08:23:25 +01:00
Samuel Iglesias Gonsalvez
30fc3fa24d i965/vec4/gen6: fix exec_size for instructions with width of 4 in generate_gs_svb_write()
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-17 08:23:25 +01:00
Samuel Iglesias Gonsalvez
2fafc6b98c i965/gs/gen6: fix execsize for instructions with width of 4 in gen6_sol_program()
v2:
- Add assert (Topi).

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-17 08:23:25 +01:00
Iago Toral Quiroga
f6342b5645 i965: set correct execsize for MOVS with a width of 4 in brw_find_live_channel
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-17 08:23:25 +01:00
Iago Toral Quiroga
31a8604252 i965/eu: set execution size for SEND message in brw_send_indirect_message
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-17 08:23:25 +01:00
Iago Toral Quiroga
2d6af62a0f i965/fs: Set exec size for gen7 pull const loads
v2 (Topi):
  - No need to set the execsize for the indirect send message,
    the next patch will handle that.
  - Set the execution size explicitly instead of taking it from
    the width of the dst that we set before.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-17 08:23:24 +01:00
Iago Toral Quiroga
ea45b6e96d i965/eu: set correct execution size in brw_NOP
v2: NOP should have an execsize of 1 (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-17 08:23:24 +01:00
Kenneth Graunke
9c1e01c4a8 meta: Don't use integer handles for shaders or programs.
Previously, we gave our internal clear/blit shaders actual GL handles
and stored them in the shader/program hash table.  We used ordinary
GL API entrypoints to work with them.

We thought this shouldn't be a problem because GL doesn't allow
applications to invent their own names for shaders or programs.
GL allocates all names via glCreateShader and glCreateProgram.

However, having them in the hash table is a bit risky: if a broken
application guesses the name of our shaders or programs, it could
alter them, potentially screwing up future meta operations.

Also, test cases can observe the programs in the hash table.  Running
a single dEQP process that executes the following test list:

dEQP-GLES3.functional.negative_api.buffer.clear
dEQP-GLES3.functional.negative_api.shader.compile_shader
dEQP-GLES3.functional.negative_api.shader.delete_shader

would result in the last two tests breaking.  The compile_shader test
calls glCompileShader(9) straight away, and since it hasn't even created
any shaders or programs, it expects to get a GL_INVALID_VALUE error
because there's no such name.  However, because the clear test ran
first, it created Meta programs, so an object named "9" did exist.

This patch reworks Meta to work with gl_shader and gl_shader_program
pointers directly.  These internal programs have bogus names, and are
never stored in the hash tables, so they're invisible to applications.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94485
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-16 23:57:11 -07:00
Kenneth Graunke
0fe254168b mesa: Expose compile_shader() and link_program() beyond the file.
This will allow me to use them directly from Meta, bypassing the
versions that work with GL integer handles.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-03-16 23:57:11 -07:00
Kenneth Graunke
7753657cf2 mesa: Make link_program() take a gl_shader_program, not a GLuint.
In half the callers, we already have a pointer, and don't need
to look it up again.  This will also help with upcoming meta work.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-03-16 23:57:11 -07:00
Kenneth Graunke
a461e0003f mesa: Make compile_shader() take a gl_shader, not a GLuint.
In half the callers, we already have a pointer, and don't need
to look it up again.  This will also help with upcoming meta work.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-03-16 23:57:11 -07:00
Kenneth Graunke
a7e9b31d5b meta: Use the _mesa_meta_compile_and_link_program helper more places.
Less boilerplate.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-16 23:57:11 -07:00
Eric Anholt
2b9f0dffe0 vc4: Move discard handling to the condition flag.
Now that the field exists in the instruction, we can make discards less
special.  As a bonus, that means that we should be able to merge some more
.sf instructions together when we get around to that.

This causes some scheduling changes, as it allows tlb_color_reads to be
delayed past the discard condition setup.  Since the tlb_color_read ends
up later, this may mean performance improvements, but I haven't tested.

total instructions in shared programs: 78114 -> 78035 (-0.10%)
instructions in affected programs:     1922 -> 1843 (-4.11%)
total estimated cycles in shared programs: 234318 -> 234329 (0.00%)
estimated cycles in affected programs:     8200 -> 8211 (0.13%)
2016-03-16 11:28:47 -07:00
Eric Anholt
7c9fc43915 vc4: Don't make a temporary for setting flags.
The register allocator doesn't really do anything about the temp, so it
doesn't seem like it should matter.  However, the scheduler would think
that a new def is being created.

This doesn't change anything yet, but it avoids a bunch of regressions in
the next commit.
2016-03-16 11:28:34 -07:00
Eric Anholt
b4f45f319c vc4: Add a safety check for setting flags.
If a pack was on the src reg, should it be a float, int, or mul unpack?
Just complain, instead.
2016-03-16 11:28:34 -07:00
Eric Anholt
a298fb15af vc4: Reuse list_for_each_entry_safe_rev().
This didn't exist when I wrote the code.
2016-03-16 11:28:34 -07:00
Nanley Chery
5464f0c046 anv/blit: Reduce number of VUE headers being read
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-16 10:57:23 -07:00
Nanley Chery
f33866ae0a anv/blit: Remove completed finishme for VkFilter
This task was finished as of:
d9079648d0.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-16 10:57:19 -07:00
Nanley Chery
5647de8ba5 anv/blit2d: Only use one extent in meta_emit_blit2d
Since scaling isn't involved, we don't need multiple extents.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-16 10:57:14 -07:00
Nanley Chery
92fb65f117 anv/blit2d: Remove sampler from pipeline
Since we're using texelFetch with a sampled image, a sampler is no
longer needed. This agrees with the Vulkan Spec section 13.2.4
Descriptor Set Updates:

sampler is a sampler handle, and is used in descriptor updates for types
VK_DESCRIPTOR_TYPE_SAMPLER and VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
if the binding being updated does not use immutable samplers.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-16 10:57:00 -07:00
Nanley Chery
f8f9886915 anv/blit2d: Use texel fetch in frag shader
The texelFetch operation requires that the sampled texture coordinates
be unnormalized integers. This will simplify the copy shader for
w-tiled images (stencil buffers).

v2 (Jason):
   Use f2i for texel coords
   Fix num_components indirectly
   Use float inputs for interpolation
   Nest tex_pos functions

Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-16 10:56:51 -07:00
Nanley Chery
b487acc489 Revert "anv/meta: Make meta_emit_blit() public"
This reverts commit f391683922.

Some conflicts had to be resolved in order for this revert to be
successful.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-16 10:56:46 -07:00
Nanley Chery
1a0c63b880 Revert "anv/meta: Prefix anv_ to meta_emit_blit()"
This reverts commit 514c055717.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-16 10:56:41 -07:00
Nanley Chery
997a873f0c anv/blit2d: Customize meta blit structs and functions for blit2d API
* Add fields in meta struct
* Add support in meta init/teardown
* Switch to custom meta_emit_blit2d()

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-16 10:56:22 -07:00
Nanley Chery
2d8c632117 anv/blit2d: Copy anv_meta_blit.c functions
These will be customized for blit2d operations.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-16 10:56:10 -07:00
Kenneth Graunke
b566317e7e meta: Use ARB_explicit_attrib_location in the rest of the meta shaders.
This is cleaner than using glBindAttribLocation().

Not all drivers support the extension, but I don't think those drivers
use GLSL in the first place.  Apparently some Meta shaders already use
GL_ARB_explicit_attrib_location, so I think it should be okay.

Honestly, I'm not sure how the old code worked anyway - we bound the
attribute location for "texcoords", while all the shaders capitalized
or spelled it differently.

v2: Convert another instance in brw_meta_fast_clear.c.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-16 00:09:56 -07:00
Plamena Manolova
9d9965c06f mesa: Ignore glPointSize when GL_POINT_SIZE_ARRAY_OES is enabled
When a user defines a point size array and enables it, the point
size value set via glPointSize should be ignored. To achieve this,
we can simply toggle ctx->VertexProgram.PointSizeEnabled.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42187
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-15 15:49:48 -07:00
Jason Ekstrand
abaa3bed22 anv/device: Flush the fence batch rather than the start of the BO 2016-03-15 15:24:24 -07:00
Jason Ekstrand
7f6a0cb29c Merge remote-tracking branch 'public/master' into vulkan 2016-03-15 14:09:50 -07:00
Varad Gautam
e103b52aec vc4: Coalesce instructions using VPM reads into the VPM read.
This is done instead of copy propagating the VPM reads into the
instructions using them, because VPM reads have to stay in order.

shader-db results:
total instructions in shared programs: 78509 -> 78114 (-0.50%)
instructions in affected programs:     5203 -> 4808 (-7.59%)
total estimated cycles in shared programs: 234670 -> 234318 (-0.15%)
estimated cycles in affected programs:     5345 -> 4993 (-6.59%)

Signed-off-by: Varad Gautam <varadgautam@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Rhys Kidd <rhyskidd@gmail.com>
2016-03-15 13:09:24 -07:00
Varad Gautam
00bdbb22a9 vc4: rename file to group vpm optimizations together
This file will contain optimization passes for both vpm reads
and writes.

Signed-off-by: Varad Gautam <varadgautam@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-03-15 12:49:37 -07:00
Eric Anholt
1c4b077409 vc4: Fix failures with nir_extract_* since the addition of the opcodes. 2016-03-15 12:49:37 -07:00
Roland Scheidegger
bb2c5e657b llvmpipe: fix lp_rast_plane alignment on 32bit
Some rasterization code relies (for sse) on the first and third planes
(but not the second for now) being 128bit aligned, and we didn't get that
on 32bit - I mistakenly thought the 64bit number in the struct would get
the thing aligned to 64bit even on 32bit archs.
Stephane Marchesin really figured this out.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

CC: <mesa-stable@lists.freedesktop.org>
2016-03-15 19:42:15 +01:00
Roland Scheidegger
12a4f0bed6 draw: fix line stippling
The logic was comparing actual ints, not true/false values.
This meant that it was emitting always multiple line segments instead of just
one even if the stipple test had the same result, which looks inefficient, and
the segments also overlapped thus breaking line aa as well.
(In practice, with the no-op default line stipple pattern, for a 10-pixel
long line from 0-9 it was emitting 10 segments, with the individual segments
ranging from 0-1, 0-2, 0-3 and so on.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94193

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

CC: <mesa-stable@lists.freedesktop.org>
2016-03-15 19:41:34 +01:00
Roland Scheidegger
4b249ed4cd softpipe: fix misleading TGSI_QUAD_SIZE usage
All these img filter loops iterate through NUM_CHANNELS, not QUAD_SIZE.
In practice both are of course the same unchangeable value (4), but it
makes the code look a bit confusing. Moreover, some of the functions were
actually given an array of 4 values according to the declaration, yet the
code was addressing values 0/4/8/12 out of it, so fix this by just saying
it's a pointer to floats like the other functions.

While here, also add comment about not quite correct filtering.

There's no actual code difference.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-15 19:37:59 +01:00
Roland Scheidegger
9e9d69979c softpipe: fix anisotropic filtering crash
The filt_args->offset wasn't assigned but was always used later leading
to a crash (as far as I can tell, texel offsets don't actually make much
sense with anisotropic filtering, but because there's no explicit setting
if offsets are enabled there the array is always accessed).

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94481

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>

CC: <mesa-stable@lists.freedesktop.org>
2016-03-15 16:40:05 +01:00
Nicolai Hähnle
4de25fa7b0 radeonsi: set DEPTH_BEFORE_SHADER based on FS_EARLY_DEPTH_STENCIL
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:59 -05:00
Nicolai Hähnle
0ffcc318e6 tgsi: add tgsi_full_src_register_from_dst helper function
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:49 -05:00
Nicolai Hähnle
c02d73af0b gallium/u_inlines: add util_copy_image_view
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:46 -05:00
Nicolai Hähnle
f6dc4f5558 st/mesa: set image access flags in st_bind_images
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:43 -05:00
Nicolai Hähnle
71a1b54b33 gallium: add access field to pipe_image_view
This allows drivers to make smarter decisions e.g. about whether the image
has to be decompressed.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:40 -05:00
Nicolai Hähnle
8c497b8fb5 st/glsl_to_tgsi: set FS_EARLY_DEPTH_STENCIL when required
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:37 -05:00
Nicolai Hähnle
e526f930aa tgsi: add TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:33 -05:00
Nicolai Hähnle
1c0cee8764 st/glsl_to_tgsi: set memory access type on image intrinsics
This is required to preserve the image variable's coherent/restrict/volatile
qualifiers in TGSI.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:30 -05:00
Nicolai Hähnle
dfcf420412 st/glsl_to_tgsi: provide Texture and Format information for image ops
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:26 -05:00
Nicolai Hähnle
3243b6fc97 tgsi: add Texture and Format to tgsi_instruction_memory
Frontends should have this information readily available, and it simplifies
image LOAD/STORE/ATOM* handling especially with indirect image access.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:02 -05:00
Nicolai Hähnle
9b68bdf6f8 get: reconcile aliasing enums for MaxCombinedShaderOutputResources
The enums MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS and
MAX_COMBINED_SHADER_OUTPUT_RESOURCES are equal and should therefore only
appear once.

Noticed while implementing ARB_shader_image_load_store without previously
implementing SSBO.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-14 17:19:14 -05:00
Francisco Jerez
b054605722 i965/fs: Restrict inequality that can only hold equal in saturate propagation.
Should have no functional change.  The IP value of an instruction that
reads src_var cannot possibly be after the end of the live interval of
the variable it's reading from, by the definition of live interval.
Might save future readers a momentary WTF while trying to understand
this code.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-14 14:58:19 -07:00
Francisco Jerez
7d7990cf65 i965/vec4: Consider removal of no-op MOVs as progress during register coalesce.
Bug found by the liveness analysis validation pass that will be
introduced in a later commit.  The no-op MOV check in
opt_register_coalesce() was removing instructions which makes the
cached liveness analysis calculation inconsistent with the shader IR.
We were failing to set progress to true in that case though, which
means that invalidate_live_intervals() wouldn't necessarily be called
at the end of the function.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-14 14:58:11 -07:00
Francisco Jerez
93be4158ae i965/fs: Add missing analysis invalidation in fixup_3src_null_dest().
Bug found by the liveness analysis validation pass that will be
introduced in a later commit.  fixup_3src_null_dest() was allocating
registers which makes the cached liveness analysis calculation
incomplete, so it must be invalidated.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-14 14:57:58 -07:00
Francisco Jerez
6691c03fd3 i965/fs: Add missing analysis invalidation in opt_sampler_eot().
Bug found by the liveness analysis validation pass that will be
introduced in a later commit.  opt_sampler_eot() was allocating
registers and inserting and removing instructions, which makes the
cached liveness analysis calculation inconsistent with the shader IR,
so it must be invalidated.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-14 14:56:02 -07:00
Hans de Goede
4d02e91e49 clover: Fix pipe_grid_info.indirect not being initialized.
After pipe_grid_info.indirect was introduced, clover was not modified
to set it causing it to pass uninitialized memory for it to launch_grid.

This commit fixes this by zero-ing the entire pipe_grid_info struct when
declaring it, to avoid similar problems popping-up in the future.

Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
[ Francisco Jerez: Trivial codestyle fix. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-14 14:12:42 -07:00
Sarah Sharp
af06190760 mesa: docs: Intel i965 hardware limits.
This should help the next person working on hardware enabling figure out
where in the Intel PRMs to find the magic platform hardware values.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
2016-03-14 14:00:29 -07:00
Sarah Sharp
0f5bfc7f01 mesa: docs: i965: Use correct doxygen groupings syntax
When reading the source code, it's useful to indicate that a group of
fields in a struct are related in someway. There were several places
where people tried to group related structure members with the {@
syntax, without realizing they also needed to add the \name syntax in
order to generate correct doxygen html.

There are several files with groupings that look like this:

struct foo {
    /**
     * Related fields description
     * @{
     */
    int bar;
    char baz;
    /** @} */
    long qux;
}

However, the doxygen syntax for grouping is:

struct foo {
    /**
     * \name Related fields description
     * @{
     */
    int bar;
    char baz;
    /** @} */
    long qux;
}

https://www.stack.nl/~dimitri/doxygen/manual/grouping.html

Without the group name definition, the fields don't get properly
grouped. Instead, the group description is applied to the first field.

Fix the Intel hardware information structure, brw_device_info to
properly group the GPU hardware limitations and hardware quirks fields.

Once you've run `cd doxygen; make clean; make all`,
updated documentation can be found at

mesa/doxygen/i965/structbrw__device__info.html

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
2016-03-14 14:00:29 -07:00
Bruce Cherniak
e9d68cc3da gallium/swr: Resource management
Better tracking of resource state and synchronization.
A follow on commit will clean up resource functions into a new
swr_resource.cpp file.

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2016-03-14 14:07:48 -05:00
Marek Olšák
7a2333e4ef configure.ac: require libdrm 2.4.66 for drmGetDevice
since 737b6ed13e
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c no longer compiles:
error: unknown type name ‘drmDevicePtr’
2016-03-14 16:42:41 +01:00
Francisco Jerez
63250d8178 i965: Remove useless IR self-destruct backend_shader method.
From the point it's constructed the CFG contains the only existing
copy of the program IR, and it never becomes invalid.  Calling
backend_shader::invalidate_cfg would have destroyed the program
structure irrecoverably -- We weren't calling it at all for a good
reason.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-13 18:07:53 -07:00
Pierre Moreau
8c7acd87af nv50,nvc0: Set only NEW_CP_GLOBALS upon binding
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-13 22:34:50 +01:00
Rob Clark
e73ac84b93 freedreno/ir3: lower extract_byte/word
The following commits broke things by starting to feed us unhandled
extract_u16/extract_u8 opcodes:

commit 905ff86198
Author:     Matt Turner <mattst88@gmail.com>
AuthorDate: Wed Feb 3 14:28:31 2016 -0800
Commit:     Matt Turner <mattst88@gmail.com>
CommitDate: Fri Mar 4 11:52:34 2016 -0800

    nir: Recognize open-coded extract_u16.

commit 76289fbfa8
Author:     Matt Turner <mattst88@gmail.com>
AuthorDate: Thu Jan 21 09:09:48 2016 -0800
Commit:     Matt Turner <mattst88@gmail.com>
CommitDate: Fri Mar 4 11:52:34 2016 -0800

    nir: Recognize open-coded extract_u8.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 14:10:57 -04:00
Ilia Mirkin
c1e4a6bfbf nv50,nvc0: handle SQRT lowering inside the driver
First off, st/mesa lowers DSQRT incorrectly (it uses CMP to attempt to
find out whether the input is less than 0). Secondly the current
approach (x * rsq(x)) behaves poorly for x = inf - a NaN is produced
instead of inf.

Instead we switch to the less accurate rcp(rsq(x)) method - this behaves
nicely for all valid inputs. We still don't do this for DSQRT since the
RSQ/RCP ops are *really* inaccurate, and don't even have Newton-Raphson
steps right now. Eventually we should have a separate library function
for DSQRT that does it more precisely (and perhaps move this lowering to
the post-opt phase).

This fixes a number of dEQP precision tests that were expecting better
behavior for infinite inputs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-13 13:17:24 -04:00
Ilia Mirkin
b3e7fb5234 nv50/ir: avoid folding mul + add if the mul has a dnz
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-13 13:17:24 -04:00
Ilia Mirkin
a651bc027d nvc0: fix blit triangle size to fully cover FB's > 8192x8192
The idea is that a single triangle will cover the whole area being
drawn, allowing the blit shader to do its work. However the max fb size
is 16384x16384, which means that the triangle we draw needs to be twice
that in order to cover the whole area fully. Increase the size of the
triangle to 32768x32768.

This fixes a number of dEQP tests that were failing because a blit was
involved which would miss some of the resulting texture.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-13 13:17:24 -04:00
Rob Clark
01b071d530 freedreno: OUT_RELOC vs OUT_RELOCW fixes
Make sure we use OUT_RELOCW() in cases where the buffer is written to.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
f68c6951b8 freedreno/a4xx: hw binning
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
b3fe196e21 freedreno/a4xx: use generated headers for draw initiator
No need to open-code this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
2224ba5976 freedreno/a4xx: remove RB_RENDER_CONTROL patching
Bitfields where shuffled around for the better on a4xx, so we don't need
any patching on this one.  It appears to be something we set entirely in
the gmem code so no conflict between tiling and render state like we had
in a3xx.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
8824a765a2 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
476551a21f freedreno/a3xx: move where we deal w/ binning FS
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
dd9135c452 freedreno/a4xx: move where we deal w/ binning FS
Move where we pick dummy FS for binning pass, so the whole driver sees
the same dummy/no-op FS stage.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
09b3447344 freedreno/a3xx: constify the shader variants
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
5b955f09f7 freedreno/a4xx: constify the shader variants
Most of the driver just needs read-only access, so constify..

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:40 -04:00
Rob Clark
d9395e4ed8 freedreno/a3xx: remove duplicate mark of end of binning cmds
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:40 -04:00
Nicolai Hähnle
28d2a7e67c radeonsi: avoid crash when a sampler state is bound for a buffer texture
Sampler states don't really make sense with buffer textures, but they
can be set anyway, so we need to be defensive here. This bug was lurking
for a while and was finally noticed due to PBO uploads setting sampler
states.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94284
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Tested-by: Shawn Starr <shawn.starr@rogers.com>
2016-03-13 09:37:23 -05:00
Matt Turner
61b10b4eb7 i965: Use foreach_in_list_reverse_safe() macro.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-12 19:23:50 -08:00
Jason Ekstrand
98d58e7320 nir/clone: Add support for cloning a single function_impl
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
036b209484 nir/validate: Better function validation
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
f86f3c90aa nir/print: Better function argument printing
Since we aren't going to put the function parameters or the return variable
in the list of locals, it won't get a proper declaration.  This changes
nir_print to print the type along with each parameter or return variable.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
13969565f9 nir/print: Factor variable name lookup into a helper
Otherwise, we have a problem when we go to print functions with arguments
because their names get added to the hash table during declaration which
happens after we print the prototype.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
e4bebe8a02 nir: Create function parameters in function_impl_create
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
066d3c115e nir: Add a helper for creating a "bare" nir_function_impl
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
2ef4754a20 nir: Add a new "param" variable mode for parameters and return variables
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
41ae553fda nir/glsl: Remove dead function parameter handling code
NIR has never been used on IR where we haven't already done function
inlining so this code has been dead from the beginning.  Let's just get rid
of it for now.  We can always put it back in if we decide to use NIR for
function inlining at some point in the future.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jordan Justen
b83785d86d anv/gen7: Add stall and flushes before switching pipelines
This is a port of 18c76551ee from OpenGL
to Vulkan.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 13:13:37 -08:00
Jordan Justen
c8ec65a1f5 anv: Add flush_pipeline_before_pipeline_select
flush_pipeline_before_pipeline_select adds workarounds required before
switching the pipeline.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 13:13:37 -08:00
Jordan Justen
1b126305de anv/genX: Add flush_pipeline_select_gpgpu
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 12:43:46 -08:00
Jason Ekstrand
41af9b2e51 HACK: Don't re-configure L3$ in render stages pre-BDW
This fixes a "regression" on Haswell and prior caused by merging the gen7
and gen8 flush_state functions.  Haswell should still work just fine if
you're on a 4.4 kernel, but we really should make it detect the command
parser version and do something intelligent.
2016-03-12 08:57:16 -08:00
Boyuan Zhang
6cf120ec77 st/va: add HEVC main 10 profile
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-03-11 22:33:56 -05:00
Boyuan Zhang
06c862d67d radeon/video: enable HEVC main 10 decode
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-03-11 22:33:56 -05:00
Boyuan Zhang
8be9efcce7 radeon/uvd: handle HEVC main 10 decode
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-03-11 22:33:56 -05:00
Jason Ekstrand
753ebe4457 anv/x11: Reset the SHM fence before presenting the pixmap
This seems to fix the flicker issue that I was seeing with dota2
2016-03-11 17:22:46 -08:00
Kristian Høgsberg Kristensen
9bff5266be anv/x11: Add present support
The old DRI3 implementation just used CopyArea instead of present.  We
still don't support all the MST fancyness, but it should at least avoid
some copies and allow for.

v2 (Jason Ekstrand):
   - Better object cleanup and destruction
   - Handle the CONFIGURE_NOTIFY event and return OUT_OF_DATE when needed
   - Track dirtyness via IDLE_NOTIFY rather than interating through the
     images sequentially
2016-03-11 16:54:17 -08:00
Jason Ekstrand
e920b184e9 anv/x11: Split image creation into a helper function
This lets us clean up error handling and make it correct.
2016-03-11 12:28:34 -08:00
Jason Ekstrand
41a147904a anv/wsi: Throttle rendering to no more than 2 frames ahead
Right now, Vulkan apps can pretty easily DOS the GPU by simply submitting a
lot of batches.  This commit makes us wait until the rendering for earlier
frames is comlete before continuing.  By waiting 2 frames out, we can still
keep the pipe reasonably full but without taking the entire system down.
This is similar to what the GL driver does today.
2016-03-11 11:31:13 -08:00
Jason Ekstrand
132f079a8c anv/gem: Use C99-style struct initializers for DRM structs
This is more consistent with the way the rest of the driver works and
ensures that all structs we pass into the kernel are zero'd out except for
the fields we actually want to fill.  We were previously doing then when
building with valgrind to keep valgrind from complaining.  However, we need
to start doing this unconditionally as recent kernels have been getting
touchier about this.  In particular, as of kernel commit b31e51360e88 from
Chris Wilson, context creation and destroy fail if the padding bits are not
set to 0.
2016-03-11 11:31:03 -08:00
Ben Widawsky
d1ab544bb8 i965/chv: Display proper branding
"Braswell" is a Cherryview based *thing*. It unfortunately requires extra
information to determine its marketing name. Unlike all previous products, and
hopefully all future ones, there is no unique 1:1 mapping of PCI device ID to
brand string.

I put up a fight about adding any complexity to our GL renderer string code for
a very long time. However, a wise man made a comment to me that I couldn't argue
with: if a user installs Windows on their hardware, the brand string should be
the same as what we display in Linux. The Windows driver apparently does this
check, so we should too.

Note that I did manage to find a good use for this info anyway in the compute
shader thread counts.

v2: memcpy instead of strncpy, and some minor changes (Matt)

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com
2016-03-11 11:17:28 -08:00
Ben Widawsky
5e6a43a001 i965/chv: Update lower min for CS threads
We have better information now, and 28 was not a valid thing to support. 6 EUs
per sublice with 7 threads per EU is the minimum supported config.

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com
2016-03-11 11:17:28 -08:00
Ben Widawsky
3dc3dbc8d8 i965/chv: Check that compute threads are above threshold
The way we are organizing this code, the statically configured max_cs_threads
should always be the minimum value we actually support (ie. are aware of). As a
result, we can fall back to that if we get invalid numbers from the kernel (ie.
when the query succeeds, but the result is lower than expected).

I was originally planning to use an assert, but there is no reason to be so
mean.

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com
2016-03-11 11:17:28 -08:00
Ben Widawsky
9dd20b715a i965/chv: Use kernel provided info for max_cs_threads
With the previous patches, the code can find out the actual number of available
compute threads. It is enabled only for Cherryview since that is the only
platform I know for a fact has shipped devices which can benefit from this.  It
seems like other platforms /might/ benefit from this because of fused
configurations which /might/ have shipped. Fallback code is still there.

v2: Some minor adjustments from Matt

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com
2016-03-11 11:17:28 -08:00
Ben Widawsky
38eb606884 i965: Query and store GPU properties from kernel
Certain products are not uniquely identifiable based on device id alone. The
kernel exports an interface to help deal with this. This patch merely introduces
the consumer of the interface and makes sure nothing breaks.

It is also possible to use these values for programming GPGPU mode, and I plan
to do that as well.

The interface was introduced in libdrm 2.4.60, which is already required, so it
should all be fine.

v2: Some minor changes recommended by Matt

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-11 11:17:28 -08:00
Nicolai Hähnle
9908b13af6 st/mesa: check that the image unit is valid in st_bind_images
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-11 11:53:40 -05:00
Bas Nieuwenhuizen
417b6721a0 radeonsi: Lazily re-set sampler views after disabling DCC
Clear DCC flags if necessary when binding a new sampler view.

v2: Do not reset DCC flags of bound sampler views.
v3: Check that we have a real texture (Nicolai)

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-11 11:51:15 -05:00
Marek Olšák
af3454cad5 st/mesa: remove ST_NEW_MESA flag (v2)
Only used indirectly when checking dirty.st != 0

v2: also update st_cb_compute.c

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-11 16:07:18 +01:00
Nicolai Hähnle
e502801d98 r600g: clear compressed_depthtex/colortex_mask when binding buffer texture
Found by inspection of the source based on a bisected bug report.

This bug has been in the code for a long time, but the more recent PBO upload
feature exposed it because it leads to more uses of buffer textures.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94388
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-11 08:00:15 -05:00
Ilia Mirkin
f8ea98e4ec st/mesa: add GL_ARB_shader_atomic_counter_ops support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-10 22:36:17 -05:00
Ilia Mirkin
075a5742bf mesa: add GL_ARB_shader_atomic_counter_ops support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-10 22:34:46 -05:00
Ilia Mirkin
a8819fb1ff nvc0: add support for TGSI FMA ops
This will allow the nouveau backend to not try and split up ops that are
fused in GLSL.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-10 22:34:28 -05:00
Nicolai Hähnle
59c5508b9a radeonsi: update compressed_colortex_masks when a cmask is created or disabled
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-10 18:22:52 -05:00
Nicolai Hähnle
da68a9b215 radeonsi: move si_decompress_textures to si_blit.c
Since it is all about calling into blitter functions, it makes more
sense here. This change also reduces the size of the interfaces between
.c files.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-10 18:22:49 -05:00
Nicolai Hähnle
f03c9e5692 r600g: update compressed_colortex_masks when a cmask is created or disabled
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-10 18:22:46 -05:00
Nicolai Hähnle
784269aa40 gallium/radeon: notify all contexts when cmasks are enabled/disabled
There is an annoying corner case that I stumbled across while looking into
piglit's arb_shader_image_load_store/execution/load-from-cleared-image.shader_test
(which can be easily adapted to demonstrate the bug without the
ARB_shader_image_load_store extension)

When we bind a texture and then clear it using glClear (by attaching it
to the current framebuffer) for the first time, we allocate a separate
cmask for the texture to do fast clear, but the corresponding bit in
compressed_colortex_mask is not set. Subsequent rendering will use
incorrect data.

Conversely, when a currently bound texture with an existing cmask is
exported leading to that cmask being disabled, the compressed_colortex_mask
bit will remain set, leading to an assertion later on in debug builds.

Since iterating through all contexts and/or remembering where every
texture is bound would be costly, and cmask enable/disable should be
rare, we will maintain a global counter to signal contexts that they
must update their compressed_colortex_masks.

This patch introduces the global counter, and subsequent patches will
do the mask update.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-10 18:22:00 -05:00
Kenneth Graunke
9ea00c6f6b i965: Set a proper _BaseFormat for window system renderbuffers in ES.
intel_alloc_private_renderbuffer_storage did:

   rb->_BaseFormat = _mesa_base_fbo_format(ctx, internalFormat);

Unfortunately, internalFormat was usually an unsized format (such as
GL_DEPTH_COMPONENT).  In OpenGL ES, _mesa_base_fbo_format() refuses to
accept unsized formats, and returns 0 rather than a real base format.

This meant that we ended up with a completely bogus rb->_BaseFormat for
window system buffers on OpenGL ES.  All other renderbuffer allocation
functions in intel_fbo.c instead use the mesa_format, and do:

   rb->_BaseFormat = _mesa_get_format_base_format(...);

We can do likewise, using rb->Format.  This appears to work just fine.

dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_x_size_initial
failed, as it tried to perform a GL_FRAMEBUFFER_ATTACHMENT_DEPTH_SIZE query
on the window system depth buffer.  That query relies on a proper
rb->_BaseFormat being set, so it broke because rb->_BaseFormat was 0 due
to the above bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94458
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-10 11:23:52 -08:00
Kenneth Graunke
e032e4ad5a glcpp: Fix locations when encounting "#<NEWLINE>".
We were failing to reset our location tracking when encountering a
NEWLINE in the <HASH> state.  Rip the code from the <*>{NEWLINE} rule,
which handles this properly.

Also, update 146-version-first-hash.c to have proper expectations.
When I introduced the test, I didn't verify that the line/column
numbers were correct, and it turns out they varied based on the type
of newline ending.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94447
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-10 11:23:26 -08:00
Jason Ekstrand
1f3d582cba isl/surface_state: Set the clear color 2016-03-10 10:41:52 -08:00
Jason Ekstrand
8c819b8c2b genxml/gen75: Add the clear color bits to RENDER_SURFACE_STATE 2016-03-10 10:41:52 -08:00
Jason Ekstrand
6f47ed28b4 isl: Add more helpers for determining if a format is an integer format 2016-03-10 10:41:52 -08:00
Jason Ekstrand
b0e423cc4f isl: Remove redundant check
The green channel was checked twice.
2016-03-10 10:41:52 -08:00
Tim Rowley
84f857bef7 gallium/swr: remove use of BYTE from swr driver
Remove use of a win32-style type leaked from the swr rasterizer.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-03-10 11:20:58 -06:00
Samuel Pitoiset
dad3e5f4ef nvc0: expose SM35 perf counters to AMD_performance_monitor
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-10 18:20:40 +01:00
Samuel Pitoiset
0e511400de nvc0: add driver metrics for SM35 (GK110)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-10 18:20:38 +01:00
Samuel Pitoiset
bf840aa523 nvc0: add MP performance counters for SM35 (GK110)
Because compute support is not enabled by default for these chipsets,
NVF0_COMPUTE=1 needs to be used, along with GALLIUM_HUD to enable
performance counters.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-10 18:20:35 +01:00
Samuel Pitoiset
f289e99dee nvc0: explode config of Kepler hardware SM events
This is really verbose but most of the configuration will be reused
for SM35 (GK110).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-10 18:20:32 +01:00
Samuel Pitoiset
a0ce8536b3 nvc0: rework the driver metrics infrastructure
This follows the same design as MP perf counters.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-10 18:20:29 +01:00
Samuel Pitoiset
41fb87249a nvc0: rework the MP counters infrastructure
This mainly improves how we define the different list of queries.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-10 18:20:26 +01:00
Marek Olšák
7b29188a3f egl: clean up typedef madness in the backend API
let's use the dd.h format

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-03-10 18:03:14 +01:00
Iago Toral Quiroga
3e3de9ec0a glsl: report correct number of allowed vertex inputs and fragment outputs
Before we would always report 16 for both and we would only fail if either
one exceeded 16. Now we fail if the maximum for each is exceeded, even if
it is smaller than 16 and we report the correct maximum.

Also, expand the size of to_assign[] to 32. There is code at the top
of the function handling max_index up to 32, so this just makes the
code more consistent.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-03-10 08:48:53 +01:00
Vinson Lee
d46feee697 nouveau: Fix clang reserved-user-defined-literal error.
CXX      codegen/nv50_ir.lo
In file included from codegen/nv50_ir.cpp:28:
./nouveau_debug.h:19:30: error: invalid suffix on literal; C++11 requires a space between literal and identifier
      [-Wreserved-user-defined-literal]
   fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args)
                             ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-09 23:00:45 -08:00
Kenneth Graunke
3823b53ff8 mesa: Make glGetInteger64v convert float/doubles to 32-bit integers.
According to the GL 4.4 core specification, section 2.2.2 ("Data
Conversions For State Query Commands"):

"If a command returning integer data is called, such as GetIntegerv or
 GetInteger64v, a boolean value of TRUE or FALSE is interpreted as one
 or zero, respectively. A floating-point value is rounded to the nearest
 integer, unless the value is an RGBA color component, a DepthRange
 value, or a depth buffer clear value. In these cases, the query command
 converts the floating-point value to an integer according to the INT
 entry of table 18.2; a value not in [−1, 1] converts to an undefined
 value."

The INT entry of table 18.2 shows that b = 32, meaning the expectation
is to convert it to a 32-bit integer value.

Fixes:
dEQP-GLES3.functional.state_query.floats.blend_color_getinteger64
dEQP-GLES3.functional.state_query.floats.color_clear_value_getinteger64
dEQP-GLES3.functional.state_query.floats.depth_clear_value_getinteger64

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94456
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-09 19:44:18 -08:00
Nanley Chery
7fbbad0170 anv/blit2d: Use the tiling enum for simplicity
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-03-09 10:57:47 -08:00
Nanley Chery
514c055717 anv/meta: Prefix anv_ to meta_emit_blit()
Follow the convention for non-static functions.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-03-09 10:57:47 -08:00
Nanley Chery
627728cce5 anv/meta: Split anv_meta_blit.c into three files
The new organization is as follows:
* anv_meta_blit.c: Blit and state setup/teardown commands
* anv_meta_copy.c: Copy and update commands
* anv_meta_blit2d.c: 2D Blitter API commands

Also, change the formatting to contain most lines
within 80 columns.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-03-09 10:57:47 -08:00
Nanley Chery
f391683922 anv/meta: Make meta_emit_blit() public
This can be reverted if the only other consumer, anv_meta_blit2d(),
uses a different method.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-03-09 10:57:47 -08:00
Nanley Chery
ddbc645846 anv/meta: Store src and dst usage flags in a variable
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-03-09 10:57:47 -08:00
Nanley Chery
7ebbc3946a anv/meta: Minimize height of images used for copies
In addition to demystifying the value being added to the height,
this future-proofs the code for new tiling modes and keeps the
image height as small as possible.

v2: Actually use the smallest height possible.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-03-09 10:57:47 -08:00
Emil Velikov
3dc2630e45 gallium/radeon: use explicit drm_major, drm_minor check
Just like everywhere else in the radeon codebase.

v2: Don't forget about drm_major == 3 (Alex)

Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-09 17:25:22 +00:00
Emil Velikov
b9c5c4af6d egl/x11: check the return value of xcb_dri2_get_buffers_reply()
... before using it. The function can return NULL, which we should check
prior to refererencing it in the next function(s).

Cc: Fabian Vogt <fvogt@suse.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93667
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-09 17:25:22 +00:00
Emil Velikov
373f118c6c gallium: do not wrap header inclusion in
Add one missing extern C guard within include/pipe/p_video_enums.h, and
remove the wrapping throughout gallium.

On Haiku one could even use the gallium debug_printf() although
that's another topic.

v2: Leave dbghelp.h as is (Jose)

Cc: Jose Fonseca <jfonseca@vmware.com>
Cc: Brian Paul <brianp@vmware.com>
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-09 17:21:39 +00:00
Dieter Nützel
69d389c52f opencl: fix .gitignore for .install-gallium-links
Fixes: 0b6157e971 "install-gallium-links: port changes from install-lib-links"

v2: move this to the top level .gitignore and added Fixes:
    like Emil Velikov <emil.l.velikov@gmail.com> suggested

Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-03-09 17:16:52 +00:00
Emil Velikov
f3e23ead53 egl: remove remnants of MESA_drm_display
Last set in st/egl, unused in mesa-demos and superseded by
EGL_KHR_platform_gbm.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-03-09 17:16:51 +00:00
Emil Velikov
2295a4b1e1 egl: remove final pieces of KHR_vg_parent_image
Similar to previous commit - unused/unset for a long time.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-03-09 17:16:51 +00:00
Emil Velikov
c85544a10c glapi: remove the final function offset tags
A commit earlier this year reworked out python scripts to use a separate
file for these. Followed by removing support from the parser, and
removing all of the offset tags.

Seems like we either missed a few, or people added them by mistake.
Either way let's nuke the ones that are still around.

Cc: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-03-09 17:16:51 +00:00
Emil Velikov
3ffab9a89c winsys/amdgpu/addrlib: do not wrap header inclusion in extern "C"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-09 17:16:51 +00:00
Emil Velikov
a07192bd63 mesa/main: do not wrap header inclusion in extern "C"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-03-09 17:16:51 +00:00
Emil Velikov
5351dc1522 i915: limit extern "C" hack only for libdrm headers
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-03-09 17:16:51 +00:00
Emil Velikov
cf215d92f6 xmesa: do not wrap header inclusion in extern "C"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-03-09 17:16:50 +00:00
Emil Velikov
2af3a0ca6f util/sha: do not wrap header inclusion in extern "C"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-03-09 17:16:50 +00:00
Emil Velikov
d426c17550 egl/wayland: do not wrap header inclusion in extern "C"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-03-09 17:16:50 +00:00
Emil Velikov
750da80b34 gbm: do not wrap header inclusion in extern "C"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-03-09 17:16:50 +00:00
Nicolai Hähnle
9f06e7f5c1 st/mesa: shader image atoms must be before framebuffer update
The reason is that the shader image atoms call st_finalize_texture, which
may set ST_NEW_FRAMEBUFFER.

This fixes an assertion triggered by a subtest of piglit's
arb_shader_image_load_store-invalid.

v2: add comment explaining order constraints (suggested by Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-09 11:40:06 -05:00
Nicolai Hähnle
4eb416bd9d gallivm: special case TGSI_OPCODE_STORE
This instruction has the resource (buffer or image) as a destination to
represent the writemask for SSBO writes. However, this is obviously not
a "real" destination for the purpose of emitting LLVM IR.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-09 11:39:55 -05:00
Nicolai Hähnle
10b2b584ee tgsi: set correct output mode for RESQ
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-09 11:39:43 -05:00
Marek Olšák
dcb2b77823 gallium: add CAPs returning PCI device location
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-09 15:02:28 +01:00
Marek Olšák
737b6ed13e winsys/amdgpu: get PCI info
This will be queried by the OpenCL stack using an interop call.

I have tested that the values match lspci.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:28 +01:00
Marek Olšák
ec74deeb24 radeonsi: set amdgpu metadata before exporting a texture
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:28 +01:00
Nicolai Hähnle
ff7e9412be radeonsi: extract the texture descriptor computation into its own function
This will allow this code to be re-used for shader images.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-09 15:02:27 +01:00
Nicolai Hähnle
1197c69bdd radeonsi: extract the buffer descriptor computation into its own function
This will allow it to be re-used for shader image descriptors.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-09 15:02:27 +01:00
Nicolai Hähnle
2bf8ee34b8 radeonsi: remove resource field from si_sampler_view
view->resource is redundant with view->base.texture, so get rid of it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák
2dec5e09e1 radeonsi: accept pipe_resource in si_sampler_view_add_buffer
and rename .._buffers -> .._buffer

Based loosely on Nicolai's patch. This will make it easier to cherry-pick
Nicolai's patches from his image support branch.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák
f18fc70d6f radeonsi: disable DCC on handle export if expecting write access
This should be okay except that sampler views and images are not re-set.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Bas Nieuwenhuizen
1e48ec7571 radeonsi: add DCC decompression (v2)
This is currently not needed but will be necessary when we have
features that do not work with DCC enabled, such as image stores
and sharing non-scanout surfaces.

v2: Marek: rebase, remove decompression from si_flush_resource (not needed)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák
b744ac9f44 radeonsi: allocate DCC in the same backing buffer as the texture
To allow sharing textures with DCC enabled.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák
60c08aa90b gallium/radeon: disable CMASK on handle export if sharing doesn't allow it (v2)
v2: remove the list of all contexts

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák
970b979da1 gallium/radeon: eliminate fast color clear before sharing
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák
abac6bf67a gallium/radeon: don't use fast color clear if sharing doesn't allow it
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák
d4e847ea33 gallium/radeon: disallow handle export for MSAA & depth textures
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák
d95f593758 gallium/radeon: remember that texture_from_handle was called and its flags
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák
c034d3dde0 gallium/radeon: check that handle usage doesn't change for a resource
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák
6b187bbd9f gallium/radeon: disallow reallocation of shared buffers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák
ecbd3aba17 gallium/radeon: if we can't discard a whole resource, discard the range instead
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák
afdaffcbdb gallium/radeon: buffer valid range tracking only works with unshared buffers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák
be73d35829 gallium/radeon: don't set texture metadata for buffers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák
f914779c75 gallium/radeon: set texture metadata only once
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák
69d8b75114 gallium/radeon: clean up r600_texture_get_handle
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák
e3cee38e13 gallium/radeon: move code initializing texture metadata to its own function
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák
f4aa3256ef winsys/amdgpu: allow drivers to set/get opaque metadata
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák
bd1feb2827 gallium/radeon: rename winsys buffer_get/set_tiling to buffer_get/set_metadata
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:26 +01:00
Marek Olšák
6011d7cf25 gallium/radeon: remove rcs parameter from radeon_winsys::buffer_set_tiling
This was needed for DRM < 2.12.0 where the kernel was rewriting tiling flags
in IBs.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:25 +01:00
Marek Olšák
260ef9c9be gallium/radeon: use a structure for passing tiling flags from/to winsys
and call it radeon_bo_metadata

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:25 +01:00
Marek Olšák
82db518f15 gallium: add external usage flags to resource_from(get)_handle (v2)
This will allow drivers to make better decisions about texture sharing
for DRI2, DRI3, Wayland, and OpenCL.

v2: add read/write flags, take advantage of __DRI_IMAGE_USE_BACKBUFFER

Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-03-09 15:02:25 +01:00
Axel Davy
d943ac432d dri: add backbuffer use flag
This will be used by the next commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-09 15:02:25 +01:00
Timothy Arceri
2188c77a0e glsl: dont allow undefined array sizes in ES
This applies the rule to empty declarations.

Fixes:
dEQP-GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_vertex
dEQP-GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_fragment

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-09 20:30:42 +11:00
Jason Ekstrand
248ab61740 anv/cmd_buffer: Pull the core of flush_state into genX_cmd_buffer 2016-03-08 17:10:05 -08:00
Jason Ekstrand
28cbc45b3c anv/cmd_buffer: Split flush_state into two functions 2016-03-08 16:54:07 -08:00
Jason Ekstrand
42b4c0fa6e anv: Pull all of the genX_foo functions into anv_genX.h
This way we only have to declare them each once and we get it for all gens
at a single go.
2016-03-08 16:49:08 -08:00
Tamil velan
353a4f844f radeon/uvd: increase max height to 4096 for VI and newer
With this issue 'mpv --hwdec=vdpau --vo=vdpau <stream>' fails
for vdpau decode if the stream height is 4096. Vdpau decode of
height upto 4096 is necessary usecase on amdgpu driver for VI
and newer platforms.

The fix is in driver specific implementation of "Decoder
Query Capabilities" API to return 4096 for VI and newer
platforms. With this fix vdpauinfo reports height support as
4096 and mpv for vdpau decode works fine for 4096 height streams.

Signed-off-by: Tamil velan <Tamil-Velan.Jayakumar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-08 19:01:19 -05:00
Bas Nieuwenhuizen
6373845d98 winsys/amdgpu: enlarge buffer_indices_hashlist
Enlarge the buffer hashlist to prevent large numbers of misses
due to adding more buffers than can be cached in the hashlist.

The game I tested had CS's with up to 1500 buffers and the overhead
of amdgpu_lookup_buffer for various sizes was:

4096 1.97% (new value)
2048 4.37%
1024 6.92%
512  9.47% (old value)

(percentage of CPU usage in render thread as determined by perf)

The time spent in amdgpu_add_buffer self is ~4.2% in all cases and
for 4096 the time needed to clear the hashlist is still < 0.10%,
so I am not expecting significant regressions.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-09 00:52:07 +01:00
Jason Ekstrand
bbbdd32c19 anv/meta_clear: Use repclear again 2016-03-08 15:40:11 -08:00
Jason Ekstrand
dc504a51fb anv/pipeline: Unconditionally emit PS_BLEND on gen8+
Special-casing the PS_BLEND packet wasn't really gaining us anything.  It's
defined to be more-or-less the contents of blend state entry 0 only without
the indirection.  We can just copy-and-paste the contents.  If there are no
valid color targets, then blend state 0 will be 0-initialized anyway so
it's basically the same as the special case we had before.
2016-03-08 15:40:11 -08:00
Jason Ekstrand
cce65471b8 anv: Compact render targets
Previously, we would always emit all of the render targets in the subpass.
This commit changes it so that we compact render targets just like we do
with other resources.  Render targets are represented in the surface map by
using a descriptor set index of UINT16_MAX.
2016-03-08 15:40:11 -08:00
Samuel Pitoiset
32e848b016 nvc0: add a new validation path for compute
This makes use of the new state validation interface to be consistent
with 3d.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-09 00:19:21 +01:00
Samuel Pitoiset
db9b41d302 nvc0: rework the validation path for 3D
This exposes an interface for state validation that will be also used
to rework the compute validation path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-09 00:19:16 +01:00
Jordan Justen
a100a57e30 i965/hsw: Initialize SLM index in state register
For Haswell, we need to initialize the SLM index in the state
register. This can be copied out of the CS header dword 0.

v2:
 * Use UW move to avoid changing upper 16-bits of sr0.1 (mattst88)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94081
Fixes: piglit arb_compute_shader/execution/shared-atomics.shader_test
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-08 14:27:18 -08:00
Jordan Justen
d8347f12ea i965/compute: Skip SIMD8 generation if it can't be used
If the local workgroup size is sufficiently large, then the SIMD8
program can't be used. In this case we can skip generating the SIMD8
program. For complex programs this can save a significant amount of
time.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-08 14:27:18 -08:00
Jordan Justen
e1d54b1ba5 i965/fs: Allow spilling for SIMD16 compute shaders
For fragment shaders, we can always use a SIMD8 program. Therefore, if
we detect spilling with a SIMD16 program, then it is better to skip
generating a SIMD16 program to only rely on a SIMD8 program.

Unfortunately, this doesn't work for compute shaders. For a compute
shader, we may be required to use SIMD16 if the local workgroup size
is bigger than a certain size. For example, on gen7, if the local
workgroup size is larger than 512, then a SIMD16 program is required.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93840
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-08 14:27:18 -08:00
Timothy Arceri
91630d7453 glsl: don't always reject shaders with mismatching ifc blocks
Since we store some member qualifiers in the interface type
we need to be more careful about rejecting shaders just because
the pointer doesn't match. Its perfectly valid for some qualifiers
such as precision to not match across shader interfaces.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-09 09:21:42 +11:00
Timothy Arceri
3026b3565a glsl: make interstage_match() static
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-09 09:21:36 +11:00
Timothy Arceri
ebc419fcbd glsl: don't validate ifc blocks using validation meant for variables
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-09 09:21:31 +11:00
Kenneth Graunke
19f13b2096 mesa: Fix error code for GetFramebufferAttachmentParameter in ES 3.0+.
The ES 3.0+ specifications contain the exact same text as the OpenGL
specification, which says that we should return GL_INVALID_OPERATION.

ES 2.0 contains different text saying we should return GL_INVALID_ENUM.

Previously, Mesa chose the error code based on API (GL vs. ES).
This patch makes ES 3.0+ follow the GL behavior.  ES 2 remains as is.

Fixes dEQP-GLES3.functional.fbo.api.attachment_query_empty_fbo.
However, breaks the dEQP-GLES2 variant of the same test for drivers
which silently promote to ES 3.0.  This can be worked around by
exporting MESA_GLES_VERSION_OVERRIDE=2.0, but is a bug in dEQP-GLES2.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-08 12:46:28 -08:00
Kenneth Graunke
8b3496f378 mesa: Add GL_RED and GL_RG to ES3 effective internal format mapping.
The dEQP-GLES3.functional.fbo.completeness.renderable.texture.
{color0,depth,stencil}.{red,rg}_unsigned_byte tests appear to expect
GL_RED/GL_RG and GL_UNSIGNED_BYTE to map to GL_R8/GL_RG8, rather than
returning an INVALID_OPERATION error.

This makes perfect sense.  However, RED and RG are strangely missing
from the ES 3.0/3.1/3.2 spec's "Effective internal format corresponding
to external format and type" tables.  It may be worth filing a spec bug.

Fixes the 6 dEQP tests mentioned above.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-03-08 12:46:28 -08:00
Samuel Pitoiset
752769e053 nv50,nvc0: make sure to destroy the mutex used for blits
This mutex is initialized when the blitter is created, but it is never
destroyed. This doesn't hurt anything but it makes sense to destroy it
at blitter deletion.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-08 21:24:46 +01:00
Marek Olšák
3146014d5f gallium/radeon: don't use temporary buffers for persistent mappings
Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-08 20:08:52 +01:00
Jason Ekstrand
14b18aba89 nir: Add a pass for lower indirect variable dereferences
This new pass lowers load/store_var intrinsics that act on indirect derefs
to if-ladder of direct load/store_var intrinsics.  The if-ladders perform a
simple binary search on the indirect.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-03-08 10:41:54 -08:00
Alejandro Piñeiro
ef76ea4ba9 i965/fs/nir: "surface_access::" prefix not needed
"using namespace brw::surface_access" is already present at the
top of the source file.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-08 17:55:28 +01:00
Brian Paul
6857420e79 mesa: fix malformed assertion in _image_format_class_to_glenum()
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
2016-03-08 08:42:56 -07:00
Brian Paul
3ed8729f7b program: minor whitespace clean-ups in program_parse_extra.c 2016-03-08 08:42:56 -07:00
Christian König
37402aa4c6 st/mesa: conditionally enable GL_NV_vdpau_interop
Only enable it when we compile the state tracker as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-08 13:00:04 +01:00
Christian König
e148a3b6e9 radeon/uvd: disable MPEG1
The hardware simply doesn't support that correctly.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-08 12:57:08 +01:00
Alejandro Piñeiro
0548844e86 i965/vec4/nir: no need to use surface_access:: to call emit_untyped_atomic
Now that brw_vec4_visitor::emit_untyped_atomic was removed, there is no need
to explicitly set it.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-08 08:22:26 +01:00
Alejandro Piñeiro
d3a89a7c49 i965/vec4/nir: remove emit_untyped_surface_read and emit_untyped_atomic at brw_vec4_visitor
surface_access emit_untyped_read and emit_untyped_atomic provides the same
functionality.

v2: surface parameter of emit_untyped_atomic is a const, no need to
    specify default predicate on emit_untyped_atomic, use retype
    (Francisco Jerez).

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-08 08:22:26 +01:00
Alejandro Piñeiro
0c5c2e2c93 i965/vec4: pass the correct src_sz to emit_send at emit_untyped_atomic
If the src is invalid, so src size is zero, the src_sz passed to emit
send should be zero too, instead of a default 1 if we are in a simd4x2
case. This can happens if using emit_untyped_atomic for an atomic
dec/inc.

v2: use the proper src_sz when calling emit_send, instead of just
    avoid loading src at emit_send if BAD_FILE (Francisco Jerez)

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-08 08:22:26 +01:00
Kenneth Graunke
ea9fa5ff05 glcpp: Remove empty mid-rule action which changes test behavior.
Apparently this causes a slight difference in the parser's token
expectations, leading to a different error message.

It seems harmless, but I wanted to be cautious and separate it out.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-07 23:02:05 -08:00
Kenneth Graunke
e816c8b54a glcpp: Clean up most empty mid-rule actions left by previous commit.
I didn't want to pollute the previous patch with all the $4 -> $3
changes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-07 23:02:03 -08:00
Kenneth Graunke
639bbe3cb4 glcpp: Delete unnecessary implicit version resolves.
We now have a bigger hammer.  The HASH_TOKEN NEWLINE rule still needs
to exist to ensure the 146-version-hash-first.c test still passes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-07 23:02:01 -08:00
Kenneth Graunke
07ec67d85c glcpp: Implicitly resolve version after the first non-space/hash token.
We resolved the implicit version directive when processing control lines,
such as #ifdef, to ensure any built-in macros exist.  However, we failed
to resolve it when handling ordinary text.

For example,

        int x = __VERSION__;

should resolve __VERSION__ to 110, but since we never resolved the implicit
version, none of the built-in macros exist, so it was left as is.

This also meant we allowed the following shader to slop through:

        123
        #version 120

Nothing would cause the implicit version to take effect, so when we saw
the #version directive, we thought everything was peachy.

This patch makes the lexer's per-token action resolve the implicit
version on the first non-space/newline/hash token that isn't part of
a #version directive, fulfilling the GLSL language spec:

"The #version directive must occur in a shader before anything else,
 except for comments and white space."

Because we emit #version as HASH_TOKEN then VERSION_TOKEN, we have to
allow HASH_TOKEN to slop through as well, so we don't resolve the
implicit version as soon as we see the # character.  However, this is
fine, because the parser's HASH_TOKEN NEWLINE rule does resolve the
version, disallowing cases like:

        #
        #version 120

This patch also adds the above shaders as new glcpp tests.

Fixes dEQP-GLES2.functional.shaders.preprocessor.predefined_macros.
{gl_es_1_vertex,gl_es_1_fragment}.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-07 23:01:43 -08:00
Jason Ekstrand
75af420cb1 anv/pipeline: Move binding table setup to its own helper 2016-03-07 22:24:31 -08:00
Jason Ekstrand
2308891ede anv: Store CPU-side fence information in the BO
This reduces the number of allocations a bit and cuts back on memory usage.
Kind-of a micro-optimization but it also makes the error handling a bit
simpler so it seems like a win.
2016-03-07 22:23:44 -08:00
Jason Ekstrand
f61d40adc2 anv/allocator: Better casting in PFL macros
We cast he constant 0xfff values to a uintptr_t before applying a bitwise
negate to ensure that they are actually 64-bit when needed.  Also, the
count variable doesn't need to be explicitly cast, it will get upcast as
needed by the "|" operation.
2016-03-07 22:23:44 -08:00
Jason Ekstrand
3d4f2b0927 anv/allocator: Move the alignment assert for the pointer free list
Previously we asserted every time you tried to pack a pointer and a counter
together.  However, this wasn't really correct.  In the case where you try
to grab the last element of the list, the "next elemnet" value you get may
be bogus if someonoe else got there first.  This was leading to assertion
failures even though the allocator would safely fall through to the failure
case below.
2016-03-07 22:23:44 -08:00
Jason Ekstrand
8c2b9d1529 anv/bo_pool: Allow freeing BOs where the anv_bo is in the BO itself 2016-03-07 22:23:44 -08:00
Tim Rowley
90f9df3210 gallium/swr: fix issues preventing a 32-bit build
Not a currently tested configuration, but these couple of small changes
allow a 32-bit build.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94383
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-07 17:22:24 -06:00
Nanley Chery
181b142fbd anv/device: Up device limits for 3D and array texture dimensions
The limit for these textures is 2048 not 1024.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-03-07 15:21:50 -08:00
Tim Rowley
035d39b539 gallium/swr: remove use of UINT64 from swr_fence
Remove use of a win32-style type leaked from the swr rasterizer.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-07 16:58:48 -06:00
Jason Ekstrand
428ffc9c13 anv/device: Actually free the CPU-side fence struct again
In 23de78768, when we switched from allocating individual BOs to using the
pool for fences, we accidentally deleted the free.
2016-03-07 14:50:52 -08:00
Kenneth Graunke
af41c0b7e0 glsl: Add function parameters to the parser symbol table.
In a shader such as:

    struct S { float f; }
    float identity(float S) { return S; }

we would think that "S" in "return S" referred to a structure, even
though it's shadowed by the "float S" parameter in the inner struct.

This led to the parser's grammar seeing TYPE_IDENTIFIER and getting
confused.

Fixes dEQP-GLES2.functional.shaders.scoping.valid.
function_parameter_hides_struct_type_{vertex,fragment}.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-03-07 14:09:55 -08:00
Kenneth Graunke
c4960068d5 glsl: Add single declaration variables to the symbol table too.
The lexer/parser use a symbol table to classify identifiers as
variables, functions, or structure types.

For some reason, we neglected to add variables in simple declarations
such as

    int x = 5;

but did add subsequent variables in multi-declarations:

    int x = 5, y = 6; // y gets added, but not x, for some reason

Fixes four dEQP-GLES2.functional.shaders.scoping.valid subcases:
- local_int_variable_hides_struct_type_vertex
- local_int_variable_hides_struct_type_fragment
- local_struct_variable_hides_struct_type_vertex
- local_struct_variable_hides_struct_type_fragment

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-03-07 14:09:31 -08:00
Kenneth Graunke
1107e48b9a mesa: Change GLboolean to bool in GenerateMipmap target checker.
This is not API facing, so just use bool.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-07 14:01:34 -08:00
Kenneth Graunke
2f8a43586e mesa: Make GenerateMipmap check the target before finding an object.
If glGenerateMipmap was called with a bogus target, then it would
pass that to _mesa_get_current_tex_object(), which would raise a
_mesa_problem() telling people to file bugs.  We'd then do the
proper error checking, raise an error, and bail.

Doing the check first avoids the _mesa_problem().  The DSA variant
doesn't take a target parameter, so we leave the target validation
exactly as it was in that case.

Fixes one dEQP GLES2 test:
dEQP-GLES2.functional.negative_api.texture.generatemipmap.invalid_target.

v2: Rebase on Antia's recent patch to this area.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-07 14:01:22 -08:00
Samuel Pitoiset
8f99c1bbce gm107/ir: add emission for ATOMS
This allows to perform atomic operations on shared memory.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-07 22:13:14 +01:00
Samuel Pitoiset
7f8565f0b2 tgsi: fix parsing of shared memory declarations
The SHARED TGSI keyword is only allowed with TGSI_FILE_MEMORY and not
with TGSI_FILE_BUFFER. I have found this by using the nouveau_compiler
from command line.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
2016-03-07 22:13:08 +01:00
Samuel Pitoiset
c82086f7e9 gm107/ir: add emission for BAR
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-07 18:39:50 +01:00
Samuel Pitoiset
8a109c0375 gk110/ir: add missing src predicate emission for BAR.RED
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-07 18:39:48 +01:00
Samuel Pitoiset
f4d2d49152 gk110/ir: allow to emit immediates for BAR
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-07 18:39:46 +01:00
Samuel Pitoiset
cba89fdaa1 gk110/ir: fix wrong emission of BAR.SYNC
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-07 18:39:43 +01:00
Samuel Pitoiset
5777e87bed nvc0/ir: make sure that thread count immediate for BAR fit
The limit of the thread count immediate value is 12 bits.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-07 18:39:41 +01:00
Brian Paul
3af78b426e svga: add new surface-write-flushes HUD query
To know when we're flushing the command buffer because we need to
write to surface in the command buffer.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-07 09:33:15 -07:00
Brian Paul
7e8cf34546 svga: add new flush-time HUD query
To measure the time spent flushing the command buffer.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-07 09:33:15 -07:00
Brian Paul
903afc370f svga: also dump SVGA3D_BUFFER surfaces in svga_screen_cache_dump()
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-07 09:33:15 -07:00
Kristian Høgsberg Kristensen
32aa01663f anv: Quiet pTessellationState warning
Some application pass a dummy for pTessellationState which results in a
lot of noise. Only warn if we're actually given tessellation shadear
stages.
2016-03-06 22:06:24 -08:00
Ilia Mirkin
0941ef3dd5 mesa: flip current tf object back to default if current is being deleted
In the rather unusual case of Bind + Delete, we need to make sure that
we unbind the current tf object.

Fixes dEQP-GLES3.functional.lifetime.delete_bound.transform_feedback

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-07 00:36:08 -05:00
Ilia Mirkin
f6827e20d1 glsl: avoid stack smashing when there are too many attributes
This fixes a crash in

dEQP-GLES3.functional.transform_feedback.array_element.separate.points.lowp_mat3x2

and likely others. The vertex shader has > 16 input variables (without
explicit locations), which causes us to index outside of the to_assign
array.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-07 00:36:08 -05:00
Jason Ekstrand
23de78768b anv: Create fences from the batch BO pool
Applications may create a *lot* of fences, perhaps as much as one per
vkQueueSubmit.  Really, they're supposed to use ResetFence, but it's easy
enough for us to make them crazy-cheap so we might as well.
2016-03-06 14:26:52 -08:00
Francisco Jerez
3dd0441f6c i965/vec4: Propagate swizzles correctly during copy propagation.
This simplifies the code that iterates over the per-component values
found in the matching copy_entry struct and checks whether the
register regions that were copied to each component are similar enough
to be treated as a single (reswizzled) value which can be propagated
into the current instruction.

Aside from being scattered between opt_copy_propagation(),
try_copy_propagate(), and try_constant_propagate(), what I found
terribly confusing about the preexisting logic was that
opt_copy_propagation() tried to reorder the array of values according
to the swizzle of the instruction source, which meant one would have
had to invert the reordering applied at the top level in order to find
out which component to take from each value (we were just taking the
i-th component from the i-th value, which is not correct in general).
The saturate mask was also being swizzled incorrectly.

This consolidates the logic for matching multiple components of a
copy_entry into a single function which returns the result as a
regular src_reg on success, as if the copy had been performed with a
single MOV instruction copying all components of the src_reg into the
destination.

Fixes several ARB_vertex_program MOV test-cases from:
 https://cgit.freedesktop.org/~kwg/piglit/log/?h=arb_program

Acked-by: Matt Turner <mattst88@gmail.com>
2016-03-06 12:22:40 -08:00
Francisco Jerez
c70b7c80e3 i965: Don't try copy propagation if constant propagation succeeded.
It cannot get any better.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-06 12:22:40 -08:00
Francisco Jerez
dcf5e19e65 i965/vec4: Use swizzle() to swizzle immediates during constant propagation.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-06 12:22:40 -08:00
Francisco Jerez
ff7a2b489e i965: Add support for swizzling arbitrary immediates to (brw_)swizzle().
Scalar immediates used to be handled correctly by swizzle() (as the
identity) but since commit 58fa9d47b5 it
will corrupt the contents of the immediate.  Vector immediates were
never handled correctly, but we had ad-hoc code to swizzle VF
immediates in the vec4 copy propagation pass.  This takes care of
swizzling V and UV in addition.

v2: Don't implement swizzling of V/UV immediates (Matt).  If you need
    to swizzle an integer vector immediate in the future apply the
    following diff to go back to v1:

--- a/src/mesa/drivers/dri/i965/brw_eu.c
+++ b/src/mesa/drivers/dri/i965/brw_eu.c
@@ -119,11 +119,10 @@ brw_swap_cmod(uint32_t cmod)
 static unsigned
 imm_shift(enum brw_reg_type type, unsigned i)
 {
-   assert(type != BRW_REGISTER_TYPE_UV && type != BRW_REGISTER_TYPE_V &&
-          "Not implemented.");
-
    if (type == BRW_REGISTER_TYPE_VF)
       return 8 * (i & 3);
+   else if (type == BRW_REGISTER_TYPE_UV || type == BRW_REGISTER_TYPE_V)
+      return 4 * (i & 7);
    else
       return 0;
 }

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-06 12:22:40 -08:00
Francisco Jerez
537d3df974 i965: Pass symbolic swizzle to brw_swizzle() as a single argument.
And replace brw_swizzle1() with brw_swizzle().  Seems slightly cleaner
and will allow reusing brw_swizzle() in the vec4 back-end more easily.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-06 12:22:39 -08:00
Ilia Mirkin
ff085d014e nvc0: reset TFB bufctx when we no longer hold a reference to the buffers
This fixes some use-after-free situations in dEQP when an xfb state is
removed, and then a clear is triggered, which only does a partial
validation. It would attempt to read the no-longer-valid buffers,
resulting in crashes.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-06 10:14:52 -05:00
Jason Ekstrand
21ee5fd326 anv: Emit null render targets
v2 (Francisco Jerez): Add the state_offset to the surface state offset
2016-03-05 20:47:10 -08:00
Ilia Mirkin
fa43c4bd99 nv50/ir: using sampleid/pos shouldn't force per-sample interpolation
See https://www.khronos.org/bugzilla/show_bug.cgi?id=1462

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-05 23:26:03 -05:00
Ilia Mirkin
313205cb8f st/mesa: don't force per-sample interp if only sampleid/pos are used
The OES extensions clarify this behaviour to differentiate between
per-sample invocation and per-sample interpolation. Using sampleid/pos
will force per-sample invocation but not per-sample interpolation.

See https://www.khronos.org/bugzilla/show_bug.cgi?id=1462

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-05 23:26:03 -05:00
Ilia Mirkin
dcbf8377be swrast: fix GL_ANY_SAMPLES_PASSED values in Result
Since commit 922be4eab, the expectation is that the query result
contains the correct value. Unfortunately swrast does not distinguish
between GL_SAMPLES_PASSED and GL_ANY_SAMPLES_PASSED. As a result, we
must fix up the query result in a post-draw fixup.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94274
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
2016-03-05 23:25:52 -05:00
Jason Ekstrand
8502794c12 anv/pipeline: Handle null wm_prog_data in 3DSTATE_CLIP 2016-03-05 14:42:16 -08:00
Kristian Høgsberg Kristensen
7b348ab8a0 anv: Fix rebase error 2016-03-05 14:33:50 -08:00
Kristian Høgsberg Kristensen
34326f46df anv: Turn pipeline cache on by default
Move the environment variable check to cache creation time so we block
both lookups and uploads if it's turned off.
2016-03-05 13:54:24 -08:00
Kristian Høgsberg Kristensen
f2b37132cb anv: Check if shader if present before uploading to cache
Between the initial check the returns NO_KERNEL and compiling the
shader, other threads may have added the shader to the cache. Before
uploading the kernel, check again (under the mutex) that the compiled
shader still isn't present.
2016-03-05 13:54:24 -08:00
Kristian Høgsberg Kristensen
30bbe28b7e anv: Always use point size from the shader
There is no API for setting the point size and the shader is always
required to set it. Section 24.4:

   "If the value written to PointSize is less than or equal to zero, or
    if no value was written to PointSize, results are undefined."

As such, we can just always program PointWidthSource to Vertex. This
simplifies anv_pipeline a bit and avoids trouble when we enable the
pipeline cache and don't have writes_point_size in the prog_data.
2016-03-05 13:54:24 -08:00
Kristian Høgsberg Kristensen
6139fe9a77 anv: Also cache the struct anv_pipeline_binding maps
This is state the we generate when compiling the shaders and we need it
for mapping resources from descriptor sets to binding table indices.
2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen
584f39c65e anv: Don't re-upload shaders when merging
Using anv_pipeline_cache_upload_kernel() will re-upload the kernel and
prog_data when we merge caches. Since the kernel and prog_data is
already in the program_stream, use anv_pipeline_cache_add_entry()
instead to only add the entry to the hash table.
2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen
626559ed37 anv: Add anv_pipeline_cache_add_entry()
This function will grow the cache to make room and then add the entry.
2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen
07441c344c anv: Rename anv_pipeline_cache_add_entry() to 'set'
This function is a helper that unconditionally sets a hash table entry
and expects the cache to have enough room. Calling it 'add_entry'
suggests it will grow the cache as needed.
2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen
87967a2c85 anv: Simplify pipeline cache control flow a bit
No functional change, but the control flow around searching the cache
and falling back to compiling is a bit simpler.
2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen
2b29342fae anv: Store prog data in pipeline cache stream
We have to keep it there for the cache to work, so let's not have an
extra copy in struct anv_pipeline too.
2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen
37c5e70253 anv: Rename 'table' to 'hash_table' in anv_pipeline_cache
A little less ambiguous.
2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen
c028ffea70 anv: Serialize as much pipeline cache as we can
We can serialize as much as the application asks for and just stop once
we run out of memory. This lets applications use a fixed amount of
space for caching and still get some benefit.
2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen
cd812f086e anv: Use 1.0 pipeline cache header
The final version of the pipeline cache header adds a few more fields.
2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen
26ed943eb9 anv: Fix shader key hashing
This was copied from inline code to a helper and wasn't updated to hash
a pointer instead.
2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen
3baf8af947 anv: Remove excess whitespace 2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen
ab36eae5e7 anv: Remove left-over bits of sparse-descriptor code 2016-03-05 13:50:07 -08:00
Jason Ekstrand
1afdfc3e6e anv/pipeline: Implement the depth compare EQUAL workaround on gen8+ 2016-03-05 09:59:28 -08:00
Jason Ekstrand
7c1660aa14 anv: Don't allow D16_UNORM to be combined with stencil
Among other things, this can cause the depth or stencil test to spurriously
fail when the fragment shader uses discard.
2016-03-05 09:59:28 -08:00
Jason Ekstrand
9a90176d48 anv/pipeline: Calculate the correct max_source_attr for 3DSTATE_SBE 2016-03-05 09:59:28 -08:00
Brian Paul
a4678311be st/mesa: 78-column wrapping in st_extensions.c
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:21:05 -07:00
Brian Paul
9e6a6bd575 gallium/util: add new comments, assertions in u_debug_refcnt.c
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:20:34 -07:00
Brian Paul
b6a607b221 gallium/util: update comments and URL in u_debug_refcnt.c
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:20:28 -07:00
Brian Paul
cbca6964e2 gallium/util: make stream variable static in u_debug_refcnt.c
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:20:23 -07:00
Brian Paul
fb0abedce7 gallium/util: re-indent u_debug_refcnt.[ch]
Wrap comments to 78 columns, etc.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:20:14 -07:00
Brian Paul
a7ba29f6d8 gallium/tests: silence warning in compute.c
compute.c: In function ‘launch_grid’:
compute.c:435:20: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default]
         info.input = input;
                    ^

Maybe the pipe_grid_info::input field should be const void *?

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-05 09:15:44 -07:00
Timothy Arceri
31943e6ba5 glsl: replace remaining tabs in link_varyings.cpp
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2016-03-05 20:50:10 +11:00
Timothy Arceri
e2415e8467 glsl: replace remaining tabs in link_uniforms.cpp
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2016-03-05 20:50:05 +11:00
Jordan Justen
81f30e2f50 anv/hsw: Move query code to genX file for Haswell
This fixes many CTS cases, but will require an update to the kernel
command parser register whitelist. (The CS GPRs and TIMESTAMP
registers need to be whitelisted.)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-05 01:08:07 -08:00
Timothy Arceri
3322cb7b8d docs: mark align layout qualifier as DONE
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:39:13 +11:00
Timothy Arceri
037f68d81e glsl: apply align layout qualifier rules to block offsets
From Section 4.4.5 (Uniform and Shader Storage Block Layout
Qualifiers) of the OpenGL 4.50 spec:

  "The align qualifier makes the start of each block member have a
  minimum byte alignment.  It does not affect the internal layout
  within each member, which will still follow the std140 or std430
  rules. The specified alignment must be a power of 2, or a
  compile-time error results.

  The actual alignment of a member will be the greater of the
  specified align alignment and the standard (e.g., std140) base
  alignment for the member's type. The actual offset of a member is
  computed as follows: If offset was declared, start with that
  offset, otherwise start with the next available offset. If the
  resulting offset is not a multiple of the actual alignment,
  increase it to the first offset that is a multiple of the actual
  alignment. This results in the actual offset the member will have.

  When align is applied to an array, it affects only the start of
  the array, not the array's internal stride. Both an offset and an
  align qualifier can be specified on a declaration.

  The align qualifier, when used on a block, has the same effect as
  qualifying each member with the same align value as declared on
  the block, and gets the same compile-time results and errors as if
  this had been done. As described in general earlier, an individual
  member can specify its own align, which overrides the block-level
  align, but just for that member.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:39:07 +11:00
Timothy Arceri
5a27fefffe glsl: parse align layout qualifier
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:39:01 +11:00
Timothy Arceri
22b0082b9d docs: mark explicit byte offsets as DONE
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:55 +11:00
Timothy Arceri
802262c0af glsl: use explicit offset when lowering buffer access
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:49 +11:00
Timothy Arceri
96527c3cf2 glsl: copy explicit offset to uniform storage
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:44 +11:00
Timothy Arceri
e12a49ac12 glsl: update comment on offset field
The old comment was for the location not the offset, we now use
the field for block members so mention that also.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:39 +11:00
Timothy Arceri
9f24f42c49 glsl: add offset to glsl interface type
In this patch we also copy the offset value from the ast and
implement offset linking rules by adding it to the record_compare()
function.

From Section 4.4.5 (Uniform and Shader Storage Block Layout Qualifiers)
of the GLSL 4.50 spec:

   "Two blocks linked together in the same program with the same block
   name must have the exact same set of members qualified with
   offset and their integral-constant-expression values must be the
   same, or a link-time error results."

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:34 +11:00
Timothy Arceri
8abed7f185 glsl: apply compile-time rules for the offset layout qualifier
This implements the rules for the offset qualifier on block members.

From Section 4.4.5 (Uniform and Shader Storage Block Layout Qualifiers)
of the GLSL 4.50 spec:

   "The offset qualifier can only be used on block members of blocks
   declared with std140 or std430 layouts."

   ...

   "It is a compile-time error to specify an offset that is smaller than
   the offset of the previous member in the block or that lies within the
   previous member of the block."

   ...

   "The specified offset must be a multiple of the base alignment of the
   type of the block member it qualifies, or a compile-time error results."

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:30 +11:00
Timothy Arceri
6f45484ac7 glsl: enable offset layout qualifier for ARB_enhanced_layouts
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:26 +11:00
Timothy Arceri
1824ff1c2a glsl: reject invalid input layout qualifiers
Global in validation is already handled, this will do the validation
for variables, blocks and block members.

This fixes some CTS tests for the new enhanced layouts transform
feedback qualifiers.

V2: add some more valid input flags
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:07:09 +11:00
Timothy Arceri
bd53cc7b45 glsl: only apply default stream to output blocks
This is needed to allow invalid qualifier checks on inputs.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:07:04 +11:00
Timothy Arceri
78d3098c05 glsl: rework parsing of blocks
Previously interface blocks were giving the global default flags of
uniform blocks. This meant we could not check for invalid qualifiers
on interface blocks because they always contained invalid flags.

This changes parsing so that interface blocks now get an empty
set of layouts.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:07:00 +11:00
Timothy Arceri
d244986bf2 glsl: don't apply uniform/buffer layouts to interface blocks
If the following patch we will stop setting these layouts by default
on interface blocks, so we need to do this to avoid hitting the
assert.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:06:56 +11:00
Nanley Chery
4e75f9b219 anv: Implement VK_REMAINING_{MIP_LEVELS,ARRAY_LAYERS}
v2: Subtract the baseMipLevel and baseArrayLayer (Jason)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-04 21:25:23 -08:00
Kenneth Graunke
4ba7ad6cc1 i965: Only magnify depth for 3D textures, not array textures.
When BaseLevel > 0, we magnify the dimensions to fill out the size of
miplevels [0..BaseLevel).  In particular, this was magnifying depth,
thinking that the depth doubles at each level.  This is perfectly
reasonable for 3D textures, but dead wrong for array textures.

Changing the depth != 1 condition to a target == GL_TEXTURE_3D check
should make this only happen in the appropriate cases.

Fixes about 32 dEQP tests:
- dEQP-GLES31.functional.texture.gather.*.level_{1,2}

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2016-03-04 21:25:08 -08:00
Jason Ekstrand
c1436e80ef anv/meta_clear: Set the right number of dynamic states 2016-03-04 19:18:20 -08:00
Juan A. Suarez Romero
2f76a9924e i965/vec4: add opportunistic behaviour to opt_vector_float()
opt_vector_float() transforms several scalar MOV operations to a single
vectorial MOV.

This is done when those MOV covers all the components of the destination
register. So something like:

mov vgrf3.0.xy:D, 0D
mov vgrf3.0.w:D, 1065353216D
mov vgrf3.0.z:D, 0D

is transformed in:

mov vgrf3.0:F, [0F, 0F, 0F, 1F]

But there are cases where not all the components are written. For
example, in:

mov vgrf2.0.x:D, 1073741824D
mov vgrf3.0.xy:D, 0D
mov vgrf3.0.w:D, 1065353216D
mov vgrf4.0.xy:D, 1065353216D
mov vgrf4.0.w:D, 0D
mov vgrf6.0:UD, u4.xyzw:UD

Nor vgrf3 nor vgrf4 .z components are written, so the optimization is
not applied.

But it could be applied anyway with the components covered, using a
writemask to select the ones written. So we could transform it in:

mov vgrf2.0.x:D, 1073741824D
mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F]
mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F]
mov vgrf6.0:UD, u4.xyzw:UD

This commit does precisely that: opportunistically apply
opt_vector_float() when possible.

total instructions in shared programs: 7124660 -> 7114784 (-0.14%)
instructions in affected programs: 443078 -> 433202 (-2.23%)
helped: 4998
HURT: 0

total cycles in shared programs: 64757760 -> 64728016 (-0.05%)
cycles in affected programs: 1401686 -> 1371942 (-2.12%)
helped: 3243
HURT: 38

v2: change vectorize_mov() signature (Matt).
v3: take in account predicates (Juan).
v4 [mattst88]: Update shader-db numbers. Fix some whitespace issues.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2016-03-04 19:16:52 -08:00
Jason Ekstrand
cc57efc67a anv/pipeline: Fix depthBiasEnable on gen7
The first time I tried to fix this, I set the wrong fields.
2016-03-04 17:56:12 -08:00
Jason Ekstrand
653261285e anv/cmd_buffer: Reset the state streams when resetting the command buffer 2016-03-04 17:54:29 -08:00
Jason Ekstrand
f700d16a89 anv/cmd_buffer: Include Haswell in set_subpass 2016-03-04 17:54:29 -08:00
George Kyriazis
feb71117ae st/xlib: Don't destroy screen on XCloseDisplay()
screen may still be used by other resources that are not yet freed.
To correctly fix this there will be a need to account for resources
differently, but this quick fix is not any worse than the original
code that leaked screens anyway.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-04 18:14:46 -07:00
Nanley Chery
a6fb62a864 isl: Fix RenderTargetViewExtent for mipmapped 3D surfaces
Match the comment stated above the assignment.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-04 13:20:44 -08:00
Nanley Chery
b80c8ebc45 isl: Get rid of isl_surf_fill_state_info::level0_extent_px
This field is no longer needed.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-04 13:20:03 -08:00
Jason Ekstrand
d154a5ebd6 anv/cmd_buffer: Let the pipeline set StencilBufferWriteEnable on gen9 2016-03-04 12:23:01 -08:00
Jason Ekstrand
f374765ce6 anv/cmd_buffer: Mask stencil reference values 2016-03-04 12:22:32 -08:00
Jason Ekstrand
d61dcec64d anv/clear: Pull the stencil write mask from the pipeline
The stencil write mask wasn't getting set at all so we were using whatever
write mask happend to be left over by the application.
2016-03-04 12:03:00 -08:00
Jason Ekstrand
ec18fef88d anv/pipeline: Set StencilBufferWriteEnable from the pipeline
The hardware docs say that StencilBufferWriteEnable should only be set if
StencilTestEnable is set.  It seems reasonable to set them together.
2016-03-04 12:03:00 -08:00
Jason Ekstrand
fcd8e57185 anv/pipeline: More competent gen8 clipping 2016-03-04 12:03:00 -08:00
Jason Ekstrand
a8afd29653 anv/pipeline: Use the right provoking vertex for triangle fans 2016-03-04 12:03:00 -08:00
Jason Ekstrand
fa8539dd6b anv/pipeline: Respect pRasterizationState->depthBiasEnable 2016-03-04 12:03:00 -08:00
Matt Turner
1f862e923c i965/fs: Optimize float conversions of byte/word extract.
instructions in affected programs: 31535 -> 29966 (-4.98%)
   helped: 23

   cycles in affected programs: 272648 -> 266022 (-2.43%)
   helped: 14
   HURT: 1

The patch decreases the number of instructions in the two Unigine
programs by:

 #1721: 4374 -> 4155 instructions (-5.01%)
 #1706: 3582 -> 3363 instructions (-6.11%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-04 11:52:34 -08:00
Matt Turner
905ff86198 nir: Recognize open-coded extract_u16.
No shader-db changes, but does recognize some extract_u16 which enables
the next patch to optimize some code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-04 11:52:34 -08:00
Matt Turner
76289fbfa8 nir: Recognize open-coded extract_u8.
Two shaders that appear in Unigine benchmarks (Heaven and Valley) unpack
three bytes from an integer and convert each into a float:

   float((val >> 16u) & 0xffu)
   float((val >>  8u) & 0xffu)
   float((val >>  0u) & 0xffu)

Instead of shifting, masking, and type converting like this:

   shr(8)          g15<1>UD        g25<8,8,1>UD    0x00000010UD
   and(8)          g16<1>UD        g15<8,8,1>UD    0x000000ffUD
   mov(8)          g17<1>F         g16<8,8,1>UD

   shr(8)          g18<1>UD        g25<8,8,1>UD    0x00000008UD
   and(8)          g19<1>UD        g18<8,8,1>UD    0x000000ffUD
   mov(8)          g20<1>F         g19<8,8,1>UD

   and(8)          g21<1>UD        g25<8,8,1>UD    0x000000ffUD
   mov(8)          g22<1>F         g21<8,8,1>UD

i965 can simply extract a byte and convert to float in a single
instruction:

   mov(8)          g17<1>F         g25.2<32,8,4>UB
   mov(8)          g20<1>F         g25.1<32,8,4>UB
   mov(8)          g22<1>F         g25.0<32,8,4>UB

This patch implements the first step: recognizing byte extraction. A
later patch will optimize out the conversion to float.

   instructions in affected programs: 28568 -> 27450 (-3.91%)
   helped: 7

   cycles in affected programs: 210076 -> 203144 (-3.30%)
   helped: 7

This patch decreases the number of instructions in the two Unigine
programs by:

 #1721: 4520 -> 4374 instructions (-3.23%)
 #1706: 3752 -> 3582 instructions (-4.53%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-04 11:52:34 -08:00
Kenneth Graunke
9d7faadd8a anv: Fix backwards shadow comparisons
sample_c is backwards from what GL and Vulkan expect.

See intel_state.c in i965.

v2: Drop unused vk_to_gen_compare_op.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-04 11:35:46 -08:00
George Kyriazis
01e92e7010 st/xlib: Hang off screen destructor off main XCloseDisplay() callback.
This resolves some order dependencies between the already existing
callback the newly created one.

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-04 10:57:24 -07:00
George Kyriazis
51e562c3ea st/xlib: Support unlimited number of display connections
There is a limit of 10 display connections, which was a
problem for apps/tests that were continuously opening/closing display
connections.

This fix uses XAddExtension() and XESetCloseDisplay() to keep track
of the status of the display connections from the X server, freeing
mesa-related data as X displays get destroyed by the X server.

Poster child is the VTK "TimingTests"

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-04 10:57:09 -07:00
Brian Paul
192ee9adb1 svga: add new command-buffer-size HUD query
To plot a graph of the command buffer size.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-04 07:57:41 -07:00
Brian Paul
1258f907f4 svga: add new svga_winsys_context::get_command_buffer_size()
To ask how large the current command buffer is.  Will be used for
a new GALLIUM_HUD graph.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-04 07:57:41 -07:00
Brian Paul
6fc8d90fa9 svga: reorder SVGA_QUERY_ switch cases to match declaration order
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-03-04 07:57:41 -07:00
Sinclair Yeh
f1410c5b91 svga: Force an RGBA view creation for an RGBA resource
glXCreatePixmap() may specify a GLX_TEXTURE_FORMAT_RGB_EXT format
for an RGBA resource, causing us to create an RGBX view for an
RGBA resource, a combination vgpu10 does not support.

When this is detected, change the request to create an RGBA view
instead.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-04 07:57:41 -07:00
Charmaine Lee
8366701f4c svga: fix an error in svga_texture_generate_mipmap
With this patch, make sure the shader resource view is properly created
before referencing it in the generate mipmap command.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-04 07:57:41 -07:00
Thomas Hellstrom
395c7b8fa1 winsys/svga: Increase the fence timeout
If running with a software renderer backend, the timeout may be
insufficient, and we don't want to release busy buffers too early.

In practice, SVGA gpu lockups are extremely rare.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
2016-03-04 13:55:23 +01:00
Thomas Hellstrom
24ad7e16cd winsys/svga: Fix an uninitialized return value
Reported-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviwed-by: Brian Paul <brianp@vmware.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
2016-03-04 13:54:38 +01:00
Kenneth Graunke
9ec246796f i965: Set MaxFramebufferWidth/Height to 16384, not viewport.
dEQP-GLES31.functional.fbo.no_attachments.maximums.{all,height,size,width}
started hitting assertion failures when emitting SURFACE_STATE, after
commit e8fd60e789 where Samuel increased the maximum viewport size to
32768, from 16384.

MaxFramebufferWidth/Height were being set to the maximum viewport size,
but are actually limited by the SURFACE_STATE width/height field range,
which is 16384 on Gen7+ (where ARB_framebuffer_no_attachments is
exposed).  So, reduce these to 16384 explicitly.

Fixes assert fails in the above mentioned dEQP tests.  (Those tests
still fail, however.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-03 21:31:22 -08:00
Francisco Jerez
a6046d217d glsl: Improve the accuracy of the acos() approximation.
The adjusted polynomial coefficients come from the numerical
minimization of the L2 norm of the relative error.  The old
coefficients would give a maximum relative error of about 15000 ULP in
the neighborhood around acos(x) = 0, the new ones give a relative
error bounded by less than 2000 ULP in the same neighborhood.

Fixes four dEQP subtests:
dEQP-GLES31.functional.shaders.builtin_functions.precision.acos.
highp_compute.{scalar,vec2,vec3,vec4}

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-03 21:31:22 -08:00
Kenneth Graunke
2795fbcae3 glsl: Parameterize asin_expr() on the fit coefficients.
This will allow us to share the implementation while using different
polynomials for asin() and acos().

Francisco Jerez did this in the SPIR-V front-end; I'm merely porting
his idea to the GLSL world.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-03 21:31:22 -08:00
Kenneth Graunke
aa37cbdff7 mesa: Allow Get*() of several forgotten IsEnabled() pnames.
From section 6.2 ("State Tables") of the GL 2.1 specification
(the text also appears in the GL 3.0 and ES 3.1 specifications):
"However, state variables for which IsEnabled is listed as the query
 command can also be obtained using GetBooleanv, GetIntegerv, GetFloatv,
 and GetDoublev."

GL_DEBUG_OUTPUT, GL_DEBUG_OUTPUT_SYNCHRONOUS, and GL_FRAGMENT_SHADER_ATI
were missing from the glGet*() functions.  All other IsEnabled() pnames
look to be present, as far as I can tell.

Fixes 8 dEQP-GLES31.functional.debug.state_query subtests:
debug_output[_synchronous]_get{boolean,float,integer,integer64}.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-03-03 21:31:22 -08:00
Kenneth Graunke
b4b50b074b mesa: Make glGet queries initialize ctx->Debug when necessary.
dEQP-GLES31.functional.debug.state_query.debug_group_stack_depth_*
tries to call glGet on GL_DEBUG_GROUP_STACK_DEPTH right away, before
doing any other debug setup.  This should return 1.

However, because ctx->Debug wasn't allocated, we bailed and returned 0.

This patch removes the open-coded locking and switches the two glGet
functions to use _mesa_lock_debug_state(), which takes care of
allocating and initializing that state on the first time.  It also
conveniently takes care of unlocking on failure for us, so we don't
need to handle that in every caller.

Fixes dEQP-GLES31.functional.debug.state_query.debug_group_stack_depth_
{getboolean,getfloat,getinteger,getinteger64}.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-03-03 21:31:22 -08:00
Kenneth Graunke
3ed260f54c hack to make dota 2 menus work 2016-03-03 16:21:09 -08:00
Jason Ekstrand
56ba13c994 isl/surface_state: Set L2 bypass disable for certain BC* formats 2016-03-03 16:16:57 -08:00
Eduardo Lima Mitev
47392011c0 Update docs to advertise new support for ARB_internalformat_query2
Support in Mesa main and i965 has just been added.

v2: Include note in 'New Features' of docs/relnotes/11.3.0.html.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-03 22:19:35 +01:00
Kenneth Graunke
623ce595a9 anv: Compile shader stages in pipeline order.
Instead of the arbitrary order modules might be specified in.

Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:36:19 -08:00
Nanley Chery
8dddc3fb1e anv/meta: Delete unused functions
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:26:44 -08:00
Nanley Chery
d20f6abc85 anv/meta: Use blitter API for state-handling in Buffer Update/Copy
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:26:42 -08:00
Nanley Chery
318b67d157 anv/meta: Use blitter API in do_buffer_copy()
v2: Keep pitch in units of bytes (Jason)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:26:36 -08:00
Nanley Chery
96ff4d0679 anv/meta: Use blitter API in anv_CmdCopyImage()
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:26:35 -08:00
Nanley Chery
9b6c95d46e anv/meta: Use blitter API for copies between Images and Buffers
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:25:20 -08:00
Nanley Chery
91640c34c6 anv/meta: Add function which copies between Buffers and Images
v2: Keep pitch in units of bytes (Jason)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:25:15 -08:00
Nanley Chery
61ad78d0d1 anv/meta: Add function to create anv_meta_blit2d_surf from anv_image
v2: Keep pitch in units of bytes (Jason)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:25:10 -08:00
Nanley Chery
2e9b08b9b8 anv/meta: Implement the blitter API functions
Most of the code in anv_meta_blit2d() is borrowed from do_buffer_copy().

Create an image and image view for each rectangle.
Note: For tiled RGB images, ISL will align the image's row_pitch up to
the nearest tile width.

v2 (Jason):
    Keep pitch in units of bytes
    Make src_format and dst_format variables
    s/dest/dst/ in every usage
v3: Fix dst_image width

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:25:04 -08:00
Nanley Chery
032bf172b4 anv/meta: Modify blitter API fields
Some fields are unnecessary. The variables "pitch" and "bs" are used
for consistency with ISL.

v2: Keep pitch in units of bytes (Jason)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:24:53 -08:00
Jason Ekstrand
654f79a045 anv/meta: Add the beginnings of a blitter API
This API is designed to be an abstraction that sits between the VkCmdCopy
commands and the hardware.  The idea is that it is simple enough that it
*should* be implementable using the blitter but with enough extra data that
we can implement it with the 3-D pipeline efficiently.  One design
objective is to allow the user to supply enough information that we can
handle most blit operations with a single draw call even if they require
copying multiple rectangles.
2016-03-03 11:24:45 -08:00
Nanley Chery
d1e48b9945 anv/meta: Remove redundancies in do_buffer_copy()
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:24:42 -08:00
Nanley Chery
cfe7036750 anv/meta: Replace copy_format w/ block size in do_buffer_copy()
This is a preparatory commit that will simplify the future usage of
this function.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:24:38 -08:00
Nanley Chery
d50ff250ec anv/meta: Add missing command to exit meta in anv_CmdUpdateBuffer()
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:24:21 -08:00
Nanley Chery
1d9d90d9a6 anv/image: Create a linear image when requested
If a linear image is requested, the only possible result should be a
linearly-tiled surface.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:24:17 -08:00
Nanley Chery
091f1da902 isl: Don't filter tiling flags if a specific tiling bit is set
If a specific bit is set, the intention to create a surface with a
specific tiling format should be respected.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 11:23:40 -08:00
Nanley Chery
456f5b0314 isl: Add function to get intratile offsets from x/y offsets
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-03 10:56:15 -08:00
Jason Ekstrand
206414f92e anv/util: Fix vector resizing
It wasn't properly handling the fact that wrap-around in the source may not
translate to wrap-around in the destination.  This really needs unit tests.
2016-03-03 08:17:36 -08:00
Antia Puentes
4f028bfcc0 i965: Enable the ARB_internalformat_query2 extension
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:08 +01:00
Eduardo Lima Mitev
cbbdf8612d i965/formatquery: Add support for INTERNALFORMAT_PREFERRED query
This pname is tricky. The spec states that an internal format should be
returned, that is compatible with the passed internal format, and has
at least the same precision. There is no clear API to resolve this.

The closest we have (and what other drivers (i.e, NVidia proprietary) do,
is to return the same internal format given as parameter. But we validate
first that the passed internal format is supported by i965.

To check for support, we have the TextureFormatSupported map'. But
this map expects a 'mesa_format', which takes a format+typen. So, we must
first "come up" with a generic type that is suited for this internal format,
then get a mesa_format, and then do the validation.

The cleanest solution here is to add a method that does exactly what
the spec wants: a driver's preferred internal format from a given
internal format. But at this point we lack a clear view of what
defines this preference, and also there seems to be no API for it.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:08 +01:00
Eduardo Lima Mitev
e064f43485 mesa/glformats: Consider DEPTH/STENCIL when resolving a mesa_format
_mesa_format_from_format_and_type() is currently not considering DEPTH and
STENCIL formats, which are not array formats and are not handled anywhere.

This patch adds cases for common combinations of DEPTH/STENCIL format and
types.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev
ec299602a6 mesa/formatquery: Add (GET_)TEXTURE_IMAGE_TYPE pnames
These basically reuse the default implementation of GL_READ_PIXELS_TYPE.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev
23f94146c9 mesa/formatquery: Add (GET_)TEXTURE_IMAGE_FORMAT pnames
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev
020671f2a3 mesa/formatquery: Add READ_PIXELS_TYPE pname
We call the driver to provide its preferred type, but also provide a
default implementation that selects a generic type based on the passed
internal format.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev
bec286f724 mesa/formatquery: Add READ_PIXELS_FORMAT pname
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev
09550c16a5 mesa/formatquery: Add support for READ_PIXELS query
This is supported since very early version of OpenGL, but we still call the
driver to give it the opportunity to report caveat or no support.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Alejandro Piñeiro
8d7696f638 mesa/formatquery: added FILTER pname support
It discards out the targets and internalformats that explicitly
mention (per-spec) that doesn't support filter types other than
NEAREST or NEAREST_MIPMAP_NEAREST. Those are:

  * Texture buffers target
  * Multisample targets
  * Any integer internalformat

For the case of multisample targets, it was used the existing method
_mesa_target_allows_setting_sampler_parameter. This would scalate
better in the future if new targets appear that doesn't allow to
set sampler parameters.

We consider RENDERBUFFER to support LINEAR filters, because although
it doesn't support this filter for sampling, you can set LINEAR
on a blit operation using glBlitFramebuffer.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Alejandro Piñeiro
a8736a2567 mesa/texparam: make public target_allows_setting_sampler_parameters
In order to allow to be used on ARB_internalformat_query2 implementation.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
e8ab7727e1 mesa/formatquery: Added framebuffer renderability related queries
From the ARB_internalformat_query2 specification:

   "- FRAMEBUFFER_RENDERABLE: The support for rendering to the resource via
      framebuffer attachment is returned in <params>.

    - FRAMEBUFFER_RENDERABLE_LAYERED: The support for layered rendering to
      the resource via framebuffer attachment is returned in <params>.

    - FRAMEBUFFER_BLEND: The support for rendering to the resource
      via framebuffer attachment when blending is enabled is returned in
      <params>."

For all of them,
    "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE.
     If the resource is unsupported, NONE is returned."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
b4ee9f56fd mesa/formatquery: Added texture gather/shadow related queries
From the ARB_internalformat_query2 specification:

   "- TEXTURE_SHADOW: The support for using the resource with shadow samplers
      is written to <params>.

    - TEXTURE_GATHER: The support for using the resource with texture gather
      operations is written to <params>.

    - TEXTURE_GATHER_SHADOW: The support for using resource with texture gather
      operations with shadow samplers is written to <params>."

For all of them,

    "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE.
     If the resource or operation is not supported, NONE is returned."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
557939c08f mesa/formatquery: Added texture view related queries
From the ARB_internalformat_query2 specification:

   "- TEXTURE_VIEW: The support for using the resource with the TextureView
      command is returned in <params>.
      Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE.
      If the resource or operation is not supported, NONE is returned.

    - VIEW_COMPATIBILITY_CLASS: The compatibility class of the resource when
      used as a texture view is returned in <params>. The compatibility
      class is one of the values from the /Class/ column of Table 3.X.2. If
      the resource has no other formats that are compatible, the resource
      does not support views, or if texture views are not supported, NONE is
      returned."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
04e2e0b24a mesa/textureview: Make _lookup_view_class public
It will be used by the ARB_internalformat_query2 implementation to
implement the VIEW_COMPATIBILITY_CLASS <pname> query.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
2066c7be61 mesa/formatquery: Added CLEAR_BUFFER <pname> query
From the ARB_internalformat_query2 specification:

   "- CLEAR_BUFFER: The support for using the resource with ClearBuffer*Data
      commands is returned in <params>.
      Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE.
      If the resource or operation is not supported, NONE is returned."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
aed633bb97 mesa/formatquery: Added compressed texture related queries
From the ARB_internalformat_query2 specification:

   "- TEXTURE_COMPRESSED: If <internalformat> is a compressed format
      that is supported for this type of resource, TRUE is returned in
      <params>. If the internal format is not compressed, or the type of
      resource is not supported, FALSE is returned.

    - TEXTURE_COMPRESSED_BLOCK_WIDTH: If the resource contains a compressed
      format, the width of a compressed block (in bytes) is returned in
      <params>. If the internal format is not compressed, or the resource
      is not supported, 0 is returned.

    - TEXTURE_COMPRESSED_BLOCK_HEIGHT: If the resource contains a compressed
      format, the height of a compressed block (in bytes) is returned in
      <params>. If the internal format is not compressed, or the resource
      is not supported, 0 is returned.

    - TEXTURE_COMPRESSED_BLOCK_SIZE: If the resource contains a compressed
      format the number of bytes per block is returned in <params>.  If the
      internal format is not compressed, or the resource is not supported,
      0 is returned.
      (combined with the above, allows the bitrate to be computed, and may be
      useful in conjunction with ARB_compressed_texture_pixel_storage)."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
467f462c75 mesa/formatquery: Added simultaneous texture and depth/stencil queries
From the ARB_internalformat_query2 specification:

   "- SIMULTANEOUS_TEXTURE_AND_DEPTH_TEST: The support for using the resource
      both as a source for texture sampling while it is bound as a buffer for
      depth test is written to <params>. For example, a depth (or stencil)
      texture could be bound simultaneously for texturing while it is bound as
      a depth (and/or stencil) buffer without causing a feedback loop, provided
      that depth writes are disabled.

    - SIMULTANEOUS_TEXTURE_AND_STENCIL_TEST: The support for using the resource
      both as a source for texture sampling while it is bound as a buffer for
      stencil test is written to <params>. For example, a depth (or stencil)
      texture could be bound simultaneously for texturing while it is bound as
      a depth (and/or stencil) buffer without causing a feedback loop,
      provided that stencil writes are disabled.

    - SIMULTANEOUS_TEXTURE_AND_DEPTH_WRITE: The support for using the resource
      both as a source for texture sampling while performing depth writes to
      the resources is written to <params>.  For example, a depth-stencil
      texture could be bound simultaneously for stencil texturing while it
      is bound as a depth buffer. Feedback loops cannot occur because sampling
      a stencil texture only returns the stencil portion, and thus writes to
      the depth buffer do not modify the stencil portions.

    - SIMULTANEOUS_TEXTURE_AND_STENCIL_WRITE: The support for using the resource
      both as a source for texture sampling while performing stencil writes to
      the resources is written to <params>.  For example, a depth-stencil
      texture could be bound simultaneously for depth-texturing while it is
      bound as a stencil buffer. Feedback loops cannot occur because sampling
      a depth texture only returns the depth portion, and thus writes to
      the stencil buffer could not modify the depth portions.

For all of them,
    "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE.
     If the resource or operation is not supported, NONE is returned."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
bd45fb3de4 mesa/formatquery: Added queries related to image textures
From the ARB_internalformat_query2 specification:

   "- IMAGE_TEXEL_SIZE: The size of a texel when the resource when used as
      an image texture is returned in <params>.  This is the value from the
      /Size/ column in Table 3.22. If the resource is not supported for image
      textures, or if image textures are not supported, zero is returned.

    - IMAGE_COMPATIBILITY_CLASS: The compatibility class of the resource when
      used as an image texture is returned in <params>.  This corresponds to
      the value from the /Class/ column in Table 3.22. The possible values
      returned are IMAGE_CLASS_4_X_32, IMAGE_CLASS_2_X_32, IMAGE_CLASS_1_X_32,
      IMAGE_CLASS_4_X_16, IMAGE_CLASS_2_X_16, IMAGE_CLASS_1_X_16,
      IMAGE_CLASS_4_X_8, IMAGE_CLASS_2_X_8, IMAGE_CLASS_1_X_8,
      IMAGE_CLASS_11_11_10, and IMAGE_CLASS_10_10_10_2, which correspond to
      the 4x32, 2x32, 1x32, 4x16, 2x16, 1x16, 4x8, 2x8, 1x8, the class
      (a) 11/11/10 packed floating-point format, and the class (b)
      10/10/10/2 packed formats, respectively.
      If the resource is not supported for image textures, or if image
      textures are not supported, NONE is returned.

    - IMAGE_PIXEL_FORMAT: The pixel format of the resource when used as an
      image texture is returned in <params>.  This is the value
      from the /Pixel format/ column in Table 3.22. If the resource is not
      supported for image textures, or if image textures are not supported,
      NONE is returned.

    - IMAGE_PIXEL_TYPE: The pixel type of the resource when used as an
      image texture is returned in <params>.  This is the value from
      the /Pixel type/ column in Table 3.22. If the resource is not supported
      for image textures, or if image textures are not supported, NONE is
      returned."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
990a7200e0 mesa/shaderimage: Added func to get the GL_IMAGE_CLASS from the format
It will be used by the ARB_internalformat_query2 implementation to
implement the IMAGE_COMPATIBILITY_CLASS <pname> query.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
52c3692324 mesa/formatquery: Added SHADER_IMAGE_{LOAD,STORE,ATOMIC} <pname> queries
From the ARB_internalformat_query2 specification:

    "- SHADER_IMAGE_LOAD: The support for using the resource with image load
      operations in shaders is written to <params>.
      In this case the <internalformat> is the value of the <format> parameter
      that would be passed to BindImageTexture.

    - SHADER_IMAGE_STORE: The support for using the resource with image store
      operations in shaders is written to <params>.
      In this case the <internalformat> is the value of the <format> parameter
      that is passed to BindImageTexture.

    - SHADER_IMAGE_ATOMIC: The support for using the resource with atomic
      memory operations from shaders is written to <params>."

For all of them:

    "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE.
     If the resource or operation is not supported, NONE is returned."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
876f7a7c08 mesa/shaderimage: Make is_image_format_supported public
It will be used by the ARB_internalformat_query2 implementation
to implement queries related to the ARB_shader_image_load_store
extension.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
fae2b10ff9 mesa/formatquery: Added queries related to texture sampling in shaders
From the ARB_internalformat_query2 specification:

   "- VERTEX_TEXTURE: The support for using the resource as a source for
      texture sampling in a vertex shader is written to <params>.

    - TESS_CONTROL_TEXTURE: The support for using the resource as a source for
      texture sampling in a tessellation control shader is written to <params>.

    - TESS_EVALUATION_TEXTURE: The support for using the resource as a source
      for texture sampling in a tessellation evaluation shader is written to
      <params>.

    - GEOMETRY_TEXTURE: The support for using the resource as a source for
      texture sampling in a geometry shader is written to <params>.

    - FRAGMENT_TEXTURE: The support for using the resource as a source for
      texture sampling in a fragment shader is written to <params>.

    - COMPUTE_TEXTURE: The support for using the resource as a source for
      texture sampling in a compute shader is written to <params>."

For all of them,

   "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE.
    If the resource or operation is not supported, NONE is returned."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
aeb759c7d6 mesa/formatquery: Added SRGB_DECODE_ARB <pname> query
From the ARB_internalformat_query2 specification:

   "- SRGB_DECODE_ARB: The support for toggling whether sRGB decode happens at
      sampling time (see EXT/ARB_texture_sRGB_decode) for the resource is
      returned in <params>.
      Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE.
      If the resource or operation is not supported, NONE is returned."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
bcb2f9cdb9 mesa/formatquery: Added SRGB_{READ,WRITE} <pname> queries
From the ARB_internalformat_query2 specification:

   "- SRGB_READ: The support for converting from sRGB colorspace on read
      operations (see section 3.9.18) from the resource is returned in
      <params>.
      Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE.
      If the resource or operation is not supported, NONE is returned.

    - SRGB_WRITE: The support for converting to sRGB colorspace on write
      operations to the resource is returned in <params>.
      This indicates that writing to framebuffers with this internalformat
      will encode to sRGB color spaces when FRAMEBUFFER_SRGB is enabled (see
      section 4.1.8).
      Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE.
      If the resource or operation is not supported, NONE is returned."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
e88cbb7a51 mesa/formatquery: Added COLOR_ENCODING <pname> query.
From the ARB_internalformat_query2 specification:

   "- COLOR_ENCODING: The color encoding for the resource is returned in
      <params>.  Possible values for color buffers are LINEAR or SRGB,
      for linear or sRGB-encoded color components, respectively. For non-color
      formats (such as depth or stencil), or for unsupported resources,
      the value NONE is returned."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev
b1755535ec mesa/glformats: Add a helper function _mesa_is_srgb_format()
Returns true if the passed format is an sRGB format, false otherwise.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
87b2de3998 mesa/formatquery: Added mipmap related <pname> queries
Specifically MIPMAP, MANUAL_GENERATE_MIPMAP and AUTO_GENERATE_MIPMAP <pname>
queries.

From the ARB_internalformat_query2 specification:

   "- MIPMAP: If the resource supports mipmaps, TRUE is returned in <params>.
      If the resource is not supported, or if mipmaps are not supported for
      this type of resource, FALSE is returned.

    - MANUAL_GENERATE_MIPMAP: The support for manually generating mipmaps for
      the resource is returned in <params>.
      Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE.
      If the resource is not supported, or if the operation is not supported,
      NONE is returned.

    - AUTO_GENERATE_MIPMAP: The support for automatic generation of mipmaps
      for the resource is returned in <params>.
      Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE.
      If the resource is not supported, or if the operation is not supported,
      NONE is returned."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:07 +01:00
Antia Puentes
079d99b830 mesa/genmipmap: Added a function to validate the internalformat
It will be used by the ARB_internalformat_query2 implementation to
implement mipmap related queries.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
06852f4b7a mesa/genmipmap: Added a function to check if the target is valid
It will be used by the ARB_internalformat_query2 implementation to
implement mipmap related queries.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
df3a37311d mesa/formatquery: Added {COLOR,DEPTH,STENCIL}_RENDERABLE <pname> queries
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
c22ceb08bb mesa/formatquery: Added {COLOR,DEPTH,STENCIL}_COMPONENTS <pname> queries
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Alejandro Piñeiro
e976a30db8 mesa/formatquery: support for MAX_COMBINED_DIMENSIONS
It is implemented combining the values returned by calls to the 32-bit
query _mesa_GetInternalformati32v.

The main reason is simplicity. The other option would be C&P how we
implemented the support of GL_MAX_{WIDTH/HEIGHT/DEPTH} and GL_SAMPLES.

Additionally, doing this way, we avoid adding checks on the code, as
are done by the call to the query itself.

MAX_COMBINED_DIMENSIONS is the only pname pointed on the spec of
needing a 64-bit query. We handle that possibility by packing the
returning value on the two first 32-bit integers of params. This
would work on the 32-bit query as far as the value is not greater
that INT_MAX. On the 64-bit query wrapper we unpack those values
in order to get the final value.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Eduardo Lima Mitev
c5cf16a4fc mesa/teximage: add _mesa_is_cube_map_texture utility method
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Alejandro Piñeiro
4e33278b39 main/formatquery: support for MAX_{WIDTH/HEIGHT/DEPTH/LAYERS}
Implemented by calling GetIntegerv with the equivalent pname and
handling individually the exceptions related to dimensions.

All those pnames are used to get the maximum value for each dimension
of the given target. The only difference between this calls and
calling GetInteger with pnames like GL_MAX_TEXTURE_SIZE,
GL_MAX_3D_TEXTURE_SIZE, etc is that GetInternalformat allows to
specify a internalformat.

But at this moment, there is no reason to think that the values would
be different based on the internalformat. The spec already take that
into account, using these specific pnames as example on Issue 7 of
arb_internalformat_query2 spec.

So this seems like a hook to allow to return different values based on
the internalformat in the future.

It is worth to note that the piglit test associated to those pnames
are checking the returned values of GetInternalformat against the
values returned by GetInteger, and the test is passing with NVIDIA
proprietary drivers.

main/formatquery: support for MAX_{WIDTH/HEIGHT/DEPTH/LAYERS}

Implemented by calling GetIntegerv with the equivalent pname and
handling individually the exceptions related to dimensions.

All those pnames are used to get the maximum value for each dimension
of the given target. The only difference between this calls and
calling GetInteger with pnames like GL_MAX_TEXTURE_SIZE,
GL_MAX_3D_TEXTURE_SIZE, etc is that GetInternalformat allows to
specify a internalformat.

But at this moment, there is no reason to think that the values would
be different based on the internalformat. The spec already take that
into account, using these specific pnames as example on Issue 7 of
arb_internalformat_query2 spec.

So this seems like a hook to allow to return different values based on
the internalformat in the future.

It is worth to note that the piglit test associated to those pnames
are checking the returned values of GetInternalformat against the
values returned by GetInteger, and the test is passing with NVIDIA
proprietary drivers.

v2: use _mesa_has## instead of direct ctx->Extensions access (Nanley Chery)

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Alejandro Piñeiro
b750144b0a mesa/formatquery: support for IMAGE_FORMAT_COMPATIBILITY_TYPE
From arb_internalformat_query2 spec:

 "IMAGE_FORMAT_COMPATIBILITY_TYPE: The matching criteria use for the
  resource when used as an image textures is returned in
  <params>. This is equivalent to calling GetTexParameter with <value>
  set to IMAGE_FORMAT_COMPATIBILITY_TYPE."

Current implementation of GetTexParameter for this case returns a
field of a texture object, so the support of this pname was
implemented creating a temporal texture object and returning that
value.

It is worth to mention that right now that field is not reassigned
after initialization. So it is somehow hardcoded. An alternative
option would be return that value. That doesn't seems really scalable
though.

v2: use _mesa_has## instead of direct ctx->Extensions access (Nanley Chery)

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Alejandro Piñeiro
e98a3c799f mesa/formatquery: handle unmodified buffer for SAMPLES on the 64-bit query
From arb_internalformat_query2 spec:

" If <internalformat> is not color-renderable, depth-renderable, or
  stencil-renderable (as defined in section 4.4.4), or if <target>
  does not support multiple samples (ie other than
  TEXTURE_2D_MULTISAMPLE, TEXTURE_2D_MULTISAMPLE_ARRAY, or
  RENDERBUFFER), <params> is not modified."

So there are cases where the buffer should not be modified. As the
64-bit query is a wrapper over the 32-bit query, we can't just copy
the values to the equivalent 32-bit buffer, as that would fail if the
original params contained values greater that INT_MAX. So we need to
copy-back only the values that got modified by the 32-bit query. We do
that by filling the temporal buffer by negatives, as the 32-bit query
should not return negative values ever.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Alejandro Piñeiro
580816b747 mesa/formatquery: initial implementation for GetInternalformati64v
It just does a wrapping on the existing 32-bit GetInternalformativ.

We will maintain the 32-bit query as default as it is likely that
it would be the one most used.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
7241e1b5f4 mesa/formatquery: Added INTERNALFORMAT_{X}_{SIZE,TYPE} <pname> queries
From the ARB_internalformat_query2 spec:

   "- INTERNALFORMAT_RED_SIZE
    - INTERNALFORMAT_GREEN_SIZE
    - INTERNALFORMAT_BLUE_SIZE
    - INTERNALFORMAT_ALPHA_SIZE
    - INTERNALFORMAT_DEPTH_SIZE
    - INTERNALFORMAT_STENCIL_SIZE
    - INTERNALFORMAT_SHARED_SIZE
      For uncompressed internal formats, queries of these values return the
      actual resolutions that would be used for storing image array components
      for the resource.
      For compressed internal formats, the resolutions returned specify the
      component resolution of an uncompressed internal format that produces
      an image of roughly the same quality as the compressed algorithm.
      For textures this query will return the same information as querying
      GetTexLevelParameter{if}v for TEXTURE_*_SIZE would return.
      If the internal format is unsupported, or if a particular component is
      not present in the format, 0 is written to <params>.

    - INTERNALFORMAT_RED_TYPE
    - INTERNALFORMAT_GREEN_TYPE
    - INTERNALFORMAT_BLUE_TYPE
    - INTERNALFORMAT_ALPHA_TYPE
    - INTERNALFORMAT_DEPTH_TYPE
    - INTERNALFORMAT_STENCIL_TYPE
      For uncompressed internal formats, queries for these values return the
      data type used to store the component.
      For compressed internal formats the types returned specify how components
      are interpreted after decompression.
      For textures this query returns the same information as querying
      GetTexLevelParameter{if}v for TEXTURE_*TYPE would return.
      Possible values return include, NONE, SIGNED_NORMALIZED,
      UNSIGNED_NORMALIZED, FLOAT, INT, UNSIGNED_INT, representing missing,
      signed normalized fixed point, unsigned normalized fixed point,
      floating-point, signed unnormalized integer and unsigned unnormalized
      integer components. NONE is returned for all component types if the
      format is unsupported."

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
675418182b mesa/main: Extend _mesa_get_format_bits to accept new pnames
The new pnames accepted by the function are:

        - INTERNALFORMAT_RED_SIZE
        - INTERNALFORMAT_GREEN_SIZE
        - INTERNALFORMAT_BLUE_SIZE
        - INTERNALFORMAT_ALPHA_SIZE
        - INTERNALFORMAT_DEPTH_SIZE
        - INTERNALFORMAT_STENCIL_SIZE

It will be used by the ARB_internalformat_query2 implementation to
implement those pnames.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
4a8dae6247 mesa/main: Extend _mesa_base_format_has_channel to accept new pnames
The new pnames accepted by the function are:

        - INTERNALFORMAT_RED_SIZE
        - INTERNALFORMAT_GREEN_SIZE
        - INTERNALFORMAT_BLUE_SIZE
        - INTERNALFORMAT_ALPHA_SIZE
        - INTERNALFORMAT_DEPTH_SIZE
        - INTERNALFORMAT_STENCIL_SIZE
        - INTERNALFORMAT_RED_TYPE
        - INTERNALFORMAT_GREEN_TYPE
        - INTERNALFORMAT_BLUE_TYPE
        - INTERNALFORMAT_ALPHA_TYPE
        - INTERNALFORMAT_DEPTH_TYPE
        - INTERNALFORMAT_STENCIL_TYPE

It will be used by the ARB_internalformat_query2 implementation to
implement those pnames.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
f1c789fa00 mesa/main: Make legal_get_tex_level_parameter_target public
It will be used by the ARB_internalformat_query2 implementation
to check if the target is valid for those <pnames> that are said
in the spec that should return the same values than the
'glGetTexLevelParameter{if}v' function:

    - INTERNALFORMAT_RED_SIZE
    - INTERNALFORMAT_GREEN_SIZE
    - INTERNALFORMAT_BLUE_SIZE
    - INTERNALFORMAT_ALPHA_SIZE
    - INTERNALFORMAT_DEPTH_SIZE
    - INTERNALFORMAT_STENCIL_SIZE
    - INTERNALFORMAT_SHARED_SIZE
    - INTERNALFORMAT_RED_TYPE
    - INTERNALFORMAT_GREEN_TYPE
    - INTERNALFORMAT_BLUE_TYPE
    - INTERNALFORMAT_ALPHA_TYPE
    - INTERNALFORMAT_DEPTH_TYPE
    - INTERNALFORMAT_STENCIL_TYPE
    - IMAGE_FORMAT_COMPATIBILITY_TYPE

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Eduardo Lima Mitev
eacb2c971e mesa/formatquery: Added INTERNALFORMAT_PREFERRED pname
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
56ec2dfcb1 mesa/formatquery: Added the INTERNALFORMAT_SUPPORTED <pname> query
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
4722abc630 mesa/formatquery: Added a func to check <internalformat> supported
From the ARB_internalformat_query2 specification:

  "The INTERNALFORMAT_SUPPORTED <pname> can be used to determine if
   the internal format is supported, and the  other <pnames> are defined
   in terms of whether or not the format is supported."

v2: Consider also FBO base formats when checking if the internalformat is
    supported.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
5f6e3a0370 mesa/formatquery: Added func to check if the 'resource' is supported
Checks that the 'resource', as defined by the ARB_internalformat_query2
specification, is supported by the implementation for those 'pnames'
that require this check.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Alejandro Piñeiro
95392cfa9d mesa/main: not fill mesa_error on _mesa_legal_texture_base_format_for_target
This would allow to use this method if you are just querying if it is
allowed, like for arb_internalformat_query2.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
aaf5ad513b mesa/teximage: Make _mesa_format_no_online_compression public
It will be used by the ARB_internalformat_query2 implementation
to check if a certain compressed 'internalformat' is supported
by texture 'targets'.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
5eef355823 mesa/teximage: make public is_renderable_texture_format
It will be used by the ARB_internalformat_query2 implementation
to check if the 'internalformat' passed is supported by texture
MULTISAMPLE 'targets'.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
b5d27bc5dd mesa/main: Added empty skeleton of glGetInternalformati64v
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Alejandro Piñeiro
2453bba504 mesa: Add dispatch and extension XML for GL_ARB_internalformat_query2
Equivalent to commit bda540 (that added GL_ARB_internalformat_query)

v2: include the new xml to to API_XML list at Makefile.am (Emil Velikov)

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:06 +01:00
Antia Puentes
d432337e2d mesa/formatquery: Added boilerplate code to extend GetInternalformativ
The goal is to extend the GetInternalformativ query to implement the
ARB_internalformat_query2 specification, keeping the behaviour defined
by the ARB_internalformat_query if ARB_internalformat_query2 is not
supported.

v2: Don't require ARB_internalformat_query when profile is GLES3.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Antia Puentes
806bc2bf22 mesa/formatquery: Added a func to check if the <target> is supported
From the ARB_internalformat_query2 spec:

  "If the particular <target> and <internalformat> combination do not make
   sense, or if a particular type of <target> is not supported by the
   implementation the "unsupported" answer should be given. This is not an
   error."

This function checks if the <target> is supported by the implementation.

v2: Allow RENDERBUFFER targets also on GLES 3 profiles.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Antia Puentes
4af3e5e9f1 mesa/formatquery: Added function to set 'unsupported' responses
The ARB_internalformat_query2 specification defines which is the
reponse best representing "not supported" or "not applicable" for
each <pname>.

Queries for unsupported features, targets, internalformats, combinations
of: target and internalformat, target and pname, pname and internalformat,
do not return an error but the corresponding 'unsupported' response.
We will use that response as the default answer.

For SAMPLES the 'unsupported' response is to not modify the 'params' buffer.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Antia Puentes
a6434f41cc mesa/formatquery: Added function to validate parameters
Handles the cases where an error should be returned according
to the ARB_internalformat_query and ARB_internalformat_query2
specifications.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Antia Puentes
b89463cdfd mesa/main: Add extension tracking bit for ARB_internalformat_query2
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev
a347a0f53f mesa: Completely remove QuerySamplesForFormat from driver func table
At this point, all uses have been replaced by the more general hook
QueryInternalFormat, introduced by ARB_internalformat_query2.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev
993d7345b7 mesa/formatquery: Use new driver hook QueryInternalFormat
Implements SAMPLES and NUM_SAMPLE_COUNTS queries using the new generic
driver call QueryInternalFormat, which is being introduced as replacement
of QuerySamplesForFormat to support ARB_internalformat_query2.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev
25ee5c60dc mesa/formatquery: Remove tracking of number of elements in the response
Currently, the number of integers returned in the response to
GetInternalFormativ is being tracked by a 'count' variable.
This is so only the modified elements from the temporary buffer are copied into
the original user buffer.

However, with the introduction of ARB_internalformat_query2, keeping track
of 'count' would complicate the code a lot, considering the high number of
queries.

So, we propose to forget about tracking count, and move all the 16 elements
in the temporary buffer, back to the user buffer (clamped to user buffer size
of course). This is basically a trade-off between performance and code clarity.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev
1f0b2ce8ec mesa/multisample: Check sample count using the new driver hook
Use QueryInternalFormat instead of QuerySamplesForFormat to obtain the
highest supported sample. QuerySamplesForFormat is to be removed.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev
ee31b0b1d0 st/format: Replace QuerySamplesForFormat by new QueryInternalFormat hook
The previous code for SAMPLES and NUM_SAMPLE_COUNTS is reused as a private function.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev
82be7735f3 i965/formatquery: Respond queries SAMPLES and NUM_SAMPLE_COUNTS
This effectively disables old QuerySamplesForFormat driver hook, since it is
never called by Mesa anymore.

v2: Call brw_query_samples_for_format() with a dummy buffer to calculate num
    samples, to avoid modifying the original buffer.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev
2dabff9068 i965: Move brw_query_samples_for_format() to brw_queryformat.c
Now that there is a dedicated source file for internal format queries, this
function belongs there.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev
28144c4476 i965: Add boilerplate function for QueryInternalFormat driver hook
By default, we call back the driver's hook fallback function that has generic
implementations for the all the queries.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev
45054f9702 mesa: Add a default QueryInternalFormat() function for drivers
This is a fallback function for drivers not implementing
ARB_internalformat_query2.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev
93d30c3de9 mesa: Add QueryInternalFormat to device driver virtual table
This new function queries different driver parameters for a particular target
and texture format. It is basically a driver hook to support
ARB_internalformat_query2.

Since ARB_internalformat_query2 introduced several new query parameters
over ARB_internalformat_query, having one driver hook for each parameter
is no longer feasible. So this is the generic entry-point for calls
to glGetInternalFormativ and glGetInternalFormati64v.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-03-03 15:14:05 +01:00
Iago Toral Quiroga
283c8372cb glsl/opt_array_splitting: Fix indentation
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-03-03 09:12:41 +01:00
Iago Toral Quiroga
4a60002424 glsl/opt_array_splitting: Fix crash when doing array indexing into other arrays
When we find indirect indexing into an array, the current implementation
of the array spliiting optimization pass does not look further into the
expression tree. However, if the variable expression involves variable
indexing into other arrays, we can miss that these other arrays also have
variable indexing. If that happens, the pass will crash later on after
hitting an assertion put there to ensure that split arrays are in fact
always indexed via constants:

shader_runner: opt_array_splitting.cpp:296:
void ir_array_splitting_visitor::split_deref(ir_dereference**): Assertion `constant' failed.

This patch fixes the problem by letting the pass step into the variable
index expression to identify these cases properly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89607
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-03-03 09:02:30 +01:00
Oded Gabbay
914d4967d7 radeonsi: Do colorformat endian swap for PIPE_USAGE_STAGING
There is an old if statement (dated to 2011) that prevented doing
endian swap for colorformat, in case the buffer is marked as
PIPE_USAGE_STAGING.

This is now wrong because st_ReadPixels() reads into a destination
texture that is marked with PIPE_USAGE_STAGING. Therefore, even if
the texture is rendered correctly to the monitor, when reading it
back we get unswapped/wrong values.

This patch makes the check_rgba() function in gl-1.0-readpixsanity
piglit test pass in big-endian.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-03 09:20:08 +02:00
Oded Gabbay
ef5183faea r600g: Do colorformat endian swap for PIPE_USAGE_STAGING
There is an old if statement (dated to 2011) that prevented doing
endian swap for colorformat, in case the buffer is marked
as PIPE_USAGE_STAGING.

This is now wrong because st_ReadPixels() reads into a destination
texture that is marked with PIPE_USAGE_STAGING. Therefore, even if
the texture is rendered correctly to the monitor, when reading it
back we get unswapped/wrong values.

This patch makes the check_rgba() function in gl-1.0-readpixsanity
piglit test pass in big-endian.

v2: removed duplicate call to r600_colorformat_endian_swap() inside
evergreen_init_color_surface_rat()

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-03 09:20:08 +02:00
Tim Rowley
7bb193d28c mesa/build: add OpenSWR to build
Tested on Linux (centos, ubuntu, and suse variants)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-02 18:38:42 -06:00
Tim Rowley
d003be2a30 gallium/docs - add OpenSWR documentation
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-02 18:38:41 -06:00
Tim Rowley
da4f95d168 gallium/target-helpers: add OpenSWR driver
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-02 18:38:41 -06:00
Tim Rowley
ea37602273 gallium/auxilary: more __cplusplus exports
swr driver which is written in C++ needs access to some more
gallium utility functions than are currently exposed.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-02 18:38:41 -06:00
Tim Rowley
c6e67f5a93 gallium/swr: add OpenSWR rasterizer
Acked-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-02 18:38:41 -06:00
Tim Rowley
2b2d3680bf gallium/swr: add OpenSWR driver
OpenSWR is a new software rasterizer for x86 processors designed
for high performance and high scalablility on visualization workloads.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-02 18:38:41 -06:00
Timothy Arceri
2eec41f6f1 glsl: replace remaining tabs in ir_builder.cpp
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2016-03-03 11:25:57 +11:00
Anuj Phogat
7026f27e33 mesa: Update comment
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-03-02 15:06:46 -08:00
Anuj Phogat
6ccead5b48 mesa: Fix function description
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-03-02 15:06:46 -08:00
Anuj Phogat
de61849994 meta: Remove the 'allocate_storage' parameter in _mesa_meta_pbo_GetTexSubImage()
Texture is already allocated before calling this meta function. So,
the value of 'allocate_storage' passed to the function is always false.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-02 15:06:46 -08:00
Anuj Phogat
6d4ebbe9e5 meta: Fix the pbo usage in meta for GLES{1,2} contexts
OpenGL ES 1.0 doesn't support using GL_STREAM_DRAW and both
ES 1.0 and 2.0 don't support GL_STREAM_READ in glBufferData().
So, handle it correctly by calling the _mesa_meta_begin()
before create_texture_for_pbo().

V2: Remove the changes related to allocate_storage. (Ian)

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-02 15:06:45 -08:00
Matt Turner
0d047d10f1 program: Clean up after condition code removal. 2016-03-02 12:15:58 -08:00
Matt Turner
961ead6746 program: Remove variable used only in assert(). 2016-03-02 12:15:58 -08:00
Matt Turner
de2ef0401b program: Drop GL_FRAGMENT_PROGRAM_NV from switch statement. 2016-03-02 12:15:58 -08:00
Jordan Justen
98cdce1ce4 anv/gen7: Use predicated rendering for indirect compute
For OpenGL, see commit 9a939ebb47.

Fixes:
 * dEQP-VK.compute.indirect_dispatch.upload_buffer.empty_command
 * dEQP-VK.compute.indirect_dispatch.gen_in_compute.empty_command

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-02 12:03:05 -08:00
Jordan Justen
da4745104c anv: Save batch to local variable for indirect compute
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-02 12:03:05 -08:00
Jason Ekstrand
b0867ca4b2 anv: Fix make check 2016-03-02 11:45:29 -08:00
Samuel Pitoiset
b94a46aa8e gk110/ir: fix wrong emission of NOT modifier for VOTE
Spotted by Coverity.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reported-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-02 20:36:18 +01:00
Jason Ekstrand
2168082a48 isl: Fix make check 2016-03-02 11:31:22 -08:00
Jason Ekstrand
8f5a64e44f gen8/cmd_buffer: Properly return flushed push constant stages
This is required on SKL so that we can properly re-emit binding table
pointers commands.
2016-03-02 10:48:40 -08:00
Thomas Hindoe Paaboel Andersen
535002f4da gallium/cso: fix indentation
Only one of these were recently introduced. However, since
we keep copy/pasting the same wrong indentation we should
probably just fix it.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-02 08:55:20 -07:00
Thomas Hindoe Paaboel Andersen
37cfc51b13 st/mesa: move dereference after null check
We should not dereference shader before we have done the
null check.

Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-02 08:55:20 -07:00
Matt Turner
ad17511302 i965/gen6/gs: Replace V-immediate with VF-immediate.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-02 07:28:52 -08:00
Marek Olšák
43f74ac67c gallium: fix PIPE_BIND_QUERY_BUFFER - PIPE_BIND_SCANOUT overlap
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-03-02 15:32:52 +01:00
Samuel Iglesias Gonsálvez
e8fd60e789 i965: set ctx->Const.MaxViewport{Width,Height} to 32k
From ARB_viewport_array spec:

" * On GL3-capable hardware the VIEWPORT_BOUNDS_RANGE should be at least
    [-16384, 16383].
  * On GL4-capable hardware the VIEWPORT_BOUNDS_RANGE should be at least
    [-32768, 32767]."

This range is set using ctx->Const.MaxViewportWidth value, so just bump
those constants to 32k for gen7+ which can support OpenGL 4.0.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-02 07:19:01 +01:00
Samuel Iglesias Gonsálvez
add57b3fa8 main: remove MAX_VIEWPORT_WIDTH and MAX_VIEWPORT_HEIGHT constants
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-02 07:19:01 +01:00
Samuel Iglesias Gonsálvez
aa849d97a0 main: call invalidate_framebuffer_storage() with driver's viewport limits
Don't use hardcoded ones because the driver can set different ones.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-02 07:19:01 +01:00
Jason Ekstrand
5b70aa11ee anv/meta_blit: Use unorm formats for 8 and 16-bit RGB and RGBA values
While Broadwell is very good about UINT formats, HSW is more restrictive.
Neither R8G8B8_UINT nor R16G16B16_UINT really exist on HSW.  It should be
safe to just use the unorm formats.
2016-03-01 21:45:20 -08:00
Kenneth Graunke
89e421369c Merge remote-tracking branch 'origin/master' into vulkan 2016-03-01 17:11:29 -08:00
Rob Clark
c4ae047cab freedreno/ir3: enable shareable shaders
Now that we are no longer using the pctx reference in the shader, drop
it and turn on shareable shaders.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-01 19:21:45 -05:00
Rob Clark
c3f2f8cbe4 freedreno/ir3: pass ctx to constant-emit code
Rather than fishing it out of the shader.  This removes the other big
user of shader->pctx.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-01 19:20:44 -05:00
Rob Clark
5fd152bae8 freedreno/ir3: add dev ptr to ir3_compiler
And use this for allocating bo's to hold the shader binary, rather than
accessing the dev via ctx ptr.  One step towards making shaders sharable
across contexts.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-01 19:20:33 -05:00
Jason Ekstrand
e941fd8470 genxml: Make the border color pointer consistent across gens 2016-03-01 14:43:05 -08:00
Jason Ekstrand
eecd1f8001 gen7/pipeline: Add competent blending
This is mostly a copy-and-paste from gen8.  Blending still isn't 100% but
it fixes about 1100 CTS blend tests on HSW.
2016-03-01 13:51:58 -08:00
Jason Ekstrand
8b091deb5e anv: Unify gen7 and gen8 state
Now that we've pulled surface state setup into ISL, there's not much to do
here.
2016-03-01 12:17:23 -08:00
Matt Turner
1be953797e mesa: Remove NV_fragment_program remnants from dlist.c.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:30 -08:00
Matt Turner
89abb22a85 mesa: Remove NV_fragment_program_option enable bit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:30 -08:00
Matt Turner
ed72a1c118 program: Remove NV_fragment_program opcode parsing.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
5429554f09 program: Remove NV_fragment_program scalar suffix parsing.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
409c24f9cc program: Remove NV_fragment_program_option parsing support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
fe2d2c7ad8 program: Remove NV_fragment_program Abs support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
0d1f6c752f program: Remove incorrect comment about OPCODE_TXD.
The table in prog_instruction.h is correct.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
624d06708d program: Remove OPCODE_TXP_NV.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
aaef6cf4e3 program: Clean up after previous commit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
7b50b0457d program: Remove condition-code and precision support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
9e11ff7e11 program: Remove OPCODE_KIL_NV.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
a0c3650ad3 program: Remove RelAddr2 support.
Looks like more never-used crap from the first geometry shader attempt.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
6b1fb4862e program: Mark table const.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
fc61b41a95 mesa: Remove EmitCondCodes.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
7fe206da28 docs: Remove descriptions of long dead Emit* fields.
Dead since commit d8a366200 in 2010.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2016-03-01 11:41:29 -08:00
Matt Turner
f3b68fc5fc glsl: Initialize gl_shader_program::EmptyUniformLocations.
Commit 65dfb30 added exec_list EmptyUniformLocations, but only
initialized the list if ARB_explicit_uniform_location was enabled,
leading to crashes if the extension was not available.

Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2016-03-01 11:41:29 -08:00
Ian Romanick
1a80ca22fe i965/meta: Don't pollute the framebuffer namespace
tl;dr: For many types of GL object, we can *NEVER* use the Gen function.

In OpenGL ES (all versions!) and OpenGL compatibility profile,
applications don't have to call Gen functions.  The GL spec is very
clear about how you can mix-and-match generated names and non-generated
names: you can use any name you want for a particular object type until
you call the Gen function for that object type.

Here's the problem scenario:

 - Application calls a meta function that generates a name.  The first
   Gen will probably return 1.

 - Application decides to use the same name for an object of the same
   type without calling Gen.  Many demo programs use names 1, 2, 3,
   etc. without calling Gen.

 - Application calls the meta function again, and the meta function
   replaces the data.  The application's data is lost, and the app
   fails.  Have fun debugging that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:20 -08:00
Ian Romanick
8f1b1878a0 i965/meta: Use _mesa_bind_framebuffers instead of _mesa_BindFramebuffer
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:20 -08:00
Ian Romanick
3071da3032 meta: Don't pollute the framebuffer namespace
tl;dr: For many types of GL object, we can *NEVER* use the Gen function.

In OpenGL ES (all versions!) and OpenGL compatibility profile,
applications don't have to call Gen functions.  The GL spec is very
clear about how you can mix-and-match generated names and non-generated
names: you can use any name you want for a particular object type until
you call the Gen function for that object type.

Here's the problem scenario:

 - Application calls a meta function that generates a name.  The first
   Gen will probably return 1.

 - Application decides to use the same name for an object of the same
   type without calling Gen.  Many demo programs use names 1, 2, 3,
   etc. without calling Gen.

 - Application calls the meta function again, and the meta function
   replaces the data.  The application's data is lost, and the app
   fails.  Have fun debugging that.

Fixes piglit tests:
    - object-namespace-pollution glGetTexImage-compressed framebuffer
    - object-namespace-pollution glGenerateMipmap framebuffer

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:20 -08:00
Ian Romanick
91e5825b8a meta/decompress: Track framebuffer using gl_framebuffer instead of GL API object handle
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:20 -08:00
Ian Romanick
3ed44fab18 meta/generate_mipmap: Track framebuffer using gl_framebuffer instead of GL API object handle
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:20 -08:00
Ian Romanick
ec5757f9c9 meta: Use _mesa_bind_framebuffers instead of _mesa_BindFramebuffer
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:20 -08:00
Ian Romanick
7c254f0200 meta: Use _mesa_CreateFramebuffers instead of _mesa_GenFramebuffers
This enables later patches that will stop calling _mesa_GenFramebuffers
or _mesa_CreateFramebuffers which pollute the framebuffer namespace.

For framebuffers, the Bind call is still necessary.

sed -i -e 's/_mesa_GenFramebuffers/_mesa_CreateFramebuffers/' \
    src/mesa/drivers/common/*.c

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:20 -08:00
Ian Romanick
6b70c9ea98 i965/meta: Use _mesa_CreateFramebuffers instead of _mesa_GenFramebuffers
This enables later patches that will stop calling _mesa_GenFramebuffers
or _mesa_CreateFramebuffers which pollute the framebuffer namespace.

For framebuffers, the Bind call is still necessary.

sed -i -e 's/_mesa_GenFramebuffers/_mesa_CreateFramebuffers/' \
    src/mesa/drivers/dri/i965/*.c

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:19 -08:00
Ian Romanick
f76462cb6f meta: Save and restore the framebuffer using gl_framebuffer instead of GL API object handle
Some meta operations can be called recursively.  Future changes (the
"Don't pollute the ... namespace" changes) will cause objects with
invalid names to be used.  If a nested meta operation tries to restore
an object named 0xDEADBEEF, it will fail.

This also fixes another latent bug in meta.  In a multithreaded,
multicontext application, one thread can delete an object that is bound
in another thread.  That object continues to exist until it is unbound
(i.e., its refcount drops to zero).  Meta unbinds objects all over the
place.  As a result, the rebind in _mesa_meta_end could fail because the
object vanished!

See https://bugs.freedesktop.org/show_bug.cgi?id=92363#c8.

Using _mesa_reference_<object type> to save and restore the objects
prevents the refcount from going to zero.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:19 -08:00
Ian Romanick
fed9b0ed5a mesa: Refactor bind_framebuffer to make _mesa_bind_framebuffers
Fixing dd_function_table::BindFramebuffer will come later because that
change is probably not suitable for stable.

v2: Fix whitespace issue noticed by Topi.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:19 -08:00
Ian Romanick
64aff35f84 meta: Use _mesa_check_framebuffer_status instead of _mesa_CheckFramebufferStatus
sed -i -e 's/_mesa_CheckFramebufferStatus(GL_DRAW_FRAMEBUFFER/_mesa_check_framebuffer_status(ctx, ctx->DrawBuffer/' \
    -e 's/_mesa_CheckFramebufferStatus(GL_FRAMEBUFFER[^)]*/_mesa_check_framebuffer_status(ctx, ctx->DrawBuffer/' \
    -e 's/_mesa_CheckFramebufferStatus(GL_READ_FRAMEBUFFER/_mesa_check_framebuffer_status(ctx, ctx->ReadBuffer/' \
    $(grep -rl _mesa_CheckFramebufferStatus src/mesa/drivers)

The second expression catches both GL_FRAMEBUFFER and GL_FRAMEBUFFER_EXT.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:19 -08:00
Ian Romanick
92266ff7a3 meta: Obvious refactor of _mesa_meta_framebuffer_texture_image
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:19 -08:00
Ian Romanick
f69c743069 meta: Convert _mesa_meta_bind_fbo_image to take a gl_framebuffer instead of a GL API handle
Also change the name of the function to
_mesa_meta_framebuffer_texture_image.  The function is basically a
wrapper around _mesa_framebuffer_texture (which is used to implement
glFramebufferTexture1D and friends), so it makes sense for it's name to
be similar to that.

The next patch will clean _mesa_meta_framebuffer_texture_image up
considerably.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-03-01 11:07:19 -08:00
Jason Ekstrand
6e20c1e058 anv/cmd_buffer: Look at both sides for stencil enable
Now it's all consistent with gen9
2016-03-01 11:03:29 -08:00
Jason Ekstrand
4cfdd16500 anv/cmd_buffer: Clean up stencil state setup on gen7 2016-03-01 11:02:21 -08:00
Jason Ekstrand
bb08d86efe anv/cmd_buffer: Clean up stencil state setup on gen8 2016-03-01 10:58:43 -08:00
Kristian Høgsberg Kristensen
22d8666d74 anv: Add in image->offset when setting up depth buffer
Fix from Neil Roberts.

https://bugs.freedesktop.org/show_bug.cgi?id=94348
2016-03-01 09:19:39 -08:00
Jason Ekstrand
38f4c11c2f anv/pipeline: Pull 3DSTATE_SBE into a shared helper 2016-03-01 08:46:32 -08:00
Dave Airlie
ac222626ad virgl: add support for passing render condition flags to host.
This just passes the extra blit info to fix the render condition
tests.

Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-01 15:50:00 +10:00
Jason Ekstrand
3f8df795c1 genxml: Break output detail of 3DSTATE_SBE on gen7 into a struct
This makes it work like 3DSTATE_SBE_SWIZ on gen8+ which is much more
convenient.
2016-02-29 16:47:42 -08:00
Kenneth Graunke
24994ae926 i965: Push most TES inputs in vec4 mode.
(This is commit 4a1c8a3037 for vec4 mode.)

Using the push model for inputs is much more efficient than pulling
inputs - the hardware can simply copy a large chunk into URB registers
at thread creation time, rather than having the thread send messages to
request data from the L3 cache.  Unfortunately, it's possible to have
more TES inputs than fit in registers, so we have to fall back to the
pull model in some cases.

However, it turns out that most tessellation evaluation shaders are
fairly simple, and don't use many inputs.  An arbitrary cut-off of
24 vec4 slots (12 registers) should suffice.  (I chose this instead of
the 32 vec4 slots used in the scalar backend to avoid regressing a few
Piglit tests due to the vec4 register allocator being too stupid to
figure out what to do.  We probably ought to fix that, but it's a
separate issue.)

Improves performance in GPUTest's tessmark_x64 microbenchmark by
41.5394% +/- 0.288519% (n = 115) at 1024x768 on my Clevo W740SU
(with Iris Pro 5200).

Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by
38.3576% +/- 0.759748% (n = 42).

v2: Simplify abs/negate handling, as requested by Matt.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-02-29 16:12:50 -08:00
Marek Olšák
c54f38494c r600g: remove support for DRM < 2.12.0 2016-03-01 00:18:54 +01:00
Marek Olšák
b7da8fa11d r300g: remove support for DRM < 2.12.0 2016-03-01 00:18:54 +01:00
Marek Olšák
a5e2a173dd winsys/radeon: drop support for DRM 2.12.0 (kernel < 3.2)
in order to make some winsys interface changes easier

This distros should use new DRM if they want to use new Mesa:
  Distro    kernel  mesa    eol
  SLES 10   2.6.16  6.4.2   2016-07
  SLED 11   3.0     9.0.3   2022-03
  RHEL 5    2.6.18  6.5.1   2017-03
  RHEL 6    2.6.32  10.4.3  2020-11
  Debian 6  2.6.32  7.7.1   2016-02

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-01 00:18:54 +01:00
Marek Olšák
69a8e435ce radeonsi: also dump shaders on a VM fault
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-03-01 00:18:54 +01:00
Marek Olšák
18df72b50b radeonsi: dump full shader disassemblies into ddebug logs
including prolog and epilog disassemblies

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-01 00:18:54 +01:00
Marek Olšák
74b4ce81fb radeonsi: allow dumping shader disassemblies to a file
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-01 00:18:54 +01:00
Marek Olšák
d0f3b524cd radeonsi: use re-Z
This can increase perf for shaders that kill pixels (kill, alpha-test,
alpha-to-coverage).

v2: add comments

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-01 00:18:19 +01:00
Marek Olšák
09bfbd43a0 tgsi/scan: count memory instructions
for radeonsi

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-01 00:11:32 +01:00
Jason Ekstrand
097564bb8e anv/cmd_buffer: Dirty push constants when changing pipelines. 2016-02-29 14:36:24 -08:00
Jason Ekstrand
d29fd1c7cb anv/cmd_buffer: Re-emit push constants packets for all stages 2016-02-29 14:36:24 -08:00
Jason Ekstrand
9715724015 anv/pipeline: Follow push constant alignment restrictions on BDW+ and HSW gt3 2016-02-29 14:36:24 -08:00
Jason Ekstrand
6986ae35ad anv/pipeline: Avoid a division by zero 2016-02-29 14:36:24 -08:00
Jason Ekstrand
51b618285d anv/pipeline: Use dynamic checks for max push constants
The GEN_GEN macros aren't available in anv_pipeline since it only gets
compiled once for the whold driver.
2016-02-29 14:36:24 -08:00
Dave Airlie
35859d5bbb mesa/fbobject: propogate Layered when reusing attachments.
When reusing a depth attachment as a stencil, we need to propogate
the layered bit, otherwise we fail to complete the framebuffer.

discovered running ./bin/fbo-depth-array depth-layered-clear
on virgl on haswell.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-01 07:34:37 +10:00
Nanley Chery
74b7b59db5 isl/surface_state: Fix array spacing on Gen7
v2: Don't cast the enum to a boolean (Jason)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-02-29 11:43:33 -08:00
Kristian Høgsberg Kristensen
9d8bae6137 anv: Don't advertise pipelineStatisticsQuery
We don't support that just yet.

Reported-by: Jacek Konieczny <jajcus@jajcus.net>
2016-02-29 10:55:39 -08:00
Axel Davy
83bc2acfe9 st/nine: Fix second Multithreading issue with MANAGED buffers
Here is another threading issue with MANAGED buffers:

Thread 1: buffer creation
Thread 1: buffer lock
Thread 2: Draw call
Thread 1: writes data
Thread 1: Unlock

Without this patch, the buffer is initially dirty
and in the list of things to upload after its creation.
The draw call will then upload the data and unset the dirty flag,
and the Unlock won't trigger a second upload.

Fixes regression introduced by cc0114f30b:
"st/nine: Implement Managed vertex/index buffers"

Cc: "11.2" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-02-29 18:55:58 +01:00
Axel Davy
44246fe99d st/nine: Fix Multithreading issue with MANAGED buffers
d3d calls are protected by mutexes, however if app is doing in
two threads:

Thread 1: buffer Lock
Thread 2: Draw call
Thread 1: writes data
Thread 1: Unlock

Then before this patch, the Draw call would begin to upload
the buffer.

Solves this by moving the moment we add the buffer to the queue
of things to upload (We move it from Lock time to Unlock time).

Cc: "11.2" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-02-29 18:55:58 +01:00
Axel Davy
35c858c42c st/nine: Handle READONLY for buffer MANAGED pool
READONLY won't trigger an upload.

Cc: "11.2" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-02-29 18:55:58 +01:00
Axel Davy
8a8affdfda st/nine: Use Position input helper for ps3 declared inputs
When the semantic is Position (which can happen with index 0 only),
use the helper to get Position input.

Cc: "11.2" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-02-29 18:55:58 +01:00
Axel Davy
f08c990af5 st/nine: Introduce helper for Position shader input
Cc: "11.2" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-02-29 18:55:58 +01:00
Marc-André Lureau
f1d12e7392 virtio_gpu: Add virtio 1.0 PCI ID to driver map
Add the virtio-gpu PCI ID for virtio 1.0 (according to the
specification, "the PCI Device ID is calculated by adding 0x1040 to the
Virtio Device ID")

Support for virtio 1.0 was added in qemu 2.4 (same time virtio-gpu
landed).

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 11:31:36 +00:00
Koop Mast
04bc09fdf9 st/clover: Add libelf cflags to the build
Otherwise the build will fail, when the library is in a non default
location.

v2 [Emil Velikov]
 - drop the unneeded cflags from targets/opencl.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Fixes: 7f585a6a98 "configure.ac: use pkg-config for libelf"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93524
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 11:30:15 +00:00
Emil Velikov
c212a70cd9 mesa; add get-extra-pick-list.sh script into bin/
This is a very rudimentary script that checks if any of the applied
cherry-picks have been referenced (fixed?) by another patch. With the
latter either missing the stable tag or hasn't yet been picked.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-02-29 11:25:35 +00:00
Emil Velikov
64500f21f3 automake: explicitly set distcheck configure flags
Pretty much all of these are enabled by default. Considering the recent
updates (see previous commits) one might as well list most/all of these
here.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 10:51:45 +00:00
Emil Velikov
325bc6fb4a automake: add more missing options for make distcheck
Namely - opencl, osmesa (only the gallium flavour as it conflicts with
the classic one), surfaceless egl platform and a couple gallium drivers
(virgl and vc4).

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 10:51:45 +00:00
Emil Velikov
0b6157e971 install-gallium-links: port changes from install-lib-links
Namely:
b662d5282f mesa: Add clean-local rule to remove .lib links.
5c1aac17ad install-lib-links: don't depend on .libs directory
fece147be5 install-lib-links: remove the .install-lib-links file

With these in place, make distcheck now passes and a race condition has
been avoided.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 10:51:45 +00:00
Rob Herring
51b22bd468 r600: Make enum alu_op_flags unsigned
In builds with clang, there are several errors related to the enum
alu_op_flags like this:

src/gallium/drivers/r600/sb/sb_expr.cpp:887:8:
error: case value evaluates to -1610612736, which cannot be narrowed to
type 'unsigned int' [-Wc++11-narrowing]

These are due to the MSB being set in the enum. Fix these errors by
making the enum values unsigned as needed. The flags field that stores
this enum also needs to be unsigned.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-02-29 10:51:45 +00:00
Rob Herring
92dd38df5a gallium/radeon: Add space between string literal and identifier
Fix compiles with clang that have this C++11 error:

src/gallium/drivers/radeon/r600_pipe_common.h:662:34:
error: invalid suffix on literal; C++11 requires a space between literal
and identifier [-Wreserved-user-defined-literal]

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-02-29 10:51:45 +00:00
Rob Herring
0156a33aa3 freedreno: drop unnecessary -Wno-packed-bitfield-compat
Enabling this warning doesn't generate any warnings with gcc, but is an
unknown option for clang, so drop it.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Rob Clark <robdclark@gmail.com> (v1)

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
v2: keep the warning around, commented out
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 10:51:45 +00:00
Rob Herring
8949edf018 Android: clean-up and fix DRI module path handling
MESA_DRI_MODULE_PATH is only getting set for classic DRI drivers and may or
may not be set correctly for gallium_dri.so depending on the makefile
include ordering. For Android 6 and earlier it is fine, but with build
system changes in AOSP master, it is not.

Move the path variables to a single place at the top level and introduce
MESA_DRI_MODULE_REL_PATH for Android 5 and later which require relative
paths. With this, there is a single variable to change.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 10:51:44 +00:00
Rob Herring
0663edf85b Android: remove headers from LOCAL_SRC_FILES
The Android build system now spits out warnings for header files listed in
LOCAL_SRC_FILES, so strip them out.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 10:51:44 +00:00
Rob Herring
6dae9176d6 Android: add -Wno-date-time flag for clang
clang complains about date/time macros:

src/mesa/main/context.c:403:25: error: expansion of date or time macro is not reproducible [-Werror,-Wdate-time]

Disable this warning.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 10:51:44 +00:00
Rob Herring
a2f16db19b Android: glsl: fix dependence on YACC_HEADER_SUFFIX from build system
The makefile was implicitly picking up YACC_HEADER_SUFFIX from the Android
build system, but this variable is now gone. Add it locally to fix the
build with AOSP master.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 10:51:44 +00:00
Rob Herring
794221fbb7 Android: remove dependence on .SECONDEXPANSION
With the Android build system changes to ninja/kati, the use of
.SECONDEXPANSION is no longer supported. Fix this by avoiding rule specific
variables and using $(transform-generated-source).

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 10:51:44 +00:00
Rob Herring
574a92b048 Android: fix build break from nir/glsl move to compiler/
Commits a39a8fbbaa ("nir: move to compiler/") and eb63640c1d
("glsl: move to compiler/") broke Android builds. Fix them.

There is also a missing dependency between generated NIR headers and
several libraries. This isn't a new issue, but seems to have been
exposed by the NIR move.

Built with i915, i965, freedreno, r300g, r600g, vc4, and virgl enabled.

Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Cc: Mauro Rossi <issor.oruam@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-29 10:51:44 +00:00
Oded Gabbay
a640ad15e1 gallium/radeon: disable evergreen_do_fast_color_clear for BE
This function is currently broken for BE. I assume it's because of
util_pack_color(). Until I fix this path, I prefer to disable it so users
would be able to see correct colors on their desktop and applications.

Together with the two following patches:
- gallium/r600: Don't let h/w do endian swap for colorformat
- gallium/radeon: remove separate BE path in r600_translate_colorswap

it fixes BZ#72877 and BZ#92039

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-29 12:26:27 +02:00
Oded Gabbay
e3dfc0e095 gallium/r600: Don't let h/w do endian swap for colorformat
Since the rework on gallium pipe formats, there is no more need to do
endian swap of the colorformat in the h/w, because the conversion between
mesa format and gallium (pipe) format takes endianess into account (see
the big #if in p_format.h).

v2: return ENDIAN_NONE only for four 8-bits components
(V_0280A0_COLOR_8_8_8_8)

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-29 12:26:27 +02:00
Oded Gabbay
9559071ed6 gallium/radeon: remove separate BE path in r600_translate_colorswap
After further testing, it appears there is no need for
separate BE path in r600_translate_colorswap()

The only fix remaining is the change of the last if statement, in the 4
channels case. Originally, it contained an invalid swizzle configuration
that never got hit, in LE or BE. So the fix is relevant for both systems.

This patch adds an additional 120 available visuals for LE and BE,
as seen in glxinfo

v2:
Tested for regressions by running piglit gpu.py with CAICOS (r600g) on
x86-64 machine. No regressions found.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-29 12:26:27 +02:00
Samuel Pitoiset
07ed003faf nv50/ir: emit VOTE instruction
Changes from v2:
 - add missing NOT modifier for GK110/GM107

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-28 23:58:11 +01:00
Jordan Justen
635c0e92b7 anv: Set CURBEAllocationSize in MEDIA_VFE_STATE
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-02-28 11:54:49 -08:00
Jordan Justen
1af5dacd76 anv/gen7: Enable SLM in L3 cache control register
Port 1983003 to gen7.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-02-28 11:54:49 -08:00
Kristian Høgsberg Kristensen
b00b42d99b nir/spirv: Use the new bare sampler type 2016-02-28 11:24:05 -08:00
Jordan Justen
72efb68d48 anv/pipeline: Set URB offset to zero if size is zero
After 3ecd357d81, it may be possible for
the VS to get assigned all of the URB space.

On Ivy Bridge, this will cause the offset for the other stages to be
16, which cannot be packed into the ConstantBufferOffset field of
3DSTATE_PUSH_CONSTANT_ALLOC_*.

Instead we can set the offset to zero if the stage size is zero.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-02-28 10:51:38 -08:00
Jordan Justen
ef06ddb08a anv/pipeline: Set FS URB space to zero if the FS is unused
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-02-28 10:51:38 -08:00
Jordan Justen
45d8ce07a5 anv/pipeline: Set stage URB size to zero if it is unused
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-02-28 10:49:39 -08:00
Samuel Pitoiset
b3efa0a59e gk110/ir: add ld lock/st unlock emission
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-28 19:20:20 +01:00
Ilia Mirkin
aa3b85fd18 nv50,nvc0: bump minimum texture buffer offset alignment
It appears that it actually needs to be aligned to the datum size, so it
was 1 when testing with R8, but it can be as high as 16 with RGBA32.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-02-27 16:26:34 -05:00
Jason Ekstrand
46b7c242da anv/gen7: Clean up the dummy PS case
Fix whitespace and remove dead comments
2016-02-27 11:24:09 -08:00
Jason Ekstrand
e18a2f037a anv/gen7: Set MaximumNumberofThreads in the dummy PS packet 2016-02-27 11:23:56 -08:00
Jason Ekstrand
ad50896c87 anv/gen7: Only try to get the depth format the surface has depth 2016-02-27 11:23:18 -08:00
Jason Ekstrand
4b34f2ccb8 anv/image: Use isl for filling brw_image_param 2016-02-27 10:26:14 -08:00
Jason Ekstrand
bd6470fa6c isl: Add helpers for filling out brw_image_param 2016-02-27 10:26:14 -08:00
Jason Ekstrand
7363024cbd anv: Fill out image_param structs at view creation time 2016-02-27 10:26:14 -08:00
Jason Ekstrand
e9d126f23b anv/image: Add a ussage_mask field to image_view_init
This allows us to avoid doing some unneeded work on the meta paths where we
know that the image view will be used for exactly one thing.  The meta
paths also sometimes do things that aren't quite valid like setting the
array slice on a 3-D texture and we want to limit the number of paths that
need to be able to sensibly handle the lies.
2016-02-27 10:26:14 -08:00
Jason Ekstrand
b4c16fd01a isl: Move isl_image.c to isl_storage_image.c 2016-02-27 10:26:14 -08:00
Jason Ekstrand
eb19d640eb anv: Use isl to fill buffer surface states 2016-02-27 10:26:14 -08:00
Jason Ekstrand
a0cd20eb7f isl: Add a helper for filling a buffer surface state 2016-02-27 10:26:14 -08:00
Jason Ekstrand
9d5b8f7709 anv: Remove unneeded fiels from anv_image_view 2016-02-27 10:26:14 -08:00
Jason Ekstrand
b70a8d40fa anv/state: Remove unused fill_surface_state functions 2016-02-27 10:26:14 -08:00
Jason Ekstrand
ded57c3cca anv: Use ISL to fill out surface states 2016-02-27 10:26:14 -08:00
Jason Ekstrand
4a9b805ce5 anv/device: Store the default MOCS in the device 2016-02-27 10:26:13 -08:00
Jason Ekstrand
d798762cdb isl: Add a function for filling out a surface state 2016-02-27 10:26:13 -08:00
Jason Ekstrand
6b06072ba8 isl: Create per-gen helper libraries for gens 7, 8, and 9 2016-02-27 10:26:13 -08:00
Jason Ekstrand
82d2db80bb genxml: Add MOCS fields to RENDER_SURFACE_STATE
This allows us to set MOCS as a single uint32_t on all platforms.
2016-02-27 10:26:13 -08:00
Jason Ekstrand
452782f68b gen/genX_pack: Add genxml to the pack header path
If you have an out-of-tree build, gen8_pack.h and friends will not be in
the same folder as genX_pack.h so this will be a problem.  We fixed
out-of-tree earlier by adding the genxml folder to the includes for the
vulkan driver.  However, this is not a good long-term solution because we
want to use it in ISL as well.
2016-02-27 10:26:13 -08:00
Ilia Mirkin
e2dce1a340 mesa: add GL_OES_gpu_shader5 and GL_EXT_gpu_shader5 support
The two extensions are identical, and are largely taking bits of already
existing desktop functionality. We continue to do a poor job of
supporting the 'precise' keyword, just like we do on desktop.

This passes the relevant dEQP tests that I could find.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-02-27 00:08:28 -05:00
Ilia Mirkin
2875183463 mesa: expose GL_EXT_texture_sRGB_decode on GLES 3.0+
Could be exposed on earlier GLES versions if we supported EXT_sRGB, but
we don't, for now.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-02-26 23:55:45 -05:00
Nanley Chery
265d4c415c isl: Fix isl_surf_get_image_intratile_offset_el()
Consecutive tiles are separated by the size of the tile, not by the
logical tile width.

v2: Remove extra subtraction (Ville)
    Add parenthesis (Jason)
v3: Update the unit tests for the function

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-02-26 16:59:36 -08:00
Ian Romanick
585b18f305 i965/cfg: Fix comment list punctuation
Trivial

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2016-02-26 16:51:27 -08:00
Ian Romanick
5bfb302783 i965/cfg: Split out dead control flow paths to simplify both paths
v2: Fix some bad indentation.  Suggested by Curro.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-02-26 16:51:27 -08:00
Ian Romanick
2513a20240 i965/cfg: Don't handle fully empty if/else/endif
This will now never occur.  The empty if-else part would have already
been removed leaving an empty if-endif part.

No shader-db changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-02-26 16:51:27 -08:00
Ian Romanick
69bb063ec2 i965/cfg: Eliminate an empty then-branch of an if/else/endif
On BDW,

total instructions in shared programs: 8448571 -> 8448367 (-0.00%)
instructions in affected programs: 21000 -> 20796 (-0.97%)
helped: 116
HURT: 0

v2: Remove spurious attempt to combine the if_block with the (removed!)
else_block.  Suggested by Matt and Curro.  Correct the comment
describing what the new pass does.  Suggested by Matt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-02-26 16:51:27 -08:00
Ian Romanick
c7deee69ea i965/cfg: Track prev_block and prev_inst explicitly in the whole function
This provides a trivial simplification now, and it makes some future
changes more straight forward.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-02-26 16:51:27 -08:00
Ian Romanick
70cf0eb5c7 i965/cfg: Slightly rearrange dead_control_flow_eliminate
'git diff -w' is a bit more illustrative.  A couple declarations were
moved, the continue was removed, and the code was reindented.  This will
simplify future changes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-02-26 16:51:27 -08:00
Thomas Hindoe Paaboel Andersen
6bb6b5c341 anv: remove stray ; after if
Both logic and indentation suggests that the ; were not intended here.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-02-26 16:05:28 -08:00
Jason Ekstrand
b7bc52b5b1 anv/gen8: Emit the 3DSTATE_PS_BLEND packet 2016-02-26 16:04:48 -08:00
Kenneth Graunke
a0294c2cf3 i965: Simplify brw_nir_lower_vue_inputs() slightly.
The same code appeared in both branches; pull it above the if statement.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-26 15:55:59 -08:00
Kenneth Graunke
8151003ade i965: Avoid recalculating the normal VUE map for IO lowering.
The caller already computes it.  Now that we have stage specific
functions, it's really easy to pass this in.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-26 15:55:59 -08:00
Kenneth Graunke
15b3639bf1 i965: Avoid recalculating the tessellation VUE map for IO lowering.
The caller already computes it.  Now that we have stage specific
functions, it's really easy to pass this in.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-26 15:55:59 -08:00
Kenneth Graunke
cfbd9831f8 i965: Eliminate brw_nir_lower_{inputs,outputs,io} functions.
Now that each stage is directly calling brw_nir_lower_io(), and we have
per-stage helper functions, it makes sense to just call the relevant one
directly, rather than going through multiple switch statements.

This also eliminates stupid function parameters, such as the two that
only apply to vertex attributes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-26 15:55:59 -08:00
Kenneth Graunke
b96ddd2e52 i965: Split brw_nir_lower_inputs/outputs into per-stage functions.
These functions are both giant switch statements where most cases don't
overlap at all.  Let's put the bulk of the work in per-stage helpers.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-26 15:55:59 -08:00
Kenneth Graunke
d33c478bed i965: Remove catch-all nir_lower_io call with specific cases.
Most cases already call nir_lower_io explicitly for input and output
lowering.  This catch all isn't very useful anymore - we can just add it
to the remaining cases.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-26 15:55:59 -08:00
Kenneth Graunke
51f8797993 i965: Move optimizations from brw_nir_lower_io to brw_postprocess_nir.
This simplifies things.  Every caller of brw_nir_lower_io() immediately
calls brw_postprocess_nir().  The only real change this will have is
that we get an extra brw_nir_optimize() call when compiling compute
shaders, but that seems fine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-26 15:55:59 -08:00
Kenneth Graunke
dcd4a841e9 i965: Always do NIR IO lowering at specialization time.
We've now hit literally every case other than geometry shaders (and
compute shaders, but those are a no-op).  So, let's just move geometry
shaders over too and be done with it.

The only advantage to doing this at link time was to save the expense
of running the pass on recompiles.  But we're already running a lot of
passes, and the extra code complexity isn't worth it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-26 15:55:59 -08:00
Kenneth Graunke
fa7135107f i965: Make an is_scalar boolean in brw_compile_gs().
Shorter than compiler->scalar_stage[MESA_SHADER_GEOMETRY], which can
help with line-wrapping.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-26 15:55:59 -08:00
Jason Ekstrand
b3cb6e78aa i965/nir: Do lower_io late for fragment shaders
The Vulkan driver wants to be able to delete fragment outputs that are
beyond key.nr_color_regions; this is a lot easier if we lower outputs at
specialization time rather than link time.

(Rationale added to commit message by Ken)

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-26 15:55:59 -08:00
Jordan Justen
7428e6f86a i965: Set dest type to UW for several send messages
Without this, on SIMD 16 the send instruction destination will appear
to write more than one destination register, causing the simulator to
report an error.

Of course, the send instruction can actually write more than one
destination register regardless of the type set for the destination,
so this is a bit strange.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-02-26 12:03:56 -08:00
Samuel Pitoiset
aad48f8691 nvc0: rework nvc0_compute_validate_program()
Reduce the amount of duplicated code by re-using
nvc0_program_validate(). While we are at it, change the prototype
to return void and remove nvc0_compute.h which is now useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre Moreau <pierre.morrow@free.fr>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-26 14:00:27 +01:00
Samuel Pitoiset
e1f5c76047 nvc0: make sure to validate compute global buffers on Fermi
No reason to not validate those global buffers and this might avoid
fails if someone try to use the global memory from compute programs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre Moreau <pierre.morrow@free.fr>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-26 14:00:23 +01:00
Samuel Pitoiset
dcf7938833 nvc0: move nvc0_validate_global_residents() to nvc0_compute.c
While we are at it, rename it to nvc0_compute_validate_globals() and
update its prototype.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre Moreau <pierre.morrow@free.fr>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-26 14:00:18 +01:00
Derek Foreman
d085a5dff5 egl/wayland: Try to use wl_surface.damage_buffer for SwapBuffersWithDamage
Since commit d1314de293 we ignore
damage passed to SwapBuffersWithDamage.

Wayland 1.10 now has functionality that allows us to properly
process those damage rectangles, and a way to query if it's
available.

Now we can use wl_surface.damage_buffer and interpret the incoming
damage as being in buffer co-ordinates.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
2016-02-26 11:49:09 +00:00
Dave Airlie
840aa52f50 virgl: add missing CAP turned off. 2016-02-26 04:03:09 +00:00
Miklós Máté
847f1cc698 program: Remove extra reference_program()
It was already done in get_mesa_program()

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-02-25 22:02:50 +01:00
Emil Velikov
51c65a4c48 automake: add nine to make distcheck
Will allow us to catch/prevent issues, like the one in mesa 11.2.0-rc1.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-25 19:56:07 +00:00
Emil Velikov
b08dbc84fe st/nine: don't forget to bundle the nine_limits.h file
Without this mesa 11.2.0-rc1 ended up busted :-(

Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Repored-by: Ondřej Súkup <mimi.vx@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-25 19:56:07 +00:00
Matt Turner
4009a9ead4 i965/fs: Allow saturate propagation to propagate negations into MADs.
Allows us to transform

   mad      res  src0   src1   src2
   mov.sat  dst  -res

into

   mad.sat  dst  -src0 -src1   src2

instructions in affected programs: 3712 -> 3688 (-0.65%)
helped: 24

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-02-25 10:51:15 -08:00
Matt Turner
65d3217cb0 i965/fs: Allow saturate propagation to propagate negations into ADDs.
Allows us to transform

   add      res  src0   src1
   mov.sat  dst  -res

into

   add.sat  dst  -src0 -src1

No shader-db changes.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-02-25 10:51:13 -08:00
Matt Turner
7b6113bc2d i965/fs: Allow saturate propagation to propagate negations into MULs.
Allows us to transform

   mul      res  src0   src1
   mov.sat  dst  -res

into

   mul.sat  dst  src0  -src1

instructions in affected programs: 45246 -> 45054 (-0.42%)
helped: 162

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-02-25 10:51:10 -08:00
Matt Turner
1567da1e28 i965/fs: Don't CSE negated multiplies with saturation.
It's not correct to CSE these multiplies

   mul.sat dst1, -a, b
   mul.sat dst2,  a, b

by emitting a negated MOV from dst1 to dst2:

   mul.sat dst1, -a, b
   mov     dst2, -dst1

Take 2.0*2.0 for example. The first multiply would produce 0.0 and the
second would produce 1.0.

Fixes bad generated code in 18 to 22 shaders:

instructions in affected programs: 432 -> 464 (7.41%)
helped: 4
HURT: 18

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-02-25 10:51:04 -08:00
Matt Turner
3da789f1e9 glsl: Consider ubo_load to be a horizontal operation.
Unclear to me whether it actually is a horizontal operation that cannot
be vectorized, but the fact that i965 generates the same code in either
case makes me less interested in finding out.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94199
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-25 10:50:34 -08:00
Jason Ekstrand
c32273d246 anv/device: Properly handle apiVersion == 0
From the Vulkan 1.0 spec section 3.2:

"If apiVersion is 0 the implementation must ignore it"
2016-02-25 08:52:37 -08:00
Andres Gomez
d1509a5848 glsl/ast: Implicit conversion from double to float is not allowed
Also, renamed get_conversion_operation to avoid
future misunderstandings.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-25 13:10:50 +01:00
Oded Gabbay
439b5b008f gallium/radeon: return correct values for BE in r600_translate_colorswap
Because I changed the swizzle check, I also need to adapt the return
values for each check.

It's basically almost the same as before, we just cross between STD and
STD_REV, and cross between ALT and ALT_REV

This fixes the rgba test in gl-1.0-readpixsanity (piglit) and also
fixes tri-flat (mesa demos).

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-25 09:21:08 +02:00
Oded Gabbay
ff8b41b702 gallium: remove duplicate define from enum pipe_format
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-25 09:21:08 +02:00
Ian Romanick
9d9aeb91b1 glsl: Detect do-while-false loops and unroll them
Previously loops like

   do {
      // ...
   } while (false);

that did not have any other loop-branch instructions would not be
unrolled.  This is commonly used to wrap multiline preprocessor macros.

This produces IR like

    (loop (
       ...
       break
    ))

Since limiting_terminator was NULL, the loop unroller would
throw up its hands and say, "I don't know how many iterations.  How
can I unroll this?"

We can detect this another way.  If there is no limiting_terminator
and the only loop-branch is a break as the last IR, there's only one
iteration.

On my very old checkout of shader-db, this removes a loop from Orbital
Explorer, but it does not otherwise affect the shader.  The loop removed
is the one the compiler inserts surrounding the switch statement.

This change does prevent some seriously bad code generation in some
patches to meta shaders that I recently sent out for review.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-02-24 18:43:40 -08:00
Nanley Chery
3eb476fa14 i965: Enable tiled mem_copy with sRGB-formatted resources
RGBA8 and BGRA8 unorm formats are compatible with the various
mem_copy functions. Their sRGB counterparts are also compatible
because they're also color-renderable (of importance when the
specified resource is a readbuffer) and they share the same
physical layout.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-02-24 14:40:34 -08:00
Kristian Høgsberg Kristensen
59f5728995 Merge remote-tracking branch 'origin/master' into vulkan 2016-02-24 13:04:54 -08:00
Kristian Høgsberg Kristensen
25c2470b24 anv: Set max_hs_threads/max_ds_threads 2016-02-24 12:21:26 -08:00
Kenneth Graunke
3ecd357d81 anv: Allocate more push constant space.
Previously we allocated 4kB of push constant space for VS, GS, and PS
(for a total of 12kB) no matter what.  This works, but doesn't fully
utilize the space - we have 16kB or 32kB of space.

This makes anv use the same method as brw - divide up the space evenly
among all active shader stages.  This means HS and DS would get space,
if those shader stages existed.

In the future, we can probably do better by inspecting how many push
constants each shader stage uses, and weight things accordingly.  But
this is strictly better than the old code, and ideally we'd justify
a fancier solution with actual performance data.
2016-02-24 11:22:05 -08:00
Kenneth Graunke
3f11517730 anv: Properly size the push constant L3 area.
We were assuming it was 32kB everywhere, reducing the available URB
space.  It's actually 16kB on Ivybridge, Baytrail, and Haswell GT1-2.
2016-02-24 11:13:08 -08:00
Kenneth Graunke
7f9b03cc8b anv: Emit 3DSTATE_PUSH_CONSTANT_ALLOC_* via a loop.
Now we're emitting HS and DS packets as well.
2016-02-24 11:13:08 -08:00
Kenneth Graunke
1024a66fc4 anv: Emit 3DSTATE_URB_* via a loop.
Rather than keeping separate {vs,hs,ds,gs}_start fields, we now store an
array indexed by the shader stage (MESA_SHADER_*).  The 3DSTATE_URB_*
commands are also sequentially numbered.  This makes it easy to just
emit them in a loop.

This simplifies the code a little, and also will make it easier to add
more credible HS and DS code later.
2016-02-24 11:13:02 -08:00
Brian Paul
c95d5c5f6f mesa: replace for loop with bitshifting in supported_buffer_bitmask()
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-24 08:32:01 -07:00
Brian Paul
ac37d0475c mesa: updates some comments in buffers.c
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-24 08:31:53 -07:00
Brian Paul
d8412029bb mesa: make _mesa_draw_buffers() static
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-24 08:31:44 -07:00
Brian Paul
24d8080507 mesa: make _mesa_draw_buffer() static
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-24 08:31:41 -07:00
Brian Paul
ebfcf9de43 mesa: make _mesa_read_buffer() static
Not called from any other file.  Remove _mesa_ prefix and update comments.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-24 08:31:37 -07:00
Brian Paul
1e41c2e135 mesa: move declaration of buffer var in handle_first_current()
Declare the var in the scopes where it's used.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-24 08:31:31 -07:00
Brian Paul
c8fdb42c91 mesa: use gl_buffer_index in a few places
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-24 08:31:28 -07:00
Brian Paul
363019e17a st/mesa: remove useless break statement
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-24 08:31:23 -07:00
Brian Paul
953cb24e65 st/mesa: rename st_readpixels to st_ReadPixels
To match the convention of other device driver functions.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-24 08:31:17 -07:00
Brian Paul
83b589301f st/mesa: fix frontbuffer glReadPixels regressions
The change "mesa/readpix: Don't clip in _mesa_readpixels()" caused a
few piglit regressions.  The failing tests use glReadPixels to read
from the front color buffer.  The problem is we were trying to read
from a non-existant front color buffer.  The front color buffer is
created on demand in st/mesa.  Since the missing buffer bounds were
effectively 0 x 0 the glReadPixels was totally clipped and returned
early.

The fix involves creating the real front color buffer when we're about
to try reading from it.

Tested with llvmpipe and VMware driver on Linux, Windows.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94253
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94254
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94257
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-24 08:30:07 -07:00
Jason Ekstrand
c9564fd598 nir/spirv: Allow but warn for a few capabilities
Unfortunately, glslang gives us cull/clip distance and GS streams even if
the shader doesn't use it whenever a shader is declared as version 450.
This is a glslang bug, but we can easily enough ignore it for now.
2016-02-23 22:07:25 -08:00
Jason Ekstrand
f0f7cc22f3 anv/descriptor_set: Use the correct size for the descriptor pool
The descriptor sizes array gives the total number of each type of
descriptor that will ever be allocated from the pool, not the total amount
that may be in any particular set.  In our case, this simply means that we
have to sum a bunch of things up and there we go.
2016-02-23 21:25:37 -08:00
Jason Ekstrand
040355b688 nir/spirv: Add more capabilities 2016-02-23 21:01:00 -08:00
Jason Ekstrand
bd3db3d665 anv/meta: Allocate descriptor pools on-the-fly
We can't use a global descriptor pool like we were because it's not
thread-safe.  For now, we'll allocate them on-the-fly and that should work
fine.  At some point in the future, we could do something where we
stack-allocate them or allocate them out of one of the state streams.
2016-02-23 17:04:19 -08:00
Oded Gabbay
4b7e219e61 gallium/radeon: Correctly translate colorswaps for big endian
The current code in r600_translate_colorswap uses the swizzle information
to determine which colorswap to use.

This works for BE & LE when the nr_channels is <4, but when nr_channels==4
(e.g. PIPE_FORMAT_A8R8G8B8_UNORM), this method can not be used for both BE
and LE, because the swizzle info is the same for both of them.

As a result, r600g doesn't support 24bit color formats, only 16bit, which
forces the user to choose 16bit color in X server.

This patch fixes this bug by separating the checks for LE and BE and
adapting the swizzle conditions in the BE part of the checks.

Tested on an Evergreen GPU (Cedar GL FirePro 2270) running inside POWER7
Big-Endian Machine.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
CC: "11.2" "11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-23 20:55:40 +02:00
Thomas Hindoe Paaboel Andersen
1807806add mesa: use sizeof on the correct type
Before the luminance stride was based on the size of GL_FLOAT
which is just the type constant (0x1406). Change it to use the
size of GLfloat.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-02-23 08:55:35 -07:00
Marek Olšák
190a291b03 tgsi/scan: handle holes between VS inputs, assert-fail in other cases
"st/mesa: overhaul vertex setup for clearing, glDrawPixels, glBitmap"
added a vertex shader declaring IN[0] and IN[2], but not IN[1].

Drivers relying on tgsi_shader_info can't handle holes in declarations,
because tgsi_shader_info doesn't track that.

This is just a quick workaround meant for stable that will work for vertex
shaders.

This fixes radeonsi DrawPixels and CopyPixels crashes.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-02-23 16:42:16 +01:00
Jason Ekstrand
bfbb238dea anv/descriptor_set: Set descriptor type for immuatable samplers 2016-02-22 21:39:14 -08:00
Jason Ekstrand
64e1c84059 intel/genxml: Update macro documentation 2016-02-22 21:20:04 -08:00
Francisco Jerez
31a0affa28 docs: Mark off GL_OES_shader_image_atomic as done.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-22 19:59:56 -08:00
Francisco Jerez
058ed980c6 i965/fs: Return result of image atomic in a register of the expected type.
So the result is of float type if we're implementing the float
overload of imageAtomicExchange.  This is the only back-end change
required to support OES_shader_image_atomic AFAICT.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-22 19:57:09 -08:00
Francisco Jerez
81c16a2dab glsl: Implement the required built-in functions when OES_shader_image_atomic is enabled.
This is basically just the same atomic functions exposed by
ARB_shader_image_load_store, with one exception:

    "highp float imageAtomicExchange(
         coherent IMAGE_PARAMS,
         float data);"

There's no float atomic exchange overload in the original
ARB_shader_image_load_store or GL 4.2, so this seems like new
functionality that requires specific back-end support and a separate
availability condition in the built-in function generator.

v2: Move image availability predicate logic into a separate static
    function for clarity.  Had to pull out the image_function_flags
    enum from the builtin_builder class for that to be possible.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-22 19:56:54 -08:00
Francisco Jerez
be125af95e glsl: Add usual extension boilerplate for OES_shader_image_atomic.
v2: No need for extension enable bits (Ilia).

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-22 19:56:35 -08:00
Francisco Jerez
009bbecf6d mesa: Add extension table entry for OES_shader_image_atomic.
v2: No need for extension enable bits (Ilia).

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-22 19:55:35 -08:00
Jason Ekstrand
ae619a0355 anv/state: Replace a bunch of ANV_GEN with GEN_GEN 2016-02-22 19:19:00 -08:00
Jason Ekstrand
442dff8cf4 anv/descriptor_set: Stop marking everything as having dynamic offsets 2016-02-22 17:23:29 -08:00
Kristian Høgsberg Kristensen
2570a58bcd anv: Implement descriptor pools
Descriptor pools are an optimization that lets applications allocate
descriptor sets through an externally synchronized object (that is,
unlocked).  In our case it's also plugging a memory leak, since we
didn't track all allocated sets and failed to free them in
vkResetDescriptorPool() and vkDestroyDescriptorPool().
2016-02-22 17:13:51 -08:00
Kristian Høgsberg Kristensen
353d5bf286 anv/x11: Free swapchain images and memory on destroy 2016-02-22 16:23:47 -08:00
Samuel Pitoiset
2999257e0f nvc0: rename 3d binding points to NVC0_BIND_3D_XXX
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-22 21:28:51 +01:00
Samuel Pitoiset
9c6a7bfb40 nvc0: rename 3d dirty flags to NVC0_NEW_3D_XXX
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-22 21:28:51 +01:00
Samuel Pitoiset
2c48369f54 nvc0: prefix compute macros with _CP_ instead of _COMPUTE_
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-22 21:28:51 +01:00
Samuel Pitoiset
bbff97ae39 nvc0: rename NVXX_COMPUTE to NVXX_CP
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-22 21:28:51 +01:00
Samuel Pitoiset
5330ed959e nvc0: rename nvc0_context::dirty to nvc0_context::dirty_3d
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-22 21:28:51 +01:00
Samuel Pitoiset
84b9b8f0a3 nvc0/ir: add missing emission of locked load predicate
Like unlocked store on shared memory, locked store can fail and the
second dest which is a predicate must be emitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2016-02-22 21:28:51 +01:00
Samuel Pitoiset
9f0d059d4b nvc0/ir: add ld lock/st unlock emission on GK104
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-22 21:28:51 +01:00
Samuel Pitoiset
6526225f88 nv50/ir: restore OP_SELP to be a regular instruction
Actually OP_SELP doesn't need to be a compare instruction. Instead we
just need to set the NOT modifier when building the instruction.
While we are at it, fix the dst register type and use a GPR.

Suggested by Ilia Mirkin.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-02-22 21:28:51 +01:00
Mark Janes
08b408311c vulkan: fix out-of-tree builds 2016-02-22 11:31:15 -08:00
Brian Paul
9de3b0273d svga: unbind index buffer when drawing non-indexed primitives
Silences a warning reported by the svga3d device.

v2: also null-out the index buffer pointer

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-02-22 12:14:48 -07:00
Kristian Høgsberg Kristensen
f843aabdd4 intel/genxml: Add README
I've had people ask about the design of the pack functions, for example,
why aren't we using bitfields. I wrote up a bit of background on why and
how we ended up with the current design and we might as well keep that
with the code.
2016-02-22 09:14:25 -08:00
Nanley Chery
7b2c63a53c anv/meta_blit: Handle compressed textures in anv_CmdCopyImage
As with anv_CmdCopyBufferToImage, compressed textures require special
handling during copies.

Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2016-02-22 09:04:28 -08:00
Ilia Mirkin
571bd9ac42 mesa: add GL_EXT_texture_border_clamp support
This extension is identical to GL_OES_texture_border_clamp. But dEQP has
tests that want the EXT variant.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-02-22 10:38:56 -05:00
Ilia Mirkin
b6654831c3 mesa: add GL_OES_texture_border_clamp support
Only minor differences to the existing ARB_texture_border_clamp support.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-02-22 10:38:56 -05:00
Ilia Mirkin
af8ad49541 mesa: bump version
11.2 has been branched, we're on 11.3 now.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-02-22 10:38:37 -05:00
Jason Ekstrand
f49ba0f7d8 nir/spirv: Add support for multisampled textures 2016-02-21 22:02:38 -08:00
Jason Ekstrand
f1dddeadc2 anv: Fix a typo in apply_dynamic_offsets
shader->num_uniforms is in terms of bytes in i965.
2016-02-20 21:24:31 -08:00
Jason Ekstrand
b5868d2343 anv: Zero out the WSI array when initializing the instance 2016-02-20 19:30:14 -08:00
Jason Ekstrand
bc696f1db6 isl: Stop including mesa/main/imports.h
It pulls in all sorts of stuff we don't want.
2016-02-20 10:35:25 -08:00
Jason Ekstrand
853fc3e431 genxml: Add mote includes in the generated headers 2016-02-20 09:33:20 -08:00
Jason Ekstrand
1f1cf6fcb0 anv: Get rid of GENX_FUNC
It was a bad idea.
2016-02-20 09:12:38 -08:00
Jason Ekstrand
371b4a5b33 anv: Switch over to the macros in genxml 2016-02-20 09:09:28 -08:00
Jason Ekstrand
0d76aa9485 intel/genxml: Add a couple of helper headers 2016-02-20 08:35:36 -08:00
Jason Ekstrand
2b85807458 genxml: Stop using unicode in the pack generator
This causes python problems and problems when people don't have a locale
set properly in their shell.
2016-02-19 08:05:35 -08:00
Dave Airlie
1375cb3c27 anv: fix warning about unused width variable.
We don't use width outside the debug clause here.
2016-02-19 08:01:54 -08:00
Jason Ekstrand
698ea54283 anv/pipeline: Fix a typo in the pipeline layout code 2016-02-18 13:55:57 -08:00
Jason Ekstrand
d5bb23156d anv/allocator: Set is_winsys_bo to false for block pool BOs 2016-02-18 13:55:57 -08:00
Mark Janes
1b37276467 vulkan: fix out-of-tree build
We need to be able to find the generated gen*pack.h headers.

Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-02-18 12:30:27 -08:00
Jason Ekstrand
e0565f40ea anv/pipeline: Use nir's num_images for allocating image_params 2016-02-18 11:44:26 -08:00
Jason Ekstrand
79c0781f44 nir/gather_info: Count textures and images 2016-02-18 11:42:36 -08:00
Jason Ekstrand
e881c73975 anv/pipeline: Don't leak the binding map 2016-02-18 11:09:30 -08:00
Jason Ekstrand
8c23392c26 anv/formats: Don't use a compound literal to initialize a const array
Doing so makes older versions of GCC rather grumpy.  Newere GCC fixes this,
but using a compound literal isn't really gaining us anything anyway.
2016-02-18 10:44:08 -08:00
Jason Ekstrand
9851c8285f Move the intel vulkan driver to src/intel/vulkan 2016-02-18 10:37:59 -08:00
Jason Ekstrand
47b8b08612 Move isl to src/intel 2016-02-18 10:34:47 -08:00
Jason Ekstrand
f6d9587688 vulkan: Move XML and generator into src/intel/genxml 2016-02-18 10:30:29 -08:00
Kristian Høgsberg Kristensen
542c38df36 anv/meta: Initialize blend state for the right attachment
We were always initializing only RT 0. We need to initialize the RT
we're creating the clear pipeline for.
2016-02-18 10:22:50 -08:00
Kristian Høgsberg Kristensen
05f75a3026 anv/meta: Don't use the blit ds layout in resolve code 2016-02-18 10:22:50 -08:00
Jason Ekstrand
40c76d4efa Delete nir_lower_samplers.cpp
Somehow, in one of the merges with mesa master, the old file must have been
kept when nir_lower_samplers.cpp was moved to nir_lower_samplers.c.
2016-02-17 20:16:11 -08:00
Jason Ekstrand
005b9ac758 anv: Gut anv_pipeline_layout
Almost none of the data in anv_pipeline_layout is used anymore thanks to
doing real layout in the pipeline itself.
2016-02-17 18:04:40 -08:00
Kristian Høgsberg Kristensen
c2581a9375 anv: Build the real pipeline layout in the pipeline
This gives us the chance to pack the binding table down to just what the
shaders actually need.  Some applications use very large descriptor sets
and only ever use a handful of entries.  Compacted binding tables should be
much more efficient in this case.  It comes at the down-side of having to
re-emit binding tables every time we switch pipelines, but that's
considered an acceptable cost.
2016-02-17 18:04:39 -08:00
Jason Ekstrand
581e4468f9 nir/spirv: Add some more capabilities 2016-02-17 18:04:39 -08:00
Jason Ekstrand
fed8b7f817 anv/pipeline: Delete out-of-bounds fragment shader outputs 2016-02-17 18:04:39 -08:00
Jason Ekstrand
979732fafc nir: Add a helper for getting the one function from a shader 2016-02-17 18:04:39 -08:00
Jason Ekstrand
8c05b44bbb nir: Add a nir_foreach_variable_safe helper 2016-02-17 18:04:39 -08:00
Jason Ekstrand
d67d84f5e5 i965/nir: Do lower_io late for fragment shaders 2016-02-17 18:04:39 -08:00
Jason Ekstrand
7c26d8d471 anv/gen7_pipeline: Set WriteDisable = true if we have no color attachments 2016-02-17 18:04:39 -08:00
Jason Ekstrand
9f9cd3de44 anv/gen8_pipeline: Default color attachments to WriteDisable = true 2016-02-17 18:04:39 -08:00
Jason Ekstrand
da9fd74d34 anv: Pull StencilBufferWriteEnable from both sides 2016-02-17 18:04:39 -08:00
Nanley Chery
9963af8bbd anv: Ignore unused dimensions in vkCreateImage's anv_image
We ignore unused dimensions in the isl surface; do the same for the
resulting anv_image.

Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2016-02-17 17:32:26 -08:00
Kristian Høgsberg Kristensen
b8da261dc7 spirv: Fix SpvOpFwidth, SpvOpFwidthFine and SpvOpFwidthCoarse
"Result is the same as computing the sum of the absolute values of
    OpDPdx and OpDPdy on P."

We were doing sum of absolute values of OpDPdx of P and OpDPdx of NULL.
2016-02-17 15:28:52 -08:00
Kristian Høgsberg Kristensen
ae3e249d57 anv: Remove hacky PIPE_CONTROL in vkCmdEndRenderPass()
The vkCmdPipelineBarrier() command should work as intended now and we
need to pull the plug on this old hack.
2016-02-17 15:19:07 -08:00
Kristian Høgsberg Kristensen
5e92e91c61 anv: Rework vkCmdPipelineBarrier()
We don't need to look at the stage flags, as we don't really support any
fine-grained, stage-level synchronization. We have to do two
PIPE_CONTROLs in case we're both flushing and
invalidating.

Additionally, if we do end up doing two PIPE_CONTROLs, the first,
flusing one also has to stall and wait for the flushing to finish, so we
don't re-dirty the caches with in-flight rendering after the second
PIPE_CONTROL invalidates.
2016-02-17 15:18:06 -08:00
Kristian Høgsberg Kristensen
3b9b908054 anv: Ignore unused dimensions in vkCreateImage
We would assert on unused dimensions (eg extent.depth for
VK_IMAGE_TYPE_2D) not being 1, but the specification doesn't put any
constraints on those. For example, for VK_IMAGE_TYPE_1D:

   "If imageType is VK_IMAGE_TYPE_1D, the value of extent.width must be
    less than or equal to the value of
    VkPhysicalDeviceLimits::maxImageDimension1D, or the value of
    VkImageFormatProperties::maxExtent.width (as returned by
    vkGetPhysicalDeviceImageFormatProperties with values of format,
    type, tiling, usage and flags equal to those in this structure) -
    whichever is higher"

We'll fix up the arguments to isl to keep isl strict in what it expects.
2016-02-17 12:21:51 -08:00
Kristian Høgsberg Kristensen
b63e28c0e1 anv: Set correct write domain on window system BOs
We need to make sure GEM understands that we're writing to the BO, in
case it needs to synchronize with other rings (blitter use in display
server, for example).
2016-02-17 11:19:56 -08:00
Kristian Høgsberg Kristensen
5caa995c32 Revert "anv: Disable snooping for allocator pools again"
This reverts commit c136672c59.

We still have the intermittent missing flush for VkEvent in certain
vulkancts cases:

  piglit.deqp-vk.api.command_buffers.execute_large_primary
  piglit.deqp-vk.api.command_buffers.submit_count_non_zero,

Let's reenable the snooping until we figure out the root cause.
2016-02-16 23:23:49 -08:00
Kristian Høgsberg Kristensen
ecc67f1aac anv: Make driver and icd file installable
Change the name of the .so to libvulkan_intel.so and add an installable
icd with the installed paths.  Keep the icd file with build-tree paths,
but rename to dev_icd.json to make it clear that it's for development
purposes.
2016-02-16 23:23:17 -08:00
Kristian Høgsberg Kristensen
4a2d17f606 anv: Revise PhysicalDeviceFeatures and remove FINISHME 2016-02-16 15:43:12 -08:00
Philipp Zabel
ecd1d94d1c anv: pCreateInfo->pApplicationInfo parameter to vkCreateInstance may be NULL
Fix a NULL pointer dereference in anv_CreateInstance in case
the pApplicationInfo field of the supplied VkInstanceCreateInfo
structure is NULL [1].

[1] https://www.khronos.org/registry/vulkan/specs/1.0/apispec.html#VkInstanceCreateInfo

Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
2016-02-16 14:42:26 -08:00
Jason Ekstrand
48087cfc4e anv/icd.json: Update the ABI version 2016-02-16 08:02:17 -08:00
Jason Ekstrand
0a3324e66c anv: Pull Khronos stuff from the README 2016-02-16 07:43:21 -08:00
Kristian Høgsberg Kristensen
a3672a241b anv/genxml: Include MBO bits for gen7 and gen75 2016-02-15 17:57:03 -08:00
Kristian Høgsberg Kristensen
c2b2ebf1ed anv: Add missing gen75_cmd_buffer_set_subpass() prototype 2016-02-15 17:40:15 -08:00
Adam Jackson
80ec20351c anv: Bump to 1.0.3
Probably this should be picked up from <vulkan.h> directly, or we should
just assume that any 1.0.x is legal.
2016-02-15 17:38:26 -08:00
Kristian Høgsberg Kristensen
b53edea76c anv/gen7: Make disabling the FS work
We disable the fragment shader for depth/stencil-only pipelines. This
commit makes that work for gen7.
2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen
85f67cf16e anv: Deduplicate render pass code
This lets us share the renderpass code and depth/stencil state code
between gen 7 and gen 8.
2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen
ac4fd0ed21 anv/gen7: Fix pipeline selection in init_device_state()
We need the 3D pipeline for the initial setup, not GPGPU.
2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen
ea694637ac anv/gen7: Set 3DSTATE_SF depth buffer format correctly
We need to pull this from the render pass information at state flush
time.
2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen
18dd59538b anv/gen7: Call flush_pipeline_select_3d() from CmdBeginRenderPass 2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen
832f73f512 anv: Share flush_pipeline_select_3d() between gen7 and gen8 2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen
53eaa0a6b8 anv: Fix warning 3DSTATE_VERTEX_ELEMENTS setup
This is a little more subtle. If elem_count is 0, nothing else happens
in this function, so we return early to avoid warning about
uninitialized 'p'.
2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen
5d72d7b12d anv: Fix misc simple warnings 2016-02-15 17:32:07 -08:00
Jason Ekstrand
08ecd8a8d1 anv/meta_resolve: Set origin_upper_left on gl_FragCoord
It's required by the spec and any shaders that don't set it will be broken.
I'm not really sure how multisampling was even working before...
2016-02-15 12:45:03 -08:00
Jason Ekstrand
88042b9f10 nir: Get rid of the C++ NIR_SRC/DEST_INIT macros
These were originally added to reduce compiler warnings but aren't really
needed.  Getting rid of them reduces the diff between the Vulkan branch and
master, so we might as well.
2016-02-12 21:35:02 -08:00
Kristian Høgsberg Kristensen
c136672c59 anv: Disable snooping for allocator pools again
The race we were seeing on cherryview was caused by the multi-submit
problem with fences. We can now turn snooping off again an rely on
clflush and we intended.
2016-02-12 15:11:31 -08:00
Kristian Høgsberg Kristensen
b0c30b77d4 anv: Submit fence bo only after all command buffers
We were submitting the fence bo after each command buffer in a multi
command buffer submit, causing us to occasionally complete the fence too
early.
2016-02-12 15:08:09 -08:00
Kristian Høgsberg Kristensen
39a120aefe anv: Implement VkPipelineCache
We hash the input SPIR-V, specialization constants, entrypoint and the
shader key using SHA1 to determine a unique identifier for the
combination. A VkPipelineCache is then a hash table mapping these
identifiers to the corresponding prog_data and kernel data.
2016-02-12 11:53:49 -08:00
Chad Versace
03bea8fda7 anv/meta_blit: Remove references to clearing
Long ago, the blit code used to handle clearing and blitting.

- Fix any comments that refer to clearing.
- Rename shader var 'attr' to 'tex_pos'. The name 'attr' is an artifact
  of the time when the shader was used for blitting as well as clearing.
2016-02-12 11:29:29 -08:00
Chad Versace
97b5a07378 anv/meta_blit: Coalesce glsl_vec4_type vars
Just a refactor. No behavior change.

Several expressions have the same value: they point to
glsl_vec4_type(). Coalesce them into a single variable.
2016-02-12 11:29:29 -08:00
Jason Ekstrand
699f21216f anv/device: clflush simple batches if !LLC 2016-02-12 11:00:42 -08:00
Jason Ekstrand
42155abdd7 anv: Add a clfush_range helper function 2016-02-12 11:00:08 -08:00
Jason Ekstrand
3c8dc1afd1 nir/spirv/glsl: Clean up the row-skipping swizzle logic a bit 2016-02-12 10:40:39 -08:00
Chad Versace
37f4dfb19d anv/meta: Move blit code to anv_meta_blit.c
The clear code lived in anv_meta_clear.c. The resolve code in
anv_meta_resolve.c. Only the blit code lived in anv_meta.c, alongside
the shareed meta code.

This is just a copy-paste patch. No change in behavior.
2016-02-12 09:56:24 -08:00
Chad Versace
cf7fd53850 anv/meta: Hardcode smooth texcoord interpolation in blit shaders
Trivial cleanup. No change in behavior.

Function argument 'attr_flat', in anv_meta.c:build_nir_vertex_shader(),
was always false.
2016-02-12 09:15:58 -08:00
Jason Ekstrand
ea93041ccc anv/device: Use a normal BO in submit_simple_batch 2016-02-11 21:39:15 -08:00
Jason Ekstrand
3a2b23a447 anv: Add a vk_icdGetInstanceProcAddr entrypoint
Aparently there are some issues in symbol resolution if an application
packages its own loader and you have a system-installed one.  I don't
really understand the details, but it's not onorous to add.
2016-02-11 21:20:12 -08:00
Jason Ekstrand
25b09d1b5d anv/event: Use a 64-bit value
The immediate write from PIPE_CONTROL is 64-bits at least on BDW.  This
used to work on 64-bit archs because the compiler would align the following
anv_state struct up for us.  However, in 32-bit builds, they overlap and it
causes problems.
2016-02-11 19:00:56 -08:00
Jason Ekstrand
3086c5a5e1 gen8/pipeline: Properly set bits in PS_EXTRA for W, depth, and samaple mask 2016-02-11 15:22:18 -08:00
Jason Ekstrand
4016619931 nir/spirv: Allow the clip distance capability. 2016-02-11 15:14:46 -08:00
Jason Ekstrand
da4a6bbbea gen8/pipeline: Pull gs_vertex_count from prog_data 2016-02-11 15:13:54 -08:00
Jason Ekstrand
ff8895ba56 Merge remote-tracking branch 'mesa-public/master' into vulkan 2016-02-11 15:09:30 -08:00
Kristian Høgsberg Kristensen
2009e304f7 anv/pack: Handle case where a struct field covers multiple dwords
We also didn't add start to field.end to get the absolute field end
position.
2016-02-10 22:36:38 -08:00
Jason Ekstrand
f710f3ca37 Merge remote-tracking branch 'mesa-public/master' into vulkan
This also reverts commit 1d65abfa58 because
now NIR handles texture offsets in a much more sane way.
2016-02-10 17:12:11 -08:00
Jason Ekstrand
7ef3e47c27 Merge commit '85f5c18fef1ff2f19d698f150e23a02acd6f59b9' into vulkan 2016-02-10 17:09:56 -08:00
Kristian Høgsberg Kristensen
d2623a3247 anv: Handle dwords that are all MBZ correctly
A few packets have dwords in them that are all MBZ and we failed to
write those. This change makes sure we iterate through all dwords and
write them all.
2016-02-10 16:36:47 -08:00
Kristian Høgsberg Kristensen
09bb7ea4b7 anv: Fix out-of-tree build
We need to be able to find the generated nir_opcodes.h header.
2016-02-10 15:54:28 -08:00
Kristian Høgsberg Kristensen
9cc939d82f nir: Fix out-of-tree build for spirv2nir
This needs to be able to find the generated nir_opcodes.h header.
2016-02-10 15:54:28 -08:00
Jason Ekstrand
9be5a4bc29 nir/spirv: Fix handling of OpGroupMemberDecorate
We were pulling the member index from the wrong dword
2016-02-10 15:36:42 -08:00
Jason Ekstrand
ac04c6de2c nir/spirv: Assert that struct member ids are in-bounds 2016-02-10 15:36:41 -08:00
Mark Janes
8179834030 nir/spirv: fix build_mat_subdet stack smasher
The sub-determinate implementation pattern fixed by
6a7e2904e0 has a second instance in the
same file.

With the previous algorithm, when row and j are both 3, the index
overruns the array.  This only impacts the stack on 32 bit builds.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-02-10 14:43:03 -08:00
Kristian Høgsberg Kristensen
51c01e292c anv: Generate pack headers from XML definition
This huge commit switches us over to using a simple xml format (genxml)
for defining our command streamer commands and a python script for
generating the pack headers we use in the driver.
2016-02-10 14:31:26 -08:00
Jason Ekstrand
09b3e30dc6 anv: Fix up spirv for new texture/sampler split stuff 2016-02-09 16:48:36 -08:00
Jason Ekstrand
b14f4c1fd3 Merge remote-tracking branch 'mesa-public/master' into vulkan
This pulls in the separate texture/sampler stuff from upstream
2016-02-09 16:47:37 -08:00
Jason Ekstrand
e01dd59b73 vtn: Use const_index helpers 2016-02-09 16:32:38 -08:00
Jason Ekstrand
e15f7551d1 anv/apply_pipeline_layout: Use the new const_index helpers 2016-02-09 16:32:38 -08:00
Jason Ekstrand
768bd7f272 Merge commit '8b0fb1c152fe191768953aa8c77b89034a377f83' into vulkan
This pulls in Rob Clark's const_index changes for NIR
2016-02-09 15:30:39 -08:00
Chad Versace
4c5dcccfba anv/image: Fix usage for depthstencil images
The tests assertion-failed in vkCmdClearDepthStencilImage because the
isl surface lacked ISL_SURF_USAGE_DEPTH_BIT.

Fixes: https://gitlab.khronos.org/vulkan/mesa/issues/26
Fixes: dEQP-VK.pipeline.timestamp.transfer_tests.host_stage_with_clear_depth_stencil_image_method
Fixes: dEQP-VK.pipeline.timestamp.transfer_tests.transfer_stage_with_clear_depth_stencil_image_method
2016-02-09 12:54:30 -08:00
Chad Versace
c5e521f391 anv/image: Refactor choose_isl_surf_usage()
- Rename local var isl_flags -> isl_usage.
- Fix comment.
2016-02-09 12:54:30 -08:00
Chad Versace
2f4bb00c2b anv/image: Fix choose_isl_surf_usage()
Don't translate VkImageCreateInfo::usage into an isl_surf_usage bitmask.
Instead, translate anv_image::usage, which is a superset of
VkImageCreateInfo::usage.

For-Issue: https://gitlab.khronos.org/vulkan/mesa/issues/26
2016-02-09 12:54:04 -08:00
Chad Versace
bdab29a312 isl: Add more assertions to isl_surf_get_depth_format()
R16_UNORM and R32_FLOAT are illegal formats for interleaved depthstencil
surfaces.
2016-02-09 11:40:08 -08:00
Jason Ekstrand
1d65abfa58 nir/spirv: Better handle constant offsets in texture lookups 2016-02-09 10:29:05 -08:00
Jason Ekstrand
209820739b nir/spirv: Set the vtn_mode and interface type for sampler parameters 2016-02-09 10:29:05 -08:00
Jason Ekstrand
de6c9c5f2e nir/inline_functions: Don't shadown variables when it isn't needed
Previously, in order to get things working, we just always shadowed
variables.  Now, we rewrite derefs whenever it's safe to do so and only
shadow if we have an in or out variable that we write or read to
respectively.
2016-02-09 10:29:05 -08:00
Jason Ekstrand
b6c00bfb03 nir: Rework function parameters 2016-02-09 10:29:05 -08:00
Jason Ekstrand
a485567d3a anv/WSI/X11: Use the right allocator for freeing swapchains 2016-02-09 10:29:05 -08:00
Chad Versace
e6d3432c81 anv: Replace anv_format::depth_format with ::has_depth
isl now understands depth formats. We no longer need depth formats in
the anv_format table.
2016-02-09 10:02:50 -08:00
Chad Versace
0a93067993 isl: Add func isl_surf_get_depth_format()
For depth surfaces, it gets the value for
3DSTATE_DEPTH_BUFFER.SurfaceFormat.
2016-02-09 10:02:50 -08:00
Chad Versace
4d037b551e anv: Rename anv_format::surface_format -> isl_format
Because that's what it is, an isl format.
2016-02-09 10:02:50 -08:00
Francisco Jerez
cec6fe2ad8 vtn: Clean up acos implementation.
Parameterize build_asin() on the fit coefficients so the
implementation can be shared while still using different polynomials
for asin and acos.  Also switch back to implementing acos in terms of
asin -- The improvement obtained from cancelling out the pi/2 terms
was negligible compared to the approximation error.
2016-02-08 15:23:43 -08:00
Francisco Jerez
f50a651726 nir/spirv: Create integer types of correct signedness.
vtn_handle_type() creates a signed type regardless of the value of the
signedness flag, which usually doesn't make much of a difference
except when the type is used as base sampled type of an image type,
what will cause the base type of the NIR image variable to be
inconsistent with its format and cause an assertion failure in the
back-end (most likely only reproducible on Gen7), and may change the
semantics of the image intrinsic subtly (e.g. UMIN may become IMIN).
2016-02-08 15:23:35 -08:00
Kristian Høgsberg Kristensen
6c4c04690f anv: Deduplicate dispatch calls
This can all be shared between gen8+ and pre-gen8.
2016-02-05 22:36:53 -08:00
Kristian Høgsberg Kristensen
bdefaae2b9 anv: Deduplicate anv_CmdDraw calls
These were all duplicated between gen7_cmd_buffer.c and
gen8_cmd_buffer.c. This commit consolidates both copies in
genX_cmd_buffer.c.
2016-02-05 16:41:56 -08:00
Kristian Høgsberg Kristensen
6cdada0360 anv: Move invariant state to small initial batch
We use the simple batch helper to submit a batch at driver startup time
which holds all the state that never changes.  We don't have a whole lot
and once we enable tesselation there'll be even less. Even so, it's a
simple mechanism and reduces our steady state batch sizes a bit.
2016-02-05 16:13:53 -08:00
Kristian Høgsberg Kristensen
c9c3344c4f anv: Split out batch submit helper from anv_DeviceWaitIdle
We'll reuse this mechanism in the next commit.
2016-02-05 16:13:52 -08:00
Kristian Høgsberg Kristensen
381d85545a anv: Share scratch_space helper between gen7 and gen8+
The gen7 pipeline has a useful helper function for this, let's use it in
gen8_pipeline.c too.  The gen7 function has an off-by-one bug though: we
have to compute log2(size / 1024) - 1, but we divide by 2048 instead so
as to avoid the case where size is less than 1024 and we'd return -1.
2016-02-05 16:13:52 -08:00
Kristian Høgsberg Kristensen
d1617dbec3 anv: Share URB setup between gen7 and gen8+ 2016-02-05 16:13:52 -08:00
Jason Ekstrand
9401516113 Merge remote-tracking branch 'mesa-public/master' into vulkan 2016-02-05 15:21:11 -08:00
Jason Ekstrand
741744f691 Merge commit mesa-public/master into vulkan
This pulls in the patches that move all of the compiler stuff around
2016-02-05 15:03:44 -08:00
Jason Ekstrand
9645b8eb1f Merge branch mesa-public/master into vulkan 2016-02-05 14:21:13 -08:00
Chad Versace
3eebf3686b anv: Drop anv_image::needs_*_surface_state
anv_image::needs_sampler_surface_state was a redundant member,
identical to (usage & VK_IMAGE_USAGE_SAMPLED_BIT). Likewise for the
other needs_* members.
2016-02-04 12:20:51 -08:00
Chad Versace
42b9320fbf anv/image: Rename nonrt_surface_state
Let's call it what it is, not what it is not. Rename it to
'sampler_surface_state'.
2016-02-04 12:20:51 -08:00
Jason Ekstrand
1f5d56304f anv/descriptor_set: Fix descriptor copies
We weren't pulling the actual binding location information out of the set
layout.  The new code mirrors the descriptor write code.
2016-02-03 22:44:33 -08:00
Mark Janes
6a7e2904e0 nir/spirv: fix build_mat4_det stack smasher
When generating a sub-determinate matrix, a 3-element swizzle array was
indexed with clever inline boolean logic.  Unfortunately, when i and j
are both 3, the index overruns the array, smashing the next variable on
the stack.

For 64 bit builds, the alignment of the 3-element unsigned array leaves
32 bits of spacing before the next local variable, hiding this bug.  On
i386, a subcolumn pointer was smashed then dereferenced.
2016-02-02 15:30:54 -08:00
Mark Janes
ea8c2d118a anv: Fix anv_descriptor_set reference error on deletion
anv_descriptor_set_destroy uses the descriptor sets's set_layout member
to iterate the set's buffer views.  However, the set_layout reference
may have previously been freed.

On 64 bit builds, this bug generated valgrind errors but did not affect
CTS test results.  On 32 bit builds, it reliably produces assertions and
memory corruption.
2016-02-02 15:28:01 -08:00
Kristian Høgsberg Kristensen
5a06bac4a0 anv: Use @LIB_DIR@ in anv_icd.json
Otherwise we may get a lib vs lib64 mismatch.
2016-02-02 14:36:22 -08:00
Jason Ekstrand
fd99f3d658 anv/device: Improve version error reporting 2016-02-02 13:16:13 -08:00
Jason Ekstrand
c7f26bbed9 vulkan: Bump the header to 1.0.3 2016-02-02 13:08:47 -08:00
Jason Ekstrand
0d2145b50f anv/fence: Default to not ready
This is kind-of silly.  We *really* need to do a better job of making sure
all objects have all their default values set.  We probably also want to,
eventually, put everything into the BO (to save memory) and, more
specifically, make the GPU write the "ready" flag.  That way GetFenceStatus
won't ever have to call into the kernel.
2016-02-02 12:22:03 -08:00
Mark Janes
ac0589b213 i965: fix unsigned long overflows for i386
bit-shifts on 32 bit unsigned longs overflow in several places.  The
intention was for 64 bit integers to be used.
2016-02-01 14:52:22 -08:00
Jason Ekstrand
8776d3cb8e nir/spirv: Fix UBO loads of a single element of a row-major matrix 2016-02-01 14:03:05 -08:00
Jason Ekstrand
499f7c2f0b nir/spirv: Handle the LOD parameter of OpImageQuerySizeLod 2016-02-01 14:03:05 -08:00
Jason Ekstrand
b1a1623293 nir/spirv: Add support for SpvOpImage 2016-02-01 14:03:05 -08:00
Jason Ekstrand
593f88c0db nir/spirv: Fix the UBO loading case of a single row-major matric column 2016-02-01 14:03:05 -08:00
Jason Ekstrand
abc0e5c1b8 nir/spirv: Fix the UBO loading case of a single row-major matric column 2016-02-01 13:26:59 -08:00
Jason Ekstrand
2d2c6fc6bb anv/wsi/wayland: Advertise sRGB 2016-02-01 13:06:35 -08:00
Jason Ekstrand
443c578bca anv/wsi/x11: Expose SRGB all the time
After a long discussion with Eric Anholt and Owen Taylor, I learned that
X11 is basically always sRGB as that's what the scanout hardware does and X
doesn't modify anything.  Therefore, we should just always expose sRGB
formats.
2016-02-01 13:06:35 -08:00
Chad Versace
afb327a985 anv: Structify a one-member union
anv_descriptor contained a union with one member.
2016-02-01 12:18:10 -08:00
Kristian Høgsberg Kristensen
dc5fdcd6b7 anv: Advertise robustBufferAccess
The GPU does most of this for us as long as we set up tight bounds for
the buffers, which we do. Additionally, we range check dynamically
buffers in the shader. With that it's safe to turn on robustBufferAccess.
2016-02-01 12:00:05 -08:00
Chad Versace
ffbc32f8d9 anv/meta: Strip trailing whitespace 2016-02-01 10:51:01 -08:00
Chad Versace
aa5e257860 anv: Update MSAA status in README 2016-02-01 10:46:24 -08:00
Jason Ekstrand
a88b1eeb13 Update the README 2016-02-01 06:10:51 -08:00
Jason Ekstrand
ea63663a72 wsi/x11: Remove B8G8R8_UNORM
We don't actually support that format yet because ISL doesn't have an enum
for it.  We need to beef up the formats table to allow for tiled-only
formats.
2016-02-01 06:00:50 -08:00
Jordan Justen
f96a6c65a3 anv/gen7: Rename gen7_batch_lr* to emit_lr*
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-30 15:06:03 -08:00
Jordan Justen
b207a6b5aa anv/gen7: Set BypassGatewayControl in MEDIA_VFE_STATE
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-30 15:06:03 -08:00
Jordan Justen
2d8726a4b7 anv/genX_pipeline: Remove unnecessary #include files
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-30 09:30:54 -08:00
Jordan Justen
8e48ff3ad6 anv/gen7: Set SLM size in interface descriptor
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-30 09:10:54 -08:00
Jordan Justen
ab0d8608d2 anv: Support MEDIA_VFE_STATE for gen7
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-30 09:08:34 -08:00
Jordan Justen
dd2effb0e7 anv/gen7: Subtract 1 from num_elements when setting up buffer surface state
e8f51fe4 for gen7

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-30 09:00:00 -08:00
Jordan Justen
4bb1e7937a anv/gen7: Disable fs dispatch for depth/stencil only pipelines
292031a for gen7

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-30 09:00:00 -08:00
Jordan Justen
f5b3a2fe32 anv/gen7: Add support for gl_NumWorkGroups
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-30 09:00:00 -08:00
Jordan Justen
7e46cc8603 anv/gen7/compute: Setup push constants and local ids
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-30 09:00:00 -08:00
Jordan Justen
b1158ced45 anv/genX: Add genX_pipeline.c for compute_pipeline_create
Adds initial compute_pipeline_create implementation for gen7.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-30 08:58:11 -08:00
Jason Ekstrand
1a442a7923 Merge branch 'vulkan' into 'vulkan'
Vulkan WSI Wayland fixes

Two small fixes to make mailbox mode actually work again.

See merge request !4
2016-01-30 10:28:12 -05:00
Jason Ekstrand
c668dc9f75 anv/pass: Initialize has_resolve 2016-01-30 07:16:33 -08:00
Jason Ekstrand
ad813b072a anv/wsi: Set the platform field of VkIcdSurfaceBase 2016-01-30 07:05:53 -08:00
Jason Ekstrand
5acc4e2ebf anv/wsi/x11: Actually pull information from the window's visual 2016-01-30 03:51:47 -08:00
Jason Ekstrand
66e8b5cf2b anv/wsi/x11: Actually check for DRI3 2016-01-30 03:50:31 -08:00
Jason Ekstrand
44ec860cd6 anv/WSI: Support more usage bits
They're just images and we have no intention of stompping alpha channels
(at least not yet), so there's no reason why you can't sample.
2016-01-29 20:52:44 -08:00
Jason Ekstrand
337c1e0871 anv/formats: Add more compressed formats
This adds support for the DX compression formats.  Given that ETC and EAC
are working fine, these should be ok too.
2016-01-29 20:46:31 -08:00
Jason Ekstrand
c688e4db11 anv/wsi: Rework to be compatable with the loader 2016-01-29 20:39:21 -08:00
Jason Ekstrand
d4953fb340 vulkan: Import vk_icd.h 2016-01-29 20:37:45 -08:00
Jason Ekstrand
a19ceee46c anv/device: Fix version check
The bottom-end check was wrong so it was only working on <= 1.0.0.  Oops.
2016-01-29 20:36:58 -08:00
Kristian Høgsberg Kristensen
f28645f71c anv: Don't disable snooping for mempools
There's an intermittent flushing problem with VkEvent that we need to
root cause. For now, using the snooping feature keeps the memory pools
up to date with GPU writes and fixes the problem.
2016-01-29 17:19:51 -08:00
Kristian Høgsberg Kristensen
0c4ef36360 anv: clflush is only orderered against mfence
We can't use the more fine-grained load and store fence commands (lfence
and mfence), since clflush is only guaranteed to be ordered with respect
to mfence.
2016-01-29 14:56:41 -08:00
Kristian Høgsberg Kristensen
31d3486bd2 anv: Limit flushing to the range of mapped memory 2016-01-29 14:56:41 -08:00
Ben Widawsky
89ec36f221 anv/cmd_buffer: Emit gen9 style SF state for CHV
The state for line width changes on Cherryview to use the GEN9 bits (for extra
precision).
2016-01-29 14:12:32 -08:00
Ben Widawsky
31508bd0ce anv/gen8: Extract SF state
For upcoming patch to address difference in Cherryview.
2016-01-29 14:11:53 -08:00
Chad Versace
f8a4abcd15 anv: Do resolves at end of subpass 2016-01-28 10:49:50 -08:00
Chad Versace
bef8456ede anv/meta: Remove unneeded resolve pipeline
Vulkan does not allow resolving a single-sample image. So remove that
pipeline from anv_meta_state::resolve::pipelines.
2016-01-28 10:45:11 -08:00
Chad Versace
ac5594fa71 anv/meta_resolve: Remove redundant initialization params 2016-01-28 10:14:39 -08:00
Chad Versace
142da00486 anv: Drop const on anv_framebuffer::attachments
The attachments should be const, but the driver's function signatures
are generally not const-friendly.

Drop the const because it conflicts with upcoming
anv_cmd_buffer_resolve_subpass().
2016-01-28 10:03:00 -08:00
Chad Versace
22258e279d anv: Add anv_subpass::has_resolve
Indicates that the subpass has at least one resolve attachment.
2016-01-28 10:03:00 -08:00
Chad Versace
3d863e8dad anv/meta_resolve: Save/Restore viewport and scissor 2016-01-28 10:03:00 -08:00
Chad Versace
8487569fa7 anv/meta_resolve: Begin pass outside emit_resolve()
This refactor is preparation for handling subpass resolve attachments.
2016-01-28 10:03:00 -08:00
Chad Versace
2bab3cd681 anv/image: Update usage flags for multisample images
Meta resolves multisample images by binding them as textures. Therefore
we must add VK_IMAGE_USAGE_SAMPLED_BIT.
2016-01-28 10:03:00 -08:00
Jason Ekstrand
608b411e9f anv/device: Add a better version check.
We now check that the requested version is precicely within the range of
versions that we support.
2016-01-28 08:19:40 -08:00
Jason Ekstrand
6286a74f6b anv/device: Advertise 1.0.2 2016-01-27 22:02:51 -08:00
Jason Ekstrand
ec80d6388a anv/formats: Properly set FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT
This was added last minute and the API bumped to 1.0.2.
2016-01-27 22:02:06 -08:00
Jason Ekstrand
ac75746448 vulkan.h: Update to 1.0.2 2016-01-27 21:59:00 -08:00
Jason Ekstrand
c64bc5463d anv/device: Improve the api version check to allow 1.0.X 2016-01-27 21:56:46 -08:00
Francisco Jerez
4604b2871a vtn: Improve accuracy of acos approximation.
The adjusted polynomial coefficients come from the numerical
minimization of the L2 norm of the relative error.  The old
coefficients would give a maximum relative error of about 15000 ULP in
the neighborhood around acos(x) = 0, the new ones give a relative
error bounded by less than 2000 ULP in the same neighborhood.
2016-01-27 19:55:21 -08:00
Jason Ekstrand
7fb35a8228 An alternate arccosine implementation 2016-01-27 19:55:21 -08:00
Jason Ekstrand
983db2b804 anv/meta_resolve: Fix a bug in the meta pipeline destroy path 2016-01-27 19:48:43 -08:00
Chad Versace
9b240a1e3d anv/skl: Fix crash in 16x multisampling
We built meta clear and resolve pipelines for only up to 8x samples.
There were no 16x pipelines.
2016-01-27 18:38:15 -08:00
Chad Versace
61d3d49820 anv: Fix comment for anv_meta_state arrays
Array element i is for 2^i samples, not log2(i) samples.
2016-01-27 18:32:05 -08:00
Ben Widawsky
2af3281fee anv/push constants: Use constant buffer #2
SKL has a workaround which requires either some weird programming of buffer 3,
OR, just never using buffer 0. Since we don't actually use multiple constant
buffers, it's easier to just not use 0.

Only SKL requires this workaround, but there is no harm in applying it to all
platforms. The big change here is that buffer #0 is relative to dynamic state
base normally (depending upon ISTPM), where buffer 1-3 is a GPU virtual address.
2016-01-27 17:09:36 -08:00
Chad Versace
5d4f3298ae anv/meta: Implement multisample clears 2016-01-27 17:01:59 -08:00
Chad Versace
eb6fb65fd1 anv/meta: Simplify failure handling during clear init
Remove all the fine-grained cleanup in
anv_device_init_meta_clear_state(). Instead, if anything fails during
initialization, simply call anv_device_finish_meta_clear_state() and let
it clean up the partially initialized anv_meta_state::clear.
2016-01-27 17:01:56 -08:00
Chad Versace
4085f1f230 anv/meta: Implement vkCmdResolveImage
This handles multisample color images that have a floating-point or
normalized format and have a single array layer.

This does not yet handle integer formats nor multisample array images.
2016-01-27 16:55:59 -08:00
Chad Versace
8cc1e59d61 anv/meta: Add func anv_meta_get_iview_layer()
This function is just meta_blit_get_dest_view_base_array_slice(), but
moved to the shared header anv_meta.h.

Will be needed by anv_meta_resolve.c.
2016-01-27 16:52:30 -08:00
Chad Versace
8cc6f058ce anv/gen8: Begin enabling pipeline multisample state
As far as I can tell, this patch sets all pipeline multisample state
except:
    - alpha to coverage
    - alpha to one
    - the dispatch count for per-sample dispatch
2016-01-27 16:52:27 -08:00
Chad Versace
57e4a5ea99 anv/gen8: Set multisample surface state 2016-01-27 16:48:20 -08:00
Chad Versace
9b3d660878 anv/meta: Merge anv_meta_clear.h into anv_meta.h
The header was too small.
2016-01-27 16:48:20 -08:00
Kenneth Graunke
32e4c5ae30 vtn: Make tanh implementation even stupider
The dEQP "precision" test tries to verify that the reference functions

   float sinh(float a) { return ((exp(a) - exp(-a)) / 2); }
   float cosh(float a) { return ((exp(a) + exp(-a)) / 2); }
   float tanh(float a) { return (sinh(a) / cosh(a)); }

produce the same values as the built-ins.  We simplified away the
multiplication by 0.5 in the numerator and denominator, and apparently
this causes them not to match for exactly 1 out of 13,632 values.

So, put it back in, fixing the test, but making our code generation
(and precision?) worse.
2016-01-27 15:34:50 -08:00
Jason Ekstrand
8f0ef9bbeb nir/opt_algebraic: Use a more elementary mechanism for lowering ldexp 2016-01-27 15:21:28 -08:00
Jason Ekstrand
f7d6b8ccfe gen8/state: Fix QPitch for compressed textures on Broadwell 2016-01-27 15:12:42 -08:00
Jason Ekstrand
162c662585 anv/image: Use the entire image height for compressed meta blits 2016-01-27 15:12:42 -08:00
Nanley Chery
235abfb7e6 anv/image: Enlarge the image level 0 extent
The extent previously was supposed to match the mip at a given level
under the assumption that the base address would be that of the mip
as well.

Now however, the base address only matches the offset of the
containing tile. Therefore, enlarge the extent to match that of
phys_slice0, so that we don't draw/fetch in out of bounds territory.

This solution isn't perfect because the base adress isn't always at
the first tile, therefore the assumed valid memory region by the HW
contains some number of invalid tiles on two edges.
2016-01-27 15:12:42 -08:00
Jason Ekstrand
96cf5cfee1 anv/image: Minify before dividing by block dimensions 2016-01-27 15:12:42 -08:00
Jason Ekstrand
1bea1eff38 anv/meta: Don't double-call choose_buffer_format
This fixes all the renderpass tests
2016-01-27 15:12:42 -08:00
Nanley Chery
dd22b5c914 anv/meta: Modify make_image_for_buffer()'s image
Always use a valid buffer format and convert the extent to units of
elements with respect to original image format.
2016-01-27 15:12:42 -08:00
Nanley Chery
d3c1fd53e2 anv/image: Use custom VkBufferImageCopy for iview initialization
Use a custom VkBufferImageCopy with the user-provided struct as
the base. A few fields are modified when the iview is uncompressed
and the underlying image is compressed.
2016-01-27 15:12:42 -08:00
Nanley Chery
6a579ded87 anv: Add offset parameter to anv_image_view_init()
This is the offset of the tile that contains the mip specified by
isl_surf_get_image_intratile_offset_el(). Used to draw to/from the
specified mip.
2016-01-27 15:12:42 -08:00
Nanley Chery
4a0075feeb anv/meta: Calculate mip offset for compressed texture
This value will be used in a later commit.
2016-01-27 15:12:42 -08:00
Nanley Chery
1c87cb51be anv/meta: Disambiguate slice variable value
This will simplify the usage of
isl_surf_get_image_intratile_offset_el().
2016-01-27 15:12:42 -08:00
Nanley Chery
8c0c25abde gen8_state: use iview extent to program RENDER_SURFACE_STATE
When creating an uncompressed ImageView on an compressed Image, the
SurfaceFormat is updated to match the ImageView's. The surface
dimensions must also change so that the HW sees the same size image
instead of a 4x larger one.

Fixes the following error which results from running many VulkanCTS
compressed tests in one shot:
  ResourceError (vk.queueSubmit(queue, 1, &submitInfo, *m_fence):
  VK_ERROR_OUT_OF_DEVICE_MEMORY at
  vktPipelineImageSamplingInstance.cpp:921)

Makes all compressed format tests with a height > 1 pass.
2016-01-27 15:12:42 -08:00
Nanley Chery
3f01bbe7f3 anv/image: Scale iview extent by backing image
Aligns with formula's presented in Vulkan spec concerning CopyBufferToImage.
18.4 Copying Data Between Buffers and Images

This won't conflict with valid API usage, because:
1) Users are not allowed to create an uncompressed ImageView with a
compressed Image.
see: VkSpec - 11.5 Image Views - VkImageViewCreateInfo's Valid Usage box

2) If users create a differently formatted compressed ImageView with a
compressed Image, the block dimensions will still match.
see: VkSpec - 28.3.1.5 Format Compatibility Classes - Table 28.5
2016-01-27 15:12:42 -08:00
Nanley Chery
010ab34839 anv/meta: Set depth to 0 for buffer image in CopyBufferToImage()
The buffer image is a flat 2D surface. Each surface represents an
array/depth layer, therefore, the Z-offset is 0 when blitting.
2016-01-27 15:12:42 -08:00
Nanley Chery
2fb8b859f6 anv/meta: Use the uncompressed rectangle when blitting
For an uncompressed ImageView of a compressed Image, the
dimensions and offsets are all divided by the appropriate
block dimensions.

We are not yet using an uncompressed ImageView for a
compressed Image, but will do so in a future commit.
2016-01-27 15:12:42 -08:00
Nanley Chery
c3546685ed i965: Update the surface_format table for ETC formats
Enable ETC support for BDW+. In Vulkan, an array lookup on
surface_format[] is used to determine HW support for certain
formats. In contrast, Mesa dynamically populates an array
which reports this information.
2016-01-27 15:12:42 -08:00
Nanley Chery
308ec0279b anv/image: Update usages of isl_surf_get_image_offset_sa 2016-01-27 15:12:42 -08:00
Nanley Chery
02629a16d1 isl: Add logical z offset to GEN4_2D surfaces
3D surfaces in Skylake are stored with ISL_DIM_LAYOUT_GEN4_2D. Any
delta in the logical z offset causes an equivalent delta in the
surface's array layer.
2016-01-27 15:12:42 -08:00
Chad Versace
a6ecfe1dd3 isl/tests: Add some tests for intratile offsets
Test isl_surf_get_image_intratile_offset_el() in the tests:
  test_bdw_2d_r8g8b8a8_unorm_512x512_array01_samples01_noaux_tiley0
  test_bdw_2d_r8g8b8a8_unorm_1024x1024_array06_samples01_noaux_tiley0
2016-01-27 15:12:42 -08:00
Chad Versace
7ab0d2e2c0 isl: Add func isl_get_intratile_image_offset_el() 2016-01-27 15:12:42 -08:00
Chad Versace
18a83eaa8c isl/tests: Rename t_assert_offset()
Rename it to t_assert_offset_el(), clarifying that the offset in units
of surface elements, not samples.
2016-01-27 15:12:42 -08:00
Chad Versace
fa08f95ff5 isl: Add func isl_surf_get_image_offset_el()
This replaces function isl_surf_get_image_offset_sa()
2016-01-27 15:12:42 -08:00
Chad Versace
ea44d31528 isl: Fix row pitch for compressed formats
When calculating row pitch, the row's width in samples must be divided
by the format's block width. The commit below accidentally removed the
division.

    commit eea2d4d059
    Author: Chad Versace <chad.versace@intel.com>
    Date:   Tue Jan 5 14:28:28 2016 -0800
    Subject: isl: Don't align phys_slice0_sa.width twice
2016-01-27 15:12:42 -08:00
Chad Versace
45ecfcd637 isl: Add func isl_surf_get_tile_info() 2016-01-27 15:12:42 -08:00
Kenneth Graunke
9f954310e8 vtn: Fix atan2 for non-scalars.
The if/then/else block was bogus, as it can only take a scalar
condition, and we need to select component-wise.  The GLSL IR
implementation of atan2 handles this by looping over components,
but I decided to try and do it vector-wise, and messed up.

For now, just bcsel.  It means that we do the atan1 math even if
all components hit the quick case, but it works, and presumably
at least one component will hit the expensive path anyway.
2016-01-27 15:07:42 -08:00
Kenneth Graunke
f92a35d831 vtn: Fix Modf.
We were botching this for negative numbers - floor of a negative rounds
the wrong way.  Additionally, both results are supposed to retain the
sign of the original.

To fix this, just take the abs of both values, then put the sign back.
There's probably a better way to do this, but this works for now.
2016-01-27 14:21:08 -08:00
Kenneth Graunke
4acfc9effb i965: Fix SIN/COS precision problems.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-01-27 13:56:54 -08:00
Kristian Høgsberg Kristensen
b833e7a63c anv: Put back code to grow shader scratch space
This was lost in commit a71e614d33.
2016-01-27 11:36:56 -08:00
Kenneth Graunke
38a3a535eb anv: Update the device limits.
Fixes dEQP-VK.api.info.device.properties.  I haven't tested any others.
2016-01-26 23:09:45 -08:00
Jason Ekstrand
d3607351fe gen7/cmd_buffer: SCISSOR_RECT structs are tightly packed
The pointer has to be 32-byte aligned, but the structs themselves are 2
dwords each, tightly packed.
2016-01-26 22:10:14 -08:00
Jason Ekstrand
f2f03c5b65 anv/pipeline: Set MaximumVPIndex in 3DSTATE_CLIP 2016-01-26 21:52:59 -08:00
Jason Ekstrand
dc3de6f8df anv/pipeline: Only lower input indirects if EmitNoIndirectInput is set 2016-01-26 21:45:21 -08:00
Jason Ekstrand
9ac624751e anv/formats: Use is_power_of_two instead of is_rgb to determine renderability 2016-01-26 20:29:16 -08:00
Jason Ekstrand
2af3acd061 HACK/i965/surface_formats: Mark A4B4G4R4 as being supported
The table has this marked as unsupported on all gens, but I don't really
believe that given how early it is in the table.  I've tested and it seems
to work on Broadwell.  The Bspec says that it sould be renderable on SKL+
but alpha blending is questionable.

Side note: We really need to audit the format table again.
2016-01-26 20:29:16 -08:00
Jordan Justen
c20f78dc5d anv: Support swizzled formats.
Some formats require a swizzle in order to map them to actual hardware
formats.  This allows us to turn on two new Vulkan formats.
2016-01-26 20:29:16 -08:00
Jason Ekstrand
9bc72a9213 anv/image: Do swizzle remapping in anv_image.c
TODO: At some point, we really need to make an image_view_init_info that's
a flyweight and stop stuffing everything into image_view.
2016-01-26 20:23:59 -08:00
Jason Ekstrand
7d84fe9b1f HACK: Expose support for stencil blits
If someone actually tries to use them, they won't work, but at least we
don't fail to return format properties now.
2016-01-26 17:29:49 -08:00
Kenneth Graunke
32dcfc953e vtn: Delete references to IMix opcode.
This is being removed in SPIR-V.

Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=15452
2016-01-26 17:02:35 -08:00
Ben Widawsky
c5dc6cdf26 i965/skl: Utilize new 5th bit for gateway messages
Cc: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
2016-01-26 15:44:48 -08:00
Jason Ekstrand
a1ea45b857 genX/pipeline: Don't make vertex bindings with holes 2016-01-26 15:44:18 -08:00
Jason Ekstrand
7ef0d39cb2 anv/cmd_buffer: Put base_instance in the second component 2016-01-26 15:44:02 -08:00
Francisco Jerez
6840cc1513 anv/image: clflush surface state map in anv_fill_buffer_surface_state().
Some of its users had the required clflush on non-LLC platforms, some
didn't.  Put the clflush in anv_fill_buffer_surface_state() so we
don't forget.
2016-01-26 15:14:50 -08:00
Francisco Jerez
fc7a7b31c5 anv/image: clflush the right state map in anv_fill_image_surface_state().
It was clflushing the nonrt_surface_state structure regardless of
which state structure was actually being initialized.
2016-01-26 15:14:50 -08:00
Francisco Jerez
a50dc70e21 anv/image: Upload raw buffer surface state for untyped storage image and texel buffer access. 2016-01-26 15:14:50 -08:00
Francisco Jerez
d2ec510dda anv/image: Fix image parameter initialization. 2016-01-26 15:14:50 -08:00
Francisco Jerez
d9e0b9a06a isl/gen9: Fix slice offset calculation for 1D array images.
The X component of the offset is set to the layer index times layer
height which is obviously bogus, return the vertical offset of the
slice as Y component instead.  Fixes a few image load/store tests that
use 1D arrays on SKL when forcing it to fall back to untyped reads and
writes.
2016-01-26 15:14:50 -08:00
Jason Ekstrand
cc065e0ad7 i965/fs_surface_builder: Mask signed integers after conversion 2016-01-26 15:14:50 -08:00
Jason Ekstrand
ba393c9d81 anv/image: Actually fill out brw_image_param structs 2016-01-26 15:14:50 -08:00
Jason Ekstrand
aa9987a395 anv/image_view: Add base mip and base layer fields
These will be needed by image_load_store
2016-01-26 15:14:50 -08:00
Jason Ekstrand
42cd994177 gen7: Add support for base vertex/instance 2016-01-26 14:56:37 -08:00
Jason Ekstrand
4bf3cadb66 gen8: Add support for base vertex/instance 2016-01-26 14:56:37 -08:00
Jason Ekstrand
6ba67795db nir/spirv: Add proper support for InstanceIndex 2016-01-26 14:56:37 -08:00
Jason Ekstrand
1c3b7fe1ee nir/lower_io: Lower INSTNACE_INDEX 2016-01-26 14:56:37 -08:00
Jason Ekstrand
b2b7c93318 glsl/enums: Add an enum for Vulkan instance index 2016-01-26 14:56:37 -08:00
Jason Ekstrand
da75492879 genX/pipeline: Break emit_vertex_input out into common code
It's mostly the same and contains some non-trivial logic, so it really
should be shared.  Also, we're about to make some modifications here that
we would really like to share.
2016-01-26 14:56:37 -08:00
Kristian Høgsberg Kristensen
fe6ccb6031 anv: Remove long unused anv_aub.h 2016-01-26 14:53:00 -08:00
Kristian Høgsberg Kristensen
074a7c7d7c anv: Dirty fragment shader descriptors in meta restore
We need to reemit render targets, so dirtying VK_SHADER_STAGE_VERTEX_BIT
doesn't help us much.
2016-01-26 14:44:02 -08:00
Kristian Høgsberg Kristensen
725d969753 anv: Reemit STATE_BASE_ADDRESS after second level cmd buffers
Otherwise the primary batch will continue using the state base addresses
set by the secondary.  Fixes remaining renderpass tests.
2016-01-26 14:44:02 -08:00
Chad Versace
df5f6d824b anv/meta: Fix sample mask in clear pipelines
Once we begin emitting the correct sample mask,
genX_3DSTATE_SAMPLE_MASK_pack will hit an assertion if the mask contains
too many bits.
2016-01-26 11:04:44 -08:00
Jason Ekstrand
725fb3623f i965/compiler: Set nir_options.vertex_id_zero_based 2016-01-25 16:10:28 -08:00
Jason Ekstrand
6b6a8a99f8 HACK/i965: Default to scalar GS on BDW+ 2016-01-25 15:52:53 -08:00
Jason Ekstrand
e462d4d815 Merge remote-tracking branch 'mattst88/nir-lower-pack-unpack' into vulkan 2016-01-25 15:50:31 -08:00
Jason Ekstrand
6bbf3814dc gen7/state: Apply min/mag filters individually for samplers
This fixes tests which apply different min and mag filters, and depend on
the min filter to be correct.
2016-01-25 15:33:08 -08:00
Ben Widawsky
9c69f4632d gen8/state: Apply min/mag filters individually for samplers
This fixes tests which apply different min and mag filters, and depend on the
min filter to be correct.
2016-01-25 15:29:18 -08:00
Jason Ekstrand
2434ceabf4 i965/fs: Feel free to spill partial reads/writes
Now that we properly handle write-masking, this should be safe.
2016-01-25 15:23:10 -08:00
Jason Ekstrand
9c0109a1f6 i965/fs: Properly write-mask spills
For unspills (scratch reads), we can just set WE_all all the time because
we always unspill into a new GRF.  For spills, we have two options: If the
instruction has a 32-bit-per-channel destination and "normal" regioning,
then we just do a regular write and it will interleave channels from
different control-flow paths properly.  If, on the other hand, the the
regioning is non-normal, then we have to unspill, run the instruction, and
spill afterwards.  In this second case, we need to do the spill with
we_ALL.
2016-01-25 15:23:10 -08:00
Kristian Høgsberg Kristensen
8e07f7942e anv: Remove a few finished finishme 2016-01-25 15:16:13 -08:00
Kristian Høgsberg Kristensen
76c096f0e7 anv: Remove stale assert
This goes back to when we didn't have the subpass number in the command
buffer begin info.
2016-01-25 15:15:59 -08:00
Matt Turner
874ede4983 i965/gen7+: Use NIR for lowering of pack/unpack opcodes. 2016-01-25 14:48:34 -08:00
Matt Turner
5deba3f00a i965/vec4: Implement nir_op_pack_uvec2_to_uint.
And mark nir_op_pack_uvec4_to_uint unreachable, since it's only produced
by lowering pack[SU]norm4x8 which the vec4 backend does not need.
2016-01-25 14:24:07 -08:00
Matt Turner
8bb22dc351 nir: Add lowering support for unpacking opcodes. 2016-01-25 14:24:07 -08:00
Matt Turner
d7781038f5 nir: Add lowering support for packing opcodes. 2016-01-25 14:24:07 -08:00
Matt Turner
6c1b3bc950 i965/fs: Implement support for extract_word.
The vec4 backend will lower it.
2016-01-25 14:24:07 -08:00
Matt Turner
26f0444ead nir: Add opcodes to extract bytes or words.
The uint versions zero extend while the int versions sign extend.
2016-01-25 14:24:07 -08:00
Nanley Chery
2c94f659e8 anv/meta: Fix CopyBuffer when size matches HW limit
Perform a copy when the copy_size matches the HW limit (max_copy_size).
Otherwise the current behavior is that we fail the following assertion:

      assert(height < max_surface_dim);

because the values are equal.
2016-01-25 12:26:39 -08:00
Kristian Høgsberg Kristensen
c21de2bf04 anv: Don't use uninitialized barycentric_interp_modes
If we don't have a fragment shader, wm_prog_data in undefined.
2016-01-25 11:34:32 -08:00
Kristian Høgsberg Kristensen
292031a1a5 anv: Disable fs dispatch for depth/stencil only pipelines
Fixes most renderpass bugs.
2016-01-25 11:26:19 -08:00
Matt Turner
26b2cc6f3a glsl: Remove 2x16 half-precision pack/unpack opcodes.
i965/fs was the only consumer, and we're now doing the lowering in NIR.
2016-01-25 11:12:36 -08:00
Matt Turner
24d385f85c i965/fs: Switch from GLSL IR to NIR for un/packHalf2x16 lowering. 2016-01-25 11:11:56 -08:00
Matt Turner
5eb1145434 nir: Add lowering of nir_op_unpack_half_2x16. 2016-01-25 11:11:56 -08:00
Matt Turner
84166aed92 i965: Make separate nir_options for scalar/vector stages.
We'll want to have different lowering options set for scalar/vector
stages.
2016-01-25 11:11:26 -08:00
Matt Turner
b6bb3b9bcd i965: Move brw_compiler_create() to new brw_compiler.c.
A future patch will want to use designated initalizers, which aren't
available in C++, but this is C.
2016-01-25 11:11:25 -08:00
Matt Turner
b126039784 nir: Make argument order of unop_convert match binop_convert.
Strangely the return and parameter types were reversed.
2016-01-25 11:11:08 -08:00
Jason Ekstrand
a804d82ef6 anv/cmd_buffer: Zero out binding tables and samplers in state_reset
This fixes a use of an undefined value if the client uses push constants in
a stage without ever setting any descriptors on GEN8-9.
2016-01-22 22:57:05 -08:00
Jason Ekstrand
9e0bc29f80 nir/opcodes: Properly flush denormals in fquantize2f16 2016-01-22 22:18:31 -08:00
Jason Ekstrand
89672d81f3 i965/nir: Properly flush denormals in nir_op_fquantize2f16 2016-01-22 22:18:31 -08:00
Jason Ekstrand
2bfb9f29b8 anv/format: Add a helpful comment about format names 2016-01-22 19:14:41 -08:00
Jason Ekstrand
259e1bdf79 anv/formats: Add support for 3 more formats 2016-01-22 19:03:27 -08:00
Jason Ekstrand
0b6c1275d0 anv/pipeline: Add a default L3$ setup 2016-01-22 19:02:55 -08:00
Chad Versace
99a4885328 anv/formats: Rename ambiguous func parameter
vkGetPhysicalDeviceImageFormatProperties has multiple 'flags'
parameters.
2016-01-22 17:51:24 -08:00
Chad Versace
149d5ba64d anv/formats: Advertise multisample formats
Teach vkGetPhysicalDeviceImageFormatProperties() to advertise
multisampled formats.
2016-01-22 17:50:15 -08:00
Chad Versace
d96d78c3b6 anv/image: Drop assertion that samples == 1 2016-01-22 17:19:57 -08:00
Chad Versace
fda074b23f isl: Fix gen8_choose_msaa_layout()
Gen8 requires any Y tiling, not any *standard* Y tiling.
2016-01-22 17:19:57 -08:00
Chad Versace
2fa1f745ea isl: Add func isl_tiling_is_any_y() 2016-01-22 17:19:57 -08:00
Chad Versace
fa5f45e8aa anv/meta: Assert correct sample counts for blit funcs
Add assertions to:
    anv_CmdBlitImage
    anv_CmdCopyImage
    anv_CmdCopyImageToBuffer
    anv_CmdCopyBufferToImage
2016-01-22 17:19:57 -08:00
Chad Versace
dfcb4ee6df anv: Add anv_image::samples
It's set but not yet used.
2016-01-22 17:19:57 -08:00
Chad Versace
1c5d7b38e2 anv: Use isl_device_get_sample_counts()
Use it in vkGetPhysicalDeviceProperties.
2016-01-22 17:19:57 -08:00
Chad Versace
14b753f666 isl: Add func isl_device_get_sample_counts() 2016-01-22 17:19:57 -08:00
Nanley Chery
d4de918ad0 gen8/state: Remove SKL special-casing for MinimumArrayElement
MinimumArrayElement carries the same meaning for BDW and SKL.
Suggested by Jason.

No regressions in dEQP-VK.pipeline.image.view_type.cube_array.*
Fixes a number of cube tests, including cube_array_base_slice
and cube_base_slice tests.
2016-01-22 17:10:14 -08:00
Chad Versace
6a03c69adb anv/state: Dedupe code for lowering surface format
Add helper anv_surface_format().
2016-01-22 16:49:17 -08:00
Francisco Jerez
11d5c1905c anv/meta: Set sampler type and instruction arrayness consistently in blit shader. 2016-01-22 16:43:18 -08:00
Francisco Jerez
bf151b8892 anv/meta: Fix meta blit fragment shader for 1D arrays. 2016-01-22 16:43:15 -08:00
Jason Ekstrand
53b83899e0 genX/state: Set CubeSurfaceControlMode to OVERRIDE
This makes it act like the address mode is set to TEXCOORDMODE_CUBE
whenever this sampler is combined with a cube surface.  This *should* be
what we need for Vulkan.  Interestingly, the PRM contains a programming
note for this field that says simply, "This field must be set to
CUBECTRLMODE_PROGRAMMED".  However, emprical evidence suggests that it does
what the PRM says it does and OVERRIDE is just fine.
2016-01-22 16:34:13 -08:00
Jason Ekstrand
35879fe829 gen8/state: Divide depth by 6 for cube maps for GEN8
For Broadwell cube maps, MinimumArrayElement is in terms of 2d slices (a
multiple of 6) but Depth is in terms of whole cubes.
2016-01-22 16:14:54 -08:00
Nanley Chery
3cd8c0bb04 gen8_state: Enable all cube faces
These fields are ignored for non-cube surfaces. For cube surfaces
these fields should be enabled when using TEXCOORDMODE_CLAMP and
TEXCOORDMODE_CUBE.

TODO: Determine if these are the only two modes used in Vulkan.
2016-01-22 16:12:52 -08:00
Jason Ekstrand
107a109d1c isl/format_layout: R11G11B10_FLOAT is unsigned 2016-01-22 11:57:49 -08:00
Jason Ekstrand
e5558ffa64 anv/image: Move common code to anv_image.c 2016-01-22 11:57:01 -08:00
Jason Ekstrand
84612f4014 anv/state: Refactor surface state setup into a "fill" function 2016-01-22 11:40:56 -08:00
Francisco Jerez
448285ebf2 anv/state: Add missing clflushes for storage image surface state. 2016-01-22 11:12:09 -08:00
Francisco Jerez
d533c3796d anv/state: Factor out surface state calculation from genX_image_view_init.
Some fields of the surface state template were dependent on the
surface type, which is dependent on the usage of the image view, which
wasn't known until the bottom of the function after the template had
been constructed.  This caused failures in all image load/store CTS
tests using cubemaps.  Refactor the surface state calculation into a
function that is called once for each required usage.
2016-01-22 11:12:09 -08:00
Jason Ekstrand
16780632c2 i965/nir: Temporariliy disable mul+add fusion
We don't want to do this in the long-run but it's needed for passing the
NoContraction tests at the moment.  Eventually, we want to plumb this
through NIR properly.
2016-01-22 11:10:54 -08:00
Chad Versace
d9abbbe0d8 isl: Fix indentation of isl_format_layout comment 2016-01-22 09:48:11 -08:00
Chad Versace
65f3c420c3 isl/tests: Give tests less cryptic names 2016-01-22 09:46:48 -08:00
Chad Versace
f9d4d09549 isl: Fix isl_surf_get_image_offset_sa for gen4_3d layout
Bug found by unit test
test_bdw_3d_r8g8b8a8_unorm_256x256x256_levels09_tiley0.
2016-01-22 09:45:22 -08:00
Chad Versace
891ed5ca8c isl/tests: Add test for bdw 3d surface
test_bdw_3d_r8g8b8a8_unorm_256x256x256_levels09_tiley0

Currently fails.
2016-01-22 09:45:21 -08:00
Chad Versace
fbc87ce4be isl/tests: Remove copy-paste assertion 2016-01-22 07:18:04 -08:00
Chad Versace
63d999b762 isl/tests: Fix build
isl_device_init() acquired a new param for bit6 swizzling.
2016-01-22 07:17:57 -08:00
Francisco Jerez
2e54381622 anv/batch_chain: Fix patching up of block pool relocations on Gen8+.
Relocations are 64 bits on Gen8+.  Most CTS tests that send
non-trivial work to the GPU would fail when run from a single deqp-vk
invocation because they were effectively relying on reloc presumed
offsets to be wrong so the kernel would come and apply relocations
correctly.
2016-01-21 16:30:44 -08:00
Jason Ekstrand
13aaf90048 nir/spirv: Ignore cull distance 2016-01-21 16:20:39 -08:00
Jason Ekstrand
13858a1c1a nir/lower_system_values: Use the correct invication id for CS 2016-01-21 16:20:39 -08:00
Jason Ekstrand
d8c0e0805b nir/spirv: Properly assign locations to split structures 2016-01-21 16:20:39 -08:00
Jason Ekstrand
514507825c nir/spirv: Improve handling of variable loads and copies
Before we were asuming that a deref would either be something in a block or
something that we could pass off to NIR directly.  However, it is possible
that someone would choose to load/store/copy a split structure all in one
go.  We need to be able to handle that.
2016-01-21 16:20:39 -08:00
Jason Ekstrand
7e5e64c8a9 nir/spirv: Make vectors a proper array time with an array_element
This makes dealing with single-component derefs easier
2016-01-21 16:20:39 -08:00
Jason Ekstrand
a8af0f536c nir/spirv: Rework access chains a bit to allow for literals
This makes them much easier to construct because you can also just specify
a literal number and it doesn't have to be a valid SPIR-V id.
2016-01-21 16:20:39 -08:00
Jason Ekstrand
5d9a6fd526 vtn/variables: Compact local loads/stores into one function
This is similar to what we did for block loads/stores.
2016-01-21 16:20:39 -08:00
Jason Ekstrand
b298743d7b nir/spirv: Add an actual variable struct to spirv_to_nir
This allows us, among other things, to do structure splitting on-the-fly to
more correctly handle input/output structs.
2016-01-21 16:20:39 -08:00
Jason Ekstrand
2892693d56 nir/spirv: Split variable handling out into its own file
It's 1300 lines all by itself and it will only grow.
2016-01-21 16:20:39 -08:00
Jason Ekstrand
1112bf633f nir/spirv: Rework access chains
Previously, we were creating nir_deref's immediately.  Now, instead, we
have an intermediate vtn_access_chain structure.  While a little more
awkward initially, this will allow us to more easily do structure splitting
on-the-fly.
2016-01-21 16:18:37 -08:00
Kenneth Graunke
824f776355 nir/spirv: Implement ModfStruct opcode. 2016-01-21 14:57:47 -08:00
Kenneth Graunke
f89d5cb807 nir/spirv: Delete stray fmod remnants.
Jason left these stray code fragments in
22804de110.
2016-01-21 14:54:20 -08:00
Kristian Høgsberg Kristensen
ac60e98a58 vk: Do render cache flush for GEN8+
This is needed for SKL as well.
2016-01-21 14:18:52 -08:00
Kristian Høgsberg Kristensen
9eab8fc683 vk: Emit surface state base address before renderpass
If we're continuing a render pass, make sure we don't emit the depth and
stencil buffer addresses before we set the state base addresses.

Fixes crucible func.cmd-buffer.small-secondaries
2016-01-21 14:18:52 -08:00
Kristian Høgsberg Kristensen
c5490d0277 vk: Fix indirect push constants
This currently sets the base and size of all push constants to the
entire push constant block. The idea is that we'll use the base and size
to eventually optimize the amount we actually push, but for now we don't
do that.
2016-01-21 11:10:11 -08:00
Kristian Høgsberg Kristensen
83c86e09a8 Merge remote-tracking branch 'jekstrand/wip/i965-uniforms' into vulkan 2016-01-21 11:09:58 -08:00
Jordan Justen
b1a7a27d60 nir/spirv: Handle compute shared atomics
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Jordan Justen
a7e5b683ca nir/spirv: Support workgroup (shared) variable translation
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Jordan Justen
bc035db3c8 anv/gen8: Set SLM size in interface descriptor
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Jordan Justen
819cb69434 anv/gen8+9: Invalidate color calc state when switching to the GPGPU pipeline
Port 044acb9256 to anv.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Jordan Justen
19830031cb anv/gen8: Enable SLM in L3 cache control register
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Jordan Justen
97b09a9268 anv/pipeline: Set size of shared variables in prog_data
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Jordan Justen
86daceb7f2 i965/nir: Lower nir compute shader shared variables
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Jordan Justen
ca55817fa1 nir: Lower shared var atomics during nir_lower_io
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Jordan Justen
36157cd5ea nir: Add support for lowering load/stores of shared variables
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Jordan Justen
7a9a54b5c8 nir: Add atomic operations on variables
This allows us to first generate atomic operations for shared
variables using these opcodes, and then later we can lower those to
the shared atomics intrinsics with nir_lower_io.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Jordan Justen
10db985fa0 nir: Add compute shader shared variable storage class
Previously we were receiving shared variable accesses via a lowered
intrinsic function from glsl. This change allows us to send in
variables instead. For example, when converting from SPIR-V.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Jordan Justen
65a5407931 nir/print: Add space after shader_storage var mode
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Jordan Justen
9f4a72c9e3 i965/fs/nir: Move shared variable load/store to nir_emit_cs_intrinsic
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-21 00:31:29 -08:00
Chad Versace
5ce5a7d021 anv/image: Stop including gen8_pack.h in common file 2016-01-20 15:42:17 -08:00
Chad Versace
8ab527de03 isl: Add a README
Most of the file-level comment in isl.h is moved to the README.
2016-01-20 15:24:40 -08:00
Kristian Høgsberg Kristensen
7b7a7c2bfc vk: Make maxSamplerAllocationCount more reasonable
We can't allocate 4 billion samplers. Let's go with 64k.
2016-01-20 14:36:52 -08:00
Kristian Høgsberg Kristensen
8ef002dd7a vk/tests: Add stub for anv_gem_get_bit6_swizzle() 2016-01-20 13:47:40 -08:00
Kristian Høgsberg Kristensen
420e8664cb vk/tests: Add isl include path 2016-01-20 13:47:40 -08:00
Kenneth Graunke
b76e4458f9 nir/spirv/glsl450: Use fabs not iabs in ldexp.
This was just wrong.
2016-01-20 12:18:02 -08:00
Kristian Høgsberg Kristensen
947ebd9c71 isl: Add ish.h to libsil_la_SOURCES 2016-01-20 12:03:46 -08:00
Jason Ekstrand
21b2d87408 nir/spirv/glsl450: Implement FrexpStruct 2016-01-20 11:36:41 -08:00
Jason Ekstrand
c7896d1868 spirv/nir/glsl450: Use vtn_create_ssa_value to create SSA values 2016-01-20 11:36:26 -08:00
Jason Ekstrand
e45748bade anv/device: Default to scalar GS on BDW+ 2016-01-20 11:16:44 -08:00
Jason Ekstrand
34f9a5f301 nir/spirv: Pull texture dimensionality out of the image when available 2016-01-20 11:11:30 -08:00
Jason Ekstrand
59ef7c6507 anv/meta: fix UpdateBuffer in the case where we do multiple updates 2016-01-20 07:56:48 -08:00
Jason Ekstrand
a0516cfbac anv/meta: Fix a finishme 2016-01-20 07:33:41 -08:00
Jason Ekstrand
c7203aa621 nir/spirv: Move OpPhi handling to vtn_cfg.c
Phi handling is somewhat intrinsically tied to the CFG.  Moving it here
makes it a bit easier to handle that.  In particular, we can now do SSA
repair after we've done the phi node second-pass.  This fixes 6 CTS tests.
2016-01-19 19:00:00 -08:00
Jason Ekstrand
891564adb9 nir/spirv: Handle OpLine and OpNoLine in foreach_instruction
This way we don't have to explicitly handle them everywhere.
2016-01-19 19:00:00 -08:00
Kenneth Graunke
e79f8a4926 nir: Lower ldexp to arithmetic.
This is a port of Matt's GLSL IR lowering pass to NIR.  It's required
because we translate SPIR-V directly to NIR, bypassing GLSL IR.

I haven't introduced a lower_ldexp flag, as I believe all current NIR
consumers would set the flag.  i965 wants this, vc4 doesn't implement
this feature, and st_glsl_to_tgsi currently lowers ldexp
unconditionally anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-01-19 18:10:30 -08:00
Kenneth Graunke
b3cc10f3b2 nir: Let nir_opt_algebraic rules contain unsigned constants > INT_MAX.
struct.pack('i', val) interprets `val` as a signed integer, and dies
if `val` > INT_MAX.  For larger constants, we need to use 'I' which
interprets it as an unsigned value.

This patch makes us use 'I' for all values >= 0, and 'i' for negative
values.  This should work in all cases.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-01-19 18:10:30 -08:00
Jason Ekstrand
eb2a119da2 anv/meta: Implement UpdateBuffer 2016-01-19 16:53:35 -08:00
Jason Ekstrand
0ae1bd321e anv/meta: Implement CmdFillBuffer 2016-01-19 16:53:35 -08:00
Jason Ekstrand
46eef31311 anv/meta_clear: Call emit_clear directly in ClearImage
Using the load op means that we end up with recursive meta.  We shouldn't
be doing that.
2016-01-19 16:53:35 -08:00
Jason Ekstrand
6325a75011 anv/meta_clear: Do save/restore in actual entry points 2016-01-19 16:53:35 -08:00
Jason Ekstrand
56dbf13045 anv: Add support for VK_WHOLE_SIZE several places 2016-01-19 16:53:35 -08:00
Kenneth Graunke
549be68258 nir/spirv/glsl450: Implement Frexp. 2016-01-19 16:46:03 -08:00
Kenneth Graunke
68c9ca1a94 nir/spirv/glsl450: Blindly implement Atan2.
This is untested and probably broken.

We already passed the atan2 CTS tests before implementing this opcode.
Presumably, glslang or something was giving us a plain Atan opcode
instead of Atan2.  I don't know why.
2016-01-19 16:14:05 -08:00
Kenneth Graunke
2ab3efa0ad nir/spirv/glsl450: Implement Atan. 2016-01-19 16:14:05 -08:00
Kenneth Graunke
bc9d9bc2e3 nir/spirv/glsl450: Implement Asin and Acos. 2016-01-19 16:14:05 -08:00
Jason Ekstrand
5e57a87dcf anv/pipeline: Fix point size 2016-01-19 12:03:13 -08:00
Daniel Stone
f9ca780ea4 anv/wsi: Mark Wayland buffers as busy
We were diligently setting Wayland buffers as non-busy, but nowhere in
the code did we set them to busy when submitted to the server. This
meant that acquire_next_image would only ever find the same buffer in
a loop, over and over.

Signed-off-by: Daniel Stone <daniels@collabora.com>
2016-01-19 16:54:55 +00:00
Daniel Stone
ba5ef49dcb anv/wsi: Avoid stuck Wayland connection
In acquire_next_image, we are waiting for a wl_buffer::release to arrive
and release one of the buffers in our swapchain. Most compositors don't
explicitly flush release events, so we may need to perform a roundtrip
instead, to ensure the event arrives.

Signed-off-by: Daniel Stone <daniels@collabora.com>
2016-01-19 16:54:55 +00:00
Jason Ekstrand
3276610ea6 getX/state: Set LOD pre-clamp to OpenGL mode
This gets us another couple hundred sampler tests
2016-01-18 17:51:35 -08:00
Jason Ekstrand
580b2e85e4 isl/device: Add a flag for bit 6 swizzling 2016-01-18 17:21:05 -08:00
Jason Ekstrand
587842a0ca anv/gem: Add a helper for getting bit6 swizzling information 2016-01-18 17:21:05 -08:00
Jason Ekstrand
c2a6f4302e nir/spirv: Patch through image qualifiers 2016-01-18 17:21:05 -08:00
Jason Ekstrand
56c8a5f2b8 nir/spirv: Implement ImageQuerySize for storage iamges
SPIR-V only has one ImageQuerySize opcode that has to work for both
textures and storage images.  Therefore, we have to special-case that one a
bit and look at the type of the incoming image handle.
2016-01-18 17:21:05 -08:00
Jason Ekstrand
bb8cadd169 nir/spirv: Insert movs around image intrinsics
Image intrinsics always take a vec4 coordinate and always return a vec4.
This simplifies the intrinsics a but but also means that they don't
actually match the incomming SPIR-V.  In order to compensate for this, we
add swizzling movs for both source and destination to get the right number
of components.
2016-01-18 17:21:05 -08:00
Jason Ekstrand
6f956b0b22 anv/meta: Improve meta clear cleanup a bit 2016-01-18 14:07:46 -08:00
Jason Ekstrand
45d17fcf9b anv: Misc allocation scope fixes 2016-01-18 14:04:13 -08:00
Jason Ekstrand
378af64e30 anv/meta: Add a meta allocator that uses SCOPE_DEVICE
The Vulkan spec requires all allocations that happen for device creation to
happen with SCOPE_DEVICE.  Since meta calls into other things that allocate
memory, the easiest way to do this is with an allocator.
2016-01-18 14:03:24 -08:00
Jason Ekstrand
3dfa6a881c anv/meta: Initialize a handle to null 2016-01-18 13:05:02 -08:00
Jason Ekstrand
d49298c702 gen8: Fix border color
The border color packet is specified as a 64-byte aligned address relative
to dynamic state base address.  The way the packing functions are currently
set up, we need to provide it with (offset >> 6) because it just shoves the
bits in where the PRM says they go and isn't really aware that it's an
address.
2016-01-18 12:16:31 -08:00
Jason Ekstrand
bfcc744892 genX/pack: Add a __gen_fixed helper and use it for TextureLODBias
The __gen_fixed helper properly clamps the value and also handles negative
values correctly.  Eventually, we need to make the scripts generate this
and use it for more things.
2016-01-18 11:35:04 -08:00
Jason Ekstrand
5a67df2546 anv/pack: Make TextureLODBias a proper 4.8 float
XXX: We need to update the generators so this doesn't get stompped.
2016-01-18 10:36:53 -08:00
Jason Ekstrand
15e6af0708 nir/spirv: Handle if's where the merge is also a break or continue 2016-01-18 10:10:47 -08:00
Jason Ekstrand
14ebd0fdd7 nir/spirv: Hanle continues that use SSA values from the loop body
Instead of emitting the continue before the loop body we emit it
afterwards.  Then, once we've finished with the entire function, we run
nir_repair_ssa to add whatever phi nodes are needed.
2016-01-18 09:43:12 -08:00
Jason Ekstrand
61ba97522e nir/lower_returns: Repair SSA after doing return lowering 2016-01-18 09:43:12 -08:00
Jason Ekstrand
b11825590d nir: Add a pass to repair SSA form 2016-01-18 09:43:12 -08:00
Jason Ekstrand
a7a5e8a2de nir/vars_to_ssa: Use the new nir_phi_builder helper
The efficiency should be approximately the same.  We do a little more work
per phi node because we have to sort the predecessors.  However, we no
longer have to walk the blocks a second time to pop things off the stack.
The bigger advantage, however, is that we can now re-use the phi placement
and per-block SSA value tracking in other passes.
2016-01-18 09:18:42 -08:00
Jason Ekstrand
8aab4a7bd2 nir: Add a phi node placement helper
Right now, we have phi placement code in two places and there are other
places where it would be nice to be able to do this analysis.  Instead of
repeating it all over the place, this commit adds a helper for placing all
of the needed phi nodes for a value.
2016-01-18 09:18:42 -08:00
Jason Ekstrand
b1f1200e80 util/bitset: Allow iterating over const bitsets 2016-01-18 09:18:42 -08:00
Jason Ekstrand
f509a89082 nir/lower_system_values: Lower vertexID to id+base if needed 2016-01-15 16:15:50 -08:00
Jason Ekstrand
6b64dddd71 anv/batch_chain: Remove padding from the BO before emitting BUFFER_END 2016-01-15 15:59:58 -08:00
Jason Ekstrand
67bf74f020 anv/batch_chain: Don't call current_batch_bo() again
We call it once at the top of the function and then hold on to the pointer.
It shouldn't have changed, so there's no reason to query for it again.
2016-01-15 15:49:32 -08:00
Jason Ekstrand
117cac75d0 nir/spirv: Stop trusting the SPIR-V for the number of texture coordinates 2016-01-15 11:13:51 -08:00
Chad Versace
0e420cb67f anv: Populate SURFACE_STATE more safely
genX_image_view_init allocates up to 3 separate SURFACE_STATE structures,
and populates each from a single template. Stop mutating the template
between each final SURFACE_STATE.
2016-01-15 11:00:22 -08:00
Chad Versace
eab6212efd anv/meta: Stop leaking renderpass and framebuffer 2016-01-15 10:14:07 -08:00
Chad Versace
482a1f5eab anv/meta: Reuse code for vkCmdClear{Color,DepthStencil}Image
The two function bodies were very similar. Move common code to
anv_cmd_clear_image().

Fixes all 'dEQP-VK.renderpass.formats.*' on Skylake.
2016-01-15 07:46:10 -08:00
Chad Versace
1afe33f8b3 anv/gen8: Fix SF_CLIP_VIEWPORT's Z elements
SF_CLIP_VIEWPORT does not clamp Z values. It only scales and shifts
them. Clamping to VkViewport::minDepth,maxDepth is instead handled by
CC_VIEWPORT.

Fixes dEQP-VK.renderpass.simple.depth on Broadwell.
2016-01-14 22:53:05 -08:00
Chad Versace
842b424d3b anv/meta: Implement vkCmdClearDepthStencilImage 2016-01-14 22:53:05 -08:00
Chad Versace
e4b17a2e1a anv/meta: Implement vkCmdClearAttachments 2016-01-14 22:53:05 -08:00
Chad Versace
0038ae2e4a anv/meta: Add VkClearRect param to emit_clear()
Prepares for vkCmdClearAttachments.
2016-01-14 22:53:05 -08:00
Chad Versace
11f5433715 anv: Distinguish between subpass setup and subpass start
vkCmdBeginRenderPass, vkCmdNextSubpass, and vkBeginCommandBuffer with
VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT, all *setup* the
command buffer for recording commands for some subpass.  But only the
first two, vkCmdBeginRenderPass and vkCmdNextSubpass, can *start*
a subpass.

Therefore, calling anv_cmd_buffer_begin_subpass() inside
vkCmdBeginCommandBuffer is misleading. Clarify its purpose by renaming
it to anv_cmd_buffer_set_subpass() and adding comments.
2016-01-14 22:53:05 -08:00
Chad Versace
deb8dd89b5 anv: Emit load clears at start of each subpass
This should improve cache residency for render targets.

Pre-patch, vkCmdBeginRenderPass emitted all the meta clears for
VK_ATTACHMENT_LOAD_OP_CLEAR before any subpass began. Post-patch,
vCmdBeginRenderPass and vkCmdNextSubpass emit only the clears needed for
that current subpass.
2016-01-14 22:53:05 -08:00
Chad Versace
0679bef49f anv/meta: Create 8 pipelines for color clears
This prepares for moving the clear ops from the start of the render pass
into each subpass.

Pipeline N will be used to clear color attachment N of the current
subpass. Currently meta color clears still create a throwaway subpass
with exactly one attachment, so currently only pipeline 0 is used.

This is an ugly hack to workaround the compiler's current inability to
dynamically set the render target index in the render target write
message.
2016-01-14 22:53:05 -08:00
Chad Versace
2997b0da4a anv: Allow override of pipeline color attachment count
Add anv_graphics_pipeline_create_info::color_attachment_count. If
non-negative, then it overrides the color attachment count in the
pipeline's subpass. Useful for meta. (All the hacks for meta!)
2016-01-14 22:53:05 -08:00
Chad Versace
13610c03a7 anv/meta: Name the nir shaders
The names appear in debug output.
2016-01-14 22:53:05 -08:00
Chad Versace
6a1a760e3c anv: Move MAX_* defs to top of anv_private.h
Because I need to use MAX_RTS in struct anv_meta_state.
2016-01-14 22:53:05 -08:00
Chad Versace
4c2bafb9bf anv: Define zero() macro
zero(x) memsets x to zero. Eliminates bugs due to errors in memset's
size param.
2016-01-14 22:53:05 -08:00
Chad Versace
f2700d665c anv/meta: Rename emit_load_*_clear funcs
The functions will soon handle clears unrelated to
VK_ATTACHMENT_LOAD_OP_CLEAR, namely vkCmdClearAttachments. So remove
"load" from their name:

    emit_load_color_clear -> emit_color_clear
    emit_load_depthstencil_clear -> emit_depthstencil_clear
2016-01-14 22:53:05 -08:00
Chad Versace
356f952f87 anv/meta: Use anv_cmd_state::attachments for clears
Rewrite anv_cmd_buffer_clear_attachments, which emits the top-of-pass
clears, to use the data provided in anv_cmd_state::attachments. This
prepares for deferring each attachment clear to the first subpass that
uses the attachment.
2016-01-14 22:53:05 -08:00
Chad Versace
a4b045ca44 anv: Add anv_cmd_state::attachments
This array contains attachment state when recording a renderpass instance.
It's populated on each call to anv_cmd_buffer_set_pass.

The data is currently set but unused. We'll use it later to defer each
attachment clear to the subpass that first uses the attachment.
2016-01-14 22:53:05 -08:00
Jason Ekstrand
5d1c2736b6 i965/fs/generator: Change a comment as per jordan's suggestion 2016-01-14 22:03:15 -08:00
Jason Ekstrand
6be517b20e i965/fs: Always set hannel 2 of texture headers in some stages 2016-01-14 20:42:47 -08:00
Jason Ekstrand
e1d13cd058 i965/fs/generator: Take an actual shader stage rather than a string 2016-01-14 20:27:56 -08:00
Jason Ekstrand
47af950df5 anv/apply_pipeline_layout: Stomp texture array size to 1 2016-01-14 18:58:25 -08:00
Jason Ekstrand
6483d3f8fe nir/spirv: Fix texture return types
We were just hard-coding everything to a vec4.  This meant we weren't
handling shadow samplers at all and integer things were getting the wrong
return type.
2016-01-14 18:48:57 -08:00
Kristian Høgsberg Kristensen
2eb52198ff vk: Fix struct field indentation 2016-01-14 15:18:40 -08:00
Chad Versace
5dea9d0039 anv: Document anv_cmd_state::current_pipeline
It's the value of PIPELINE_SELECT.PipelineSelection.
2016-01-14 13:18:40 -08:00
Chad Versace
ed33ccde63 anv: Make vkBeginCommandBuffer reset the command buffer
If its the command buffer's first call to vkBeginCommandBuffer, we must
*initialize* the command buffer's state. Otherwise, we must *reset* its
state. In both cases, let's use anv_ResetCommandBuffer.

From the Vulkan 1.0 spec:

   If a command buffer is in the executable state and the command buffer
   was allocated from a command pool with the
   VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT flag set, then
   vkBeginCommandBuffer implicitly resets the command buffer, behaving
   as if vkResetCommandBuffer had been called with
   VK_COMMAND_BUFFER_RESET_RELEASE_RESOURCES_BIT not set. It then puts
   the command buffer in the recording state.
2016-01-14 13:14:40 -08:00
Chad Versace
ea20389320 anv: Add FIXME for vkResetCommandPool
vkResetCommandPool currently destroys its command buffers. The Vulkan
1.0 spec requires that it only reset them:

    Resetting a command pool recycles all of the resources from all of
    the command buffers allocated from the command pool back to the
    command pool. All command buffers that have been allocated from the
    command pool are put in the initial state.
2016-01-14 13:14:40 -08:00
Chad Versace
20fd816b6b anv: Remove duplicate func prototype
anv_private.h declared anv_cmd_buffer_begin_subpass twice.
2016-01-14 13:14:40 -08:00
Chad Versace
0415dfcfe7 anv/meta: Add FINISHME for clearing multi-layer framebuffers 2016-01-14 13:14:40 -08:00
Jason Ekstrand
32f8bcb84f i965/vec4: Use UW type for multiply into accumulator on GEN8+
BDW adds the following restriction: "When multiplying DW x DW, the dst
cannot be accumulator."
2016-01-14 12:04:25 -08:00
Jason Ekstrand
45349acad0 Merge remote-tracking branch 'mesa-public/master' into vulkan
This fixes the bitfieldextract and bitfieldinsert CTS tests
2016-01-14 11:36:27 -08:00
Jason Ekstrand
f46f4e4886 nir/spirv: Add initial support for Vertex/Instance index 2016-01-14 09:12:32 -08:00
Jason Ekstrand
3d0fac7aca vulkan.h: Pull in 1.0.1 header 2016-01-14 08:37:54 -08:00
Jason Ekstrand
24a6fcba77 vulkan-1.0.0: Bump the version to 1.0.0 2016-01-14 08:26:37 -08:00
Jason Ekstrand
c310fb032d vulkan-1.0.0: Rework memory barriers 2016-01-14 08:09:39 -08:00
Jason Ekstrand
b14a78cfb8 vulkan-1.0.0: No-op WSI changes 2016-01-14 08:02:44 -08:00
Jason Ekstrand
6d3322d0e5 vulkan-1.0.0: Make extents unsigned 2016-01-14 08:00:18 -08:00
Jason Ekstrand
b57c72d964 vulkan-1.0.0: Rework blits to use four offsets 2016-01-14 07:59:37 -08:00
Jason Ekstrand
f6cae99294 vulkan-1.0.0: Split out command buffer inheritance info 2016-01-14 07:45:15 -08:00
Jason Ekstrand
f99f847412 vulkan-1.0.0: Re-order some structs in the header 2016-01-14 07:43:05 -08:00
Jason Ekstrand
aab9517f3d vulkan-1.0.0: Misc. field and argument renames 2016-01-14 07:41:45 -08:00
Jason Ekstrand
d877095e66 vulkan-1.0.0: Get rid of MIPMAP_MODE_BASE 2016-01-14 07:32:16 -08:00
Jason Ekstrand
7b81637762 vulkan-1.0.0: Convert pPreserveAttachments to a uint32_t 2016-01-14 07:30:46 -08:00
Jason Ekstrand
802f00219a anv/device: Update features and limits 2016-01-14 07:30:46 -08:00
Jason Ekstrand
08735ba91c anv/cmd_buffer: Fix setting of viewport/scissor count 2016-01-14 07:30:46 -08:00
Jason Ekstrand
ed4fe3e9ba anv/state: Respect SamplerCreateInfo.anisotropyEnable 2016-01-14 07:30:46 -08:00
Jason Ekstrand
8a81d136f8 anv/image: Fill out VkSubresourceLayout.arrayPitch 2016-01-14 07:30:46 -08:00
BogDan Vatra
102c74277f WIP: Partially upgrade to vulkan v0.221.0
TODO, make use of:
- VkPhysicalDeviceFeatures.drawIndirectFirstInstance,
- VkPhysicalDeviceFeatures.inheritedQueries
- VkPhysicalDeviceLimits.timestampComputeAndGraphics
- VkSubmitInfo.pWaitDstStageMask
- VkSubresourceLayout.arrayPitch
- VkSamplerCreateInfo.anisotropyEnable
2016-01-14 07:30:46 -08:00
Jordan Justen
8ce2b0e140 nir/spirv: Add support for ArrayLength op
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-13 23:34:45 -08:00
Jason Ekstrand
4507d8a57a nir/spirv/alu: Properly implement mod/rem 2016-01-13 16:53:02 -08:00
Jason Ekstrand
7d5ae2d34b i965: Implement nir_op_irem and nir_op_srem 2016-01-13 16:53:02 -08:00
Jason Ekstrand
cac99fffdb nir: Add more modulus and remainder opcodes
SPIR-V makes a distinction between "modulus" and "remainder" for both
floating-point and signed integer variants.  The difference is primarily
one of which source they take their sign from.  The "remainder" opcode for
integers is equivalent to the C/C++ "%" operation while the "modulus"
opcode is more mathematically correct (at least for an unsigned divisor).
This commit adds corresponding opcodes to NIR.
2016-01-13 15:18:36 -08:00
Jason Ekstrand
0079523a0d nir/spirv: Add support for OpSpecConstantOp 2016-01-13 15:18:36 -08:00
Jason Ekstrand
8c408b9b81 nir/spirv/alu: Factor out the opcode table 2016-01-13 15:18:36 -08:00
Jason Ekstrand
9b7e08118b anv/pipeline: Pass through specialization constants 2016-01-13 15:18:36 -08:00
Jason Ekstrand
c95c3b2c21 nir/spirv: Add initial support for specialization constants 2016-01-13 15:18:36 -08:00
Jason Ekstrand
610aa00cdf nir/spirv: Add support for OpQuantize 2016-01-12 15:36:38 -08:00
Jason Ekstrand
282a837317 i965: Implement nir_op_fquantize2f16 2016-01-12 15:35:00 -08:00
Jason Ekstrand
15a56459d7 nir: Add a fquantize2f16 opcode
This opcode simply takes a 32-bit floating-point value and reduces its
effective precision to 16 bits.
2016-01-12 15:33:02 -08:00
Jason Ekstrand
aee970c844 anv/device: Bump the max program size again
No one will ever need more than 128K, right?
2016-01-12 13:49:05 -08:00
Kristian Høgsberg Kristensen
d7a193327b vk: Implement workaround for occlusion queries
We have an issue with occlusion queries (PIPE_CONTROL depth writes)
after using the pipeline with the VS disabled. We work around it by
using a depth cache flush PIPE_CONTROL before doing a depth write.

Fixes dEQP-VK.query_pool.*
2016-01-12 11:50:36 -08:00
Jason Ekstrand
6fc278ae4f anv/UpdateDescriptorSets: Respect write.dstArrayElement 2016-01-12 11:45:12 -08:00
Kristian Høgsberg Kristensen
af422fe9b3 Merge ../mesa into vulkan
Merge master again to get the brw_device_info with the
correct slice counts for KBL.
2016-01-12 10:54:26 -08:00
Kristian Høgsberg Kristensen
7df20f0c14 vk: Support SpvBuiltInViewportIndex 2016-01-12 10:53:59 -08:00
Kristian Høgsberg Kristensen
2b4bacb84b vk: Use the correct stride for CC_VIEWPORT structs 2016-01-12 10:53:59 -08:00
Jason Ekstrand
62e56492c3 nir/spirv: Allow non-block variables with interface types in lists
The original objective was to disallow UBO and SSBO variables from the
variable lists.  This was accidentally broken in b208620fd when fixing some
other interface issues.
2016-01-12 01:32:19 -08:00
Jason Ekstrand
4141d13de5 nir/spirv: Handle matrix decorations on arrays of matrices
Connor's original shallow-copy plan works great except that a couple of the
decorations apply to a matrix which may be some levels down in an array.
We weren't properly unpacking that.  This fixes most of the remaining SSBO
and UBO layout tests.
2016-01-12 01:04:44 -08:00
Jason Ekstrand
b208620fd2 nir/spirv: Allow creating local/global variables from interface types
Not sure if this is actually allowed, but it's not that hard to just strip
the interface information from the type.
2016-01-11 17:45:54 -08:00
Jason Ekstrand
350bbd3d15 nir/spirv: Allow base derefs in get_vulkan_resource_index 2016-01-11 17:45:24 -08:00
Jason Ekstrand
1c5393d57d nir/spirv: Allow OpBranchConditional without a merge
This can happen if you have a predicated break/continue.
2016-01-11 17:03:52 -08:00
Jason Ekstrand
24523e98a4 nir/spirv/cfg: Allow breaking from the continue block 2016-01-11 17:03:16 -08:00
Jason Ekstrand
c381906bbd nir/spirv: Stop wrapping carry/borrow in b2i
The upstream versions now return an integer like GLSL/SPIR-V want.
2016-01-11 17:02:30 -08:00
Jason Ekstrand
dee09d7393 nir/spirv: Better handle OpCopyMemory 2016-01-11 16:29:38 -08:00
Jason Ekstrand
1ca97cefb0 nir/spirv: Add no-op support for OpSourceContinued 2016-01-11 16:06:11 -08:00
Jason Ekstrand
bb5882e6af nir/spirv/cfg: Handle unreachable instructions 2016-01-11 15:35:15 -08:00
Jason Ekstrand
fc3f659aa9 nir/vars_to_ssa: Add phi sources for unreachable predecessors
It is possible to end up with unreachable blocks if, for instance, you have
an "if (...) { break; } else { continue; } unreachable()".  In this case,
the unreachable block does not show up in the dominance tree so it never
gets visited.  Instead, we go and visit all of those in follow-on pass.
2016-01-11 15:33:44 -08:00
Jason Ekstrand
c974b94578 nir/spirv: Properly handle OpConstantNull 2016-01-11 14:30:46 -08:00
Jason Ekstrand
96683065f2 nir/spirv: Assert that matrix types are valid 2016-01-11 14:30:46 -08:00
Jason Ekstrand
d032ede26f nir/types: Add an is_error helper 2016-01-11 14:30:46 -08:00
Jason Ekstrand
17cfafd83a nir/spirv: Handle OpNoLine 2016-01-11 14:30:46 -08:00
Chad Versace
52d4af6a3c anv/gen7: Remove unheeded helper begin_render_pass()
The helper didn't help much. It looks like a leftover from past
code-reuse.  Now it's called from exactly one location,
gen7_CmdBeginRenderPass(). So fold it into its caller.
2016-01-11 14:08:30 -08:00
Jason Ekstrand
790565b06e anv/pipeline: Handle output lowering in anv_pipeline instead of spirv_to_nir
While we're at it, we delete any unused variables.  This allows us to prune
variables that are not used in the current stage from the shader.
2016-01-11 11:06:06 -08:00
Jason Ekstrand
b8ec48ee76 anv/pipeline: Only delete functions for SPIR-V shaders
We can assume that direct NIR shaders only have one entrypoint
2016-01-11 11:06:06 -08:00
Jason Ekstrand
30883adfb8 nir/spirv: Get rid of a bunch of stage asserts
Since we may have multiple entrypoints from different stages, we don't know
what stage we are actually in so these asserts are invalid.
2016-01-11 11:06:06 -08:00
Jason Ekstrand
9f4ba499d1 nir/spirv: Take an entrypoint stage as well as a name 2016-01-11 11:06:06 -08:00
Jason Ekstrand
83bf1f752d nir/dead_variables: Add a a mode parameter
This allows dead_variables to be used on any type of variable.
2016-01-11 11:06:06 -08:00
Kristian Høgsberg Kristensen
a9c0e8f00f vk: Handle uninitialized FS inputs and gl_PrimitiveID
These show up as varying_to_slot[attr] == -1. Instead of storing -1 - 2
in swiz.Attribute[input_index].SourceAttribute, handle it correctly.
2016-01-09 01:03:20 -08:00
Kristian Høgsberg Kristensen
b538ec5409 vk: Support reseting timestamp query pools 2016-01-09 00:51:50 -08:00
Kristian Høgsberg Kristensen
925ad84700 vk: Advertise number of timestamp bits
We have 36 bits.
2016-01-09 00:51:14 -08:00
Kristian Høgsberg Kristensen
dae800daa8 vk: Expose correct timestampPeriod for SKL
Skylake uses 83.333ms per tick.
2016-01-09 00:50:04 -08:00
Kristian Høgsberg Kristensen
ec8e261208 vk: Mark VkEvent and VkSemaphore as done 2016-01-09 00:48:41 -08:00
Kristian Høgsberg Kristensen
bbb2a85c81 vk: Assert on use of uninitialized surface state
This exposes a case where we want to anv_CmdCopyBufferToImage() on an
image that wasn't created with VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT and
end up using uninitialized color_rt_surface_state from the meta image
view.
2016-01-08 23:51:11 -08:00
Kristian Høgsberg Kristensen
a8cdef3dce vk: Only begin subpass if we're continuing a render pass
If VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT is not set in
pBeginInfo->flags, we don't have a render pass or framebuffer. Change
the condition that guard looking up render pass and framebuffer to test
for VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT instead of
VK_COMMAND_BUFFER_LEVEL_SECONDARY.

Fixes all remaining crashes in dEQP-VK.api.command_buffers.*.
2016-01-08 23:02:46 -08:00
Kristian Høgsberg Kristensen
7c5e1fd998 vk: Remove unsupported warnings for Skylake and Broxton
These are working as well as Broadwell and Cherryiew. The recent merge
from mesa master brings in Kabylake device info and that should be all
we need to enable that.
2016-01-08 22:29:06 -08:00
Kristian Høgsberg Kristensen
f0993f81c7 Merge ../mesa into vulkan 2016-01-08 22:16:43 -08:00
Jason Ekstrand
cfdc955fd5 anv/reloc_list: Make valgrind explicitly check relocation data 2016-01-08 16:44:54 -08:00
Jason Ekstrand
7a1c4a0ccc nir/spirv: Add matrix determinants and inverses 2016-01-08 16:02:30 -08:00
Jordan Justen
c7f6e42a7d anv: Increate dynamic pool block size from 2k to 16k
This is needed because compute push constant data is replicated per
invocation. For gen7, this can be up to 64. With a push constant data
max of 128 bytes, this is 8k of data. We need additional space for
local-id payloads, so we are going with 16k for now.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-08 13:03:30 -08:00
Jason Ekstrand
4e15d26e47 nir/spirv: Fix a small bug in row-major matrix loading 2016-01-08 12:27:25 -08:00
Jason Ekstrand
fe2f44f2a4 nir/spirv: Use create_ssa_value for block_load_store 2016-01-08 11:50:34 -08:00
Jason Ekstrand
8b9dfb4b6d nir/spirv: Add real support for outer products 2016-01-08 11:38:59 -08:00
Jason Ekstrand
927ef0ea4e nir/spirv: Add support for add, subtract, and negate on matrices 2016-01-08 11:26:43 -08:00
Jason Ekstrand
393562f47b nir/spirv: Split ALU operations out into their own file 2016-01-08 11:26:43 -08:00
Jason Ekstrand
72bff62e7f nir/spirv: Add support for SSBO atomics 2016-01-07 22:13:46 -08:00
Jason Ekstrand
fe57ad62a6 nir/spirv: Rework UBOs and SSBOs
This completely reworks all block load/store operations.  In particular, it
should get row-major matrices working.
2016-01-07 22:13:46 -08:00
Chad Versace
1818463733 anv/gen9: Fix cube surface state
For gen9 SURFTYPE_CUBE, the RENDER_SURFACE_STATE's Depth,
MinimumArrayElement, and RenderTargetViewExtent is in units of full
cubes and so must be divided by 6.

Fixes 'dEQP-VK.pipeline.image.view_type.cube_array.cube_array.*'.

Now all of 'dEQP-VK.pipeline.image.*' passes.
2016-01-07 17:20:25 -08:00
Chad Versace
24d82a3f79 anv/gen8: Refactor genX_image_view_init()
Drop the temporary variables for RENDER_SURFACE_STATE's Depth and
RenderTargetViewExtent. Instead, assign them in-place.

This simplifies the next commit, which fixes gen9 cube surfaces.
2016-01-07 17:20:25 -08:00
Kristian Høgsberg Kristensen
1b1dca75a4 vk: Make sure we emit binding table pointers after push constants
SKL needs this to make sure we flush the push constants. It gets a
little tricky, since we also need to emit binding tables before push
constants, since that may affect the push constants (dynamic buffer
offsets and storage image parameters).  This patch splits emitting
binding tables from emitting the pointers so that we can emit push
constants after binding tables but before emitting binding table
pointers.
2016-01-07 16:31:57 -08:00
Kristian Høgsberg Kristensen
a18b5e642c vk: Implement VK_QUERY_RESULT_WITH_AVAILABILITY_BIT 2016-01-07 16:31:57 -08:00
Kristian Høgsberg Kristensen
bbf3fc815b vk: Add missing DepthStallEnable to OQ pipe control 2016-01-07 16:31:57 -08:00
Kristian Høgsberg Kristensen
067dbd7a17 vk: Issue PIPELINE_SELECT before setting up render pass
We need to make sure we're selected the 3D pipeline before we start
setting up depth and stencil buffers.
2016-01-07 16:31:57 -08:00
Jordan Justen
d24e88b98e anv/gen7: Setup state to enable barrier() function
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-07 17:11:46 -08:00
Jordan Justen
36a2304686 anv/gen8: Setup state to enable barrier() function
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-07 17:11:46 -08:00
Chad Versace
4c7f4c25d0 anv/meta: Fix hardcoded format size in anv_CmdCopy*
When looping through VkBufferImageCopy regions, for each region we
incremented the offset into the VkBuffer assuming the format size was 4.

Fixes CTS tests dEQP-VK.pipeline.image.view_type.cube_array.3d.* on
Skylake.
2016-01-07 13:56:58 -08:00
Chad Versace
a50c78a5cf isl: Add missing break statement in array pitch calculation
Fixes regression in ed98c374bd3f1952fbab3031afaf5ff4d178ef41.
2016-01-07 11:08:12 -08:00
Chad Versace
d1e6c1b29b isl/gen9: Fix array pitch of 3d surfaces
For tiled 3D surfaces, the array pitch must aligned to the tile height.

From the Skylake BSpec >> RENDER_SURFACE_STATE >> Surface QPitch:

   Tile Mode != Linear: This field must be set to an integer multiple of
   the tile height

Fixes CTS tests 'dEQP-VK.pipeline.image.view_type.3d.format.r8g8b8a8_unorm.*'.
Fixes Crucible tests 'func.miptree.r8g8b8a8-unorm.aspect-color.view-3d.*'.
2016-01-07 11:04:17 -08:00
Chad Versace
0af77fe5b6 isl: Refactor func isl_calc_array_pitch_sa_rows
Update the function to calculate the array pitch is *element rows*, and
it rename it accordingly to isl_calc_array_pitch_el_rows.
2016-01-07 11:04:17 -08:00
Jordan Justen
2f0a10149c isl: Assert that alignments are not 0 for isl_align
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-07 10:37:35 -08:00
Jordan Justen
4d68c477ad anv: Assert that alignments are not 0 for align_*
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-07 10:37:35 -08:00
Jordan Justen
be91f23e3b isl: Fix image alignment calculation
The previous code was resulting in an alignment of 0.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-07 10:37:35 -08:00
Jason Ekstrand
d8cd5e333e anv/state: Pull sampler vk-to-gen maps into genX_state_util.h 2016-01-06 19:53:45 -08:00
Jason Ekstrand
195c60deb4 nir/spirv: Wrap borrow/carry ops in b2i
NIR specifies them as booleans but SPIR-V wants ints.
2016-01-06 17:13:06 -08:00
Jason Ekstrand
000eb00862 nir/spirv/cfg: Only set fall to true at the start of a case
Previously, we were setting it to true at the top of the switch statement.
However, this causes all of the cases to get executed until you hit a
break.  Instead, you want to be not executing at the start, start executing
when you hit your case, and end at a break.
2016-01-06 17:00:55 -08:00
Jordan Justen
de65d4dcaf anv: Fix build without VALGRIND
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-01-06 15:54:51 -08:00
Jason Ekstrand
5bbf060ece i965/compiler: Enable more lowering in NIR
We don't need these for GLSL or ARB, but we need them for SPIR-V
2016-01-06 15:30:53 -08:00
Jason Ekstrand
573351cb0f nir/algebraic: Add more lowering
This commit adds lowering options for the following opcodes:

 - nir_op_fmod
 - nir_op_bitfield_insert
 - nir_op_uadd_carry
 - nir_op_usub_borrow
2016-01-06 15:30:53 -08:00
Jason Ekstrand
1f503603d3 nir/opcodes: Fix the folding expression for usub_borrow 2016-01-06 15:30:53 -08:00
Jason Ekstrand
22804de110 nir/spirv: Properly implement Modf 2016-01-06 15:30:53 -08:00
Jason Ekstrand
1f3593d8a1 nir/builder: Add a helper for storing to a deref 2016-01-06 15:30:53 -08:00
Chad Versace
8284786c5d anv/gen9: Teach gen9_image_view_init() about 1D surface qpitch
QPitch is usually expressed as rows of surface elements (where a surface
element is an compression block or a single surface sample. Skylake 1D
is an outlier; there QPitch is expressed as individual surface
elements.
2016-01-06 09:38:57 -08:00
Chad Versace
e05b307942 isl: Add isl_surf_get_array_pitch_el()
Will be needed to program SurfaceQPitch for Skylake 1D arrays.
2016-01-06 09:38:57 -08:00
Chad Versace
c1e890541e isl/gen9: Support ISL_DIM_LAYOUT_GEN9_1D 2016-01-06 09:38:57 -08:00
Chad Versace
eea2d4d059 isl: Don't align phys_slice0_sa.width twice
It's already aligned to the format's block width. Don't align it again
in isl_calc_row_pitch().
2016-01-06 09:38:57 -08:00
Chad Versace
39d043f94a isl: Fix the documented units of isl_surf::row_pitch
It's the pitch between surface elements, not between surface samples.
2016-01-06 09:38:57 -08:00
Chad Versace
dcb9c11dc7 anv/gen9: Fix oob lookup of surface halign, valign
For 1D surfaces and for surfaces with Yf or Ys tiling, the hardware
ignores SurfaceVerticalAlignment and SurfaceHorizontalAlignment.
Moreover, the anv_halign[] and anv_valign[] lookup tables may not even
contain the surface's actual alignment values. So don't do the lookup
for those surfaces.
2016-01-06 09:38:57 -08:00
Chad Versace
94566d9b68 anv/meta: Teach meta how to blit from a 1D image
Meta needed a VkShader with a 1D sampler type.
2016-01-06 09:38:57 -08:00
Jason Ekstrand
7a069bea5d nir/spirv: Fix switch statements with duplicate cases 2016-01-05 16:18:01 -08:00
Jason Ekstrand
506a467f16 nir/spirv/cfg: Assert that blocks only ever get added once
This effectively prevents infinite loops in cfg_walk_blocks.
2016-01-05 15:56:59 -08:00
Jason Ekstrand
71a25a0b07 nir/spirv: Simplify phi node handling
Instead of trying to crawl through predecessor chains and build phi nodes,
we just do a poor-man's out-of-ssa on the spot.  The into-SSA pass will
deal with putting the actual phi nodes in for us.
2016-01-05 14:59:40 -08:00
Jason Ekstrand
ec899f6b42 anv/pipeline: Lower indirect temporaries and inputs 2016-01-05 13:42:52 -08:00
Jason Ekstrand
bff45dc44e nir: Add an indirect deref lowering pass 2016-01-05 13:42:52 -08:00
Kristian Høgsberg Kristensen
30521fb19e vk: Implement a basic pipeline cache
This is not really a cache yet, but it allows us to share one state
stream for all pipelines, which means we can bump the block size without
wasting a lot of memory.
2016-01-05 12:03:21 -08:00
Kristian Høgsberg Kristensen
f551047751 vk: Destroy device->mutex when destroying the device 2016-01-05 12:03:21 -08:00
Chad Versace
8d6f0a1b80 isl: Don't force linear for 1d surfaces in gen7_filter_tiling()
gen7_filter_tiling() should filter out only tiling flags that are
incompatible with the surface. It shouldn't make performance decisions,
such as forcing linear for 1D; that's the role of the caller.
2016-01-05 11:37:32 -08:00
Chad Versace
8135786605 isl: Document gen7_filter_tiling() 2016-01-05 11:35:13 -08:00
Chad Versace
33f06842be isl: Prefer linear tiling for 1D surfaces 2016-01-05 11:35:13 -08:00
Chad Versace
98af1cc6d7 isl: Remove isl_format_layout::bpb
struct isl_format_layout contained two near-redundant members: bpb (bits
per block) and bs (block size). There do exist some hardware formats for
which bpb != 8 * bs, but Vulkan does not use them. Therefore we don't
need bpb.
2016-01-05 10:00:39 -08:00
Chad Versace
89b68dc8d0 anv: Use isl_format_layout::bs instead of ::bpb
For all formats used by Vulkan, 8 * bs == bpb.
(bs=block_size_in_bytes, bpb=bits_per_block)
2016-01-05 10:00:39 -08:00
Chad Versace
a1d64ae561 isl: Align isl_surf::phys_level0_sa to the format's compression block 2016-01-05 09:52:07 -08:00
Chad Versace
2172f0e9bb isl: Fix mis-documented units of isl_surf::phys_level_sa
It's in physical surface samples. Hence the _sa suffix.
2016-01-05 09:52:07 -08:00
Jason Ekstrand
8b403d599b nir/spirv: Add support for the ControlBarrier instruction 2016-01-04 22:08:24 -08:00
Jason Ekstrand
ba7b5edc26 anv/UpdateDescriptorSets: Use the correct index for the buffer view 2016-01-04 21:36:11 -08:00
Jason Ekstrand
b8f0bea07a nir/spirv: Implement extended add, sub, and mul 2016-01-04 20:59:16 -08:00
Jason Ekstrand
3a3c4aecf1 nir/spirv: Add support for bitfield operations 2016-01-04 17:37:10 -08:00
Jason Ekstrand
01ba96e059 nir/spirv: Add support for msb/lsb opcodes 2016-01-04 17:37:10 -08:00
Jason Ekstrand
f32370a536 nir/spirv: Add a documenting assert for OpConstantSampler 2016-01-04 17:37:10 -08:00
Jason Ekstrand
0309199802 nir/spirv: Add initial support for ConstantNull 2016-01-04 17:37:10 -08:00
Chad Versace
8cc21d3aea isl: Align single-level 2D surfaces to compression block
This fixes an assertion failure at isl.c:1003.

Reported-by: Nanley Chery <nanley.g.chery@intel.com>
2016-01-04 16:48:58 -08:00
Jason Ekstrand
151694228d anv/formats: Hand out different formats based on tiled vs. linear 2016-01-04 16:08:05 -08:00
Jason Ekstrand
f665fdf0e7 anv/image_view: Separate vulkan and isl formats
Previously, anv_image_view had a anv_format pointer that we used for
everything.  This commit replaces that pointer with a VkFormat enum copied
from the API and an isl_format.  In order to implement RGB formats, we have
to use a different isl_format for the actual surface state than the obvious
one from the VkFormat.  Separating the two helps us keep things streight.
2016-01-04 16:08:05 -08:00
Jason Ekstrand
ceb05131da anv_get_isl_format: Support depth+stencil aspect value
You just get the depth format in this case.
2016-01-04 16:08:05 -08:00
Jason Ekstrand
a7cc12910d anv/image: Do more work in anv_image_view_init
There was a bunch of common code in gen7/8_image_view_init that we really
should be sharing.
2016-01-04 16:08:05 -08:00
Jason Ekstrand
87dd59e578 anv/formats: Rework GetPhysicalDeviceFormatProperties
It now calls get_isl_format to get both linear and tiled views of the
format and determines linear/tiled properties from that.  Buffer properties
are determined from the linear format.
2016-01-04 16:08:05 -08:00
Jason Ekstrand
2712c0cca3 anv/formats: Add a tiling parameter to get_isl_format
Currently, this parameter does nothing.
2016-01-04 16:08:05 -08:00
Jason Ekstrand
603a3a9439 isl/format: Add some helpers for working with RGB formats 2016-01-04 16:08:05 -08:00
Jason Ekstrand
0639f44d0f isl: Add a file for format helpers 2016-01-04 16:08:05 -08:00
Jason Ekstrand
5f5fc23e7c genX/state: Pull some generic helpers into a shared header 2016-01-04 16:08:05 -08:00
Jason Ekstrand
ad9ff4f2b2 meta/blit: Rework how format and aspect choices are made
This commit does two things.  First, it introduces choose_* functions for
chosing formats and aspects.  Second, it changes the copy (not blit) code
to use appropreately sized UINT formats for everything except depth.  There
are two main reasons for this:  First, it means that compressed and other
non-renderable texture upload should "just work" because it won't be
tripping over non-renderable formats.  Second, it allows us to easly copy
an RGB buffer to and from an RGBX image because the formats will get
switched over to their UINT variants and the shader will deal with the
extra channel for us.
2016-01-04 16:08:05 -08:00
Jason Ekstrand
3200a81a55 anv/image: Add a vk_format field
We've been trying to move away from anv_format for a while and this should
help with the transition.  There are cases (mostly in meta) where we need
the original format for the image and not the isl_format.  These will be
moved over to the new vk_format and everythign else will use the isl_format
from the particular anv_surface.
2016-01-04 16:08:05 -08:00
Chad Versace
0d7614dce6 isl: Document mnemonic in Yf and Ys tiling
The 'f' means "four K". The 's' means "sixty-four K".
2016-01-04 15:37:39 -08:00
Kristian Høgsberg Kristensen
0f34a4ec4e isl: Use isl_align_npot for row_pitch
Many formats are not power-of-two bytes per pixels and we need the
non-power-of-two align macro here.

This reverts the revert from 4f9a211b, but keeps the change from a827b553
that fixed the yuv if-else mix-up.
2016-01-04 10:53:47 -08:00
Kristian Høgsberg Kristensen
abc1c9878f vk: Don't leak pipeline if initialization fails 2016-01-04 10:42:50 -08:00
Kristian Høgsberg Kristensen
fca1c08e34 vk: Allocate subpass attachment in one big block
This avoids making a lot of small allocations and handles allocation
failure correctly.

Fixes dEQP-VK.api.object_management.alloc_callback_fail.* failures.
2016-01-04 10:07:10 -08:00
Kristian Høgsberg Kristensen
5526c1782a vk: Handle allocation failures in meta init paths
Fixes dEQP-VK.api.object_management.alloc_callback_fail.* failures.
2016-01-04 10:07:08 -08:00
Kristian Høgsberg Kristensen
b2ad2a20b6 vk: Handle allocation failure in anv_pipeline_init()
Fixes dEQP-VK.api.object_management.alloc_callback_fail.* failures.
2016-01-04 10:06:50 -08:00
Kristian Høgsberg Kristensen
3954594eb4 vk: Call vk_error when we generate a VK_ERROR 2016-01-04 10:02:50 -08:00
Kristian Høgsberg Kristensen
75e01c8b2d vk: Only finish wayland wsi if we created it
Failure during instance creation will leave instance->wayland_wsi
undefined. When we then try to clean that up we crash. Set
instance->wayland_wsi to NULL on failure and only clean it up if it's
non-NULL.

Fixes part of dEQP-VK.api.object_management.alloc_callback_fail.*
2016-01-04 10:02:50 -08:00
Chad Versace
05c22f2d74 isl: Fix row pitch for linear buffers
isl always aligned the row pitch to the surface's image alignment.  This
was sometimes wrong when the surface backed a VkBuffer. For a VkBuffer,
the surface's row pitch is set by VkBufferImageCopy::bufferRowLength,
whose required alignment is only that of the VkFormat.

In particular, VkBuffer rows are packed in many dEQP and Crucible tests.
And packed rows are rarely aligned to the surface's image alignment.

Fixes: dEQP-VK.pipeline.image.view_type.2d.format.r8g8b8a8_unorm.size.13x13
2016-01-04 09:57:25 -08:00
Chad Versace
a827b553d9 isl: Fix swapped if-else in isl_calc_row_pitch
The YUV case was applied to non-YUV formats. Oops.
2016-01-04 09:57:23 -08:00
Jason Ekstrand
f6c4658cde nir/spirv: Fix group decorations
They were completely bogus before.  For one thing, OpDecorationGroup
created a value of type undef rather than decoration_group.  Also
OpGroupMemberDecorate didn't properly apply the decoration to the different
members of the different groups.  It *should* be correct now but there's no
good way to test it yet.
2016-01-02 11:53:36 -08:00
Jason Ekstrand
6b0b57225c anv/device: Only allocate whole pages in AllocateMemory
The kernel is going to give us whole pages anyway, so allocating part of a
page doesn't help.  And this ensures that we can always work with whole
pages.
2016-01-02 07:52:24 -08:00
Jason Ekstrand
f076d5330d anv/device: Handle non-4k-aligned calls to MapMemory
As per the spec:

   minMemoryMapAlignment is the minimum required alignment, in bytes, of
   host-visible memory allocations within the host address space. When
   mapping a memory allocation with vkMapMemory, subtracting offset bytes
   from the returned pointer will always produce a multiple of the value of
   this limit.
2016-01-01 09:29:29 -08:00
Jason Ekstrand
6b5cbdb317 anv/format: Get rid of num_channels 2015-12-31 12:07:43 -08:00
Jason Ekstrand
3fe1f118f8 anv/cmd_buffer: Fix a pointer-cast typo 2015-12-31 12:07:43 -08:00
Chad Versace
86ecb28ec6 isl: Document some isl_surf::phys_level0_sa invariants
isl_dim_layout restricts the range of isl_surf::phys_level0_sa.
2015-12-31 12:06:02 -08:00
Jason Ekstrand
5318424d49 anv/pipeline: Better vertex input channel setup
First off, it now uses isl formats instead of anv_format.  Also, it
properly handles integer vs. floating-point default channels and can
properly handle alpha-only channels.  (Not sure if those are allowed).
2015-12-31 12:02:08 -08:00
Jason Ekstrand
c6364495b2 anv/pipeline: Move vk_to_gen tables into a shared header 2015-12-31 12:02:08 -08:00
Chad Versace
d25cff687b isl: Better document surface units
Logical pixels, physical surface samples, and physical surface elements.

Requested-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-12-31 11:56:13 -08:00
Chad Versace
373fd89e4b isl: Document the 3D block extent of isl_format 2015-12-31 11:55:48 -08:00
Jason Ekstrand
1ddcbbf05f nir/spirv: Add a missing break statement in handle_image 2015-12-30 21:57:04 -08:00
Jason Ekstrand
4f9a211b4a Revert "isl: Fix assertion failure for npot pixel formats"
This reverts commit 96d1baa88d.
2015-12-30 21:01:55 -08:00
Jason Ekstrand
0bb103d010 nir/spirv: Handle push constants after decorations 2015-12-30 20:54:27 -08:00
Jason Ekstrand
3421ba1843 anv/device: Place memory types at heapIndex == 0
Previously, they were at heapIndex == 1 even though we only advertised one
heap.
2015-12-30 19:32:43 -08:00
Jason Ekstrand
cf6ce424e0 nir/spirv: Fix constant num_elements and allocation
Thanks to the addition of nir_clone, we now have a num_elements field in
nir_constant which we weren't setting.  Also, constants have to be parented
to the variable they initialize, so we have to make a copy.
2015-12-30 18:51:59 -08:00
Jason Ekstrand
601b7d5f98 nir/lower_outputs_to_temporaries: Reparent constant initializers 2015-12-30 18:51:06 -08:00
Jason Ekstrand
7d57528233 nir/clone: Expose nir_constant_clone 2015-12-30 18:44:19 -08:00
Jason Ekstrand
fed98df428 nir/gather_info: Add support for end_primitive_with_counter 2015-12-30 17:45:43 -08:00
Jason Ekstrand
5afac62b28 nir/spirv: Handle OpLine 2015-12-30 17:45:43 -08:00
Jason Ekstrand
149f35bbba nir/spirv: Let OpEntryPoint act as an OpName 2015-12-30 17:45:43 -08:00
Jason Ekstrand
5f7f88524c nir/lower_outputs_to_temporaries: Take a nir_function entrypoint 2015-12-30 17:45:43 -08:00
Jason Ekstrand
0fe4580e64 nir/spirv: Add support for multiple entrypoints per shader
This is done by passing the entrypoint name into spirv_to_nir.  It will
then process the shader as if that were the only entrypoint we care about.
Instead of returning a nir_shader, it now returns a nir_function.
2015-12-30 17:45:43 -08:00
Jason Ekstrand
e993e45eb1 nir/spirv: Get the shader stage from the SPIR-V
Previously, we depended on it being passed in.
2015-12-30 17:45:43 -08:00
Jason Ekstrand
db3a64fcea nir/spirv: Use shader stage for determining variable locations 2015-12-30 17:45:43 -08:00
Jason Ekstrand
d7ae2200f9 nir/spirv: Get rid of default GS info
shaderc has been fixed for a while now.
2015-12-30 17:45:43 -08:00
Jason Ekstrand
d9c9a117dc nir/spirv: Handle execution modes as decorations
They're basically the same thing.
2015-12-30 17:45:43 -08:00
Jason Ekstrand
2b6bcaf91a nir/spirv: Separate handling of preamble from type/var/const instructions 2015-12-30 17:45:43 -08:00
Chad Versace
96d1baa88d isl: Fix assertion failure for npot pixel formats
When aligning to isl_format_layout::bs (which is the number of bytes in
the pixel), use isl_align_npot() instead of isl_align(), because
isl_align() works only for power-of-2 alignment.

Fixes assertion in
dEQP-VK.pipeline.image.view_type.1d.format.r16g16b16_sfloat.size.512x1.
2015-12-30 16:28:19 -08:00
Jason Ekstrand
07b4f17aaf nir/spirv/GLSL450: Add support for SAbs 2015-12-30 14:41:49 -08:00
Kenneth Graunke
e6cd0c0e1c nir/spirv: Implement IsInf and IsNan built-ins. 2015-12-30 14:10:44 -08:00
Jason Ekstrand
a7e827192b isl: Tile-align height in image size calculation
This fixes a bunch of gpu hangs on the dEQP-VK.glsl.ShaderExecutor.common
group of CTS tests.
2015-12-30 14:03:47 -08:00
Kenneth Graunke
9f23116bfa Revert "nir/spirv: Update to the 1.0 GLSL.std.450 header"
This reverts commit b33f5d3889,
and also removes the (empty) case statements for the new built-ins.

It doesn't look like glslang has updated yet, so updating the header
just breaks everything, as we no longer agree on opcode numbers.
2015-12-30 13:26:56 -08:00
Jason Ekstrand
e6fc170afb anv/allocator: Rework state streams again
If we're going to hav valgrind verify state streams then we need to ensure
that once we choose a pointer into a block we always use that pointer until
the block is freed.  I was trying to do this with the "current_map" thing.
However, that breaks down because you have to use the map from the block
pool to get to the stream_block to get at current_map.  Instead, this
commit changes things to track the stream_block by pointer instead of by
offset into the block pool.
2015-12-30 11:40:38 -08:00
Jason Ekstrand
28243b2fba gen7/8/cmd_buffer: Allocate the correct ammount for COLOR_CALC_STATE
We were allocating 6 bytes when we should have been allocating 6 dwords.
2015-12-30 10:37:57 -08:00
Jason Ekstrand
a0b2829f20 anv/stream_alloc: Properly manage valgrind NOACCESS and UNDEFINED status
When I first did the valgrindifying for stream allocators, I misunderstood
some things about valgrind's expectations for NOACCESS and UNDEFINED.
First off, valgrind expects things to be marked NOACCESS before you
allocate out of them.  Since our blocks came from a pool backed by a
mmapped memfd, they came in as UNDEFINED; we needed to mark them as
NOACCESS.  Also, I didn't realize that VALGRIND_MEMPOOL_CHANGE only updated
the mempool allocation state and didn't actually change definedness; we had
to add a VALGRIND_MAKE_MEM_UNDEFINED to get rid of the NOACCESS on the
newly allocated portion.
2015-12-30 10:36:19 -08:00
Kristian Høgsberg Kristensen
91d93f7908 nir/spirv: Lower gl_GlobalInvocationID correctly
Use nir_intrinsic_load_local_invocation_id, not
nir_intrinsic_load_invocation_id (missing 'local'), which is a geometry
shader built-in.
2015-12-30 00:03:54 -08:00
Jason Ekstrand
451fe2670c nir/spirv/cfg: Handle discard 2015-12-29 19:23:25 -08:00
Jason Ekstrand
5693637faa nir/print: Handle variables with var->name == NULL 2015-12-29 16:58:00 -08:00
Jason Ekstrand
8cc55780fd nir/inline_functions: Switch to inlining everything 2015-12-29 16:58:00 -08:00
Kenneth Graunke
7cdcee3bed nir/spirv/glsl450: Enumerate more built-in opcodes. 2015-12-29 16:06:35 -08:00
Kenneth Graunke
ccd84848f0 anv/state: Fix reversed MIN vs. MAX in levelCount handling.
The point is to promote a levelCount of 0 to 1 before subtracting 1.
This needs MAX, not MIN.
2015-12-29 15:51:14 -08:00
Jason Ekstrand
2a58cb03d0 nir/spirv: Use instr_rewrite_src for updating phi sources
You can't just add a new source to a phi because use/def information won't
get updated properly.  Instead, you have to use one of the core helpers.
Some day, we may want to add a nir_phi_instr_add_src helper.
2015-12-29 15:44:39 -08:00
Jason Ekstrand
69d5838aee nir/validate: Don't validate the return deref for void function calls 2015-12-29 15:35:29 -08:00
Jason Ekstrand
51b04d03d5 nir/dominance: Handle unreachable blocks
Previously, nir_dominance.c didn't properly handle unreachable blocks.
This can happen if, for instance, you have something like this:

loop {
   if (...) {
      break;
   } else {
      break;
   }
}

In this case, the block right after the if statement will be unreachable.
This commit makes two changes to handle this.  First, it removes an assert
and allows block->imm_dom to be null if the block is unreachable.  Second,
it properly skips unreachable blocks in calc_dom_frontier_cb.
2015-12-29 15:29:27 -08:00
Kenneth Graunke
b4a1c9b506 nir/spirv/glsl450: Implement inverse hyperbolic trig built-ins. 2015-12-29 15:27:03 -08:00
Kenneth Graunke
2ea111664c nir/spirv/glsl450: Implement Refract built-in. 2015-12-29 15:27:03 -08:00
Kenneth Graunke
74529a2c50 nir/spirv/glsl450: Implement hyperbolic trig built-ins. 2015-12-29 15:27:03 -08:00
Kenneth Graunke
0b1a436ac8 nir/spirv/glsl450: implement Reflect built-in. 2015-12-29 15:27:03 -08:00
Kenneth Graunke
659a3623b0 nir/spirv/glsl450: Implement FaceForward built-in. 2015-12-29 15:27:03 -08:00
Kenneth Graunke
b10af36d93 nir/spirv/glsl450: Implement SmoothStep. 2015-12-29 15:27:03 -08:00
Kenneth Graunke
6a0fa2d758 nir/spirv/glsl450: Implement Cross built-in. 2015-12-29 15:27:03 -08:00
Kenneth Graunke
083fd6ec2a nir/spirv/glsl450: Implement Clamp/SClamp/UClamp. 2015-12-29 15:27:03 -08:00
Kenneth Graunke
034010924e nir/spirv/glsl450: Implement the Log built-in. 2015-12-29 15:27:03 -08:00
Kenneth Graunke
ffc5ae7c9e nir/spirv/glsl450: Implement Exp built-in. 2015-12-29 15:27:03 -08:00
Kenneth Graunke
227e250005 nir/spirv/glsl450: Add a helper for doing fclamp(). 2015-12-29 15:27:03 -08:00
Kenneth Graunke
0f801752f2 nir/spirv/glsl450: Add helpers for calculating exp() and log(). 2015-12-29 15:27:03 -08:00
Kenneth Graunke
9c9edd1ce8 nir/spirv/glsl450: Add an 'nb' shortcut variable.
"nb" is shorter and more convenient than "&b->nb", especially
when several operations are composed together into a larger expression
tree.
2015-12-29 15:27:03 -08:00
Jason Ekstrand
5f04a61219 nir/lower_returns: Don't just change the type of a jump.
It doesn't give core NIR the opportunity to update predecessors and
successors.  Instead, we have to remove and re-insert the instruction.
2015-12-29 14:51:47 -08:00
Jason Ekstrand
6fa47c9c17 nir/builder: Add a nir_jump helper 2015-12-29 14:48:34 -08:00
Jason Ekstrand
37a38548d4 glsl/types.cpp: Fix function_key_compare 2015-12-29 14:32:10 -08:00
Jason Ekstrand
b33f5d3889 nir/spirv: Update to the 1.0 GLSL.std.450 header 2015-12-29 14:29:03 -08:00
Jason Ekstrand
a33fcc0fd4 Merge remote-tracking branch 'mesa-public/master' into vulkan
This pulls in nir_builder_init_simple_shader and allows us to delete
anv_nir_builder.h entirely.
2015-12-29 13:53:41 -08:00
Jason Ekstrand
5dd4386b92 nir/spirv: Use a C99-style initializer for structure fields
This ensures that all unknown fields get zero-initizlied so we don't have
undefined values floating around.
2015-12-29 13:15:20 -08:00
Jason Ekstrand
e10b0e2b49 anv/pipeline: Use vs_prog_data.inputs_read when computing vb_used 2015-12-29 13:03:01 -08:00
Jason Ekstrand
0a2ab87947 nir/spirv: Move CF emit code into vtn_cfg.c 2015-12-29 12:50:31 -08:00
Jason Ekstrand
4e22cd2e32 nir/spirv: Add support for switch statements 2015-12-29 12:50:31 -08:00
Jason Ekstrand
cf555dc1c2 nir/spirv: A couple simple loop fixes 2015-12-29 12:50:31 -08:00
Jason Ekstrand
303d095f58 nir/spirv: Add an actual CFG data structure
The current data structure doesn't handle much that we couldn't handle
before.  However, this will be absolutely crucial for doing swith
statements.  Also, this should fix structured continues.
2015-12-29 12:50:31 -08:00
Jason Ekstrand
bbf99511d0 gen7/8/pipeline: s/vb_used/elements in emit_vertex_input 2015-12-29 09:40:22 -08:00
Kristian Høgsberg Kristensen
fc03723bcd vk: Fill out buffer surface state when updating descriptor set
We can do this when we update the descriptor set instead of on the
fly.
2015-12-28 21:57:56 -08:00
Kristian Høgsberg Kristensen
a00524a216 vk: Unstub VkSemaphore implementation
There really is nothing to do for us here, at least with the current
kernel interface.
2015-12-28 21:57:56 -08:00
Jason Ekstrand
5fab35d090 gen7/pipeline: Actually use inputs_read from the VS for laying out inputs 2015-12-28 18:21:11 -08:00
Jason Ekstrand
b090f9dce1 gen8/pipeline: Actually use inputs_read from the VS for laying out inputs 2015-12-28 18:21:11 -08:00
Jason Ekstrand
3eb108ef87 anv/meta: Fix the pos_out location for the vertex shader 2015-12-28 18:21:11 -08:00
Jason Ekstrand
b005fd62f9 nir/spirv: Add GLSL.std.450.h
It accidentally got removed during the mass rename.
2015-12-28 15:46:22 -08:00
Jason Ekstrand
9c84b6cce0 anv/device: Set device->info sooner in CreateDevice
anv_block_pool_init calls anv_block_pool_grow which checks
device->info.has_llc to see if it needs to set caching parameters.
If we don't set device->info early enough, this reads an undefined value
which is probably 0 and not what we want on llc platforms.

Found with valgrind.
2015-12-28 13:29:01 -08:00
Jason Ekstrand
763176a3e2 nir/lower_returns: Fix a bug in loop lowering 2015-12-28 13:22:09 -08:00
Jason Ekstrand
7aaed91581 nir/spirv: Move to its own directory 2015-12-28 11:49:39 -08:00
Jason Ekstrand
d5fa51bdee Merge remote-tracking branch 'mesa-public/master' into vulkan
This pulls in the removal of nir_function_overload
2015-12-28 10:56:31 -08:00
Jason Ekstrand
d9dcfafacc nir/spirv: Use nir_build_alu for alu instructions 2015-12-28 10:35:31 -08:00
Jason Ekstrand
ea77b384e8 Merge remote-tracking branch 'mesa-public/master' into vulkan
This pulls in tessellation and the store_var changes that go with it.
2015-12-27 23:23:05 -08:00
Jason Ekstrand
f948767471 nir/lower_returns: Better algorithm as per connor 2015-12-27 22:50:45 -08:00
Jason Ekstrand
3489f66056 nir: Add a cursor helper for getting a cursor after any phi nodes 2015-12-27 22:50:14 -08:00
Jason Ekstrand
c60456dfaa nir/gather_info: Handle multi-slot variables in io bitfields 2015-12-24 00:47:20 -08:00
Jason Ekstrand
bbebd2de13 nir: Add a helper for getting the bitmask for a variable's location 2015-12-24 00:47:20 -08:00
Jason Ekstrand
4ff4310a78 nir/types: Expose glsl_type::count_attribute_slots() 2015-12-24 00:47:19 -08:00
Jason Ekstrand
0bc1b0fd23 nir/lower_return: Do it for real this time 2015-12-24 00:47:19 -08:00
Jason Ekstrand
e1b1d58bec nir/cf: Make extracting or re-inserting nothing a no-op 2015-12-23 23:46:04 -08:00
Jason Ekstrand
eae352e75c nir: Add a function for comparing cursors 2015-12-23 18:09:42 -08:00
Jason Ekstrand
54c870ff61 nir/spirv: Add support for undefs in vtn_ssa_value() 2015-12-23 14:14:39 -08:00
Jason Ekstrand
2e823d5754 nir/spirv: Properly handle vector times matrix 2015-12-23 13:49:56 -08:00
Jason Ekstrand
452ba4db2b nir/spirv: Create the correct type if a matrix-vector multiply produces a vector 2015-12-23 13:49:56 -08:00
Jason Ekstrand
5b30132388 nir/spirv: Fix some mem_ctx issues with create_vec 2015-12-23 13:49:56 -08:00
Jason Ekstrand
66168a798b nir/spirv: Better document vtn_ssa_value.transposed 2015-12-23 13:49:56 -08:00
Jason Ekstrand
3b391892aa anv/descriptor_set: Use anv_foreach_stage 2015-12-23 13:49:56 -08:00
Jason Ekstrand
72ceb99bab anv: Mask out invalid stages in foreach_stage 2015-12-23 13:49:56 -08:00
Jason Ekstrand
5644b1cece nir/spirv: Handle LogicalNot 2015-12-23 13:49:56 -08:00
Jason Ekstrand
6219a69589 nir/spirv: Handle derefs in vtn_ssa_value
This is kind of a hack, but it makes vtn_ssa_value insert a load if the
value requested is actually a deref.  This shouldn't happen normally but,
thanks to the impedence mismatch of the NIR function parameter model vs.
the SPIR-V model, this can happen for function arguments.
2015-12-23 13:49:56 -08:00
Jason Ekstrand
3ab1b7afa8 nir/spirv: Do boolean fixup on block loads
We used to do it for variable loads on things of type "uniform" but that
never got ported to block loads.
2015-12-23 13:49:56 -08:00
Jason Ekstrand
af74ce5a19 spirv/nir: Handle non-vector extractions in vtn_composite_extract 2015-12-23 13:49:56 -08:00
Jason Ekstrand
79b8b42081 nir/spirv: Handle function calls 2015-12-23 13:49:56 -08:00
Jason Ekstrand
95990c96cc nir: Create the params array in function_impl_create 2015-12-23 13:49:56 -08:00
Jason Ekstrand
a7f3e113ad i965/nir: Remove return handling
This was added because we were getting spurrious returns coming out of
SPIR-V.  Now that we're calling lower_returns, we don't need this.
2015-12-23 13:49:56 -08:00
Jason Ekstrand
ac975b73cf anv/pipeline: Run lower_returns and inline_functions after spirv_to_nir 2015-12-23 13:49:56 -08:00
Jason Ekstrand
8fba4bf79f nir: Add a function inlining pass 2015-12-23 13:49:56 -08:00
Jason Ekstrand
b21db9cea5 nir/builder: Add a copy_deref_var helper 2015-12-23 13:49:56 -08:00
Jason Ekstrand
23cfa683d5 nir: move nir_copy_var from anv_nir_builder to nir_builder 2015-12-23 13:49:56 -08:00
Jason Ekstrand
4aac03fe61 nir/clone: Add support for cloning a single function_impl
This will be useful for things such as function inlining.
2015-12-23 13:49:56 -08:00
Jason Ekstrand
98291b8f2c nir: Add a helper for creating a "bare" nir_function_impl
This is useful if you want to clone a single function_impl if, for
instance, you wanted to do function inlining.
2015-12-23 13:49:56 -08:00
Jason Ekstrand
86772c2488 nir/control_flow: Handle relinking top-level blocks
This can happen if a function ends in a return instruction and you remove
the return.
2015-12-23 13:49:56 -08:00
Jason Ekstrand
1749e667ea nir: Add a stub function inlining pass
All it does is remove the return at the end, but it's good enough for
simple functions.
2015-12-23 13:49:56 -08:00
Jason Ekstrand
413a9d3517 nir/print: Factor variable name lookup into a helper
Otherwise, we have a problem when we go to print functions with arguments
because their names get added to the hash table during declaration which
happens after we print the prototype.
2015-12-23 13:49:56 -08:00
Kristian Høgsberg Kristensen
220ac9337b vk: Only require wc bo mmap for !llc GPUs 2015-12-19 22:25:57 -08:00
Kristian Høgsberg Kristensen
b49aaf5de0 vk: Remove stale 48 bit addresses FIXMEs
This has worked fine for a long time.
2015-12-19 22:20:45 -08:00
Kristian Høgsberg Kristensen
c4802bc44c vk/gen8: Implement VkEvent for gen8
We use PIPE_CONTROL for setting and resetting the event from cmd buffers
and MI_SEMAPHORE_WAIT in polling mode for waiting on an event.
2015-12-19 22:17:19 -08:00
Kristian Høgsberg Kristensen
8ac46d84ff vk: Fix check for I915_PARAM_MMAP_VERSION
Comparing the wrong thing for < 1.
2015-12-18 17:24:19 -08:00
Jordan Justen
5e82a91324 anv/gen8: Add support for gl_NumWorkGroups
Co-authored-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-18 01:45:11 -08:00
Jason Ekstrand
d7f66f9f6f nir/spirv: Array lengths are constants not literals 2015-12-17 16:36:29 -08:00
Jason Ekstrand
1473a8dc6f anv/formats: Add more 64-bit formats 2015-12-17 13:51:09 -08:00
Jason Ekstrand
167809365b anv/formats: Add more PACK32 formats 2015-12-17 13:44:50 -08:00
Jason Ekstrand
952bf05897 anv/image: Properly report buffer features 2015-12-17 11:52:31 -08:00
Jason Ekstrand
3395ca17d1 isl: Add a is_storage_image_format helper 2015-12-17 11:45:04 -08:00
Jason Ekstrand
b1325404c5 anv/device: Handle zero-sized memory allocations 2015-12-17 11:00:38 -08:00
Jason Ekstrand
c643e9cea8 anv/state: Allow levelCount to be 0
This can happen if the client is creating an image view of a textureable
surface and they only ever intend to render to that view.
2015-12-16 17:34:57 -08:00
Jason Ekstrand
b2fe8b4673 nir/spirv: Add a missing break statement 2015-12-15 17:24:18 -08:00
Jason Ekstrand
1c51d91bfe anv/pipeline: Allow the user to pass a null MultisampleCreateInfo
According to section 5.2 of the Vulkan spec, this is allowed for color-only
rendering pipelines.
2015-12-15 16:26:10 -08:00
Jason Ekstrand
d61ff1ed08 anv/descriptor_set: Initialize immutable_samplers to NULL
Previously this wasn't a problem.  However, with the new API update,
descriptor sets can now be sparse so the client doesn't have to provide an
entry for every binding.  This means that it's possible for a binding to be
uninitialized other than the memset.  In that case, we want to have a null
array of immutable samplers.
2015-12-15 16:24:22 -08:00
Jason Ekstrand
28c4ef9d6c anv/device: Bump the size of the instruction block pool
Some CTS test shaders were failing to compile.  At some point soon, we
really need to make a real pipeline cache and stop using a block pool for
this.
2015-12-15 11:49:28 -08:00
Jason Ekstrand
306abbead3 anv/pipeline: Properly set IncludeVertexHandles in 3DSTATE_GS 2015-12-15 11:37:18 -08:00
Jason Ekstrand
2d4b7eda23 nir/spirv: Add support for more CS intrinsics 2015-12-15 10:20:23 -08:00
Jason Ekstrand
1035108a7f nir/lower_system_values: Add support for computed builtins.
In particular, this commit adds support for computing gl_GlobalInvocationID
and gl_LocalInvocationIndex from other intrinsics.
2015-12-15 10:20:23 -08:00
Jason Ekstrand
630b9528b3 shader_enums: Add enums for gl_GlobalInvocationID and gl_LocalInvocationIndex 2015-12-15 10:20:23 -08:00
Jason Ekstrand
7ebd84fa4b nir/lower_system_values: Refactor and use the builder.
Now that we have a helper in the builder for system values and a helper in
core NIR to get the intrinsic opcode, there's really no point in having
things split out into a helper function.  This commit "modernizes" this
pass to use helpers better and look more like newer passes.
2015-12-15 10:20:23 -08:00
Jason Ekstrand
c26e889a44 nir/builder: Add a load_system_value helper
While we're at it, go ahead and make nir_lower_clip use it.

Cc: Rob Clark <robclark@gmail.com>
2015-12-15 10:20:23 -08:00
Jason Ekstrand
de67456d6d nir/lower_system_values: Stop supporting non-SSA
The one user of this (i965) only ever calls it while in SSA form.
2015-12-15 10:20:23 -08:00
Chad Versace
64f0ee73e0 isl: Add func isl_surf_get_image_offset_sa
The function calculates the offset to a subimage within the surface, in
units of surface samples.

All unit tests pass with `make check`. (Admittedly, though, there are
too few unit tests).
2015-12-15 08:46:09 -08:00
Chad Versace
53504b884e isl: Fix calculation of array pitch for layout GEN4_2D
The height of the miptree's right half was not large enough.

Found by `make check` in test_isl_surf_get_offset, which is added in the
next commit.
2015-12-15 08:46:09 -08:00
Chad Versace
f7e36f9f66 isl: Move it a standalone directory
The plan all along was to eventualyl move isl out of the Vulkan
directory, because I intended i965 and anvil to share it.

A small problem I encountered when attempting to write unit tests for
isl precipitated the move.  I discovered that it's easier to get isl
unit tests to build if I remove the extra, unneeded dependencies
injected by src/vulkan/Makefile.am. And the easiest way to remove those
unneeded dependencies is to move isl out of src/vulkan. (Unit tests come
in subsequent commits).
2015-12-15 08:45:49 -08:00
Jason Ekstrand
8224571ef8 vec4/generator: Actually pass the sampler into generate_tex
This is an artifact of the way the separate samplers/textures series ended
up getting sent out and rebased.  This should fix a number of CTS tests
involving geometry shaders.
2015-12-14 21:13:52 -08:00
Jordan Justen
7edcc59a7b anv: Rename gs_vec4 to gs_kernel
The code generated may be vec4 or simd8 depending on how we start the
compiler.

To run the GS in SIMD8, set the INTEL_SCALAR_GS environment variable.
This was added in:

    commit 36fd653817
    Author: Kenneth Graunke <kenneth@whitecape.org>
    Date:   Wed Mar 11 23:14:31 2015 -0700

        i965: Add scalar geometry shader support.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-14 18:23:14 -08:00
Jordan Justen
a3c5c339a8 nir/spirv_to_nir: Use a minimum of 1 for GS invocations
glslang is giving us 0, which causes the SIMD8 GS compile to hit an
assert.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-14 18:23:14 -08:00
Jason Ekstrand
f46544dea1 anv: Fix CUBE storage images 2015-12-14 16:59:59 -08:00
Jason Ekstrand
783a21192c anv: Add support for storage texel buffers 2015-12-14 16:51:12 -08:00
Jason Ekstrand
1f98bf8da0 anv: Pass an isl_format into fill_buffer_surface_state 2015-12-14 16:14:20 -08:00
Jason Ekstrand
091b6156dd i965/fs: Push small uniform arrays
Unfortunately, this also means that we need to use a slightly different
algorithm for assign_constant_locations.  The old algorithm worked based on
the assumption that each read of a uniform value read exactly one float.
If it encountered a MOV_INDIRECT, it would immediately bail and push the
whole thing.  Since we can now read ranges using MOV_INDIRECT, we need to
be able to push a series of floats without breaking them up.  To do this,
we use an algorithm similar to the on in split_virtual_grfs.
2015-12-14 15:58:10 -08:00
Jason Ekstrand
63c313de84 i965/fs: Rename demote_pull_constants to lower_constant_loads 2015-12-14 15:58:10 -08:00
Jason Ekstrand
75f33a6420 i965/vec4: Get rid of the uniform_size array 2015-12-14 15:58:09 -08:00
Jason Ekstrand
eb76f226cf i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD 2015-12-14 15:58:09 -08:00
Jason Ekstrand
a487f0284f i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants
This commit moves us to an instruction based model rather than a
register-based model for indirects.  This is more accurate anyway as we
have to emit instructions to resolve the reladdr.  It's also a lot simpler
because it gets rid of the recursive reladdr problem by design.

One side-effect of this is that we need a whole new algorithm in
move_uniform_array_access_to_pull_constants.  This new algorithm is much
more straightforward than the old one and is fairly similar to what we're
already doing in the FS backend.
2015-12-14 15:58:09 -08:00
Jason Ekstrand
46f5396846 i965/vec4: Inline get_pull_constant_offset
It's not really doing enough anymore to justify a helper function.
2015-12-14 15:58:09 -08:00
Jason Ekstrand
9c36c40845 i965/fs: Get rid of the param_size array 2015-12-14 15:58:09 -08:00
Jason Ekstrand
9024353db3 i965/fs: Stop relying on param_size in assign_constant_locations
Now that we have MOV_INDIRECT opcodes, we have all of the size information
we need directly in the opcode.  With a little restructuring of the
algorithm used in assign_constant_locations we don't need param_size
anymore.  The big thing to watch out for now, however, is that you can have
two ranges overlap where neither contains the other.  In order to deal with
this, we make the first pass just flag what needs pulling and handle
assigning pull constant locations until later.
2015-12-14 15:58:09 -08:00
Jason Ekstrand
9f46af9e41 i965/fs: Get rid of reladdr
We aren't using it anymore.
2015-12-14 15:58:09 -08:00
Jason Ekstrand
a3cd95a884 i965/fs: Use MOV_INDIRECT for all indirect uniform loads
Instead of using reladdr, this commit changes the FS backend to emit a
MOV_INDIRECT whenever we need an indirect uniform load.  We also have to
rework some of the other bits of the backend to handle this new form of
uniform load.  The obvious change is that demote_pull_constants now acts
more like a lowering pass when it hits a MOV_INDIRECT.
2015-12-14 15:58:09 -08:00
Jordan Justen
c4219bc6ff anv/cmd_buffer: Gen 8 requires 64 byte alignment for push constant data
See MEDIA_CURBE_LOAD, CURBE Data Start Address & CURBE Total Data Length

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-14 15:39:07 -08:00
Jason Ekstrand
f0313a5569 anv: Add initial support for cube maps
This fixes 486 cubemap CTS tests.
2015-12-14 15:36:30 -08:00
Jason Ekstrand
7ba70b1b51 nir: Add another index to load_uniform to specify the range read 2015-12-14 14:28:31 -08:00
Jason Ekstrand
4115648a6b i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT 2015-12-14 14:28:31 -08:00
Jason Ekstrand
2f1455dbb0 i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware
While we're at it, we also add support for the possibility that the
indirect is, in fact, a constant.  This shouldn't happen in the common case
(if it does, that means NIR failed to constant-fold something), but it's
possible so we should handle it.
2015-12-14 14:28:31 -08:00
Jason Ekstrand
4be9a1c7bb i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnr
The subnr field is in bytes so we don't need to multiply by type_sz.
2015-12-14 14:28:31 -08:00
Jason Ekstrand
653d8044ab i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructions
It should work fine without it and the visitor can set it if it wants.
2015-12-14 14:28:30 -08:00
Jason Ekstrand
2c90f08bf7 i965/fs: Add support for doing MOV_INDIRECT on uniforms 2015-12-14 14:28:30 -08:00
Jason Ekstrand
dba28da075 anv/buffer_view: Store a bo + offset instead of buffer pointer
This is what image_view does.  Also, we really need to do this so that we
can properly handle the combined offsets from the buffer and from
pCreateInfo.

This fixes some of the nonzero offset buffer view CTS tests.
2015-12-14 14:10:40 -08:00
Chad Versace
ee57062e1e anv: Remove anv_image::surface_type
When building RENDER_SURFACE_STATE, the driver set
SurfaceType = anv_image::surface_type, which was calculated during
anv_image_init(). This was bad because the value of
anv_image::surface_type was taken from a gen-specific header,
gen8_pack.h, even though the anv_image structure is used for all gens.

Replace anv_image::surface_type with a gen-specific lookup function,
anv_surftype(), defined in gen${x}_state.c.

The lookup function contains some useful asserts that caught some nasty
bugs in anv meta, which were fixed in the previous commit.
2015-12-14 10:46:27 -08:00
Chad Versace
f0d11d5a81 anv/meta: Fix VkImageViewType
Meta unconditionally used VK_IMAGE_VIEW_TYPE_2D in the functions below.
This caused some out-of-bound memory accesses.
  anv_CmdCopyImage
  anv_CmdBlitImage
  anv_CmdCopyBufferToImage
  anv_CmdClearColorImage

Fix it by adding a new function, anv_meta_get_view_type().
2015-12-14 09:03:58 -08:00
Chad Versace
0bebaeacd7 isl: Rename s/lod_align/image_align/ for consistency
Regarding the subimages within a surface, sometimes isl called them
"images" and sometimes "LODs". This patch make isl consistently refer to
them as "images".  I choose the term "image" over "LOD" because LOD is
an misnomer when applied to 3D surfaces. The alignment applies to each
individual 2D subimage, not to the LOD as a whole.

This patch changes no behavior. It's just a manually performed,
case-insensitive, replacement s/lod/image/ that maintains correct
indentation.  any behavior.
2015-12-14 09:01:51 -08:00
Chad Versace
85a6384014 anv/tests: gitignore block_pool_no_free 2015-12-14 09:00:28 -08:00
Chad Versace
0da776b733 anv: Fix build for unit tests
Clearly no one has been running `make check`, because the unittestbuild
has been broken for a long time. After this buildfix, all tests now
pass.
2015-12-14 09:00:28 -08:00
Jason Ekstrand
c56186026f anv: Add initial support for texel buffers 2015-12-12 16:11:23 -08:00
Jason Ekstrand
fd944197f2 i965/nir: Provide a default LOD for buffer textures
Our hardware requires an LOD for all texelFetch commands even if they are
on buffer textures.  GLSL IR gives us an LOD of 0 in that case, but the LOD
is really rather meaningless.  This commit allows other NIR producers to be
more lazy and not provide one at all.
2015-12-12 16:09:54 -08:00
Jason Ekstrand
1c605c8dfa Merge remote-tracking branch 'mesa-public/master' into vulkan
This pulls in a shared local memory fix.
2015-12-11 14:29:13 -08:00
Jason Ekstrand
d12ea21dd5 gen8/pipeline: Support vec4 vertex shaders
In order to actually get them, you need INTEL_DEBUG=vec4.
2015-12-11 13:25:17 -08:00
Kristian Høgsberg Kristensen
e803276148 Revert "i965/HACK: Build brw_cs into libcompiler"
This reverts commit 6df7963531.
2015-12-11 13:09:42 -08:00
Kristian Høgsberg Kristensen
21d5e52da8 Merge ../mesa into vulkan 2015-12-11 13:09:06 -08:00
Jason Ekstrand
6ae4e59fac anv/pipeline: Get rid of the no kernel input parameters hack
Previously, meta would pass null shaders in for the VS when it intended to
disable the VS.  However, this meant that we didn't know what inputs we had
and would dead-code things in the FS.  In order to solve this, we
hard-coded a number.  Now meta passes in a VS even if it plans to disable
the stage so this is no longer needed.
2015-12-10 22:37:30 -08:00
Jason Ekstrand
bd0e25d41e anv/apply_pipeline_layout: Multiply uniform sizes by 4
This is because uniforms are now in terms of bytes everywhere.
2015-12-10 22:36:49 -08:00
Jason Ekstrand
6df7963531 i965/HACK: Build brw_cs into libcompiler
We need it for CS push constants
2015-12-10 22:36:07 -08:00
Jason Ekstrand
21cf55ab54 gen8/cmd_buffer: Don't push CS constants if there aren't any
Issuing MEDIA_CURB_LOAD with a size of zero causes GPU hangs on BDW.
2015-12-10 18:56:27 -08:00
Jason Ekstrand
3893e11f4b anv: Use 4 instead of sizeof(gl_constant_value)
We no longer have access to gl_constant_value and, really, it's 4 because
our uniform layout code works entirely in dwords.
2015-12-10 18:55:16 -08:00
Jason Ekstrand
13d1dd465c nir/spirv: Put SSBO store writemasks in the right index
It moved with the nir_intrinsic_load/store update.
2015-12-10 18:54:44 -08:00
Jason Ekstrand
d5c9955d3e Merge remote-tracking branch 'mesa-public/master' into vulkan
This pulls in nir_intrinsic_load/store changes and the switch of all
uniforms in i965 to bytes.  This accounts for the Vulkan changes.
2015-12-10 18:29:36 -08:00
Jason Ekstrand
8beea9d45b anv/icd: Advertise the right ABI version 2015-12-10 12:27:13 -08:00
Chad Versace
5ba9121fe8 anv/image: Remove some vkCreateImage validation
Don't validate the baseArrayLayer and layerCount of cube images.  This
allows us to remove a bloated lookup table and an unneeded struct
definition (anv_image_view_info).
2015-12-09 16:33:23 -08:00
Chad Versace
9a9c551f3e anv/image: Drop unused halign, valign lookup tables 2015-12-09 15:36:39 -08:00
Jason Ekstrand
46bcf9d777 vulkan: Pull in the 0.210.1 vk_platform header
Somehow this got missed in the API update.
2015-12-09 11:55:38 -08:00
Jordan Justen
47e5fb52f4 gen8/compute: Setup push constants and local ids
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-09 11:04:30 -08:00
Jordan Justen
f8d5fb4293 anv: Add anv_cmd_buffer_cs_push_constants
Similar to anv_cmd_buffer_push_constants, but handles the compute
pipeline, which requires different setup from the other stages.

This also handles initializing the compute shader local IDs.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-09 11:02:20 -08:00
Jordan Justen
974bdfa9ad i965: Move brw_cs_fill_local_id_payload to brw_compiler.h
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-08 18:09:31 -08:00
Jordan Justen
d28df86c87 anv/compute: Fix thread width max off by 1
See cooresponding code in:

commit 8d87070af2
Author: Jordan Justen <jordan.l.justen@intel.com>
Date:   Thu Aug 28 14:47:19 2014 -0700

    i965/cs: Implement brw_emit_gpgpu_walker

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-08 18:09:31 -08:00
Chad Versace
db66424218 anv: Remove unused anv_image_view_info_for_vk_image_view_type() 2015-12-08 14:25:28 -08:00
Jason Ekstrand
f4aee5d82f gen8/cmd_buffer: Flush push constants after descriptor sets
This is because, if storage images are used, flushing descriptor sets can
cause push constants to become dirty.
2015-12-07 21:45:43 -08:00
Jason Ekstrand
43ac954e25 anv: Add initial support for pushing image params
The helper to fill out the image params data-structure is stilly a dummy,
but this puts the infastructure in place.
2015-12-07 21:08:26 -08:00
Jason Ekstrand
1eb731d9fe anv/descriptor_set: Add support for storage images in layouts 2015-12-07 21:08:26 -08:00
Jason Ekstrand
ff05f634f6 anv/image: Add a separate storage image surface state
Thanks to hardware limitations, storage images may need a different surface
format and/or other bits in the surface state.
2015-12-07 21:08:22 -08:00
Jason Ekstrand
8f83222d37 isl: Add initial support for storage images 2015-12-07 21:08:08 -08:00
Jason Ekstrand
42b4417031 HACK/i965: Disable assign_var_locations on uniforms
This conflicts with the way we're doing uniforms in Vulkan.
2015-12-07 17:19:55 -08:00
Jason Ekstrand
cd75ff5d17 anv/pipeline: Only apply a pipeline layout if we have one 2015-12-07 16:56:02 -08:00
Chad Versace
9098e0f074 anv/image: Refactor anv_image_make_surface()
Reduce the number of function parameters. Deduce the
anv_image::*_surface from the parameters instead of requiring the caller
to do that.
2015-12-07 09:28:14 -08:00
Chad Versace
3d85a28e90 anv: Assert the succes of isl_surf_init() 2015-12-07 08:54:59 -08:00
Chad Versace
64e8af69b1 anv: Use isl_tiling_flags in anv_image_create_info
Replace
    anv_image_create_info::force_tiling
    anv_image_create_info::tiling
with the bitmask
    anv_image_create_info::isl_tiling_flags

This allows us to drop the function
anv_image.c:choose_isl_tiling_flags().
2015-12-07 08:50:28 -08:00
Chad Versace
c97d8af9aa anv: Fix anv_gem_set_tiling to respect tiling param
Function anv_gem_set_tiling() ignored its 'tiling' parameter. It
unconditionally set the bo's tiling to I915_TILING_X.
2015-12-07 08:42:11 -08:00
Chad Versace
01e2932d6a anv: Remove unused anv_format_s8_uint
This is no longer needed after migrating to isl.
2015-12-07 08:40:14 -08:00
Kristian Høgsberg Kristensen
0a5bee1fe6 vk: Don't override and hardcode autoconf CFLAGS
To disable optimizations pass CFLAGS="-O0 -g" on the configure command
line.
2015-12-04 21:24:15 -08:00
Kristian Høgsberg Kristensen
7337870036 vk: Move isl files to libisl.la helper library
These will be in their own library eventually - let's just do that now.
2015-12-04 21:24:15 -08:00
Chad Versace
2f270f0d15 anv/image: Fix choice of isl_surf_usage for depthstencil images
Fixes assertion in vkCreateImage when VkFormat is combined depthstencil.

Fixed many vulkancts tests that use combined depthstencil. For example,
fixes dEQP-VK.pipeline.depth.format.d16_unorm_s8_uint.compare_ops.\
not_equal_less_or_equal_not_equal_greater.
2015-12-04 16:37:05 -08:00
Chad Versace
a09b4c298c anv: Add func anv_get_isl_format() 2015-12-04 16:37:05 -08:00
Chad Versace
8b9ceda9f1 anv/image: Delete old ifdef'd out code 2015-12-04 16:37:05 -08:00
Jason Ekstrand
4dd5ef9e09 vk: Add needed builddir subdirectories to the include path
This fixes out-of-tree builds and closes #1
2015-12-04 15:48:27 -08:00
Kristian Høgsberg Kristensen
f1f78a371e vk: gem handles are uint32_t
No functional difference, but lets be consistent with the kernel API.
2015-12-04 12:53:27 -08:00
Jason Ekstrand
8f722c2fa3 vk: Update the README for 0.210.1 2015-12-04 11:08:45 -08:00
Kristian Høgsberg
dac57750db vk: Turn on Bay Trail, Cherryview and Broxton support 2015-12-04 09:51:47 -08:00
Kristian Høgsberg Kristensen
bbb6875f35 vk: Map uncached, coherent memory as write-combine
This gives us the required characteristics for the memory type.
2015-12-04 09:51:47 -08:00
Kristian Høgsberg Kristensen
c3c61d210f vk: Expose two memory types for non-LLC GPUs
We're required to expose a host-visible, coherent memory type. On big
core GPUs that share, LLC, we can expose one such memory type that's
also cached.  However, on non-LLC GPUs we can't both be cached and
coherent. Thus, we expose both the required coherent type and the cached
but non-coherent combination.
2015-12-04 09:51:47 -08:00
Kristian Høgsberg
773592051b vk: clflush all state for non-LLC GPUs 2015-12-04 09:51:47 -08:00
Kristian Høgsberg
b431cf59a3 vk: Set I915_CACHING_NONE for userptr BOs when !llc
Regular objects are created I915_CACHING_CACHED on LLC platforms and
I915_CACHING_NONE on non-LLC platforms. However, userptr objects are
always created as I915_CACHING_CACHED, which on non-LLC means
snooped. That can be useful but comes with a bit of overheard.  Since
we're eplicitly clflushing and don't want the overhead we need to turn
it off.
2015-12-04 09:51:47 -08:00
Kristian Høgsberg
e0b5f0308c vk: Implement vkFlushMappedMemoryRanges()
We'll do a runtime switch on device->info.has_llc for now.
2015-12-04 09:51:47 -08:00
Jason Ekstrand
cb2382882e nir/spirv: Update to SPIR-V version 1.0 2015-12-03 18:28:10 -08:00
Chad Versace
371fc2bc20 anv/gen9: Fix SURFACE_STATE halign and valign
Pre-Skylake, RENDER_SUFFACE_STATE.SurfaceVerticalAlignment is in units
of surface samples. A surface sample is equivalent to a pixel in all
surfaces except interleaved multisample surfaces.

In Skylake, it is in units of surface elements. A surface element is
equivalent to a surface sample except for compressed formats, in which
case the element is a compression block.
2015-12-03 15:33:08 -08:00
Chad Versace
981ef2f02d anv: Embed isl_surf into anv_surface
This reduces struct anv_surface to just two members: an offset and the
embedded isl_surf.
2015-12-03 15:31:00 -08:00
Chad Versace
594e673fcc anv/image: Drop assertions on SURFTYPE extent limits
In anv_image_create(), stop asserting that VkImageCreateInfo::extent
does not exceed the hardware limits for the given SURFTYPE. The
assertions were incorrect because they did not take into account the
hardware gen. Anyways, these types of assertions belong in isl, not
anvil.
2015-12-03 15:29:52 -08:00
Chad Versace
b369389640 anv/image: Use isl to calculate surface layout
Remove the surface layout calculations in anv_image_make_surface().  Let
isl_surf_init() do the heavy lifting.

Fixes 8 Crucible tests and regresses none. (hw=Broadwell and
crucible@33d91ec).
2015-12-03 15:29:08 -08:00
Chad Versace
afdadec77f isl: Implement isl_surf_init() for gen4-gen9
This is a big code push. The patch is about 3000 lines.

Function isl_surf_init() calculates the physical layout of a surface.

The implementation is "complete" (but untested) for all 1D, 2D, 3D, and
cube surfaces for gen4 through gen9, except:
    * gen9 1D surfaces
    * gen9 Ys multisampled surfaces
    * auxiliary surfaces (such as hiz, mcs, ccs)
2015-12-03 15:26:11 -08:00
Chad Versace
bda43a0f59 isl: Rename legacy Y tiling to ISL_TILING_Y0
Rename legacy Y tiling from ISL_TILING_Y to ISL_TILING_Y0 in order to
clearly distinguish it from Yf and Ys.  Using ISL_TILING_Y to denote
legacy Y tiling would lead to confusion with i965, because i965 uses
I195_TILE_Y to denote *any* Y tiling.
2015-12-03 15:26:11 -08:00
Chad Versace
57941b61ab anv/image: Vulkan's depthPitch is in bytes, not rows
Fix for VkGetImageSubresourceLayout.
2015-12-03 15:26:11 -08:00
Jason Ekstrand
bfeaf67391 anv/device: Give a version of 0.210.1 in apiVersion 2015-12-03 15:23:33 -08:00
Jason Ekstrand
d666487dc6 vk: Add new WSI support and bump the API to 0.210.1 2015-12-03 15:15:29 -08:00
Jason Ekstrand
fde60c1684 anv/entrypoints: Run the headers through the preprocessor first
This allows us to filter based on preprocessor directives.  We could build
a partial preprocessor into the generator, but we would likely get it
wrong.  This allows us to filter out, for instance, windows-specific WSI
stuff.
2015-12-03 14:13:55 -08:00
Jason Ekstrand
4c19243562 vk/0.210.0: Advertise version 0.210.0 2015-12-03 13:44:02 -08:00
Jason Ekstrand
888744cabf vk/0.210.0: Update queries to the new API 2015-12-03 13:44:02 -08:00
Jason Ekstrand
924fbfc9a1 vk/0.210.0: Fix how we handle access flags in barriers
The initial implementation in the 0.210.0 API update was misguieded as to
what the access flags meant.  This should be more correct.
2015-12-03 13:44:02 -08:00
Jason Ekstrand
fa2435de3c vk/0.210.0: Update the VkFormat enum 2015-12-03 13:44:02 -08:00
Jason Ekstrand
4e904a0310 vk/0.210.0: Rework vkQueueSubmit 2015-12-03 13:44:02 -08:00
Jason Ekstrand
5757ad2959 vk/0.210.0: Remove depth clip and add depth clamp 2015-12-03 13:43:59 -08:00
Jason Ekstrand
d689745303 vk/0.210.0: Rework device features and limits 2015-12-03 13:43:54 -08:00
Jason Ekstrand
74c4c4acb6 vk/0.210.0: Rework QueueFamilyProperties 2015-12-03 13:43:54 -08:00
Jason Ekstrand
fed3586f34 vk/0.210.0: Rework result and structure type enums
By and large, this is just moving enum values around.  However, it also
removed VK_UNSUPPORTED which we were returning a number of places.  Those
places now return VK_ERROR_INCOMPATABLE_DRIVER.
2015-12-03 13:43:54 -08:00
Jason Ekstrand
a5f19f64c3 vk/0.210.0: Remove the VkShaderStage enum
This made for an unfortunately large amount of work since we were using it
fairly heavily internally.  However, gl_shader_stage does basically the
same things, so it's not too bad.
2015-12-03 13:43:54 -08:00
Jason Ekstrand
e10dc002e9 vk/0.210.0: Remove VkShader 2015-12-03 13:43:54 -08:00
Jason Ekstrand
e6ab06ae7f vk/0.210.0: Rework memory property flags 2015-12-03 13:43:54 -08:00
Jason Ekstrand
93071482f9 vk/0.210.0: Remove some unused enum values 2015-12-03 13:43:54 -08:00
Jason Ekstrand
b264012fcf vk/0.210.0: Update VkPipelineStageFlagBits 2015-12-03 13:43:54 -08:00
Jason Ekstrand
1aaf15bf19 vk/0.210.0: Trivial function argument name change 2015-12-03 13:43:53 -08:00
Jason Ekstrand
938a2939c8 vk/0.210.0: We now allocate command buffers; not create them 2015-12-03 13:43:53 -08:00
Jason Ekstrand
5a02441789 vk/0.210.0: Rename a parameter to GetImageSparseMemoryRequirements 2015-12-03 13:43:53 -08:00
Jason Ekstrand
a9fc0ce0e3 vk/0.210.0: Delete three no longer existant entrypoints 2015-12-03 13:43:53 -08:00
Jason Ekstrand
fcfb404a58 vk/0.210.0: Rework allocation to use the new pAllocator's 2015-12-03 13:43:53 -08:00
Jason Ekstrand
d3547e7334 vk/0.210.0: Use VkSampleCountFlagBits for sample counts 2015-12-03 13:43:53 -08:00
Jason Ekstrand
9349625d60 vk/0.210.0: Rework VkInstanceCreateInfo 2015-12-03 13:43:53 -08:00
Jason Ekstrand
c30a021820 vk/0.210.0: More function argument renaming 2015-12-03 13:43:53 -08:00
Jason Ekstrand
b1cd025b88 vk/0.210.0: Replace MemoryInput/OutputFlags with AccessFlags 2015-12-03 13:43:53 -08:00
Jason Ekstrand
43f3e92348 vk/0.210.0: Rework render pass description structures 2015-12-03 13:43:53 -08:00
Jason Ekstrand
299f8f1511 vk/0.210.0: More structure field renaming 2015-12-03 13:43:53 -08:00
Jason Ekstrand
407b8cc5e0 vk/0.210.0: Get rid of VkImageAspect 2015-12-03 13:43:53 -08:00
Jason Ekstrand
3f6abd0161 vk/0.210.0: Rework descriptor sets 2015-12-03 13:43:52 -08:00
Jason Ekstrand
6a6da54ccb vk/0.210.0: Rename parameters to memory binding/mapping functions 2015-12-03 13:43:52 -08:00
Jason Ekstrand
aadb7dce9b vk/0.210.0: Update to the new instance/device create structs 2015-12-03 13:43:52 -08:00
Jason Ekstrand
607fe31598 vk/0.210.0: More trivial struct/enum changes 2015-12-03 13:43:52 -08:00
Jason Ekstrand
dde7172a8a vk/0.210.0: Trivial flag enum updates 2015-12-03 13:43:52 -08:00
Jason Ekstrand
4cf0b57bbf vk/0.210.0: Rename ChannelFlags to ColorComponentFlags 2015-12-03 13:43:52 -08:00
Jason Ekstrand
7f2284063d vk/0.210.0: s/raster/rasterization/ 2015-12-03 13:43:52 -08:00
Jason Ekstrand
1ab9f843bc vk/0.210.0: Don't allow chaining of description structs 2015-12-03 13:43:52 -08:00
Jason Ekstrand
17486b8664 vk/0.210.0: More fun with flags fields 2015-12-03 13:43:52 -08:00
Jason Ekstrand
f5ba1f994a vk/0.210.0: Make pCode a uint32_t pointer 2015-12-03 13:43:52 -08:00
Jason Ekstrand
5f348bd0e5 vk/0.210.0: Rename origin fields of VkViewport 2015-12-03 13:43:52 -08:00
Jason Ekstrand
9fa6e328eb vk/0.210.0: Move alphaToOne and alphaToCoverate to multisample state 2015-12-03 13:43:52 -08:00
Jason Ekstrand
f97c3b6d58 vk/0.210.0: Add flags fields to various pipeline create structs 2015-12-03 13:43:51 -08:00
Jason Ekstrand
e673d64209 vk/0.210.0: Change field names in vertex input structs 2015-12-03 13:43:51 -08:00
Jason Ekstrand
fd53603e42 vk/0.210.0: Misc. no-op structure changes
The only non-trivial change is to sparse resources that we don't handle
anyway.
2015-12-03 13:43:51 -08:00
Jason Ekstrand
fe644721aa vk/0.210.0: Rename property pCount parameters 2015-12-03 13:43:51 -08:00
Jason Ekstrand
e8f2294cd2 vk/0.210.0: Rework sampler filtering and mode enums 2015-12-03 13:43:51 -08:00
Jason Ekstrand
2e10ca5748 vk/0.210.0: Misc. function argument renames 2015-12-03 13:43:51 -08:00
Jason Ekstrand
569f70be56 vk/0.210.0: Rework copy/clear/blit API 2015-12-03 13:43:47 -08:00
Jason Ekstrand
4ab9391fbb vk/0.210.0: Rework dynamic states 2015-11-30 14:19:41 -08:00
Jason Ekstrand
73ef7d47d2 vk/0.210.0: Rework color blending enums 2015-11-30 13:49:28 -08:00
Jason Ekstrand
2c77b0cd01 gen7/8/cmd_buffer: Inline vk_to_gen_swizzle
It's currently unused on IVB so we get compiler warnings.
2015-11-30 13:29:51 -08:00
Jason Ekstrand
9b1cb8fdbc vk/0.210.0: Rework a few raster/input enums 2015-11-30 13:28:17 -08:00
Jason Ekstrand
a53f23d93f vk/0.210.0: Rework texture view component mapping 2015-11-30 13:06:12 -08:00
Jason Ekstrand
f1a7c7841f vk/0.210.0: Switch to the new VKAPI function decorations
While we're at it, we do a bunch of the VkResult -> void updates
2015-11-30 12:46:30 -08:00
Jason Ekstrand
a89a485e79 vk/0.210.0: Rename CmdBuffer to CommandBuffer 2015-11-30 11:48:08 -08:00
Jason Ekstrand
6a8a542610 vk/0.210.0: A pile of minor enum updates 2015-11-30 11:12:44 -08:00
Jason Ekstrand
3db43e8f3e vk/0.210.0: Switch to the new-style handle declarations 2015-11-30 10:58:02 -08:00
Jason Ekstrand
5cb57806b2 vk: Add connonical 0.170.2 and 0.210.0 headers
This is in preparation for the API update
2015-11-30 10:24:35 -08:00
Kristian Høgsberg Kristensen
d6d82f1ab3 vk: Fix 3DSTATE_WM_DEPTH_STENCIL for gen8
This packet is a different size on gen8 and we hit an assertion when we
try to merge a gen9 size dword array from the pipeline with the gen8
sized array we create from dynamic state.

Use a static assert in the merge macro and fix this issue by using different
wm_depth_stencil arrays on gen8 and gen9.
2015-11-26 10:11:52 -08:00
Kristian Høgsberg Kristensen
cd4721c062 vk: Add SKL support
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-11-25 22:34:10 -08:00
Kristian Høgsberg Kristensen
c445fa2f77 vk: Make entrypoint generator output gen9 entry points
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-11-25 20:58:25 -08:00
Kristian Høgsberg Kristensen
0e02a88ad4 vk: Add GEN9 pack header 2015-11-25 20:56:41 -08:00
Kristian Høgsberg Kristensen
0c59cb42b5 vk: Move all gen8 files to gen8 lib 2015-11-25 14:13:53 -08:00
Jason Ekstrand
179fc4aae8 Merge remote-tracking branch 'mesa-public/master' into vulkan
This pulls in nir cloning and some much-needed upstream refactors.
2015-11-23 14:03:47 -08:00
Jason Ekstrand
e14b2c76b4 anv/meta_clear: Don't trash state if no clears are needed 2015-11-21 11:39:12 -08:00
Jason Ekstrand
83c305f8ef anv/meta_clear: Don't try to clear depth-stencil without LOAD_OP_CLEAR 2015-11-21 00:05:18 -08:00
Jason Ekstrand
438eaa3ae7 anv/meta: Add initial support for multi-slice array and 3-D copies
We still need to fix up a few bits once we have real CPP values, but this
should get us a long ways.
2015-11-20 18:25:06 -08:00
Jason Ekstrand
d6a7c659c7 anv/meta: Use array textures for 2D
This a total of 1 extra instruction in the shader and gives us a lot more
flexibility in how we do blits.
2015-11-20 16:00:34 -08:00
Jason Ekstrand
e3ec964e44 anv/meta: Keep z coordinate flat while blitting 2015-11-20 15:48:03 -08:00
Jason Ekstrand
1157b0360d nir/spirv: Rework decoration iteration
The old code didn't work correctly if you had member decorations after
non-member decorations.  Since glslang never gave us any of those, it
wasn't properly tested.
2015-11-20 15:15:40 -08:00
Jason Ekstrand
cff74d6fb8 nir/spirv: Handle OpNop 2015-11-20 15:02:45 -08:00
Jason Ekstrand
1d42f773d3 gen8_state: Clamp sampler values to HW limitations 2015-11-20 14:45:44 -08:00
Jason Ekstrand
48228c114e nir/spirv: Add support for runtime arrays 2015-11-20 12:49:20 -08:00
Jason Ekstrand
55d16c090e gen8/pipeline: Properly handle MIN/MAX blend ops 2015-11-20 11:53:10 -08:00
Jason Ekstrand
b43ce6768d gen8/pipeline: Set IndependentAlphaBlendEnable properly 2015-11-20 11:52:54 -08:00
Jason Ekstrand
e69db9159b gen8/pipeline: Minor blending fixes
This makes various fields match upstream mesa
2015-11-20 11:52:30 -08:00
Jason Ekstrand
fa8db0dfcc anv: Put all of the descriptor set stuff together in one file
The stuff to take descriptor sets and turn them into binding tables and
sampler tables is still in anv_cmd_buffer.c.  We may want to consider
putting it in anv_descriptor_set.c eventually.
2015-11-18 14:58:43 -08:00
Jason Ekstrand
828b1a6eb6 anv/device: Update the right sampler in UpdateDescriptorSets 2015-11-18 14:48:28 -08:00
Jason Ekstrand
6f613abc2b anv/cmd_buffer: Add a new genX_cmd_buffer file for shared code
This file contains code that can be shared across gens modulo recompiling.
In particular, we can share STATE_BASE_ADDRESS setup and handling of the
vkPipelineBarrier call.  Not sharing STATE_BASE_ADDRESS setup has already
been a source of bugs and the gen7 and gen8 implementations of
PipelineBarrier were line-for-line identical.

Incidentally, this should fix MOCS settings for dynamic and surface state
on Haswell.
2015-11-18 12:26:57 -08:00
Jason Ekstrand
fb8b2f5f9e anv/gen7: A bunch of depth-stencil fixes
There are various bits which move around between Haswell and Ivy Bridge
that we weren't taking into account.  This also makes us actually set the
StencilWriteEnable in a sane way.
2015-11-18 11:43:52 -08:00
Jason Ekstrand
e9d634f4ad gen7/pipeline: Re-arrange stencil parameters to match gen8 2015-11-17 19:10:31 -08:00
Jason Ekstrand
9e39bdabad anv/gen7: Implement CmdPipelineBarrier 2015-11-17 17:09:27 -08:00
Jason Ekstrand
b707e90b6e anv/gen7: Don't use the upper bound on dynamic state base address
It doesn't do much for us and, if we have to resize the dynamic state block
pool for any reason, it becomes out-of-date.
2015-11-17 17:08:44 -08:00
Jason Ekstrand
f0390bcad6 anv: Add initial Haswell support 2015-11-17 12:14:24 -08:00
Jason Ekstrand
45320f677b anv: Add macros for doing per-gen compilation 2015-11-17 08:27:51 -08:00
Jason Ekstrand
92d164b1c3 anv/entrypoints: Add dispatch support for haswell 2015-11-17 08:27:51 -08:00
Jason Ekstrand
aa3002bd42 anv/entrypoints: Use devinfo instead of a gen number 2015-11-17 08:27:51 -08:00
Jason Ekstrand
0508046dc8 anv/cmd_buffer: Pack the 3DSTATE_VF packet on-demand 2015-11-17 08:27:51 -08:00
Jason Ekstrand
34d55d69cf anv/formats: Don't advertise stencil texture/blit prior to Broadwell 2015-11-17 08:23:29 -08:00
Jason Ekstrand
de54b4b18f anv: Only include the pack headers where needed
Previously, we were including gen7_pack.h, gen75_pack.h, and gen8_pack.h
in anv_private.h.  As we add more gens, this is going to become untenable.
This commit moves things around so that we only use the pack headers when
and if we need them.
2015-11-16 12:29:09 -08:00
Jason Ekstrand
cb9e2305f8 anv/cmd_buffer: Move gen-specific stuff into the appropreate files 2015-11-16 12:10:11 -08:00
Jason Ekstrand
22d024e031 nir/spirv: Add support for separate samplers and textures
This gets tricky in a few places because we have to pass vtn_sampled_image
values through OpAccessChain, but it works ok.  At some point, it probably
needs to be cleaned up but it doesn't occur to me exactly how to do that at
the moment.  We'll see how this approach goes.
2015-11-14 22:32:54 -08:00
Jason Ekstrand
002db3ee15 anv/cmd_buffer: Add a default descriptor type case
This silences a bunch of compiler warnings.
2015-11-14 09:16:55 -08:00
Jason Ekstrand
e9dba80430 anv/apply_pipeline_layout: Handle separate samplers and textures 2015-11-14 09:00:35 -08:00
Jason Ekstrand
b5d4027c35 Merge branch 'wip/i965-separate-sampler-tex' into vulkan 2015-11-14 08:23:27 -08:00
Jason Ekstrand
c7d504ad93 i965/vec4: Plumb separate surfaces and samplers through from NIR 2015-11-14 08:05:31 -08:00
Jason Ekstrand
3dd84822df i965/vec4: Separate the sampler from the surface in generate_tex 2015-11-14 08:05:31 -08:00
Jason Ekstrand
c09e140b65 i965/fs: Plumb separate surfaces and samplers through from NIR 2015-11-14 08:04:47 -08:00
Jason Ekstrand
c2a373ec85 i965/fs: Separate the sampler from the surface in generate_tex 2015-11-14 08:01:50 -08:00
Jason Ekstrand
b169bb902a nir: Separate texture from sampler in nir_tex_instr
This commit adds the capability to NIR to support separate textures and
samplers.  As it currently stands, glsl_to_nir only sets the sampler and
leaves the texture alone as it did before and nir_lower_samplers assumes
this.  However, backends can, if they wish, assume that they are separate
because nir_lower_samplers sets both texture and sampler index (they are
the same in this case).
2015-11-14 07:57:31 -08:00
Jason Ekstrand
1469ccb746 Merge remote-tracking branch 'mesa-public/master' into vulkan
This pulls in Matt's big compiler refactor.
2015-11-14 07:56:10 -08:00
Jason Ekstrand
e8f51fe4de anv/gen8: Subtract 1 from num_elements when setting up buffer surface state 2015-11-13 22:50:54 -08:00
Jason Ekstrand
91bc4e7cec anv/pipeline: Don't free blend states that don't exist
Compute pipelines don't need a blend state so we shouldn't be
unconditionally freeing it.
2015-11-13 21:49:41 -08:00
Jason Ekstrand
c1733886a6 nir/spirv: Add support for SSBO stores
This only handles vector stores, not component-of-a-vector stores.
2015-11-13 21:41:52 -08:00
Jason Ekstrand
c68e28d766 nir/spirv: Refactor vtn_block_load
We pull the offset calculations out into their own function so we can
re-use it for stores.
2015-11-13 21:32:00 -08:00
Jason Ekstrand
99494b96f0 nir/spirv: Add support for image_load_store 2015-11-13 17:54:43 -08:00
Jason Ekstrand
164b3ca164 nir/builder: Add a nir_ssa_undef helper 2015-11-13 17:54:43 -08:00
Jason Ekstrand
ffbc31d13b nir/spirv: Add support for creating image variables 2015-11-13 17:54:43 -08:00
Jason Ekstrand
453239f6a5 nir/spirv: Add support for image types 2015-11-13 17:54:43 -08:00
Jason Ekstrand
0572444a0e nir/types: Add image type helpers 2015-11-13 17:54:43 -08:00
Jason Ekstrand
d5ba7a26d9 glsl/types: Add a get_image_instance helper 2015-11-13 17:54:43 -08:00
Chad Versace
738eaa8acf isl: Embed brw_device_info in isl_device
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-13 11:14:03 -08:00
Chad Versace
ba467467f4 anv: Use enum isl_tiling everywhere
In anv_surface and anv_image_create_info, replace member
'uint8_t tile_mode' with 'enum isl_tiling'.

As a nice side-effect, this patch also reduces bug potential because the
hardware enum values for tile modes are unstable across hardware
generations.
2015-11-13 10:44:09 -08:00
Chad Versace
af392916ff anv/device: Embed isl_device
Embed struct isl_device into anv_physical_device and anv_device.  It
will later be used for surface layout calculations.
2015-11-13 10:44:09 -08:00
Chad Versace
a4a2ea3f79 isl: Add enum isl_tiling and a query func
The query func is isl_tiling_get_extent.
2015-11-13 10:44:07 -08:00
Chad Versace
652727b029 isl: Add structs isl_extent2d, isl_extent3d
They are nowhere used yet.
2015-11-13 10:31:49 -08:00
Chad Versace
b1bb270590 isl: Add struct isl_device
The struct is incomplete (it contains only the gen). And it's nowhere
used yet.

It will be used later for surface layout calculations.
2015-11-13 10:31:37 -08:00
Chad Versace
477383e9ac anv: Strip trailing whitespace from anv_device.c 2015-11-13 10:27:40 -08:00
Chad Versace
c6493dff79 anv: Strip trailing space in anv_private.h 2015-11-12 12:24:01 -08:00
Chad Versace
addc2a9d02 anv: Remove redundant fields anv_format::bs,bw,bh,bd
Instead, use the equivalent fields in anv_format::isl_layout.
2015-11-12 12:23:49 -08:00
Chad Versace
cbc31f453d anv/formats: Re-indent the fmt() macro
Use one line per struct member.
2015-11-12 12:21:46 -08:00
Chad Versace
1bea1669c5 anv: Use enum isl_format in anv_format
This patch begins using isl.h in Anvil. More refactors will follow.

Change type of anv_format::surface_format from uint16_t -> enum
isl_format.
2015-11-12 12:21:46 -08:00
Chad Versace
bfb022a235 isl: Generate isl_format_layout.c
Generate an array of struct isl_format_layout, using
isl_format_layout.csv as input.

Each entry follows the patten:

   [ISL_FORMAT_R32G32B32A32_FLOAT] = {
      ISL_FORMAT_R32G32B32A32_FLOAT,
      .bs = 16, .bpb = 128,
      .bw = 1, .bh = 1, .bd = 1,
      .channels = {
          .r = { ISL_SFLOAT, 32 },
          .g = { ISL_SFLOAT, 32 },
          .b = { ISL_SFLOAT, 32 },
          .a = { ISL_SFLOAT, 32 },
          .l = {},
          .i = {},
          .p = {},
      },
      .colorspace = ISL_COLORSPACE_LINEAR,
      .txc = ISL_TXC_NONE,
   },
2015-11-12 12:21:46 -08:00
Chad Versace
7986efc644 isl: Add CSV of format layouts
Add file isl_format_layout.csv, which describes the block layout,
channel layout, and colorspace of all hardware surface formats.
2015-11-12 11:56:16 -08:00
Chad Versace
67362698a9 isl: Add enum isl_format 2015-11-12 11:34:45 -08:00
Jason Ekstrand
3a3d79b38e anv/gen7: Implement the VS state depth-stall workaround 2015-11-10 16:42:34 -08:00
Jason Ekstrand
750b8f9e98 anv/gen7: Properly handle a GS with zero invocations 2015-11-10 16:41:23 -08:00
Jason Ekstrand
9d18555c8d anv/gen7: Add push constant support 2015-11-10 15:14:11 -08:00
Jason Ekstrand
427978d933 anv/device: Use an actual int64_t in WaitForFences 2015-11-10 15:02:52 -08:00
Jason Ekstrand
d9079648d0 anv/meta: Create a sampler in meta_emit_blit 2015-11-10 14:43:18 -08:00
Jason Ekstrand
b461744c52 anv/gen7: Properly handle VS with VertexID but no vertices 2015-11-10 11:31:31 -08:00
Jason Ekstrand
aafc87402d anv/device: Work around the i915 kernel driver timeout bug
There is a bug in some versions of the i915 kernel driver where it will
return immediately if the timeout is negative (it's supposed to wait
indefinitely).  We've worked around this in mesa for a few months but never
implemented the work-around in the Vulkan driver.

I rediscovered this bug again while working on Ivy Bridge becasuse the
drive in my Ivy Bridge currently has Fedora 21 installed which has one of
the offending kernels.
2015-11-10 11:24:11 -08:00
Jason Ekstrand
06f466a770 anv/nir: Fix codegen in lower_push_constants 2015-11-09 16:29:05 -08:00
Jason Ekstrand
abede04314 anv/gen7: Fix the length of 3DSTATE_SF 2015-11-09 16:04:07 -08:00
Jason Ekstrand
e8c2a52a70 anv/gen7: Properly handle missing color-blend state 2015-11-09 16:04:06 -08:00
Jason Ekstrand
862da6a891 anv/device: Add a newline to the end of a comment 2015-11-09 16:04:06 -08:00
Nanley Chery
9c2b37a9c3 anv/formats: Define ETC2 formats
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-09 15:41:41 -08:00
Nanley Chery
41cf35d1d8 anv/image: Determine the alignment units for compressed formats
Alignment units, i and j, match the compressed format block
width and height respectively.

v2: Don't assert against HALIGN* and VALIGN* enums (Chad)

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-09 15:41:41 -08:00
Nanley Chery
381f602c6b anv/image: Handle compressed format qpitch and padding
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-09 15:41:41 -08:00
Nanley Chery
300f7c2be3 anv/image: Handle compressed format stride and size
These formulas did not take compressed formats into account.

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-09 15:41:41 -08:00
Nanley Chery
7b4244dea0 anv/formats: Add fields for block dimensions
A non-compressed texture is a 1x1x1 block. Compressed
textures could have values which vary in different
dimensions WxHxD.

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-09 15:41:41 -08:00
Nanley Chery
a6c7d1e016 anv/formats: Add surface_format initializer
v2: Rename __brw_fmt to __hw_fmt (Chad)

Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chad Versace chad.versace@intel.com
2015-11-09 15:41:41 -08:00
Nanley Chery
3ee923f1c2 anv: Rename cpp variable to "bs"
cpp (chars-per-pixel) is an integer that fails to give useful data
about most compressed formats. Instead, rename it to "bs" which
stands for block size (in bytes).

v2: Rename vk_format_for_bs to vk_format_for_size (Chad)
    Use "block size" instead of "bs" in error message (Chad)

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-09 15:41:41 -08:00
Jason Ekstrand
17fa3d3572 nir/spirv: Give both block and buffer_block types an interface type 2015-11-07 08:03:25 -08:00
Jason Ekstrand
a10d59c09a nir/spirv: Increment num_ubos/ssbos when creating variables 2015-11-06 16:53:27 -08:00
Jason Ekstrand
046563167c anv/apply_dynamic_offsets: Use the right sized immediate zero 2015-11-06 16:49:24 -08:00
Jason Ekstrand
104525c33b anv/pipeline: Set the right SSBO binding table start index for FS 2015-11-06 15:57:51 -08:00
Jason Ekstrand
399d5314f6 anv/cmd_buffer: Rework the way we emit UBO surface state
The new mechanism should be able to handle SSBOs as well as properly handle
emitting surface state on gen7 where we need different strides depending on
shader stage.
2015-11-06 15:14:12 -08:00
Jason Ekstrand
1b5c7e7ecd anv/pipeline: Expose is_scalar_shader_stage 2015-11-06 15:12:33 -08:00
Jason Ekstrand
5ba281e794 nir/spirv: Add a helper for determining if a block is externally visable 2015-11-06 15:09:57 -08:00
Jason Ekstrand
220261a0c9 anv: Use VkDescriptorType instead of anv_descriptor_type 2015-11-06 14:09:52 -08:00
Jason Ekstrand
612e35b2c6 anv: Do range-checking in the shader for dynamic buffers 2015-11-06 13:32:52 -08:00
Jason Ekstrand
f8052351ac anv/device: Increase the block size for instructions 2015-11-06 13:29:47 -08:00
Jason Ekstrand
d7cc9929bb anv: Remove all support for BufferViews
We never *actually* supported them, we just used them for binding UBOs.
Now that we have BufferInfo and we aren't supporting texture buffers yet,
we should get rid of them until we can do them properly.
2015-11-06 13:16:18 -08:00
Jason Ekstrand
0360c3608b anv/device: Only support binding UBOs through BufferInfo 2015-11-06 12:52:12 -08:00
Jason Ekstrand
3aa2fc82dd anv: Rework UpdateDescriptorSets
Previously, UpdateDescriptorSets was wrong because it assumed that the
binding was the offset into the descriptor set.
2015-11-06 12:28:03 -08:00
Jason Ekstrand
45b1bbe801 anv: Add a descriptor_index to anv_descriptor_set_binding_layout 2015-11-06 12:16:54 -08:00
Jason Ekstrand
f029e0ce13 anv: Add a layout to anv_descriptor_set 2015-11-06 12:16:54 -08:00
Chad Versace
16119ad884 anv/meta: Finish load clears for stencil attachments
Tested by Crucible "func.depthstencil.stencil_triangles.*" in

  commit c194292d5eadb84e9d7489fc01ce0b653cdd4ca5 (HEAD -> master)
  Author: Chad Versace <chad.versace@intel.com>
  Date:   Wed Nov 4 16:19:24 2015 -0800
  Subject: func.depthstencil: Remove stencil clear workaround for Mesa
2015-11-05 15:45:43 -08:00
Jason Ekstrand
a40f682c71 anv/cmd_buffer: Fix SURFACE_STATE for non-view buffer bindings
We were treating it as if it's a BufferView and weren't taking the offset
into account properly.
2015-11-04 19:56:18 -08:00
Jason Ekstrand
1b68120760 anv/cmd_buffer: Don't use an anv_state pointer in emit_binding_table
The anv_state is supposed to be a flyweight so we're not really saving
anything by using a pointer.  Also, we were creating one, setting a pointer
to it, and then having it go out-of-scope which is bad.
2015-11-04 19:56:16 -08:00
Chad Versace
d259af3fbb anv: Remove unused anv_render_pass members
Remove members
  num_color_clear_attachments
  has_depth_clear_attachment
  has_stencil_clear_attachment

The new clear code in anv_meta_clear.c does not use them.
2015-11-04 15:54:38 -08:00
Chad Versace
a9a3071fc4 anv/meta: Rewrite clear code
Fixes Crucible test "func.clear.load-clear.attachments-8".

The old clear code, when clearing attachments for
VK_ATTACHMENT_LOAD_OP_CLEAR, suffered from some fundamental bugs. The
bugs were not fixable with the old code's approach.

    - It assumed that a VkRenderPass contained at most one depthstencil
       attachment.

    - It tried to clear all attachments (color and the sole
      depthstencil) with a single instanced draw call, using the VUE
      header's RenderTargetArrayIndex to specify the instance's target
      color attachment. But the RenderTargetArrayIndex does not select
      entries in the binding table; it only selects an array index of
      a singled layered surface.

    - If at least one attachment of VkRenderPass had
      VK_ATTACHMENT_LOAD_OP_CLEAR,
      then the old code cleared *all* attachments. This was
      a consequence of using a single draw call and single pipeline for
      the clear.

The new clear code fixes those bugs by making a separate draw call for
each attachment, and using one pipeline when clearing color attachments
and a different pipeline for depth attachments.

The new code, like the old code, does not clear stencil attachments. It
is left as a FINISHME.
2015-11-04 15:20:52 -08:00
Chad Versace
49c96a14c5 anv/meta: Clear color attribute is always flat
No behavioral change. This patch just removes an unneeded function
parameter.
2015-11-04 15:15:19 -08:00
Chad Versace
7f82cc718f anv/meta: Use consistent naming for dynamic state mask
Consistently rename bitmasks of Vulkan dynamic state to 'dynamic_mask'.

  anv_meta_saved_state::dynamic_flags -> dynamic_mask
  anv_meta_save(dynamic_state)        -> dynamic_mask
2015-11-04 15:15:19 -08:00
Chad Versace
2bdb9e2ed9 anv/meta: Rename anv_cmd_buffer_save/restore
As the functions are now exposed in anv_meta.h, let's rename them
to clarify that they are meta functions.

    anv_cmd_buffer_save -> anv_meta_save
    anv_cmd_buffer_restore -> anv_meta_restore
2015-11-04 15:15:19 -08:00
Chad Versace
16b2a489db anv: Move meta clear code to new file anv_meta_clear.c
anv_meta.c currently handles blits, copies, clears, and resolves.  The
clear code is about to grow, and anv_meta.c is already busting at the
seams.
2015-11-04 15:15:19 -08:00
Chad Versace
c56727037a anv: Move struct anv_vue_header to anv_private.h
Move it from anv_meta.c to the common header anv_private.h. This allows
us to split the meta blit and meta clear code into separate files.
2015-11-04 15:15:19 -08:00
Jason Ekstrand
b00e3f221b Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-11-03 15:45:04 -08:00
Jason Ekstrand
a1e7b8701a nir: remove sampler_set from nir_tex_instr
Now that descriptor sets are handled in a lowering pass, this is no longer
needed.
2015-11-03 14:58:20 -08:00
Chad Versace
4d1c76485b anv: Drop stale comment in anv_cmd_buffer_emit_binding_table()
When emitting the binding table for the fragment shader stage, we no
longer "walk all of the attachments, [inserting only] the color
attachments into the binding table". Instead, we iterate only over the
subpass's color attachments, which is the minimal possible iteration.

While killing the comment, also rename the variable 'attachments' to
'color_count', as it's no longer a count of all framebuffer attachments
but only the subpass's color attachment count.
2015-11-03 13:46:40 -08:00
Jason Ekstrand
584f9d4442 anv: Report 0 physical devices when not on Broadwell or Ivy Bridge
Right now, Broadweel and Ivy Bridge are the only supported platforms.
Hopefully, this reduces the chances that someone will try the driver on
unsupported hardware and be confused that it doesn't work.
2015-11-02 12:14:37 -08:00
Jason Ekstrand
3883728730 anv: Add better push constant support
What we had before was kind of a hack where we made certain untrue
assumptions about the incoming data.  This new support, while it still
doesn't support indirects properly (that will come), at least pulls the
offsets and strides from SPIR-V like it's supposed to.
2015-10-29 22:26:36 -07:00
Jason Ekstrand
1f2624e6dd nir/spirv: Add support for push constants 2015-10-29 22:26:00 -07:00
Jason Ekstrand
a2283508b0 nir/intrinsics: Add a load_push_constant intrinsic 2015-10-29 22:26:00 -07:00
Jason Ekstrand
f2a8c9db24 nir/spirv: Rework the way we handle interface types 2015-10-29 22:26:00 -07:00
Chad Versace
4073219cf1 anv/pass: Remove redundant assert
Trivial fix.
2015-10-29 11:47:39 -07:00
Chad Versace
1e98177439 anv/pass: Move VkRenderPass code to new file
Move it from anv_device.c to new file anv_pass.c. Because it will soon
grow bigger.
2015-10-29 11:10:03 -07:00
Chad Versace
c284c39b13 anv: Fix parsing of load ops in VkAttachmentDescription
My original understanding of VkAttachmentDescription::loadOp,
stencilLoadOp was incorrect. Below are all possible combinations:

  VkFormat       | loadOp=clear   stencilLoadOp=clear
  ---------------+---------------------------
  color          | clear-color    ignored
  depth-only     | clear-depth    ignored
  stencil-only   | ignored        clear-stencil
  depth-stencil  | clear-depth    clear-stencil
2015-10-29 10:59:55 -07:00
Jason Ekstrand
8bcba083db anv: Update the README
Adds a note that we support SPIR-V revision 32.  Also, we now support
geometry shaders.
2015-10-28 12:30:34 -07:00
Jason Ekstrand
12feda0c09 Revert "nir/intrinsic: Allow up to four indices"
This reverts commit 5eccd0b4b9.

This was only needed for the store_ssbo_vk_indirect intrinsic
2015-10-27 13:44:14 -07:00
Jason Ekstrand
423e7a55cc Revert "nir/intrinsics: Add new Vulkan load/store intrinsics"
This reverts commit 24bcc89c8f.

Now that we have the new vulkan_resource_index intrinsic, these variants of
the classic UBO/SSBO instrinsics aren't needed.
2015-10-27 13:43:25 -07:00
Jason Ekstrand
a6be53223e anv/nir: Work with the new vulkan_resource_index intrinsic 2015-10-27 13:42:51 -07:00
Jason Ekstrand
3d44b3aaa6 nir/spirv: Use the new vulkan_resource_index intrinsic
This is instead of using the _vk versions of UBO/SSBO load/store intrinsics
2015-10-27 13:41:59 -07:00
Jason Ekstrand
800a9706f0 nir: Add a vulkan_resource_index intrinsic 2015-10-27 13:41:08 -07:00
Jason Ekstrand
37b6afb3d9 Add a todo comment about intput_slots_valid in the FS shader key 2015-10-26 16:25:02 -07:00
Jason Ekstrand
ab6ed2e1ac anv/gen8_pipeline: Emit a real 3DSTATE_SBE_SWIZ packet 2015-10-26 16:25:02 -07:00
Jason Ekstrand
9006e555ce anv/pipeline: Bump the size of the pipeline batch to accomodate GS
The 1k batch size wasn't big enough for a full pipeline setup including
geometry shaders.  Some day we should make it dynamic.
2015-10-23 16:50:31 -07:00
Jason Ekstrand
4c59ee808f anv/gen8_pipeline: Various 3DSTATE_GS fixes 2015-10-23 16:49:26 -07:00
Jason Ekstrand
8aba8cf513 anv/pipeline: Use separate-shader 2015-10-23 10:53:00 -07:00
Jason Ekstrand
760c4b894d anv/pipeline: Pull separate_shader from NIR for vue map setup 2015-10-23 10:48:52 -07:00
Jason Ekstrand
ee8c67abe8 nir/spirv: Add support for builtins in arrays 2015-10-22 17:58:20 -07:00
Jason Ekstrand
9fe907ec79 nir/spirv: Make the builtins array distinguish between in and out 2015-10-22 17:54:24 -07:00
Jason Ekstrand
d11ea76168 nir/spirv: Make vtn_get_builtin_location smarter
Instead of just stomping on the mode, it now validates asserts that the
previously set mode is correct and only changes it if needed.  We need to
do this because, in geometry shaders, there are some builtins that can be
either an input or an output depending on context.  We can get that
information from the SPIR-V source but we can't throw it away.
2015-10-22 17:45:41 -07:00
Jason Ekstrand
9abef3e817 nir/spirv: Make get_builtin_variable take a nir_variable_mode
We'll want this in a moment for validation but, for now, it just gets
stompped by get_builtin_variable.
2015-10-22 17:28:25 -07:00
Jason Ekstrand
2ce6636c75 nir/spirv: Remove the vtn_type argument from _vtn_variable_load/store
Now that builtins are handled in deref chains, we don't really need this
anymore.
2015-10-22 16:56:42 -07:00
Jason Ekstrand
f23d951083 nir/validate: Add better validation of load/store types 2015-10-22 16:53:01 -07:00
Jason Ekstrand
82c579e314 anv/gen8: Set the correct maximum number of GS threads
This equation was pulled from mesa gen8_gs_state.c
2015-10-21 21:51:18 -07:00
Jason Ekstrand
d0e8c78407 anv/pipeline: set the gs_vertex_count in compile_gs
This was missed in the initial enabling commit.
2015-10-21 21:50:47 -07:00
Jason Ekstrand
8af2a09956 anv/pipeline: Make the has_push_constants computation more accurate
The computation used to only look for uniforms that weren't samplers.  Now
it also filters out arrays of samplers.
2015-10-21 21:50:16 -07:00
Jason Ekstrand
0329a252bd nir/spirv: Add defaults for GS input/output primitive types
These are supposed to be specified in the SPIR-V source as SpvExecutionMode
enums but glslang isn't giving them to us.  A bug has been filed:

https://github.com/KhronosGroup/glslang/issues/84
2015-10-21 21:46:22 -07:00
Jason Ekstrand
4032549885 i965/vec4: Handle returns at the end of functions 2015-10-21 20:42:23 -07:00
Jason Ekstrand
5f29dacda2 i965: Move get_hw_prim_for_gl_prim to brw_util.c 2015-10-21 20:40:28 -07:00
Jason Ekstrand
ea23cb3543 nir/spirv: Add capabilities and decorations for basic geometry shaders 2015-10-21 20:36:25 -07:00
Jason Ekstrand
d538fe849d anv/pipeline: Add back basic geometry shader support
Now that we've done the refactoring upstream, it's much easier to to get
hooked up.  We haven't tested things well enough to know that we're setting
up the GPU state correctly for them yet but at least we can compile them now.
2015-10-21 18:45:48 -07:00
Jason Ekstrand
164abff0c0 nir/spirv: Add support for more CS system values 2015-10-21 18:39:06 -07:00
Jason Ekstrand
5790ee2bbb nir/spirv: Add support for various barrier type instructions 2015-10-21 18:17:11 -07:00
Jason Ekstrand
3d35e4361f Fix a couple of dereferences 2015-10-21 18:16:50 -07:00
Jason Ekstrand
55a7ee730c spirv/nir: Add more stage asserts 2015-10-21 18:00:05 -07:00
Jason Ekstrand
27393c8630 nir/spirv: Add support for GS metadata 2015-10-21 17:58:34 -07:00
Jason Ekstrand
a8ffd6e72c nir/gather_info: Add more info for geometry shaders 2015-10-21 17:42:47 -07:00
Jason Ekstrand
fed60e3c73 Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-10-21 17:40:13 -07:00
Chad Versace
0ab926dfbf anv: Don't teardown uninitialized anv_physical_device
If the user called vkDestroyDevice but never called
vkEnumeratePhysicalDevices, then the driver tried to ralloc_free() an
unitialized anv_physical_device.

Fixes test 'dEQP-VK.api.device_init.create_instance_name_version'.
2015-10-21 11:55:37 -07:00
Jason Ekstrand
c8572d0f9c anv/pipeline: Remove a redundant line
We set compute_sample_id based on multisample state two lines below.
2015-10-20 16:02:03 -07:00
Jason Ekstrand
72d99f8a40 anv/pipeline: Update a comment 2015-10-20 16:00:55 -07:00
Jason Ekstrand
27d868500a anv/pipeline: Set key->render_to_fbo to false for fragment shaaders
Vulkan uses the upper-left convention.  This is the same as DX one and what
our hardware does.  We had it flipped around.
2015-10-20 15:37:16 -07:00
Jason Ekstrand
59bae36ffb nir/spirv: Fix a typo 2015-10-20 15:35:13 -07:00
Jason Ekstrand
44b22ca441 nir/spirv: Handle SpvExecutionMode 2015-10-20 15:23:56 -07:00
Jason Ekstrand
a71e614d33 anv: Completely rework shader compilation
Now that we have a decent interface in upstream mesa, we can get rid of all
our hacks.  As of this commit, we no longer use any fake GL state objects
and all of shader compilation is moved into anv_pipeline.c.  This should
make way for actually implementing a shader cache one of these days.

As a nice side-benifit, this commit also gains us an extra 300 passing CTS
tests because we're actually filling out the texture swizzle information
for vertex shaders.
2015-10-20 13:02:03 -07:00
Jason Ekstrand
2d9e899e35 nir: Add a pass to gather info from the shader
This pass fills out a bunch of the fields in nir_shader_info by inspecting
the shader.
2015-10-20 13:02:03 -07:00
Jason Ekstrand
6fb4469588 anv: Move the brw_compiler from anv_compiler to physical_device 2015-10-20 13:02:03 -07:00
Jason Ekstrand
9e3615cc7d i965: Move brw_compiler_create to brw_compiler.h 2015-10-20 13:02:03 -07:00
Jason Ekstrand
bf6407079b i965: Split process_nir into two haves; pre- and post- 2015-10-20 13:02:03 -07:00
Jason Ekstrand
611ace6861 anv/compiler: Remove more pre-SNB shader key setup 2015-10-20 13:02:03 -07:00
Jason Ekstrand
b3a344db30 anv/compiler: Get rid of GS support.
The geometry shader support is currently completely untested.  As I go
through and re-factor the compiler, I'd rather not refactor dead code that
I don't have a way to know if I broke.  Let's just remove it for now.  We
can put it back in easily enough later and then we'll do it properly.
2015-10-20 13:02:03 -07:00
Jason Ekstrand
5f5224f256 anv/meta: Use the actual render pass for creating blit pipelines 2015-10-20 13:02:02 -07:00
Chad Versace
4d4e559b6a vk: Use consistent names for anv_cmd_state dirty bits
Prefix all anv_cmd_state dirty bit tokens with ANV_CMD_DIRTY. For
example:

    old                           -> new
    ANV_DYNAMIC_VIEWPORT_DIRTY    -> ANV_CMD_DIRTY_DYNAMIC_VIEWPORT
    ANV_CMD_BUFFER_PIPELINE_DIRTY -> ANV_CMD_DIRTY_PIPELINE

Change type of anv_cmd_state::dirty and ::compute_dirty from uint32_t to
the self-documenting type anv_cmd_dirty_mask_t.
2015-10-20 11:40:24 -07:00
Chad Versace
2484d1a01f anv/pipeline: Fix requirement for depthstencil state
The Vulkan spec allows VkGraphicsPipelineCreateInfo::pDepthStencilState
to be NULL when the pipeline's subpass contains no depthstencil
attachment (see spec quote below). anv_pipeline_init_dynamic_state()
required it unconditionally.

This path fixes anv_pipeline_init_dynamic_state() to access
pDepthStencilState only when there is a depthstencil attachment.

From the Vulkan spec (20 Oct 2015, git-aa308cb)

   pDepthStencilState [...] may only be NULL if renderPass and subpass
   specify a subpass that has no depth/stencil attachment.
2015-10-20 11:29:16 -07:00
Chad Versace
b51468b519 anv/pipeline: Validate VkGraphicsPipelineCreateInfo
The Vulkan spec (20 Oct 2015, git-aa308cb) states that some fields of
VkGraphicsPipelineCreateInfo are required under certain conditions.
Add a new function, anv_pipeline_validate_create_info() that asserts the
requirements hold.

The assertions helped me discover bugs in Crucible and anv_meta.c.
2015-10-20 10:55:54 -07:00
Chad Versace
855180b3d9 anv: Define anv_validate macro
If a block of code is annotated with anv_validate, then the block runs
only in debug builds.
2015-10-20 10:55:54 -07:00
Chad Versace
81f8b82fc8 vk/meta: Add required renderpass to pipeline
The Vulkan spec (20 Oct 2015, git-aa308cb) requires that
VkGraphicsPipelineCreateInfo::renderPass be a valid handle. To satisfy
that, define a static dummy render pass used for all meta operations.
2015-10-20 10:48:26 -07:00
Chad Versace
0d84a0d58b vk/meta: Add required multisample state to pipeline
The Vulkan spec (20 Oct 2015, git-aa308cb) requires that
VkGraphicsPipelineCreateInfo::pMultisampleState not be NULL.
2015-10-20 10:48:09 -07:00
Jason Ekstrand
60e8439237 anv/compiler: Remove irrelevant wm key setup
Most of this applies to Iron Lake and prior only.  While we're at it, we
get rid of the legacy GL shading model code.
2015-10-19 17:00:26 -07:00
Jason Ekstrand
27ca9ca4e1 anv/compiler: Get rid of legacy shader key setup
Most of the shader key setup we did was for pre-Sandybridge and the stuff
for SNB+ wasn't in the key setup.  That stuff still isn't there but at
least we've left ourselves notes for now.
2015-10-19 16:45:11 -07:00
Jason Ekstrand
661d0db077 anv/compiler: Delete legacy clipping code
This is a Vulkan driver.  We don't need legacy clipping stuff and, even if
we did, we don't plan on supporting pre-Sandybridge anyway.
2015-10-19 16:26:16 -07:00
Jason Ekstrand
fba55b711e anv/compiler: Remove unneeded wm prog data setup
As of upstream mesa changes, brw_compile_fs does this for us so there's no
need to have the code in the Vulkan driver anymore.
2015-10-19 16:17:41 -07:00
Jason Ekstrand
12c30c9498 nir/spirv: Use the new nir_variable helpers 2015-10-19 16:08:23 -07:00
Jason Ekstrand
7e6959402d nir/spirv: Handle builtins in OpAccessChain
Previously, we were trying to handle them later when loading.  However, at
that point, you've already lost information and it's harder to handle
certain corner-cases.  In particular, if you have a shader that does

gl_PerVertex.gl_Position.x = foo

we have trouble because we see the .x and we don't know that we're in
gl_Position.  If we, instead, handle it in OpAccessChain, we have all the
information we need and we can silently re-direct it to the appropreate
variable.  This also lets us delete some code which is a nice side-effect.
2015-10-19 15:50:45 -07:00
Jason Ekstrand
958fc04dc5 Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-10-19 14:14:21 -07:00
Jason Ekstrand
995d9c4ac7 anv/pipeline: Remove the ViewportState finishme
We should be doing everything we need to with the viewport state
2015-10-17 10:35:29 -07:00
Jason Ekstrand
3e47e34036 anv: Add support for immutable descriptors 2015-10-17 08:17:00 -07:00
Jason Ekstrand
7010fe61c8 anv: Add facilities for dumping an image to a file
The ability to dump an arbitrary miplevel or array slice of an anv_image to
a file is very useful for debugging.  Nothing inside of the driver calls
this right now, but it's very useful to call from GDB.
2015-10-16 20:03:06 -07:00
Jason Ekstrand
368e703a01 anv/pipeline: Rework dynamic state handling
Aparently, we had the dynamic state array in the pipeline backwards.
Instead of enabling the bits in the pipeline, it disables them and marks
them as "dynamic".
2015-10-16 16:30:02 -07:00
Jason Ekstrand
8ed23654c9 nir/spirv: Fix handling of vector component selects via OpAccessChain
When we get to the end of the _vtn_load/store_varaible recursion, we may
have one link left in the deref chain if there is a vector component select
on the end.  In this case, we need to truncate the deref chain early so
that, when we make the copy for the load, we don't get the extra deref.
The final deref will be handled by the vector extract/insert that comes
later.
2015-10-15 21:18:57 -07:00
Jason Ekstrand
2552df41a1 anv/cmd_buffer: Reset the command buffer in BeginCommandBuffer 2015-10-15 18:28:00 -07:00
Jason Ekstrand
298d031642 anv/batch_chain: Add some sanity-check asserts for relocations 2015-10-15 17:24:32 -07:00
Jason Ekstrand
3130851add anv/x11: Only advertise VK_FORMAT_B8R8G8A8_UNORM
The others don't work at the moment so we shouldn't be advertising them.
2015-10-15 16:16:17 -07:00
Jason Ekstrand
f5eec407ea anv/x11: Treat the pPlatformWindow as a xcb_window_t* instead of xcb_window_t 2015-10-15 15:38:20 -07:00
Jason Ekstrand
03952b1513 anv/device: Add support for combined image and sampler descriptors 2015-10-15 15:17:27 -07:00
Jason Ekstrand
b459b3d82c anv/device: Remove some unneeded anv_finishmes 2015-10-15 15:17:07 -07:00
Jason Ekstrand
ba20569626 anv/device: Make the CreateSemaphore stub return success 2015-10-15 14:34:07 -07:00
Jason Ekstrand
bed7d1e03c anv: Add support for BufferInfo in descriptor sets 2015-10-15 13:45:53 -07:00
Jason Ekstrand
6dc4cad994 anv/cmd_buffer: Add an alloc_surface_state helper 2015-10-15 13:45:07 -07:00
Jason Ekstrand
896c1c65d6 anv: Get rid of the descriptor_set_binding struct
We no longer need it as we have a better way to deal with dynamic offsets.
2015-10-14 19:02:29 -07:00
Jason Ekstrand
42683e3757 anv: Get rid of backend compiler hacks for descriptor sets
Now that we have anv_nir_apply_pipeline_layout, we can hand the backend
compiler intrinsics and texture instructions that use a flat buffer index
just like it wants.  There's no longer any reason for any of these hacks.
2015-10-14 18:38:33 -07:00
Jason Ekstrand
da994f4b7e anv/nir: Rewrite apply_dynamic_offsets to handle the new vk intrinsics 2015-10-14 18:38:33 -07:00
Jason Ekstrand
9c9b7d79c8 anv/nir: Add a pass for applying a applying a pipeline layout to a shader
This new pass lowers the _vk intrinsics which take a (set, binding, index)
tripple to the single-index non-vk intrinsics based on the pipeline layout.
2015-10-14 18:38:33 -07:00
Jason Ekstrand
de608153fb nir/spirv: Use the Vulkan ubo intrinsics 2015-10-14 18:38:33 -07:00
Jason Ekstrand
24bcc89c8f nir/intrinsics: Add new Vulkan load/store intrinsics 2015-10-14 18:38:33 -07:00
Jason Ekstrand
5eccd0b4b9 nir/intrinsic: Allow up to four indices 2015-10-14 18:38:33 -07:00
Jason Ekstrand
b37c38c1ca anv: Completely rework descriptor set layouts
This patch reworks a bunch of stuff in the way we do descriptor set
layouts.  Our previous approach had a couple of problems.  First, it was
based on a misunderstanding of arrays in descriptor sets.  Second, it
didn't properly handle descriptor sets where some bindings were missing
stages.  The new apporach should be correct and also makes some operations,
particularly those on the hot-path, a bit easier.

We use the descriptor set layout for four things:

 1) To determine the map from bindings to the actual flattened descriptor
    set in vkUpdateDescriptorSets().

 2) To determine the descriptor <-> binding table entry mapping to use in
    anv_cmd_buffer_flush_descriptor_sets().

 3) To determine the mappings of dynamic indices.

 4) To determine the (set, binding, array index) -> binding table entry
    mapping inside of shaders.

The new approach is directly taylored towards these operations.
2015-10-14 18:38:33 -07:00
Chad Versace
7965fe7da6 vk: Add README
Requested by developers outside Intel.

During the driver's pre-release development, let's make the README easy
to find for external experimenters. Keep it at the top of the source
tree.
2015-10-14 13:58:29 -07:00
Jason Ekstrand
d2d8945eb8 nir/spirv: Fix a bug in indirect OpAccessChain handling 2015-10-13 20:00:18 -07:00
Jason Ekstrand
db5a5fcd18 anv/image: Add a basic implementation of GetImageSubresourceLayout 2015-10-13 20:00:17 -07:00
Jason Ekstrand
28ed02588a anv/formats: Use the surface_format_info struct from brw_surface_formats.h
The surface_format_info struct changed in mesa but the copied-and-pasted
version didn't get updated on the last mesa master merge.  This both fixes
the bug and should prevent this in the future.
2015-10-13 15:23:24 -07:00
Jason Ekstrand
accbf178eb i965/surface_formats: Pull the surface_format_info struct into a header 2015-10-13 15:23:24 -07:00
Jason Ekstrand
fd2ec1c8ad anv/x11: Do something sensible if get_geometry fails in GetSurfaceProperties 2015-10-13 15:10:40 -07:00
Jason Ekstrand
c31f926726 anv/wsi: Add the GetSurfacePresentModesKHR stub
Support has existed in the X11 and Wayland backends for a while but,
somehow, the entrypoint got missed in the API shuffle.
2015-10-13 11:47:03 -07:00
Jason Ekstrand
e21ecb841c anv: Declare/validate the correct API version 2015-10-12 18:25:19 -07:00
Jason Ekstrand
0689a0f0f3 anv/device: Return VK_SUCCESS after setting pCount in QueueFamilyProperties 2015-10-10 15:25:08 -07:00
Kristian Høgsberg Kristensen
fc2a66cfcd Merge ../mesa into vulkan 2015-10-08 17:20:24 -07:00
Jason Ekstrand
48a87f4ba0 anv/queue: Get rid of the serial
This was a remnant of the object tagging implementation we had at one
point.  We haven't used it for a long time so there's no good reason to
keep it around.
2015-10-08 12:16:00 -07:00
Jason Ekstrand
8984559892 vk/0.170.2: Update to the new VK_EXT_KHR_swapchain extensions 2015-10-08 12:11:18 -07:00
Chad Versace
2228ec0112 Merge branch 'vulkan-0.170.2' into vulkan
This updates the API from 0.138.2 to 0.170.2,
and updates SPIR-V to v32.
2015-10-07 11:49:07 -07:00
Chad Versace
7fa98ab182 vk: Remove temporary vulkan headers
Remove vulkan-0.138.2.h and vulkan-0.170.2.h. Their purpose was to aid
the header update to 0.170.2.
2015-10-07 11:45:48 -07:00
Chad Versace
2f1ca71360 vk/0.170.2: Bump header version
The header is now fully updated.
2015-10-07 11:44:44 -07:00
Chad Versace
c2f94e3a0d vk/0.170.2: Update C++ errata and typedefs 2015-10-07 11:44:33 -07:00
Chad Versace
0ca3c8480d vk/0.170.2: Update remaining enums 2015-10-07 11:39:49 -07:00
Chad Versace
f9c948ed00 vk/0.170.2: Update VkResult
Version 0.170.2 removes most of the error enums. In many cases, I had to
replace an error with a less accurate (or even incorrect) one.
In other cases, the error path is replaced with an assertion.
2015-10-07 11:36:51 -07:00
Chad Versace
8dee32e71f vk/0.170: Update VkDescriptorInfo
Ignore the new bufferInfo field with a anv_finishme.
2015-10-07 10:58:55 -07:00
Chad Versace
92e7bd3610 vk/0.170.2: Update vkCreateDescriptorPool
Nothing to do. In Mesa the pool is a stub.
2015-10-07 10:47:55 -07:00
Chad Versace
a3bc07c23b vk/0.170.2: Update VkAttachmentDescription 2015-10-07 10:44:40 -07:00
Chad Versace
82259f88dd vk/0.170.2: Update VkImageViewCreateInfo 2015-10-07 10:43:44 -07:00
Chad Versace
f4295b3cca vk/0.170.2: Update VkImageCreateInfo 2015-10-07 10:43:17 -07:00
Chad Versace
d48e71ce55 vk/0.170.2: Update VkPhysicalDeviceProperties 2015-10-07 10:36:46 -07:00
Chad Versace
81e1dcc42c vk/0.170.2: Update VkImageFormatProperties 2015-10-07 10:28:30 -07:00
Chad Versace
98c2bb6917 vk/0.170.2: Update VkFormatProperties 2015-10-07 10:15:59 -07:00
Chad Versace
545f5cc6e1 vk/0.170.2: Update VkPhysicalDeviceFeatures 2015-10-07 10:09:39 -07:00
Chad Versace
033a37f591 vk/0.170.2: Update VkPhysicalDeviceLimits 2015-10-07 10:09:31 -07:00
Jason Ekstrand
982466aeff anv/device: Remove some #ifdef'd out code
This was a left-over from the dynamic state update.
2015-10-07 09:45:49 -07:00
Jason Ekstrand
010c6efd65 vk/0.170.2: Make vkUpdateDescriptorSets return void 2015-10-07 09:44:53 -07:00
Jason Ekstrand
1a52bc3039 anv/pipeline: Add support for dynamic state in pipelines 2015-10-07 09:40:49 -07:00
Jason Ekstrand
daf68a9465 vk/0.170.2: Switch to the new dynamic state model 2015-10-07 09:40:49 -07:00
Jason Ekstrand
55fcca306b anv: Add a dynamic state data structure and basic helpers 2015-10-07 09:36:27 -07:00
Jason Ekstrand
941a105954 anv/private: Add a typed_memcpy macro
This is amazingly helpful when copying arrays of things around.
2015-10-07 09:36:27 -07:00
Chad Versace
b1c024a932 vk/meta: Fix -Wstrict-prototypes
In C, functions with no arguments require a void argument.
build_nir_clear_fragment_shader() lacked that.

Fixes:
  anv_meta.c:70:1: warning: function declaration isn't a prototype
  [-Wstrict-prototypes]
2015-10-07 09:10:25 -07:00
Chad Versace
6dea1a9ba1 vk/0.170.2: Merge VkAttachmentView into VkImageView 2015-10-07 09:10:25 -07:00
Chad Versace
03dd72279f vk/image: Fix retrieval of anv_surface for depthstencil aspect
If anv_image_get_surface_for_aspect_mask() is given a combined
depthstencil aspect mask, and the image has a stencil surface but no
depth surface, then return the stencil surface.

Hacks on hacks.
2015-10-07 09:10:25 -07:00
Chad Versace
85ff3cfde3 vk: Drop -Wextra
Eliminates lots of warnings due to anv_meta.c's inclusion of nir.h.

I like the extra warnings, and they should probably get fixed. However,
git-grep reveals that no other Mesa directory uses -Wextra. Building
Vulkan produces a lot of compiler warnings from core Mesa headers that
no other Mesa developer sees, and hence no other Mesa developer will
fix.
2015-10-07 07:28:46 -07:00
Chad Versace
24de3d49ea vk: Embed two surface states in anv_image_view
This prepares for merging VkAttachmentView into VkImageView.  The two
surface states are:

   anv_image_view::color_rt_surface_state:
       RENDER_SURFACE_STATE when using image as a color render target.

   anv_image_view::nonrt_surface_state;
       RENDER_SURFACE_STATE when using image as a non render target.

No Crucible regressions.
2015-10-06 21:22:18 -07:00
Chad Versace
37bf120930 vk/pipeline: Emit MSAA finishme only if samples > 1
If samples == 1, then there's nothing for Mesa to do, and the finishme
message is only noise.
2015-10-06 21:22:18 -07:00
Chad Versace
3fc2b1f325 vk: Remove stale finishme for stencil image views
They don't work completely. But they work well enough to satisfy
Crucible.
2015-10-06 21:22:18 -07:00
Chad Versace
44143a1f46 vk: Add anv_image::usage
It's a copy of VkImageCreateInfo::usage. Will be used for the
VkAttachmentView/VkImageView merge.
2015-10-06 21:22:18 -07:00
Chad Versace
cf603714cb vk/meta: Fix usage flags for image-wrapped-buffers
In make_image_for_buffer(), use VK_IMAGE_USAGE_SAMPLED_BIT when
transferring from the buffer and use VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT
when transferring to the buffer.
2015-10-06 21:22:18 -07:00
Chad Versace
d00718104f vk/image: Remove stale anv_asserts for depthstencil attachments
We don't fully handle mipmapped, array depthstencil attachments. But we
handle the well enough for Crucible's miptree tests.
2015-10-06 21:22:18 -07:00
Kristian Høgsberg Kristensen
1d7ef82f4b i965: Delete brw_cs.cpp which was deleted in master 2015-10-06 15:20:19 -07:00
Jason Ekstrand
c272bb58f5 nir/spirv: Better texture handling 2015-10-06 15:10:45 -07:00
Jason Ekstrand
ccea9cc332 nir/spirv: Update to SPIR-V Rev. 32 2015-10-06 14:52:35 -07:00
Jason Ekstrand
89eebd889c vk/0.170.2: Fairly trivial enum shuffling 2015-10-06 14:08:08 -07:00
Jason Ekstrand
1e4263b7d2 vk/0.170.2: s/baseArraySlice/baseArrayLayer/ 2015-10-06 14:08:08 -07:00
Chad Versace
d4446a7e58 vk: Merge anv_attachment_view into anv_image_view
This prepares for merging VkAttachmentView into VkImageView.
2015-10-06 12:13:03 -07:00
Chad Versace
6b5ce5daf5 vk: Update comments for anv_image_view
- Document the extent member. It's the extent of the view's base level.
- s/VkAttachmentView/VkImageView/
2015-10-06 12:12:52 -07:00
Jason Ekstrand
19018c9f13 vk/0.170.2: Add a stage field to ShaderCreateInfo 2015-10-06 10:20:10 -07:00
Jason Ekstrand
cc389b1482 vk/0.170.2: Rename cs to stage in ComputePipelineCreateInfo 2015-10-06 10:11:50 -07:00
Jason Ekstrand
588d40e97a vk/0.170.2: Use ImageSubresourceCopy in ImageResolve 2015-10-06 10:09:47 -07:00
Jason Ekstrand
bd4cde708a vk/0.170.2: Rename fields in VkClearColorValue 2015-10-06 10:07:47 -07:00
Jason Ekstrand
81c7fa8772 vk/0.170.2: Rework blits to use ImageSubresourceCopy 2015-10-06 10:04:04 -07:00
Jason Ekstrand
ba2254aa79 vulkan.h: Move stuff around
This has no functional change but substantially decreases the diff with the
0.170.2 header.
2015-10-06 09:50:04 -07:00
Jason Ekstrand
d1908d2c33 vk/0.170.2: Rework parameters to CmdClearDepthStencil functions 2015-10-06 09:40:39 -07:00
Jason Ekstrand
02a9be31d6 vk/0.170.2: Add the flags parameter to GetPhysicalDeviceImageFormatProperties 2015-10-06 09:37:21 -07:00
Jason Ekstrand
a145acd812 vk/0.170.2: Remove the pCount parameter from AllocDescriptorSets 2015-10-06 09:32:01 -07:00
Jason Ekstrand
8ba684cbad vk/0.170.2: Rename extension and layer query functions 2015-10-06 09:25:03 -07:00
Jason Ekstrand
a6eba403e2 vk/0.170.2: Update to the new queue family properties query 2015-10-05 21:17:12 -07:00
Jason Ekstrand
65964cd49b vk/0.170.2: Re-arrange parameters of vkCmdDraw[Indexed] 2015-10-05 21:10:20 -07:00
Jason Ekstrand
05a26a60c8 vk/0.170.2: Make destructors return void 2015-10-05 20:50:51 -07:00
Jason Ekstrand
460676122f vk/0.170.2: Rename VkClearValue.ds to depthStencil 2015-10-05 20:35:08 -07:00
Jason Ekstrand
8e1ef639b6 vk/0.170.2: Add the subpass field to VkCmdBufferBeginInfo 2015-10-05 20:30:53 -07:00
Jason Ekstrand
757166592e vk/0.170.2: Rename pointer parameters of VkSubpassDescription 2015-10-05 20:26:21 -07:00
Jason Ekstrand
57f500324b vk/0.170.2: Add unnormalizedCoordinates to VkSamplerCreateInfo 2015-10-05 20:17:24 -07:00
Jason Ekstrand
f7c3519aaf vk/0.170.2: Rename VkTexAddress to VkTexAddressMode 2015-10-05 20:15:06 -07:00
Jason Ekstrand
39a19e88a3 vulkan.h: Various cosmetic changes
These don't affect the driver in any way.
2015-10-05 20:06:30 -07:00
Chad Versace
9357062348 vk: Merge anv_*_attachment_view into anv_attachment_view
Remove anv_color_attachment_view and anv_depth_stencil_view, merging
them into anv_attachment_view. This prepares for merging
VkAttachmentView into VkImageView.
2015-10-05 17:46:04 -07:00
Chad Versace
ae30535602 vk: Drop anv_attachment_view::extent
It's duplicated by anv_attachment_view::image_view::extent.
2015-10-05 17:46:04 -07:00
Chad Versace
f0f4dfa9cc vk: Drop anv_surface_view
Push the members of struct anv_surface_view into anv_image_view and
anv_buffer_view, then remove struct anv_surface_view. Observe that
anv_surface_view::range is not needed for anv_image_view, and so was
dropped there.

This prepares for the merge of VkAttachmentView into VkImageView. Remove
the common parent of anv_buffer_view and anv_image_view (that is,
anv_surface_view) will make the merge easier.
2015-10-05 17:46:04 -07:00
Chad Versace
74193a880f vk: Use consistent names for anv_*_view variables
Rename all anv_*_view variables to follow this convention:
    - sview -> anv_surface_view
    - bview -> anv_buffer_view
    - iview -> anv_image_view
    - aview -> anv_attachment_view
    - cview -> anv_color_attachment_view
    - ds_view -> anv_depth_stencil_attachment_view

This clarifies existing code. And it will reduce noise in the upcoming
commits that merge VkAttachmentView into VkImageView.
2015-10-05 17:46:04 -07:00
Chad Versace
ffd051830d vk: Unionize anv_desciptor
For a given struct anv_descriptor, all members are NULL (in which case
the descriptor is empty) or exactly one member is non-NULL.
To make struct anv_descriptor better reflect its set of valid states,
convert the struct into a tagged union.
2015-10-05 17:46:04 -07:00
Chad Versace
63439953d7 vk: Drop dependency on no longer extant header
anv_meta no longer uses GLSL shaders, and the build system no longer
converts them to SPIR-V. So remove anv_meta_spirv_autogen.h from
Makefile.am.

(cherry picked from commit 2fc8122f66)
2015-10-05 17:06:19 -07:00
Chad Versace
2fc8122f66 vk: Drop dependency on no longer extant header
anv_meta no longer uses GLSL shaders, and the build system no longer
converts them to SPIR-V. So remove anv_meta_spirv_autogen.h from
Makefile.am.
2015-10-05 17:04:18 -07:00
Chad Versace
8bf021cf3d vk: Return anv_image_view_info by value
The struct is only 2 bytes. Returning it on the stack is better than
returning a reference into the ELF .data segment.
2015-10-05 13:22:44 -07:00
Chad Versace
4ffb4549e0 vk/image: Document a Vulkan spec requirement for depthstencil
The Vulkan spec (git a511ba2) requires support for some combined depth
stencil formats.
2015-10-05 13:18:44 -07:00
Chad Versace
3530224063 vk: Annotate anv_cmd_state::gen7::index_type
It's the value of 3DSTATE_INDEX_BUFFER.IndexFormat.
2015-10-05 08:58:35 -07:00
Chad Versace
9c93aa9141 vk: Better types for VkShaderStage, VkShaderStageFlags vars
In most places, the variable type was the uninformative uint32_t.
2015-10-05 08:55:09 -07:00
Chad Versace
6317c3144d vk/0.170.2: Drop VK_BUFFER_USAGE_GENERAL 2015-10-05 08:12:59 -07:00
Chad Versace
4744f60e79 vk/0.170.2: Drop enum VkBufferViewType 2015-10-05 08:12:58 -07:00
Chad Versace
7a089bd1a6 vk/0.170.2: Update VkImageSubresourceRange
Replace 'aspect' with 'aspectMask'.
2015-10-05 08:10:57 -07:00
Chad Versace
568654d606 vk/0.170.2: Drop VK_IMAGE_USAGE_GENERAL 2015-10-05 08:09:33 -07:00
Chad Versace
6a40af1b08 vk/0.170.2: Update VkPipelineMultisampleStateCreateInfo 2015-10-04 10:00:25 -07:00
Chad Versace
dd04be491d vk/0.170.2: Update Vk VkPipelineDepthStencilStateCreateInfo
Rename member depthBoundsEnable -> depthBoundsTestEnable.
2015-10-04 09:41:46 -07:00
Chad Versace
8cb2e27c62 vk/0.170.2: Update VkRenderPassBeginInfo
Rename members:
    attachmentCount -> clearValueCount
    pAttachmentClearValues -> pClearValues
2015-10-04 09:26:25 -07:00
Chad Versace
3694518be5 vk/0.170.2: Drop VkBufferViewCreateInfo::viewType 2015-10-04 09:14:57 -07:00
Chad Versace
216d9f248d vk: Copy current header to vulkan-0.138.2.h
While upgrading Mesa to the new 0.170.2 API, it's convenient to have all
three headers available in the tree:
    - vulkan-0.138.2.h, the old one
    - vulkan-0.170.2.h, the new one
    - vulkan.h, the one in transition
2015-10-04 09:09:35 -07:00
Chad Versace
7f18ed4b9f vk: Import header 0.170.2 header LunarG SDK
From the LunarG SDK at tag sdk-0.9.1, import vulkan.h as
vulkan-0.170.2.h. This header is the first provisional header with the
addition of minor fixes.
2015-10-04 09:09:31 -07:00
Jason Ekstrand
09ba0a7c05 Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-10-03 11:32:29 -07:00
Jason Ekstrand
ef56cf7738 Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-10-02 16:52:47 -07:00
Jason Ekstrand
10f97718c3 anv/allocator: Add a sanity assertion in state stream finish.
We assert that the block offset we got while walking the list of blocks is
actually a multiple of the block size.  If something goes wrong and the GPU
decides to stomp on the surface state buffer we can end up getting
corruptions in our list of blocks.  This assertion makes such corruptions a
crash with a meaningful message rather than an infinite loop.
2015-10-02 16:24:42 -07:00
Jason Ekstrand
002e7b0cc3 anv: Remove the GLSL -> SPIR-V scraper/converter
This was very useful to get us up-and-going.  However, now that we can use
NIR directly for meta shaders, we don't need this anymore and we might as
well drop the glslc dependency.
2015-10-02 16:20:04 -07:00
Jason Ekstrand
f5ffb0e0cb anv/meta: Use NIR directly for blit shaders 2015-10-02 16:18:44 -07:00
Jason Ekstrand
7851a4392a anv/meta: Use NIR directly for clear shaders 2015-10-02 16:18:32 -07:00
Jason Ekstrand
add99c4beb anv: Add a back-door for passing NIR shaders directly into the pipeline
This will allow us to use NIR directly for meta operations rather than
having to go through SPIR-V.
2015-10-02 16:16:57 -07:00
Jason Ekstrand
b68805f83c anv: Add some NIR builder helpers
These should all eventually be up-streamed.  However, since they currently
have no upstream users, they would just bitrot there.  We'll keep them
local for the time being.
2015-10-02 16:15:53 -07:00
Jason Ekstrand
c1553653a2 vk/wsi/x11: Send OUT_OF_DATE if the X drawable goes away 2015-10-02 13:44:53 -07:00
Kristian Høgsberg Kristensen
005c8e0106 Merge branch 'master' of ../mesa into vulkan 2015-10-01 14:24:29 -07:00
Jason Ekstrand
337caee910 anv/wsi_x11: Properly report BadDrawable errors to the client 2015-09-28 20:18:41 -07:00
Jason Ekstrand
f06bc45b0c anv/batch_chain: Use the surface state pool for binding tables 2015-09-28 16:01:14 -07:00
Jason Ekstrand
d93f6385a7 anv/batch_chain: Add helpers for fixing up block_pool relocations 2015-09-28 16:01:14 -07:00
Jason Ekstrand
8c00f9ab56 anv/gen8: Do a render cache flush prior to changing state base address 2015-09-28 16:01:14 -07:00
Jason Ekstrand
0e94446b25 anv/device: Use a 4K block size for surface state blocks
We want to start using the surface state block pool for binding tables and
binding tables.  In order to do this, we need to be able to set surface
state base address to the address of a block and surface state base address
has a 4K alignment requriement.
2015-09-28 16:01:01 -07:00
Jason Ekstrand
737e89bc8d anv/meta: Use the dynamic state stream for temporary buffers 2015-09-28 16:01:01 -07:00
Jason Ekstrand
219a1929f7 anv/util: Add helpers for getting the first and last elements of a vector 2015-09-28 16:01:01 -07:00
Jason Ekstrand
95487668df anv/batch_chain: Add a _alloc_binding_table function 2015-09-28 16:01:01 -07:00
Jason Ekstrand
d517de6126 anv: Make anv_state.offset an int32_t
Binding tables will have a negative offset and we need a way to express
that.  Besides, the chances of a state offset being larger than 2 GB is so
remote it's not worth thinking about.
2015-09-28 16:01:01 -07:00
Jason Ekstrand
9ac3dde3a0 anv/wsi_wayland: Fix FIFO mode
Previously, there were a number of things we were doing wrong:

 1) We weren't flushing the wl_display so dead-looping clients weren't
    guaranteed to work.
 2) We were sending the frame event after calling wl_surface.commit() so it
    wasn't getting assigned to the correct frame
 3) We weren't actually setting fifo_ready to false.

Unfortunately, we never noticed because (3) was hiding the other two.  This
commit fixes all three and clients that use FIFO mode are now properly
refresh-rate limited.
2015-09-28 15:58:34 -07:00
Chad Versace
ddcedb979a vk: Implement vkGetPhysicalDeviceImageFormatProperties()
The implementation is incomplete because we lie about
VkImageFormatProperties::maxResourceSize, hardcoding it to UINT32_MAX
for all supported cases.
2015-09-28 11:53:39 -07:00
Chad Versace
9f3122db0e vk: Refactor anv_GetPhysicalDeviceFormatProperties()
Move the bulk of the function body to a new function
anv_physical_device_get_format_properties(). This allows us to reuse the
function when implementing anv_GetPhysicalDeviceImageFormatProperties()
without calling into the public entry point.
2015-09-28 11:53:39 -07:00
Chad Versace
c15ce5c834 vk: Advertise that depthstencil formats support sampling
Let vkGetPhysicalDeviceFormatProperties() set
VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT for tiled depthstencil images.
2015-09-28 11:53:39 -07:00
Jason Ekstrand
4e48f94469 anv/device: Wrap a couple valgrind calls in the VG macro
This fixes the build for systems that don't have valgrind devel packages
installed.
2015-09-28 11:18:52 -07:00
Chad Versace
97636345da vk: Fix vkGetPhysicalDeviceSparseImageFormatProperties()
The driver does not yet support sparse images, so return zero properties for
all formats.
2015-09-28 10:17:48 -07:00
Kristian Høgsberg Kristensen
164f08c255 vk: Add anv_icd.json to .gitignore 2015-09-25 15:16:56 -07:00
Kristian Høgsberg Kristensen
850cfcad3e vk: Also define vk_errorf in non-debug builds 2015-09-25 15:15:37 -07:00
Kristian Høgsberg Kristensen
cf24211d55 vk: Roll back GLSL parser support for vulkan
In the interest of reducing our delta to mesa master, let's undo these
changes now that we only support SPIR-V.
2015-09-25 10:42:07 -07:00
Jason Ekstrand
e9dff5bb99 vk: Add an ICD declaration file 2015-09-24 14:45:58 -07:00
Jason Ekstrand
39cd3783a4 anv: Add support for the ICD loader 2015-09-24 14:45:58 -07:00
Jason Ekstrand
a95f51c1d7 anv: Add a global dispatch table for use in meta operations 2015-09-24 14:45:58 -07:00
Jason Ekstrand
00d18a661f anv/entrypoints: Expose the anv_resolve_entrypoint function 2015-09-24 14:45:58 -07:00
Jason Ekstrand
f5e72695e0 anv/entrypoints: Rename anv_layer to anv_dispatch_table 2015-09-24 14:45:58 -07:00
Jason Ekstrand
913a9b76f7 anv/batch_chain: Remove the current_surface_bo helper
It's no longer used outside anv_batch_chain so we certainly don't need to
be exporting.  Inside anv_batch_chain, it's only used twice and it can be
replaced by a single line so there's really no point.
2015-09-24 08:46:41 -07:00
Jason Ekstrand
bc17f9c9d7 anv/cmd_buffer: Add a helper for getting the surface state base address 2015-09-24 08:42:38 -07:00
Jason Ekstrand
e1a7c721d3 anv/allocator: Don't ever call mremap
This has always been a bit sketchy and neither Kristian nor I have ever
really liked it.
2015-09-24 08:42:14 -07:00
Jason Ekstrand
99e62f5ce8 anv/allocator: Delete the unused center_fd_offset from anv_block_pool 2015-09-24 08:41:56 -07:00
Jason Ekstrand
429665823d anv/allocator: Do a better job of centering bi-directional block pools 2015-09-24 08:41:47 -07:00
Jason Ekstrand
76be58efce anv/batch_chain: Clean up the reloc list swapping code 2015-09-24 08:41:38 -07:00
Jason Ekstrand
041f5ea089 anv/meta: Add location specifiers to meta shaders 2015-09-21 16:21:56 -07:00
Jason Ekstrand
f406b708a5 Merge branch 'nir-spirv' into vulkan 2015-09-17 20:03:40 -07:00
Jason Ekstrand
616db92b01 nir/spirv: Add better location handling
Previously, our location handling was focussed on either no location
(usually implicit 0) or a builting.  Unfortunately, if you gave it a
location, it would blow it away and just not care.  This worked fine with
crucible and our meta shaders but didn't work with the CTS.  The new code
uses the "data.explicit_location" field to denote that it has a "final"
location (usually from a builtin) and, otherwise, the location is
considered to be relative to the base for that shader stage.
2015-09-17 20:02:46 -07:00
Jason Ekstrand
a788e7c659 anv/device: Move mutex initialization to befor block pools 2015-09-17 18:23:21 -07:00
Jason Ekstrand
595e6cacf1 meta: Initial support for packing parameters
Probably incomplete but it should do for now
2015-09-17 18:21:05 -07:00
Jason Ekstrand
d616493953 anv/meta: Pass the depth through the clear vertex shader
It shouldn't matter since we shut off the VS but it's at least clearer.
2015-09-17 18:09:21 -07:00
Jason Ekstrand
3b8aa26b8e anv/formats: Properly report depth-stencil formats 2015-09-17 17:44:20 -07:00
Jason Ekstrand
b5f6889648 vk/device: Don't allow device or instance creation with invalid extensions 2015-09-17 17:44:20 -07:00
Jason Ekstrand
dcf424c98c anv/tests: Add some asserts for data integrity in block_pool_no_free 2015-09-17 17:44:20 -07:00
Jason Ekstrand
5f57ff7e18 anv/allocator: Make the block pool double-ended
This allows us to allocate from either side of the block pool in a
consistent way.  If you use the previous block_pool_alloc function, you
will get offsets from the start of the pool as normal.  If you use the new
block_pool_alloc_back function, you will get a negative index that
corresponds to something in the "back" of the pool.
2015-09-17 17:44:20 -07:00
Jason Ekstrand
15624fcf55 anv/tests: Refactor the block_pool_no_free test
This simply breaks the monotonicity check out into its own function
2015-09-17 17:44:20 -07:00
Jason Ekstrand
55daed947d vk/allocator: Split block_pool_alloc into two functions 2015-09-17 17:44:20 -07:00
Jason Ekstrand
c55fa89251 anv/allocator: Use a signed 32-bit offset for the free list
This has the unfortunate side-effect of making it so that we can't have a
block pool bigger than 1GB.  However, that's unlikely to happen and, for
the sake of bi-directional block pools, we need to negative offsets.
2015-09-17 17:44:20 -07:00
Jason Ekstrand
8c6bc1e85d anv/allocator: Create 2GB memfd up-front for the block pool 2015-09-17 17:44:20 -07:00
Jason Ekstrand
74bf7aa07c anv/allocator: Take the device mutex when growing a block pool
We don't have any locking issues yet because we use the pool size itself as
a mutex in block_pool_alloc to guarantee that only one thread is resizing
at a time.  However, we are about to add support for growing the block pool
at both ends.  This introduces two potential races:

 1) You could have two block_pool_alloc() calls that both try to grow the
    block pool, one from each end.

 2) The relocation handling code will now have to think about not only the
    bo that we use for the block pool but also the offset from the start of
    that bo to the center of the block pool.  It's possible that the block
    pool growing code could race with the relocation handling code and get
    a bo and offset out of sync.

Grabbing the device mutex solves both of these problems.  Thanks to (2), we
can't really do anything more granular.
2015-09-17 17:44:20 -07:00
Jason Ekstrand
222ddac810 anv: Document the index and offset parameters of anv_bo 2015-09-17 17:44:20 -07:00
Chad Versace
85520aa070 vk/image: Remove stale FINISHME for non-2D image views
gen8_image_view_init() now supports 1D, 2D, and 3D image views.
2015-09-14 15:16:57 -07:00
Chad Versace
622a317e4c vk/image: Teach vkCreateImage about layout of 1D surfaces
Calling vkCreateImage() with VK_IMAGE_TYPE_1D now succeeds and computes
the surface layout correctly.
2015-09-14 15:15:12 -07:00
Chad Versace
6221593ff8 vk/meta: Partially implement vkCmdCopy*, vkCmdBlit* for 3D images
Partially implement the below functions for 3D images:
    vkCmdCopyBufferToImage
    vkCmdCopyImageToBuffer
    vkCmdCopyImage
    vkCmdBlitImage

Not all features work, and there is much for performance improvement.
Beware that vkCmdCopyImage and vkCmdBlitImage are untested.  Crucible
proves that vkCmdCopyBufferToImage and vkCmdCopyImageToBuffer works,
though.

Supported:
    - copy regions with z offset

Unsupported:
    - copy regions with extent.depth > 1

Crucible test results on master@d452d2b are:
    pass: func.miptree.r8g8b8a8-unorm.*.view-3d.*
    pass: func.miptree.d32-sfloat.*.view-3d.*
    fail: func.miptree.s8-uint.*.view-3d.*
2015-09-14 14:27:34 -07:00
Chad Versace
0ecafe0285 vk/meta: Rename meta_emit_blit() params
Rename src -> src_view and dest -> dest_view. This reduces noise in the next
patch's diff, which adds new params to the function.
2015-09-14 12:29:51 -07:00
Chad Versace
b659a066e9 vk/gen8: Set RENDER_SURFACE_STATE::RenderTargetViewExtent 2015-09-14 12:29:49 -07:00
Chad Versace
ffa61e1572 vk/gen8: Refactor setting of SURFACE_STATE::Depth
The field's meaning depends on SURFACE_STATE::SurfaceType.
Make that correlation explicit by switching on VkImageType.
For good measure, add some PRM quotes too.
2015-09-14 12:27:05 -07:00
Chad Versace
eed74e3a02 vk: Teach vkCreateImage about layout of 3D surfaces
Calling vkCreateImage() with VK_IMAGE_TYPE_3D now succeeds and computes
the surface layout correctly. However, 3D images do not yet work for
many other Vulkan entrypoints.
2015-09-14 11:04:08 -07:00
Chad Versace
e01d5a0471 vk: Refactor anv_image_make_surface()
Move the code that calculates the layout of 2D surfaces into a switch case.
2015-09-14 11:00:18 -07:00
Jason Ekstrand
8c8ad6dddf vk: Use push constants for dynamic buffers 2015-09-11 15:56:19 -07:00
Jason Ekstrand
2b4a2eb592 vk/compiler: Rework create_params_array 2015-09-11 15:55:54 -07:00
Jason Ekstrand
c3086c54a8 vk/compiler: Add a NIR pass for pushing dynamic buffer offset
This commit just adds the NIR pass but does none of the uniform setup
2015-09-11 15:53:56 -07:00
Jason Ekstrand
7487371056 vk/pipeline_layout: Add dynamic_offset_start and has_dynamic_offsets fields 2015-09-11 15:52:43 -07:00
Jason Ekstrand
de5220c7ce vk/pipeline_layout: Move surface/sampler start from SoA to AoS
This makes more sense to me and it's more consistent with
anv_descriptor_set_layout.
2015-09-11 10:43:55 -07:00
Jason Ekstrand
b908c67816 vk: Rework the push constants data structure
Previously, we simply had a big blob of stuff for "driver constants".  Now,
we have a very specific data structure that contains the driver constants
that we care about.
2015-09-11 10:25:23 -07:00
Jason Ekstrand
fd21f0681a Add the wayland protocol files to .gitignire 2015-09-11 09:29:40 -07:00
Jason Ekstrand
8040dc4ca5 vk/error: Handle ERROR_OUT_OF_DATE_WSI 2015-09-08 12:13:07 -07:00
Jason Ekstrand
060720f0c9 vk/wsi/x11: Actually block on X so we don't re-use busy buffers 2015-09-08 11:51:47 -07:00
Jason Ekstrand
1bee19e023 vk: Add the WSI header files 2015-09-08 10:33:46 -07:00
Jason Ekstrand
2f3de6260d Merge branch 'nir-spirv' into vulkan 2015-09-05 14:12:59 -07:00
Jason Ekstrand
4d73ca3c58 nir/spirv.h: Remove some cruft missed while merging
There were merge conflicts in spirv.h that got missed because they were in
a comment and so it still compiled.  This gets rid of them and we should be
on-par with upstream spirv->nir.
2015-09-05 14:11:40 -07:00
Jason Ekstrand
612b13aeae nir/spirv: Add support for most of the rest of texturing
Assuming this all works, about the only thing left should be some
corner-cases for tg4
2015-09-05 14:10:05 -07:00
Jason Ekstrand
fe786ff67d Merge branch 'nir-spirv' into vulkan 2015-09-05 13:17:53 -07:00
Jason Ekstrand
35fcd37fcf nir/spirv: Handle decorations after assigning variable locations 2015-09-05 13:17:21 -07:00
Jason Ekstrand
87d02f515b Merge branch 'nir-spirv' into vulkan 2015-09-05 09:48:33 -07:00
Jason Ekstrand
9be43ef99c nir/spirv: Handle the MatrixStride member decoration 2015-09-05 09:47:45 -07:00
Jason Ekstrand
01924a03d4 vk: Actually link in wayland libraries
Turns out this was why I had accidentally broken the universe.  Oops...
2015-09-04 20:02:38 -07:00
Jason Ekstrand
2c4ae00db6 vk: Conditionally compile Wayland support
Pulling in libwayland causes undefined symbols in applications that are
linked against vulkan alone.  Ideally, we would like to dlopen a platform
support library or something like that.  For now, this works and should get
crucible running again.
2015-09-04 19:18:52 -07:00
Jason Ekstrand
b3c037f329 vk: Fix size return value handling in a couple plces 2015-09-04 19:05:51 -07:00
Jason Ekstrand
9a95d08ed6 Merge branch 'nir-spirv' into vulkan 2015-09-04 18:54:15 -07:00
Jason Ekstrand
6d5dafd779 nir/spirv/glsl450: Use the correct write mask 2015-09-04 18:50:14 -07:00
Jason Ekstrand
7174d155e9 nir: Add a lower_fdiv option and use it in i965 2015-09-04 18:50:14 -07:00
Jason Ekstrand
f32d16a9f0 nir/spirv: Use the actual GLSL 450 extension header from Khronos 2015-09-04 18:50:14 -07:00
Jason Ekstrand
9e2c13350e nir/spirv: Add support for SpvDecorationColMajor 2015-09-04 18:50:14 -07:00
Jason Ekstrand
f3bdb93a8e nir/types: Allow single-column matrices
This can sometimes be a convenient way to build vectors.
2015-09-04 18:50:14 -07:00
Jason Ekstrand
48e87c0163 vk/wsi: Add Wayland WSI support 2015-09-04 17:55:42 -07:00
Jason Ekstrand
348cb29a20 vk/wsi: Move to a clallback system for the entire WSI implementation
We do this for two reasons: First, because it allows us to simplify WSI and
compiling in/out support for a particular platform is as simple as calling
or not calling the platform-specific init function.  Second, the
implementation gives us a place for a given chunk of the WSI to stash
stuff in the instance.
2015-09-04 17:55:42 -07:00
Jason Ekstrand
06d8fd5881 vk/instance: Expose anv_instance_alloc/free 2015-09-04 17:55:42 -07:00
Jason Ekstrand
c0b97577e8 vk/WSI: Use a callback mechanism instead of explicit switching 2015-09-04 17:55:42 -07:00
Jason Ekstrand
ca3cfbf6f1 vk: Add an initial implementation of the actual Khronos WSI extension
Unfortunately, this is a very large commit and removes the old LunarG WSI
extension.  This is because there are a couple of entrypoints that have the
same name between the two extensions so implementing them both is
impractiacl.

Support is still incomplete, but this is enough to get vkcube up and going
again.
2015-09-04 17:55:42 -07:00
Jason Ekstrand
3d9fbb6575 vk: Add initial support for VK_WSI_swapchain 2015-09-04 17:55:42 -07:00
Jason Ekstrand
beb466ff5b vk: Move anv_x11.c to anv_wsi_x11.c 2015-09-04 17:55:42 -07:00
Jason Ekstrand
9a7600c9b5 vk/device: Use an array for device extensions 2015-09-04 17:55:42 -07:00
Kristian Høgsberg Kristensen
8af3624651 vk: Further reduce diff to master
Now that we don't compile GLSL, we can roll back a few more hacks and
unexport some things from the backend compiler.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-04 16:17:01 -07:00
Kristian Høgsberg Kristensen
7c1d20dc48 vk: Drop GLSL code from anv_compiler.cpp
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 14:02:11 -07:00
Kristian Høgsberg Kristensen
316c8ac53b vk: Assert that the SPIR-V module has the magic number
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 12:27:28 -07:00
Kristian Høgsberg Kristensen
6e35a1f166 vk: Remove various hacks/scaffolding code
Since we switched away from calling brwCreateContext() there's a bit of
hacky support we can now delete.  This reduces our diff to upstream master.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 12:17:13 -07:00
Kristian Høgsberg Kristensen
1d787781ff vk: Fall back to previous gens in entry point resolver
We used to always just do a one-level fallback from genX_* to anv_*
entry points. That worked for gen7 and gen8 where all entry points were
either different or could be made anv_* entry points (eg
anv_CreateDynamicViewportState). We're about to add gen9 and now need to
be able to fall back to gen8 entry points for most things.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 11:53:09 -07:00
Kristian Høgsberg Kristensen
c4dbff58d8 vk: Drop redundant gen7_CreateGraphicsPipelines
This is handled by anv_CreateGraphicsPipelines().

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 11:53:09 -07:00
Kristian Høgsberg Kristensen
b5e90f3f48 vk: Use vk* entrypoints in meta, not driver_layer pointers
We'll change the dispatch mechanism again in a later commit. Stop using
the driver_layer function pointers and just use the public entry points.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 11:53:09 -07:00
Kristian Høgsberg Kristensen
82396a5514 vk: Drop check for I915_PARAM_HAS_EXEC_CONSTANTS
We don't use this kernel feature.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 11:53:08 -07:00
Kristian Høgsberg Kristensen
c4b30e7885 vk: Add new vk_errorf that takes a format string
This allows us to annotate error cases in debug builds.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 11:53:08 -07:00
Kristian Høgsberg Kristensen
2e346c882d vk: Make vk_error a little more helpful
Print out file and line number and translate the error code to the
symbolic name.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-09-03 11:53:08 -07:00
Chad Versace
0cb26523d3 vk/image: Add PRM reference for QPitch equation
Suggested-by: Nanley Chery <nanley.g.chery@intel.com>
2015-09-03 11:04:38 -07:00
Chad Versace
28503191f1 vk/meta: Partially fix vkCmdCopyBufferToImage for S8_UINT
Create R8_UINT VkAttachmentView and VkImageView for the stencil data.

This fixes a crash, but the pixels in the destination image are still
incorrect. They are not properly tiled.

Fixes crashes in Crucible tests func.miptree.s8-uint.aspect-stencil.* as
of crucible-7471449. Test results improve 'lost' -> 'fail'.
2015-09-02 11:08:36 -07:00
Jason Ekstrand
be0a4da6a5 vk/meta: Use SPIR-V for shaders
We are also now using glslc for compiling the Vulkan driver like we do in
curcible.
2015-09-01 15:16:06 -07:00
Jason Ekstrand
362ab2d788 vk/compiler: Handle interpolation qualifiers for SPIR-V shaders 2015-09-01 15:15:04 -07:00
Jason Ekstrand
126ade0023 vk/extensions: count needs to be <= number of extensions 2015-09-01 12:28:50 -07:00
Jason Ekstrand
0c2d476935 vk/compiler: Properly reference/delete programs when using SPIR-V 2015-09-01 12:28:50 -07:00
Jason Ekstrand
16ebe883a4 vk/meta: Add a helper for making an image from a buffer 2015-08-31 21:54:38 -07:00
Jason Ekstrand
86c3476668 nir/spirv: Use VERTEX_ID_ZERO_BASE for VertexId
In Vulkan, VertexId and InstanceId will be zero-based and new intrinsics,
VertexIndex and InstanceIndex, will be added for non-zer-based.  See also,
Khronos bug #14255
2015-08-31 17:16:49 -07:00
Jason Ekstrand
6350c97412 Merge remote-tracking branch 'fdo-personal/nir-spirv' into vulkan
From now on, the majority of SPIR-V improvements should happen on the spirv
branch which will also be public.  It will be frequently merged into the
vulkan driver.
2015-08-31 17:14:47 -07:00
Jason Ekstrand
22fdb2f855 nir/spirv: Update to the latest revision 2015-08-31 17:05:23 -07:00
Jason Ekstrand
ce70cae756 nir/builder: Use nir_after_instr to advance the cursor
This *should* ensure that the cursor gets properly advanced in all cases.
We had a problem before where, if the cursor was created using
nir_after_cf_node on a non-block cf_node, that would call nir_before_block
on the block following the cf node.  Instructions would then get inserted
in backwards order at the top of the block which is not at all what you
would expect from nir_after_cf_node.  By just resetting to after_instr, we
avoid all these problems.
2015-08-31 17:05:23 -07:00
Jason Ekstrand
24b0c53231 nir/intrinsics: Move to a two-dimensional binding model for UBO's 2015-08-31 17:05:23 -07:00
Jason Ekstrand
f4608bc530 nir/nir_variable: Add a descriptor set field
We need this for SPIR-V
2015-08-31 17:05:23 -07:00
Jason Ekstrand
85cf2385c5 mesa: Move gl_vert_attrib from mtypes.h to shader_enums.h
It is a shader enum after all...
2015-08-31 17:05:23 -07:00
Jason Ekstrand
de4f379a70 nir/cursor: Add a helper for getting the current block 2015-08-31 17:05:23 -07:00
Connor Abbott
024c49e95e nir/builder: add a nir_fdot() convenience function 2015-08-31 17:05:23 -07:00
Jason Ekstrand
f6a0eff1ba nir: Add a pass to lower outputs to temporary variables
This pass can be used as a helper for NIR producers so they don't have to
worry about creating the temporaries themselves.
2015-08-31 17:05:23 -07:00
Jason Ekstrand
4956bbaa33 nir/cursor: Add a constructor for the end of a block but before the jump 2015-08-31 16:58:20 -07:00
Connor Abbott
c62be38286 nir/types: add more nir_type_is_xxx() wrappers 2015-08-31 16:58:20 -07:00
Connor Abbott
a1e136711b nir/types: add a helper to transpose a matrix type 2015-08-31 16:58:20 -07:00
Jason Ekstrand
756b00389c nir/spirv: Don't assert that the current block is empty
It's possible that someone will give us SPIR-V code in which someone
needlessly branches to new blocks.  We should handle that ok now.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
fe220ebd37 nir/spirv: Add initial support for samplers 2015-08-31 16:58:20 -07:00
Jason Ekstrand
a992909aae nir/spirv: Move Exp and Log to the list of currently unhandled ALU ops
NIR doesn't have the native opcodes for them anymore
2015-08-31 16:58:20 -07:00
Jason Ekstrand
45963c9c64 nir/types: Add support for sampler types 2015-08-31 16:58:20 -07:00
Jason Ekstrand
2887e68f36 nir/spirv: Make the global constants in spirv.h static
I've been promissed in a bug that this will be fixed in a future version of
the header.  However, in the interest of my branch building, I'm adding
these changes in myself for the moment.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
62b094a81c nir/spirv: Handle jump-to-loop in a more general way 2015-08-31 16:58:20 -07:00
Jason Ekstrand
ca51d926fd nir/spirv: Handle boolean uniforms correctly 2015-08-31 16:58:20 -07:00
Jason Ekstrand
b6562bbc30 nir/spirv: Handle control-flow with loops 2015-08-31 16:58:20 -07:00
Jason Ekstrand
4a63761e1d nir/spirv: Set a name on temporary variables 2015-08-31 16:58:20 -07:00
Jason Ekstrand
6fc7911d15 nir/spirv: Use the correct length for copying string literals 2015-08-31 16:58:20 -07:00
Jason Ekstrand
9da6d808be nir/spirv: Make vtn_ssa_value handle constants as well as ssa values 2015-08-31 16:58:20 -07:00
Jason Ekstrand
1feeee9cf4 nir/spirv: Add initial support for GLSL 4.50 builtins 2015-08-31 16:58:20 -07:00
Jason Ekstrand
577c09fdad nir/spirv: Split the core datastructures into a header file 2015-08-31 16:58:20 -07:00
Jason Ekstrand
66fc7f252f nir/spirv: Use the builder for all instructions
We don't actually use it to create all the instructions but we do use it
for insertion always.  This should make things far more consistent for
implementing extended instructions.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
9e03b6724c nir/spirv: Add support for a bunch of ALU operations 2015-08-31 16:58:20 -07:00
Jason Ekstrand
91b3b46d8b nir/spirv: Add support for indirect array accesses 2015-08-31 16:58:20 -07:00
Jason Ekstrand
9197e3b9fc nir/spirv: Explicitly type constants and SSA values 2015-08-31 16:58:20 -07:00
Jason Ekstrand
b7904b8281 nir/spirv: Handle OpBranchConditional
We do control-flow handling as a two-step process.  The first step is to
walk the instructions list and record various information about blocks and
functions.  This is where the acutal nir_function_overload objects get
created.  We also record the start/stop instruction for each block.  Then
a second pass walks over each of the functions and over the blocks in each
function in a way that's NIR-friendly and actually parses the instructions.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
d216dcee94 nir/spirv: Add a helper for getting a value as an SSA value 2015-08-31 16:58:20 -07:00
Jason Ekstrand
f36fabb736 nir/spirv: Split instruction handling into preamble and body sections 2015-08-31 16:58:20 -07:00
Jason Ekstrand
7bf4b53f1c nir/spirv: Implement load/store instructiosn 2015-08-31 16:58:20 -07:00
Jason Ekstrand
7d64741a5e nir: Add a helper for getting the tail of a deref chain 2015-08-31 16:58:20 -07:00
Jason Ekstrand
112c607216 nir/spirv: Actaully add variables to the funciton or shader 2015-08-31 16:58:20 -07:00
Jason Ekstrand
4fa1366392 nir/spirv: Add a vtn_untyped_value helper 2015-08-31 16:58:20 -07:00
Jason Ekstrand
e709a4ebb8 nir/spirv: Use vtn_value in the types code and fix a off-by-one error 2015-08-31 16:58:20 -07:00
Jason Ekstrand
67af6c59f2 nir/types: Add an is_vector_or_scalar helper 2015-08-31 16:58:20 -07:00
Jason Ekstrand
5e6c5e3c8e nir/spirv: Add support for deref chains 2015-08-31 16:58:20 -07:00
Jason Ekstrand
366366c7f7 nir/types: Add a scalar type constructor 2015-08-31 16:58:20 -07:00
Jason Ekstrand
befecb3c55 nir/spirv: Add support for OpLabel 2015-08-31 16:58:20 -07:00
Jason Ekstrand
399e962d25 nir/spirv: Add support for declaring functions 2015-08-31 16:58:20 -07:00
Jason Ekstrand
ac4d459aa2 nir/types: Add accessors for function parameter/return types 2015-08-31 16:58:20 -07:00
Jason Ekstrand
3a266a18ae nir/spirv: Add support for declaring variables
Deref chains and variable load/store operations are still missing.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
2494055631 nir/spirv: Add support for constants 2015-08-31 16:58:20 -07:00
Jason Ekstrand
2a023f30a6 nir/spirv: Add basic support for types 2015-08-31 16:58:20 -07:00
Jason Ekstrand
5bb94c9b12 nir/types: Add more helpers for creating types 2015-08-31 16:58:20 -07:00
Jason Ekstrand
53bff3e445 glsl/types: Expose the function_param and struct_field structs to C
Previously, they were hidden behind a #ifdef __cplusplus so C wouldn't find
them.  This commit simpliy moves the #ifdef and adds #ifdef's around
constructors.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
0db3e4dd72 glsl/types: Add support for function types 2015-08-31 16:58:20 -07:00
Jason Ekstrand
1169fcdb05 glsl: Add GLSL_TYPE_FUNCTION to the base types enums 2015-08-31 16:58:20 -07:00
Jason Ekstrand
b79916dacc nir/spirv: Rework the way values are added
Instead of having functions to add values and set various things, we just
have a function that does a few asserts and then returns the value.  The
caller is then responsible for setting the various fields.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
ac60aba351 nir/spirv: Add stub support for extension instructions 2015-08-31 16:58:20 -07:00
Jason Ekstrand
78eabc6153 REVERT: Add a simple helper program for testing SPIR-V -> NIR translation 2015-08-31 16:58:20 -07:00
Jason Ekstrand
2c585a722d glsl/compiler: Move the error_no_memory stub to standalone_scaffolding.cpp 2015-08-31 16:58:20 -07:00
Jason Ekstrand
b20d9f5643 nir: Add the start of a SPIR-V to NIR translator
At the moment, it can handle the very basics of strings and can ignore
debug instructions.  It also has basic support for decorations.
2015-08-31 16:58:20 -07:00
Jason Ekstrand
9d92b4fd0e nir: Import the revision 30 SPIR-V header from Khronos 2015-08-31 16:58:20 -07:00
Jason Ekstrand
0af4bf4d4b Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-08-31 16:30:07 -07:00
Jason Ekstrand
9f9628e9dd vk/SPIR-V: Pull num_uniform_components out of the NIR shader 2015-08-28 22:31:03 -07:00
Jason Ekstrand
44e6ea74b0 spirv: lower outputs to temporaries 2015-08-28 17:38:41 -07:00
Jason Ekstrand
9cebdd78d8 nir: Add a pass to lower outputs to temporary variables
This pass can be used as a helper for NIR producers so they don't have to
worry about creating the temporaries themselves.
2015-08-28 17:38:41 -07:00
Jason Ekstrand
5e7c7b2a4e spirv: Only do a block load if you're actually loading a uniform 2015-08-28 16:17:45 -07:00
Jason Ekstrand
98abed2441 spirv: Use VERTEX_ID_ZERO_BASE for vertex id 2015-08-28 16:08:29 -07:00
Jason Ekstrand
dbc3eb5bb4 vk/compiler: Pass the correct is_scalar value to brw_process_nir 2015-08-28 12:13:17 -07:00
Jason Ekstrand
ea56d0cb1d glsl/types: Fix up function type hash table insertion 2015-08-28 12:00:25 -07:00
Chad Versace
a2d15ee698 vk/meta: Support stencil in vkCmdCopyImageToBuffer
At Crucible commit 12e64a4, fixes the
func.depthstencil.stencil-triangles.* tests on Broadwell.
2015-08-28 08:41:21 -07:00
Chad Versace
84cfc08c10 vk/pipeline: Fix crash when the pipeline has no attributes
If there are no attributes, don't emit 3DSTATE_VERTEX_ELEMENTS.
That packet does not allow 0 attributes.
2015-08-28 08:07:15 -07:00
Chad Versace
053d32d2a5 vk/image: Linear stencil buffers are illegal
The hardware requires that stencil buffer memory be W-tiled.

From the Sandybridge PRM:
   This buffer is supported only in Tile W memory.
2015-08-28 08:04:59 -07:00
Chad Versace
14e1d58fb7 vk: Fix stride of stencil buffers
Stencil buffers have strange pitch. The PRM says:

  The pitch must be set to 2x the value computed based on width,
  as the stencil buffer is stored with two rows interleaved.
2015-08-28 08:03:46 -07:00
Chad Versace
31af126229 vk: Program stencil ops in 3DSTATE_WM_DEPTH_STENCIL
The driver ignored the Vulkan stencil, always programming the hardware
stencil op to 0 (STENCILOP_KEEP).
2015-08-28 08:00:56 -07:00
Chad Versace
bff2879abe vk/image: Don't abort when creating stencil image views
When creating a stencil image view, log a FINISHME but don't abort.
We're sooooo close to having this working.
2015-08-28 07:59:59 -07:00
Chad Versace
4f852c76dc vk/meta: Save/restore VkDynamicDepthStencilState 2015-08-28 07:59:29 -07:00
Chad Versace
104c4e5ddf vk/meta: Don't skip clearing when clearing only depth attachment
anv_cmd_buffer_clear_attachments() skipped the clear renderpass if no
color attachments needed to be cleared, even if a depth attachment
needed to be cleared.
2015-08-28 07:58:51 -07:00
Chad Versace
aacb7bb9b6 vk: Add func anv_cmd_buffer_get_depth_stencil_view()
This function removes some duplicated code from
genN_cmd_buffer_emit_depth_stencil().
2015-08-28 07:57:34 -07:00
Chad Versace
641c25dd55 vk: Declare some local variables as const
In anv_cmd_buffer_emit_depth_stencil(), declare 'subpass' and 'fb' as
const.
2015-08-28 07:53:24 -07:00
Chad Versace
c6f19b4248 vk: Don't duplicate anv_depth_stencil_view's surface data
In anv_depth_stencil_view, replace the members
    bo
    depth_offset
    depth_stride
    depth_format
    depth_qpitch
    stencil_offset
    stencil_stride
    stencil_qpitch
with the single member
    const struct anv_image *image

The removed members duplicated data in anv_image::depth_surface and
anv_image::stencil_surface.
2015-08-28 07:52:19 -07:00
Chad Versace
35b0262a2d vk/gen7: Add func gen7_cmd_buffer_emit_depth_stencil()
This patch moves all the GEN7_3DSTATE_DEPTH_BUFFER code from
gen7_cmd_buffer_begin_subpass() into a new function
gen7_cmd_buffer_emit_depth_stencil().
2015-08-28 07:46:16 -07:00
Chad Versace
b2ee317e24 vk: Fix format of anv_depth_stencil_view
The format of the view itself and of the view's image may differ.
Moreover, if the view's format has no depth aspect but the image's
format does, we must not program the depth buffer. Ditto for stencil.
2015-08-28 07:44:32 -07:00
Chad Versace
798acb2464 vk/gen7: Fix gen of emitted packet in gen7_batch_lri()
Emit GEN7_MI_LOAD_REGISTER_IMM, not the GEN8 version.
2015-08-28 07:36:35 -07:00
Chad Versace
4461392343 vk: Remove dummy anv_depth_stencil_view 2015-08-28 07:35:39 -07:00
Chad Versace
941b48e992 vk/image: Let anv_image have one anv_surface per aspect
Split anv_image::primary_surface into two: anv_image::color_surface and
depth_surface.
2015-08-28 07:17:54 -07:00
Jason Ekstrand
c313a989b4 spirv: Bump to the public revision 31 2015-08-27 15:24:04 -07:00
Jason Ekstrand
2a8d1ac958 vk: Update to API version 0.138.2 2015-08-27 11:41:04 -07:00
Jason Ekstrand
4e3ee043c0 vk/gen8: Add support for push constants 2015-08-27 10:25:58 -07:00
Jason Ekstrand
375a65d5de vk/private.h: Handle a NULL bo but valid offset in __gen_combine_address 2015-08-27 10:25:58 -07:00
Jason Ekstrand
c8365c55f5 vk/cmd_buffer: Set the CONSTANTS_REL_GENERAL flag on execbuf
This tells the kernel that the push constant buffers are relative to the
dynamic state base address.
2015-08-27 10:25:58 -07:00
Jason Ekstrand
efc2cce01f HACK: Don't call nir_setup_uniforms
We're doing our own uniform setup and we don't need to call into the entire
GL stack to mess with things.
2015-08-27 10:25:58 -07:00
Jason Ekstrand
33cabeab01 vk/compiler: Add a helper for setting up prog_data->param
This new helper sets it up the way we'll want for handling push constants.
2015-08-27 10:25:16 -07:00
Jason Ekstrand
5446bf352e vk: Add initial API support for setting push constants
This doesn't add support for actually uploading them, it just ensures that
we have and update the shadow copy.
2015-08-26 17:59:15 -07:00
Jason Ekstrand
36134e1050 Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-08-26 11:04:30 -07:00
Jason Ekstrand
74e076bba8 vk/meta: Destroy vertex shaders when setting up clearing 2015-08-25 18:51:26 -07:00
Jason Ekstrand
4bb9915755 vk/gen8: Don't duplicate generic pipeline setup
gen8_graphics_pipeline_create had a bunch of stuff in it that's already set
up by anv_pipeline_init.  The duplication was causing double-initialization
of a state stream and made valgrind very angry.
2015-08-25 18:41:25 -07:00
Jason Ekstrand
9b387b5d3f Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-08-25 18:41:21 -07:00
Kristian Høgsberg Kristensen
5360edcb30 vk/vec4: Use the right constant for offset into a UBO
We were using constant 0, which is the set.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-25 16:14:59 -07:00
Kristian Høgsberg Kristensen
647a60226d vk: Use true/false for RenderCacheReadWriteMode
This field in surface state is a bool, WriteOnlyCache is an enum from
GEN8.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-25 15:58:21 -07:00
Kristian Høgsberg Kristensen
7e5afa75b5 vk: Support descriptor sets and bindings in vec4 ubo loads
Still incomplete, but at least we get the simplest case working.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-25 15:57:12 -07:00
Kristian Høgsberg Kristensen
00e7799c69 vk/gen7: Enable L3 caching for GEN7 MOCS
Do what GL does here.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-25 15:55:56 -07:00
Kristian Høgsberg Kristensen
6a1098b2c2 vk/gen7: Use TILEWALK_XMAJOR for linear surfaces
You wouldn't think the TileWalk mode matters when TiledSurface is
false. However, it has to be TILEWALK_XMAJOR. Make it so.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-25 10:54:13 -07:00
Kristian Høgsberg Kristensen
f1455ffac7 vk: Add gen7 support
With all the previous commits in place, we can now drop in support for
multiple platforms. First up is gen7 (Ivybridge).

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
891995e55b vk: Move 3DSTATE_SBE setup to just before 3DSTATE_PS
This is a more logical place for it, between geometry front end state
and pixel backend state.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
9c752b5b38 vk: Move generic pipeline init to anv_pipeline.c
This logic will be shared between multiple gens.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
3800573fb5 vk: Move gen8 specific state into gen8 sub-structs
This commit moves all occurances of gen8 specific state into a gen8
substruct. This clearly identifies the state as gen8 specific and
prepares for adding gen7 state structs. In the process we also rename
the field names to exactly match the command or state packet name,
without the 3DSTATE prefix, eg:

  3DSTATE_VF -> gen8.vf
  3DSTATE_WM_DEPTH_STENCIL -> gen8.wm_depth_stencil

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
615da3795a vk: Always use a placeholder vertex shader in meta
The clear pipeline didn't have a vertex shader and relied on the clear
shader being hardcoded by the compiler to accept one attribute. This
necessitated a few special cases in the 3DSTATE_VS setup. Instead,
always provide a vertex shader, even if we disable VS dispatch.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
ac738ada7a vk: Trim out irrelevant 0-initialized surface state fields
Many of of these fields aren't used for buffer surfaces, so leave them
out for brevity.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
963a1e35e7 vk: Update generated headers
This adds VALIGN_2 and VALIGN_4 defines for IVB and HSW
RENDER_SURFACE_STATE.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen
f5275f7eb3 vk: Move anv_color_attachment_view_init() to gen8_state.c
I'd prefer to move anv_CreateAttachmentView() as well, but it's a little
too much generic code to just duplicate for each gen.  For now, we'll
add a anv_color_attachment_view_init() to dispatch to the gen specific
implementation, which we then call from anv_CreateAttachmentView().

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
988341a73c vk: Move anv_CreateImageView to gen8_state.c
We'll probably want to move some code back into a shared init function,
but this gets one GEN8 surface state initialization out of anv_image.c.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
bc568ee992 vk: Make anv_cmd_buffer_begin_subpass() switch on gen
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
8fe74ec45c vk: Add generic wrapper for filling out buffer surface state
We need this for generating surface state on the fly for dynamic buffer
views.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
a2b822185e vk: Add helper for adding surface state reloc
We're going to have to do this differently for earlier gens, so lets do
it in place only.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
e43fc871be vk: Make batch chain code gen-agnostic
Since the extra dword in MI_BATCH_BUFFER_START added in gen8 is at the
end of the struct, we can emit the gen8 packet on all gens as long as we
set the instruction length correctly.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
25ab43ee8c vk: Move vkCmdPipelineBarrier to gen8_cmd_buffer.c
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
b4ef2302a9 vk: Use helper function for emitting MI_BATCH_BUFFER_START
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
97360ffc6c vk: Use anv_batch_emit() for chaining back to primary batch
We used to use a manual GEN8_MI_BATCH_BUFFER_START_pack() call, but this
refactors the code to use anv_batch_emit();

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
cff717c649 vk: Downgrade state packet to gen7 where they're common
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
64045eebfb vk: Reorder gen8 specific code into three new files
We'll organize gen specific code in three files per gen: pipeline,
cmd_buffer and state, eg:

    gen8_cmd_buffer.c
    gen8_pipeline.c
    gen8_state.c

where gen8_cmd_buffer.c holds all vkCmd* entry points, gne8_pipeline.c
all gen specific code related to pipeline building and remaining state
code (sampler, surface state, dynamic state) in gen8_state.c.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
9f0bb5977b vk: Move gen8_CmdBindIndexBuffer() to anv_gen8.c
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
a7649b2869 vk: Move gen8_cmd_buffer_emit_state_base_address() to anv_gen8.c
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
130db30771 vk: Move gen8 specific parts of queries to anv_gen8.c
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
98126c021f vk: Move dynamic depth stenctil to anv_gen8.c 2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
0bcf85d79f vk: Move pipeline creation to anv_gen8.c 2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
ef0ab62486 vk: Move anv_CreateSampler to anv_gen8.c
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
fb428727e0 vk: Move anv_CreateBufferView to anv_gen8.c
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
74556b076a vk: Add new anv_gen8.c and move CreateDynamicRasterState there
Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen
ee9788973f vk: Implement multi-gen dispatch mechanism 2015-08-24 13:45:39 -07:00
Chad Versace
c4e7ed9163 vk/meta: Implement depth clears
Fixes Crucible test
func.depthstencil.basic-depth.clear-1.0.op-greater.
2015-08-20 10:25:05 -07:00
Chad Versace
0db3d67a14 vk: Cache each render pass's number of clear ops
During vkCreateRenderPass, count the number of clear ops and store them
in new members of anv_render_pass:

    uint32_t num_color_clear_attachments
    bool has_depth_clear_attachment
    bool has_stencil_clear_attachment

Cacheing these 8 bytes (including padding) reduces the number of times
that anv_cmd_buffer_clear_attachments needs to loop over the pass's
attachments.
2015-08-20 10:25:04 -07:00
Chad Versace
2387219101 vk: Use temp var in vkCreateRenderPass's attachment loop
Store the attachment in a temporary variable and
s/pass->attachments[i]/att/ .
2015-08-20 10:25:04 -07:00
Chad Versace
1c24a191cd vk: Improve memory locality of anv_render_pass
Allocate the pass's array of attachments, anv_render_pass::attachments,
in the same allocation as the pass itself.
2015-08-20 09:31:58 -07:00
Chad Versace
4eaf90effb vk: Unharcode an argument to sizeof
s/struct anv_subpass/pass->subpasses[0])/
2015-08-20 09:31:58 -07:00
Chad Versace
44ef4484c8 vk/meta: Add Z coord to clear vertices
For now, the Z coordinate is always 0.0. Will later be used for depth
clears.
2015-08-20 09:31:12 -07:00
Chad Versace
4aef5c62cd vk/meta: Restore all saved state in anv_cmd_buffer_restore()
anv_cmd_buffer_restore() did not restore the old
VkDynamicColorBlendState.
2015-08-20 09:30:34 -07:00
Chad Versace
9f908fcbde vk/meta: Use consistent names and types in anv_saved_state
In struct anv_saved_state, each member's type was a pointer to an Anvil
struct and each member's name was prefixed with "old" except cb_state,
which was a Vulkan handle whose name lacked "old".
2015-08-20 09:29:41 -07:00
Neil Roberts
49d9e89d00 Add mesa.icd to the .gitignore
Since 4d7e0fa8c7 this file is generated by the configure script.
Reviewed-by: Tapani Palli <tapani.palli@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

(cherry picked from commit 885762e182)
2015-08-19 14:12:31 -07:00
Chad Versace
bd0aab9a58 vk/meta: Fix dest format of vkCmdCopyImage
The source image's format was incorrectly used for both the source view
and destination view. For vkCmdCopyImage to correctly translate formats,
the destination view's format must be that of the destination image's.
2015-08-18 12:44:06 -07:00
Chad Versace
b0875aa911 vk: Assert that swap chain format is a color format 2015-08-18 12:43:57 -07:00
Chad Versace
d52822541e vk/image: Don't set anv_surface_view::offset twice
It was set twice a few lines apart, and the second setting always
overrode the first.
2015-08-18 11:48:50 -07:00
Chad Versace
e7d3a5df5a vk/meta: Use anv_format_is_color()
That is, replace !anv_format_is_depth_or_stencil() with
anv_format_is_color(). That conveys the meaning better.
2015-08-18 11:48:48 -07:00
Chad Versace
50f7bf70da vk: Add anv_format_is_color() 2015-08-18 11:48:46 -07:00
Chad Versace
6ff95bba8a vk: Add anv_format reference to anv_render_pass_attachment
Change type of anv_render_pass_attachment::format from VkFormat to const
struct anv_format*.  This elimiates the repetitive lookups into the
VkFormat -> anv_format table when looping over attachments during
anv_cmd_buffer_clear_attachments().
2015-08-17 14:08:55 -07:00
Chad Versace
5a6b2e6df0 vk/image: Simplify stencil case for anv_image_create()
Stop creating a temporary VkImageCreateInfo with overriden
format=VK_FORMAT_S8_UINT. Instead, just pass the format override
directly to anv_image_make_surface().
2015-08-17 14:08:55 -07:00
Chad Versace
a9c36daa83 vk/formats: Add global pointer to anv_format for S8_UINT
Stencil formats are often a special case. To reduce the number of lookups
into the VkFormat-to-anv_format translation table when working with
stencil, expose the table's entry for VK_FORMAT_S8_UINT as global
variable anv_format_s8_uint.
2015-08-17 14:08:55 -07:00
Chad Versace
60c4ac57f2 vk: Add anv_format reference t anv_surface_view
Change type of anv_surface_view::format from VkFormat to const struct
anv_format*. This reduces the number of lookups in the VkFormat ->
anv_format table.
2015-08-17 14:08:55 -07:00
Chad Versace
c11094ec9a vk: Pass anv_format to anv_fill_buffer_surface_state()
This moves the translation of VkFormat to anv_format from
anv_fill_buffer_surface_state() to its caller.

A prep commit to reduce more VkFormat -> anv_format translations.
2015-08-17 14:08:55 -07:00
Chad Versace
ded736f16a vk: Add anv_format reference to anv_image
Change type of anv_image::format from VkFormat to const struct
anv_format*. This reduces the number of lookups in the VkFormat ->
anv_format table.
2015-08-17 14:08:55 -07:00
Chad Versace
4ae42c83ec vk: Store the original VkFormat in anv_format
Store the original VkFormat as anv_format::vk_format. This will be used
to reduce format indirection, such as lookups into the VkFormat ->
anv_format translation table.
2015-08-17 14:07:44 -07:00
Jason Ekstrand
e39e1f4d24 vk: Update .gitignore for the autogenerated spirv changes 2015-08-17 11:47:25 -07:00
Kristian Høgsberg Kristensen
aac6f7c3bb vk: Drop aub dumper and PCI ID override feature
These are now available in intel_aubdump from intel-gpu-tools.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-17 11:41:19 -07:00
Kristian Høgsberg Kristensen
6d09d0644b vk: Use anv_image_create() for creating dmabuf VkImage
We need to make sure we use the VkImage infrastructure for creating
dmabuf images.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-17 11:41:19 -07:00
Jason Ekstrand
0deae66eb1 vk: Add an _autogen suffix autogenerated spirv file names
This prevents make from stomping on nir_spirv.h
2015-08-17 11:40:16 -07:00
Jason Ekstrand
6a7ca4ef2c Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-08-17 11:25:03 -07:00
Jason Ekstrand
b4c02253c4 vk: Add four unit tests for our lock-free data-structures 2015-08-14 17:04:39 -07:00
Jason Ekstrand
16c5b9f4ed vk: Build a version of the driver for linking into unit tests 2015-08-14 17:04:39 -07:00
Kristian Høgsberg Kristensen
30d82136bb vk: Update generated headers
This update brings usable IVB/HSW RENDER_SURFACE_STATE structs
and adds more float fields that we previously failed to
recognize.
2015-08-12 21:05:32 -07:00
Kristian Høgsberg Kristensen
9564dd37a0 vk: Query aperture size up front in anv_physical_device_init()
We already query the device in various ways here and we can just also
get the aperture size.  This avoids keeping an extra drm fd open
during the life time of the driver.

Also, we need to use explicit 64 bit types for the aperture size, not
size_t.
2015-08-10 17:18:55 -07:00
Kristian Høgsberg Kristensen
8605ee60e0 vk: Share upload logic and add size assert
This lets us hit an assert if we exceed the block pool size instead of
GPU hanging.

Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
2015-08-10 17:17:45 -07:00
Jason Ekstrand
6757e2f75c vk/cmd_buffer: Allow for null VkCmdPool's 2015-08-04 14:01:08 -07:00
Kristian Høgsberg Kristensen
4b097d73e6 vk: Call anv_batch_emit_dwords() up front in anv_batch_emit()
This avoids putting a memory barrier between the template struct and
the pack function, which generates much better code.
2015-08-03 15:38:14 -07:00
Kristian Høgsberg Kristensen
fbb119061e vk: Update generated headers
This adds zeroing of reserved blocks of dwords and removes an
instruction definition.
2015-08-03 15:21:27 -07:00
Jason Ekstrand
facf587dea vk/allocator: Solve a data race in anv_block_pool
The anv_block_pool data structure suffered from the exact same race as the
state pool.  Namely, that the uniqueness of the blocks handed out depends
on the next_block value increasing monotonically.  However, this invariant
did not hold thanks to our block "return" concept.
2015-08-03 01:19:34 -07:00
Jason Ekstrand
5e5a783530 vk: Add and use an anv_block_pool_size() helper 2015-08-03 01:18:09 -07:00
Jason Ekstrand
56ce219493 vk/allocator: Make block_pool_grow take and return a size
It takes the old size as an argument and returns the new size as the return
value.  On error, it returns a size of 0.
2015-08-03 01:06:45 -07:00
Jason Ekstrand
fd64598462 vk/allocator: Fix a data race in the state pool
The previous algorithm had a race because of the way we were using
__sync_fetch_and_add for everything.  In particular, the concept of
"returning" over-allocated states in the "next > end" case was completely
bogus.  If too many threads were hitting the state pool at the same time,
it was possible to have the following sequence:

A: Get an offset (next == end)
B: Get an offset (next > end)
A: Resize the pool (now next < end by a lot)
C: Get an offset (next < end)
B: Return the over-allocated offset
D: Get an offset

in which case D will get the same offset as C.  The solution to this race
is to get rid of the concept of "returning" over-allocated states.
Instead, the thread that gets a new block simply sets the next and end
offsets directly and threads that over-allocate don't return anything and
just futex-wait.  Since you can only ever hit the over-allocate case if
someone else hit the "next == end" case and hasn't resized yet, you're
guaranteed that the end value will get updated and the futex won't block
forever.
2015-08-03 00:38:48 -07:00
Jason Ekstrand
481122f4ac vk/allocator: Make a few things more consistant 2015-08-03 00:35:19 -07:00
Jason Ekstrand
e65953146c vk/allocator: Use memory pools rather than (MALLOC|FREE)LIKE
We have pools, so we should be using them.  Also, I think this will help
keep valgrind from getting confused when we have to end up fighting with
system allocations such as those from malloc/free and mmap/munmap.
2015-07-31 10:38:28 -07:00
Jason Ekstrand
1920ef9675 vk/allocator: Add an anv_state_pool_finish function
Currently this is a no-op but it gives us a place to put finalization
things in the future.
2015-07-31 10:38:28 -07:00
Jason Ekstrand
930598ad56 vk/instance: valgrind-guard client-provided allocations 2015-07-31 10:38:23 -07:00
Jason Ekstrand
e40bdcef1f vk/device: Add anv_instance_alloc/free helpers
This way we can more consistently alloc/free the device and it will provide
us a better place to put valgrind hooks in the next patch
2015-07-31 10:14:17 -07:00
Jason Ekstrand
0f050aaa15 vk/device: Mark newly allocated memory as undefined for valgrind
This way valgrind still works even if the client gives us memory that has
been initialized or re-uses memory for some reason.
2015-07-31 09:44:42 -07:00
Jason Ekstrand
1f49a7d9fc vk/batch_chain: Decrement num_relocs instead of incrementing it 2015-07-31 09:11:47 -07:00
Jason Ekstrand
220a01d525 vk/batch_chain: Compute secondary exec mode after finishing the bo
Figuring out whether or not to do a copy requires knowing the length of the
final batch_bo.  This gets set by anv_batch_bo_finish so we have to do it
afterwards.  Not sure how this was even working before.
2015-07-31 08:52:30 -07:00
Jason Ekstrand
26ba0ad54d vk: Re-name command buffer implementation files
Previously, the command buffer implementation was split between
anv_cmd_buffer.c and anv_cmd_emit.c.  However, this naming convention was
confusing because none of the Vulkan entrypoints for anv_cmd_buffer were
actually in anv_cmd_buffer.c.  This changes it so that anv_cmd_buffer.c is
what you think it is and the internals are in anv_batch_chain.c.
2015-07-30 15:00:42 -07:00
Jason Ekstrand
e379cd9a0e vk/cmd_buffer: Add a simple command pool implementation 2015-07-30 14:55:49 -07:00
Jason Ekstrand
4c2a182a36 vk/cmd_buffer: Add support for zero-copy batch chaining 2015-07-30 14:22:17 -07:00
Jason Ekstrand
21004f23bf vk: Add initial support for secondary command buffers 2015-07-30 11:36:48 -07:00
Jason Ekstrand
5aee803b97 vk/cmd_buffer: Split batch chaining into a helper function 2015-07-30 11:34:58 -07:00
Jason Ekstrand
0c4a2dab7e vk/device: Make BATCH_SIZE a global #define 2015-07-30 11:34:09 -07:00
Jason Ekstrand
ace093031d vk/cmd_buffer: Add functions for cloning a list of anv_batch_bo's
We'll need this to implement secondary command buffers.
2015-07-30 11:32:27 -07:00
Jason Ekstrand
7af67e085f vk/reloc_list: Actually set the new length in reloc_list_grow 2015-07-30 11:29:55 -07:00
Jason Ekstrand
f15be18c92 util/list: Add list splicing functions
This adds functions for splicing one list into another.  These have
more-or-less the same API as the kernel list splicing functions.
2015-07-30 11:28:22 -07:00
Jason Ekstrand
e39d0b635c CLONE 2015-07-30 08:24:02 -07:00
Jason Ekstrand
82548a3aca vk/cmd_buffer: Invalidate texture cache in emit_state_base_address
Previously, the caller of emit_state_base_address was doing this.  However,
putting it directly in emit_state_base_address means that we'll never
forget the flush at the cost of one PIPE_CONTROL at the top every batch
(that should do nothing since the kernel just flushed for us).
2015-07-30 08:24:02 -07:00
Jason Ekstrand
56ce896d73 vk/cmd_buffer: Rename emit_batch_buffer_end to end_batch_buffer
This is more generic and doesn't imply that it emits MI_BATCH_BUFFER_END.
While we're at it, we'll move NOOP adding from bo_finish to
end_batch_buffer.
2015-07-30 08:24:02 -07:00
Jason Ekstrand
3ed9cea84d vk/cmd_buffer: Use an array to track all know anv_batch_bo objects
Instead of walking the list of batch and surface buffers, we simply keep
track of all known batch and surface buffers as we build the command
buffer.  Then we use this new list to construct the validate list.
2015-07-29 15:30:15 -07:00
Jason Ekstrand
0f31c580bf vk/cmd_buffer: Rework validate list creation
The algorighm we used previously required us to call add_bo in a particular
order in order to guarantee that we get the initial batch buffer as the
last element in the validate list.  The new algorighm does a recursive walk
over the buffers and then re-orders the list.  This should be much more
robust as we start to add circular dependancies in the relocations.
2015-07-29 15:16:54 -07:00
Jason Ekstrand
4fc7510a7c vk/cmd_buffer: Move emit_batch_buffer_end higher in the file 2015-07-29 12:01:08 -07:00
Jason Ekstrand
8208f01a35 vk/cmd_buffer: Store the relocation list in the anv_batch_bo struct
Before, we were doing this thing where we had one big relocation list for
the whole command buffer and each subbuffer took a chunk out of it.  Now,
we store the actual relocation list in the anv_batch_bo.  This comes at the
cost of more small allocations but makes a lot of things simpler.
2015-07-29 12:01:08 -07:00
Jason Ekstrand
7d50734240 vk/batch: Make relocs a pointer to a relocation list
Previously anv_batch.relocs was an actual relocation list.  However, this
is limiting if the implementation of the batch wants to change the
relocation list as the batch progresses.
2015-07-29 12:01:08 -07:00
Kristian Høgsberg Kristensen
fcea3e2d23 vk/headers: Update to new generated gen headers
This update fixes cases where a 48-bit address field was split into
two parts:

    __gen_address_type                           MemoryAddress;
    uint32_t                                     MemoryAddressHigh;

which cases this pack code to be generated:

   dw[1] =
       __gen_combine_address(data, &dw[1], values->MemoryAddress, dw1);

   dw[2] =
      __gen_field(values->MemoryAddressHigh, 0, 15) |
      0;

which breaks for addresses above 4G.

This update also fixes arrays of structs in commands and structs, for
example, we now have:

   struct GEN8_BLEND_STATE_ENTRY                Entry[8];

and the pack functions now write all dwords in the packet, making
valgrind happy.

Finally, we would try to pack 64 bits of blend state into a uint32_t -
that's also fixed now.
2015-07-29 11:02:33 -07:00
Jason Ekstrand
65f3d00cd6 vk/cmd_buffer: Update a comment 2015-07-29 08:33:56 -07:00
Jason Ekstrand
86a53d2880 vk/cmd_buffer: Use a doubly-linked list for batch and surface buffers
This is probably better than hand-rolling the list of buffers.
2015-07-28 17:47:59 -07:00
Jason Ekstrand
6aba52381a vk/aub: Use the data directly from the execbuf2
Previously, we were crawling through the anv_cmd_buffer datastructure to
pull out batch buffers and things.  This meant that every time something in
anv_cmd_buffer changed, we broke aub dumping.  However, aub dumping should
just dump the stuff the kernel knows about so we really don't need to be
crawling driver internals.
2015-07-28 16:53:45 -07:00
Jason Ekstrand
3c2743dcd1 vk/cmd_buffer: Pull the execbuf stuff into a substruct 2015-07-27 16:37:09 -07:00
Jason Ekstrand
4ced8650d4 vk/cmd_buffer: Move the remaining entrypoints into cmd_emit.c 2015-07-27 15:14:31 -07:00
Jason Ekstrand
d4c249364d vk/cmd_buffer: Move the re-emission of STATE_BASE_ADDRESS to the flushing code
This used to happen magically in cmd_buffer_new_surface_state_bo.  However,
according to Ken, STATE_BASE_ADDRESS is very gen-specific so we really
shouldn't have it in the generic data-structure code.
2015-07-27 15:05:06 -07:00
Jason Ekstrand
117d74b4e2 vk/cmd_buffer: Factor the guts of CmdBufferEnd into two helpers 2015-07-27 14:52:16 -07:00
Jason Ekstrand
8fb6405718 vk/cmd_buffer: Factor the guts of (Create|Reset|Destroy)CmdBuffer into helpers 2015-07-27 14:23:56 -07:00
Jason Ekstrand
80ad578c4e vk/private.h: Re-arrange and better comment anv_cmd_buffer 2015-07-27 12:40:43 -07:00
Jason Ekstrand
50e86b5777 vk: Actually advertise 0.138.1 at runtime 2015-07-23 10:44:27 -07:00
Jason Ekstrand
f884b500d0 vk/vulkan.h: Bump to the version 0.138.1 header
This doesn't actually require any implementation changes but it does change
an enum so it is ABI-incompatable with 0.138.0.
2015-07-23 10:38:22 -07:00
Jason Ekstrand
e99773badd vk: Add two more valgrind checks 2015-07-23 08:57:54 -07:00
Jason Ekstrand
b1fcc30ff0 vk/meta: Destroy shader modules 2015-07-22 17:51:26 -07:00
Jason Ekstrand
3460e6cb2f vk/device: Finish the scratch block pool on device destruction 2015-07-22 17:51:14 -07:00
Jason Ekstrand
867f6cb90c vk: Add a FreeDescriptorSets function 2015-07-22 17:33:09 -07:00
Jason Ekstrand
c9dc1f4098 vk/pipeline: Be more sloppy about shader entrypoint names
The CTS passes in NULL names right now.  It's not too hard to support that
as just "main".  With this, and a patch to vulkancts, we now pass all 6
tests.
2015-07-22 15:26:56 -07:00
Chad Versace
2c2233e328 vk: Prefix most filenames with anv
Jason started the task by creating anv_cmd_buffer.c and anv_cmd_emit.c.
This patch finishes the task by renaming all other files except
gen*_pack.h and glsl_scraper.py.
2015-07-17 20:25:38 -07:00
Chad Versace
f70d079854 vk/image: Remove unneeded data from anv_buffer_view
This completes the FINISHME to trim unneeded data from anv_buffer_view.

A VkExtent3D doesn't make sense for a VkBufferView.
So remove the member anv_surface_view::extent, and push it up to the two
objects that actually need it, anv_image_view and anv_attachment_view.
2015-07-17 14:48:23 -07:00
Chad Versace
194b77d426 vk: Document members of anv_surface_view 2015-07-17 14:39:05 -07:00
Chad Versace
169251bff0 vk: Remove more raw casts
This removes nearly all the remaining raw Anvil<->Vulkan casts from the
C source files.  (File compiler.cpp still contains many raw casts, and
I plan on ignoring that).

As far as I can tell, the only remaining raw casts are:
    anv_attachment_view -> anv_depth_stencil_view
    anv_attachment_view -> anv_color_attachment_view
2015-07-17 14:32:22 -07:00
Chad Versace
fc3838376b vk/image: Add braces around multi-line ifs 2015-07-17 13:38:09 -07:00
Connor Abbott
b2cfd85060 nir/spirv: don't declare builtin blocks
They aren't used, and the backend was barfing on them. Also, remove a
hack in in compiler.cpp now that they're gone.
2015-07-16 11:04:22 -07:00
Connor Abbott
b599735be4 nir/spirv: add support for loading UBO's
We directly emit ubo load intrinsics based off of the offset information
handed to us from SPIR-V.
2015-07-16 10:54:09 -07:00
Connor Abbott
513ee7fa48 nir/types: add more nir_type_is_xxx() wrappers 2015-07-15 21:58:32 -07:00
Connor Abbott
9fa0989ff2 nir: move to two-level binding model for UBO's
The GLSL layer above is still hacky, so we're really just moving the
hack into GLSL-to-NIR. I'd rather not go all the way and make GLSL
support the Vulkan binding model too, since presumably we'll be
switching to SPIR-V exclusively, and so working on proper GLSL support
will be a waste of time. For now, doing this keeps it working as we add
SPIR-V->NIR support though.
2015-07-15 17:18:48 -07:00
Chad Versace
5520221118 vk: Remove unneeded vulkan-138.h 2015-07-15 17:16:07 -07:00
Chad Versace
73a8f9543a vk: Bump vulkan.h version to 0.138 2015-07-15 17:16:07 -07:00
Chad Versace
55781f8d02 vk/0.138: Update VkResult values 2015-07-15 17:16:07 -07:00
Chad Versace
756d8064c1 vk/0.132: Do type-safety 2015-07-15 17:16:07 -07:00
Jason Ekstrand
927f54de68 vk/cmd_buffer: Move batch buffer padding to anv_batch_bo_finish() 2015-07-15 17:11:04 -07:00
Jason Ekstrand
9c0db9d349 vk/cmd_buffer: Rename bo_count to exec2_bo_count 2015-07-15 16:56:29 -07:00
Jason Ekstrand
6037b5d610 vk/cmd_buffer: Add a helper for allocating dynamic state
This matches what we do for surface state and makes the dynamic state pool
more opaque to things that need to get dynamic state.
2015-07-15 16:56:29 -07:00
Jason Ekstrand
7ccc8dd24a vk/private.h: Move cmd_buffer functions to near the cmd_buffer struct 2015-07-15 16:56:29 -07:00
Jason Ekstrand
d22d5f25fc vk: Split command buffer state into its own structure
Everything else in anv_cmd_buffer is the actual guts of the datastructure.
2015-07-15 16:56:29 -07:00
Jason Ekstrand
da4d9f6c7c vk: Move most of the anv_Cmd related stuff to its own file 2015-07-15 16:56:28 -07:00
Jason Ekstrand
d862099198 vk: Pull the guts of anv_cmd_buffer into its own file 2015-07-15 16:56:28 -07:00
Chad Versace
498ae009d3 vk/glsl: Replace raw casts
Needed for upcoming type-safety changes.
2015-07-15 15:51:37 -07:00
Chad Versace
6f140e8af1 vk/meta: Remove raw casts
Needed for upcoming type-safety changes.
2015-07-15 15:51:37 -07:00
Chad Versace
badbf0c94a vk/x11: Remove raw casts
The raw casts in the WSI functions will break the build when the
type-safety changes arrive.
2015-07-15 15:49:10 -07:00
Chad Versace
61a4bfe253 vk: Delete vkDbgSetObjectTag()
Because VkObject is going away.
2015-07-15 15:34:20 -07:00
Jason Ekstrand
e1c78ebe53 vk/device: Remove unneeded checks for NULL 2015-07-15 15:22:32 -07:00
Jason Ekstrand
f4748bff59 vk/device: Provide proper NULL handling in anv_device_free
The Vulkan spec does not specify that the free function provided to
CreateInstance must handle NULL properly so we do it in the wrapper.  If
this ever changes in the spec, we can delete the extra 2 lines.
2015-07-15 15:22:32 -07:00
Chad Versace
4c8e1e5888 vk: Stop internally calling anv_DestroyObject()
Replace each anv_DestroyObject() with anv_DestroyFoo().

Let vkDestroyObject() live for a while longer for Crucible's sake.
2015-07-15 15:11:16 -07:00
Chad Versace
f5ad06eb78 vk: Fix vkDestroyObject dispatch for VkRenderPass
It called anv_device_free() instead of anv_DestroyRenderPass().
2015-07-15 15:07:41 -07:00
Chad Versace
188f2328de vk: Fix vkCreate/DestroyRenderPass
While updating vkDestroyObject, I discovered that vkDestroyPass reliably
crashes. That hasn't been an issue yet, though, because it is never
called.

In vkCreateRenderPass:
    - Don't allocate empty attachment arrays.
    - Ensure that pointers to empty attachment arrays are NULL.
    - Store VkRenderPassCreateInfo::subpassCount as
      anv_render_pass::subpass_count.

In vkDestroyRenderPass:
    - Fix loop bounds: s/attachment_count/subpass_count/
    - Don't call anv_device_free on null pointers.
2015-07-15 15:07:41 -07:00
Chad Versace
c6270e8044 vk: Refactor create/destroy code for anv_descriptor_set
Define two new functions:
    anv_descriptor_set_create
    anv_descriptor_set_destroy
2015-07-15 14:31:22 -07:00
Chad Versace
365d80a91e vk: Replace some raw casts with safe casts
That is, replace some instances of
    (VkFoo) foo
with
    anv_foo_to_handle(foo)
2015-07-15 14:00:21 -07:00
Chad Versace
7529e7ce86 vk: Correct anv_CreateShaderModule's prototype
s/VkShader/VkShaderModule/

:sigh: I look forward to type-safety.
2015-07-15 13:59:47 -07:00
Chad Versace
8213be790e vk: Define struct anv_image_view, anv_buffer_view
Follow the pattern of anv_attachment_view. We need these structs to
implement the type-safety that arrived in the 0.132 header.
2015-07-15 12:19:29 -07:00
Chad Versace
43241a24bc vk/meta: Fix declared type of a shader module
s/VkShader/VkShaderModule/

I'm looking forward to a type-safe vulkan.h ;)
2015-07-15 11:49:37 -07:00
Chad Versace
94e473c993 vk: Remove struct anv_object
Trivial removal because vkDestroyObject() no longer uses it.
2015-07-15 11:29:43 -07:00
Jason Ekstrand
e375f722a6 vk/device: More documentation on surface state flushing 2015-07-15 11:09:02 -07:00
Connor Abbott
9aabe69028 vk/device: explain why a flush is necessary
Jason found this from experimenting, but the docs give a reasonable
explanation of why it's necessary.
2015-07-14 23:03:19 -07:00
Chad Versace
5f46c4608f vk: Fix indentation of anv_dynamic_cb_state 2015-07-14 18:19:10 -07:00
Chad Versace
0eeba6b80c vk: Add finishmes for VkDescriptorPool
VkDescriptorPool is a stub object. As a consequence, it's impossible to
free descriptor set memory.
2015-07-14 18:19:00 -07:00
Jason Ekstrand
2b5a4dc5f3 vk: Add vulkan-138 and remove vulkan-0.132
Now, 138 is the target and not 132.  Once object destruction is finished,
we can delete 138 as it will be identical to vulkan.h
2015-07-14 17:54:13 -07:00
Jason Ekstrand
1f658bed70 vk/device: Add stub support for command pools
Real support isn't really that far away.  We just need a data structure
with a linked list and a few tests.
2015-07-14 17:40:00 -07:00
Jason Ekstrand
ca7243b54e vk/vulkan.h: Add the stuff for cross-queue resource sharing
We only have one queue, so this is currently a no-op on our implementation.
2015-07-14 17:20:50 -07:00
Jason Ekstrand
553b4434ca vk/vulkan.h: Add a couple of size fields for specialization constants 2015-07-14 17:12:39 -07:00
Jason Ekstrand
e5db209d54 vk/vulkan.h: Move around buffer image granularities 2015-07-14 17:10:37 -07:00
Jason Ekstrand
c7fcfebd5b vk: Add stubs for all the sparse resource stuff 2015-07-14 17:06:11 -07:00
Jason Ekstrand
2a9136feb4 vk/image: Add a stub for the new ImageFormatProperties function
This lets the client query about things like multisample.  We don't do
multisample right now, so I'll let Chad deal with that when he gets to it.
2015-07-14 17:05:30 -07:00
Jason Ekstrand
2c4dc92f40 vk/vulkan.h: Rename FormatInfo to FormatProperties 2015-07-14 17:04:46 -07:00
Jason Ekstrand
d7f44852be vk/vulkan.h: Re-order some #define's 2015-07-14 16:41:39 -07:00
Jason Ekstrand
1fd3bc818a vk/vulkan.h: Rename a function parameter 2015-07-14 16:39:01 -07:00
Jason Ekstrand
2e2f48f840 vk: Remove abreviations 2015-07-14 16:34:31 -07:00
Jason Ekstrand
02db21ae11 vk: Add the new extension/layer enumeration entrypoints 2015-07-14 16:11:21 -07:00
Jason Ekstrand
a463eacb8f vk/vulkan.h: Change maxAnisotropy to a float 2015-07-14 15:04:11 -07:00
Jason Ekstrand
98957b18d2 vk/vulkan.h: Add the VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT flag 2015-07-14 15:03:39 -07:00
Jason Ekstrand
a35811d086 vk/vulkan.h: Rename a couple of function parameters
No functional change.
2015-07-14 15:03:01 -07:00
Jason Ekstrand
55723e97f1 vk: Split the memory requirements/binding functions 2015-07-14 14:59:39 -07:00
Jason Ekstrand
ccb2e5cd62 vk: Make barriers more precise (rev. 133) 2015-07-14 14:50:35 -07:00
Jason Ekstrand
30445f8f7a vk: Split the dynamic state binding function into one per state 2015-07-14 14:26:10 -07:00
Jason Ekstrand
d2c0870ff3 vk/vulkan.h: Rename a function parameter to match 132 2015-07-14 14:11:04 -07:00
Jason Ekstrand
8478350992 vk: Implement Multipass 2015-07-14 11:37:14 -07:00
Jason Ekstrand
68768c40be vk/vulkan.h: Re-arrange some enums and definitions in preparation for 131 2015-07-14 11:32:15 -07:00
Chad Versace
66cbb7f76d vk/0.132: Add vkDestroyRenderPass() 2015-07-14 11:21:31 -07:00
Chad Versace
6d0ed38db5 vk/0.132: Add vkDestroy*View()
vkDestroyColorAttachmentView
vkDestroyDepthStencilView

These functions are not in the 0.132 header, but adding them will help
us attain the type-safety API updates more quickly.
2015-07-14 11:19:22 -07:00
Chad Versace
1ca611cbad vk/0.132: Add vkDestroyCommandBuffer() 2015-07-14 11:11:41 -07:00
Chad Versace
6eec0b186c vk/0.132: Add vkDestroyImageView()
Just declare it in vulkan.h. Jason defined the function earlier
in image.c.
2015-07-14 11:09:14 -07:00
Chad Versace
4b2c5a98f0 vk/0.132: Add vkDestroyBufferView()
Just declare it in vulkan.h. Jason already defined the function
earlier in vulkan.c.
2015-07-14 11:06:57 -07:00
Chad Versace
08f7731f67 vk/0.132: Add vkDestroyFramebuffer() 2015-07-14 10:59:30 -07:00
Chad Versace
0c8456ef1e vk/0.132: Add vkDestroyDynamicDepthStencilState() 2015-07-14 10:54:51 -07:00
Chad Versace
b29c929e8e vk/0.132: Add vkDestroyDynamicColorBlendState() 2015-07-14 10:52:45 -07:00
Chad Versace
5e1737c42f vk/0.132: Add vkDestroyDynamicRasterState() 2015-07-14 10:51:08 -07:00
Chad Versace
d80fea1af6 vk/0.132: Add vkDestroyDynamicViewportState() 2015-07-14 10:42:45 -07:00
Chad Versace
9250e1e9e5 vk/0.132: Add vkDestroyDescriptorPool() 2015-07-14 10:38:22 -07:00
Chad Versace
f925ea31e7 vk/0.132: Add vkDestroyDescriptorSetLayout() 2015-07-14 10:36:49 -07:00
Chad Versace
ec5e2f4992 vk/0.132: Add vkDestroySampler() 2015-07-14 10:34:00 -07:00
Chad Versace
a684198935 vk/0.132: Add vkDestroyPipelineLayout() 2015-07-14 10:29:47 -07:00
Chad Versace
6e5ab5cf1b vk/0.132: Add vkDestroyPipeline() 2015-07-14 10:26:17 -07:00
Chad Versace
114015321e vk/0.132: Add vkDestroyPipelineCache() 2015-07-14 10:19:27 -07:00
Chad Versace
cb57bff36c vk/0.132: Add vkDestroyShader() 2015-07-14 10:16:22 -07:00
Chad Versace
8ae8e14ba7 vk/0.132: Add vkDestroyShaderModule() 2015-07-14 10:13:09 -07:00
Chad Versace
dd67c134ad vk/0.132: Add vkDestroyImage()
We only need to add it to vulkan.h because Jason defined the function
earlier in image.c.
2015-07-14 10:13:00 -07:00
Chad Versace
e18377f435 vk/0.132: Dispatch vkDestroyObject to new destructors
Oops. My recent commits added new destructors, but forgot to teach
vkDestroyObject about them. They are:
  vkDestroyFence
  vkDestroyEvent
  vkDestroySemaphore
  vkDestroyQueryPool
  vkDestroyBuffer
2015-07-14 09:58:22 -07:00
Chad Versace
e93b6d8eb1 vk/0.132: Add vkDestroyBuffer() 2015-07-14 09:47:45 -07:00
Chad Versace
584cb7a16f vk/0.132: Add vkDestroyQueryPool() 2015-07-14 09:44:58 -07:00
Chad Versace
68c7ef502d vk/0.132: Add vkDestroyEvent() 2015-07-14 09:33:47 -07:00
Chad Versace
549070b18c vk/0.132: Add vkDestroySemaphore() 2015-07-14 09:31:34 -07:00
Chad Versace
ebb191f145 vk/0.132: Add vkDestroyFence() 2015-07-14 09:29:35 -07:00
Chad Versace
435ccf4056 vk/0.132: Rename VkDynamic*State types
sed -i -e 's/VkDynamicVpState/VkDynamicViewportState/g' \
       -e 's/VkDynamicRsState/VkDynamicRasterState/g' \
       -e 's/VkDynamicCbState/VkDynamicColorBlendState/g' \
       -e 's/VkDynamicDsState/VkDynamicDepthStencilState/g' \
       $(git ls-files include/vulkan src/vulkan)
2015-07-13 16:19:28 -07:00
Connor Abbott
ffb51fd112 nir/spirv: update to SPIR-V revision 31
This means that now the internal version of glslangValidator is
required. This includes some changes due to the sampler/texture rework,
but doesn't actually enable anything more yet. We also don't yet handle
UBO's correctly, and don't handle matrix stride and row major/column
major yet.
2015-07-13 15:01:01 -07:00
Chad Versace
45f8723f44 vk/0.132: Move VkQueryControlFlags 2015-07-13 13:09:32 -07:00
Chad Versace
180c07ee50 vk/0.132: Move VkImageAspectFlags 2015-07-13 13:08:56 -07:00
Chad Versace
4b05a8cd31 vk/0.132: Move VkCmdBufferOptimizeFlags 2015-07-13 13:08:07 -07:00
Chad Versace
f1cf55fae6 vk/0.132: Move VkWaitEvent 2015-07-13 13:06:53 -07:00
Chad Versace
3112098776 vk/0.132: Move VkCmdBufferLevel 2015-07-13 13:06:33 -07:00
Chad Versace
c633ab5822 vk/0.132: Drop VK_ATTACHMENT_STORE_OP_RESOLVE_MSAA 2015-07-13 13:05:24 -07:00
Chad Versace
8f3b2187e1 vk/0.132: Rename bool32_t -> VkBool32
sed -i 's/bool32_t/VkBool32/g' \
  $(git ls-files src/vulkan include/vulkan)
2015-07-13 13:03:36 -07:00
Chad Versace
77dcfe3c70 vk/0.132: Remove stray typedef 2015-07-13 12:58:17 -07:00
Chad Versace
601d0891a6 vk/0.132: Move VKImageUsageFlags 2015-07-13 12:48:44 -07:00
Chad Versace
829810fa27 vk/0.132: Move VkImageType and VkImageTiling 2015-07-13 11:49:56 -07:00
Chad Versace
17c8232ecf vk/0.132: Import the 0.132 header
Import it as vulkan-0.132.h.
2015-07-13 11:47:12 -07:00
Chad Versace
a158ff55f0 vk/vulkan.h: Remove headers for old API versions
Remove the temporary headers for 0.90 and 0.130.
2015-07-13 11:46:30 -07:00
Chad Versace
1c4238a8e5 vk/0.130: Bump header version to 0.130
All APIs have been updated. This eliminates the diff between the
work-in-progress header and the 0.130 header.
2015-07-10 20:06:09 -07:00
Chad Versace
f43a304dc6 vk/0.130: Update vkAllocMemory to use VkMemoryType 2015-07-10 17:35:52 -07:00
Chad Versace
df2a013881 vk/0.130: Implement vkGetPhysicalDeviceMemoryProperties() 2015-07-10 17:35:52 -07:00
Chad Versace
c7f512721c vk/gem: Change signature of anv_gem_get_aperture()
Replace the anv_device parameter with anv_physical_device, because this needs
querying before vkCreateDevice.
2015-07-10 17:35:52 -07:00
Chad Versace
8cda3e9b1b vk/device: Add member anv_physical_device::fd
During anv_physical_device_init(), we opend the DRM device to do some
queries, then promptly closed it. Now we keep it open for the lifetime
of the anv_physical_device so that we can query it some more during
vkGetPhysicalDevice*Properties() [which will happen in follow-up
commits].
2015-07-10 17:35:52 -07:00
Chad Versace
4422bd4cf6 vk/device: Add func anv_physical_device_finish()
Because in a follow-up patch I need to do some non-trival teardown on
anv_physical_device. Currently, however, anv_physical_device_finish() is
currently a no-op that's just called in the right place.

Also, rename function fill_physical_device -> anv_physical_device_init
for symmetry.
2015-07-10 17:35:52 -07:00
Jason Ekstrand
7552e026da vk/device: Add an explicit destructor for RenderPass 2015-07-10 12:33:04 -07:00
Jason Ekstrand
8b342b39a3 vk/image: Add an explicit DestroyImage function 2015-07-10 12:30:58 -07:00
Jason Ekstrand
b94b8dfad5 vk/image: Add explicit constructors for buffer/image view types 2015-07-10 12:26:31 -07:00
Jason Ekstrand
18340883e3 nir: Add C++ versions of NIR_(SRC|DEST)_INIT 2015-07-10 11:57:33 -07:00
Chad Versace
9e64a2a8e4 mesa: Fix generation of git_sha1.h.tmp for gitlinks
Don't assume that $(top_srcdir)/.git is a directory. It may be a
gitlink file [1] if $(top_srcdir) is a submodule checkout or a linked
worktree [2].

[1] A "gitlink" is a text file that specifies the real location of
    the gitdir.
[2] Linked worktrees are a new feature in Git 2.5.

Cc: "10.6, 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 75784243df)
2015-07-10 11:24:25 -07:00
Jason Ekstrand
19f0a9b582 vk/query.c: Use the casting functions 2015-07-09 20:32:44 -07:00
Jason Ekstrand
6eb221c884 vk/pipeline.c: Use the casting functions 2015-07-09 20:28:08 -07:00
Jason Ekstrand
fb4e2195ec vk/formats.c: Use the casting functions 2015-07-09 20:24:17 -07:00
Jason Ekstrand
a52e208203 vk/image.c: Use the casting functions 2015-07-09 20:24:07 -07:00
Jason Ekstrand
b1de1d4f6e vk/device.c: One more use of a casting function 2015-07-09 20:23:46 -07:00
Jason Ekstrand
8739e8fbe2 vk/meta.c: Use the casting functions 2015-07-09 20:16:13 -07:00
Jason Ekstrand
92556c77f4 vk: Fix the build 2015-07-09 18:59:08 -07:00
Jason Ekstrand
098209eedf device.c: Use the cast helpers a bunch of places 2015-07-09 18:49:43 -07:00
Jason Ekstrand
73f9187e33 device.c: Use the cast helpers 2015-07-09 18:41:27 -07:00
Jason Ekstrand
7d24fab4ef vk/private.h: Add a bunch of static inline casting functions
We will need these as soon as we turn on type saftey.  We might as well
define and start using them now rather than later.
2015-07-09 18:40:54 -07:00
Jason Ekstrand
5c49730164 vk/device.c: Fix whitespace issues 2015-07-09 18:20:28 -07:00
Jason Ekstrand
c95f9b61f2 vk/device.c: Use ANV_FROM_HANDLE a bunch of places 2015-07-09 18:20:10 -07:00
Jason Ekstrand
335e88c8ee vk/vulkan.h: Add the pEnabledFeatures field to DeviceCreateInfo 2015-07-09 16:21:31 -07:00
Jason Ekstrand
34871cf7f3 vk/vulkan.h: Change the MsCreateInfo structure to the 130 version
We do nothing with it at the moment, so this is a no-op.
2015-07-09 16:19:54 -07:00
Jason Ekstrand
8c2c37fae7 vk: Remove the old GetPhysicalDeviceInfo call 2015-07-09 16:14:37 -07:00
Jason Ekstrand
1f907011a3 vk: Add the new PhysicalDeviceQueue queries 2015-07-09 16:14:37 -07:00
Jason Ekstrand
977a469bce vk: Support GetPhysicalDeviceProperties 2015-07-09 16:14:37 -07:00
Jason Ekstrand
65e0b304b6 vk: Add support for GetPhysicalDeviceLimits 2015-07-09 16:14:37 -07:00
Jason Ekstrand
f6d51f3fd3 vk: Add GetPhysicalDeviceFeatures 2015-07-09 16:14:37 -07:00
Chad Versace
5b75dffd04 vk/device: Fix vkEnumeratePhysicalDevices()
The Vulkan spec says that pPhysicalDeviceCount is an out parameter if
pPhysicalDevices is NULL; otherwise it's an inout parameter.

Mesa incorrectly treated it unconditionally as an inout parameter, which
could have lead to reading unitialized data.
2015-07-09 15:53:21 -07:00
Chad Versace
fa915b661d vk/device: Move device enumeration to vkEnumeratePhysicalDevices()
Don't enumerate devices in vkCreateInstance(). That's where global,
device-independent initialization should happen. Move device enumeration
to the more logical location, vkEnumeratePhysicalDevices().
2015-07-09 15:41:17 -07:00
Chad Versace
c34d314db3 vk/device: Be consistent about path to DRM device
Function fill_physical_device() has a 'path' parameter, and struct
anv_physical_device has a 'path' member. Sometimes these are used;
sometimes hardcoded "/dev/dri/renderD128" is used instead.

Be consistent. Hardcode "/dev/dri/renderD128" in exactly one location,
during initialization of the physical device.
2015-07-09 15:27:26 -07:00
Connor Abbott
cff06bbe7d vk/compiler: create an empty parameters list
Prevents problems when initializing the sanity_param_count.
2015-07-09 14:29:23 -04:00
Connor Abbott
3318a86d12 nir/spirv: fix wrong writemask for ALU operations 2015-07-09 14:28:39 -04:00
Connor Abbott
b8fedc19f5 nir/spirv: fix memory context for builtin variable
Fixes valgrind errors with func.depthstencil.basic.
2015-07-08 22:03:30 -04:00
Connor Abbott
e4292ac039 nir/spirv: zero out value array
Before values are pushed or annotated with a name, decoration, etc.,
they need to have an invalid type, NULL name, NULL decoration, etc.
ralloc zero's everything by accident, so this wasn't an issue in
practice, but we should be explicitly zero'ing it.
2015-07-08 22:03:30 -04:00
Connor Abbott
997831868f vk/compiler: create the right kind of program struct
This fixes Valgrind errors and gets all the tests to pass with
--use-spir-v.
2015-07-08 22:03:30 -04:00
Connor Abbott
a841e2c747 vk/compiler: mark inputs/outputs as read/written
This doesn't handle inputs and outputs larger than a vec4, but we plan
to add a varyiing splitting/packing pass to handle those anyways.
2015-07-08 22:03:30 -04:00
Jason Ekstrand
8640dc12dc vk/vulkan.h: Copy the VkStructureType enum from version 130
We now have the exact same structs which require pType.
2015-07-08 17:45:52 -07:00
Jason Ekstrand
5a4ebf6bc1 vk: Move to the new pipeline creation API's 2015-07-08 17:30:18 -07:00
Chad Versace
4fcb32a17d vk/0.130: Remove VkImageViewCreateInfo::minLod
It's now set solely through VkSampler.
2015-07-08 14:48:22 -07:00
Jason Ekstrand
367b9ba78f vk/vulkan.h: Move renderPassContinue from GraphicsBeginInfo to BeginInfo 2015-07-08 14:37:30 -07:00
Jason Ekstrand
d29ec8fa36 vk/vulkan.h: Update to the new UpdateDescriptorSets api 2015-07-08 14:24:56 -07:00
Jason Ekstrand
c8577b5f52 vk: Add a macro for creating anv variables from vulkan handles
This is very helpful for doing the mass bunch of casts at the top of a
function.  It will also be invaluable when we get type saftey in the API.
2015-07-08 14:24:14 -07:00
Chad Versace
ccb27a002c vk/0.130 Update VkObjectType values
Don't import any new enum tokens from the 0.130 header. Just update the
values of existing enums. This reduces the diff by about 16 lines.
2015-07-08 12:53:49 -07:00
Chad Versace
8985dd15a1 vk/0.130: Remove VkDescriptorUpdateMode
Nowhere used.
2015-07-08 12:51:46 -07:00
Chad Versace
e02dfa309a vk/0.130: Remove VK_DEVICE_CREATE_MULTI_DEVICE_IQ_MATCH_BIT 2015-07-08 12:49:48 -07:00
Chad Versace
e9034ed875 vk/0.130: Update vkCmdBlitImage signature
Add VkTexFilter param. Ignored for now.
2015-07-08 12:47:48 -07:00
Jason Ekstrand
aae45ab583 vk/vulkan.h: Add packing parameters to BufferImageCopy 2015-07-08 11:51:34 -07:00
Chad Versace
b4ef7f354b vk/0.130: Remove msaa members of VkDepthStencilViewCreateInfo 2015-07-08 11:50:51 -07:00
Jason Ekstrand
522ab835d6 vk/vulkan.h: Move over to the new border color enums 2015-07-08 11:44:52 -07:00
Jason Ekstrand
7598329774 vk/vulkan.h: Move VkFormatProperties 2015-07-08 11:16:45 -07:00
Jason Ekstrand
52940e8fcf vk/vulkan.h: Add RenderPassBeginContents 2015-07-08 10:57:13 -07:00
Jason Ekstrand
e19d6be2a9 vk/vulkan.h: Add command buffer levels 2015-07-08 10:53:32 -07:00
Jason Ekstrand
c84f2d3b8c vk/vulkan.h: Import the VkPipeEvent enum from 130
Now, VkPipeEventFlags is back in sync with VkPipeEvent
2015-07-08 10:49:46 -07:00
Jason Ekstrand
b20cc72603 vk/vulkan.h: Remove VkFormatInfoType 2015-07-08 10:39:31 -07:00
Jason Ekstrand
8e05bbeee9 vk/vulkan.h: Update extension handling to rev 130 2015-07-08 10:38:07 -07:00
Jason Ekstrand
cc29a5f4be vk/vulkan.h: Move format quering to the physical device 2015-07-08 09:34:47 -07:00
Jason Ekstrand
719fa8ac74 vk/vulkan.h: Remove some peer opening structs and STRUCTURE_TYPE enums 2015-07-08 09:25:13 -07:00
Jason Ekstrand
fc6dcc6227 vk: Add a copy of the v90 header. 2015-07-08 09:23:29 -07:00
Jason Ekstrand
12119282e6 vk/vulkan.h: Remove an unneeded comment 2015-07-08 09:18:09 -07:00
Jason Ekstrand
3c65a1ac14 vk/vulkan.h: Remove the MemoryRange stubs and add sparse stubs 2015-07-08 09:16:48 -07:00
Jason Ekstrand
bb6567f5d1 vk/vulkan.h: Switch BindObjectMemory to a device function and remove the index 2015-07-08 09:04:16 -07:00
Jason Ekstrand
e7acdda184 vk/vulkan.h: Switch to the split ProcAddr functions in 130 2015-07-07 18:51:53 -07:00
Jason Ekstrand
db24afee2f vk/vulkan.h: Switch from GetImageSubresourceInfo to GetImageSubresourceLayout 2015-07-07 18:20:18 -07:00
Jason Ekstrand
ef8980e256 vk/vulkan.h: Switch from GetObjectInfo to GetMemoryRequirements 2015-07-07 18:16:42 -07:00
Jason Ekstrand
d9c2caea6a vk: Update memory flushing functions to 130
This involves updating the prototype for FlushMappedMemory, adding
InvalidateMappedMemoryRanges, and removing PinSystemMemory.
2015-07-07 17:22:31 -07:00
Jason Ekstrand
d5349b1b18 vk/vulkan.h: Constify the pFences parameter to ResetFences 2015-07-07 17:18:00 -07:00
Jason Ekstrand
6aa1b89457 vk/vulkan.h: Move the definitions of Create(Framebuffer|RenderPass)
This better matches the 130 header.
2015-07-07 17:13:10 -07:00
Jason Ekstrand
0ff06540ae vk: Implement the GetRenderAreaGranularity function
At the moment, we're just going to scissor clears so a granularity of 1x1
is all we need.
2015-07-07 17:11:37 -07:00
Jason Ekstrand
435b062b26 vk/vulkan.h: Add a PipelineLayout parameter to BindDescriptorSets 2015-07-07 17:06:10 -07:00
Jason Ekstrand
518ca9e254 vk/vulkan.h: Add a compareEnable parameter to SamplerCreateInfo
Our hardware doesn't actually need this, so adding it is a no-op.
2015-07-07 16:49:04 -07:00
Jason Ekstrand
672590710b vk/vulkan.h: Remove initialCount from SemaphoreCreateInfo 2015-07-07 16:42:42 -07:00
Jason Ekstrand
80046a7d54 vk/vulkan.h: Update clear color handling to 130 2015-07-07 16:37:43 -07:00
Jason Ekstrand
3e4b00d283 meta: Use the VkClearColorValue structure for the color attribute 2015-07-07 16:27:06 -07:00
Jason Ekstrand
a35fef1ab2 vk/vulkan.h: Remove the pass argument from EndRenderPass 2015-07-07 16:22:23 -07:00
Jason Ekstrand
d2ca7e24b4 vk/vulkan.h: Rename VertexInputStateInfo to VertexInputStateCreateInfo 2015-07-07 16:15:55 -07:00
Jason Ekstrand
abbb776bbe vk/vulkan.h: Remove programPointSize
Instead, we auto-detect whether or not your shader writes gl_PointSize.  If
it does, we use 1.0, otherwise we take it from the shader.
2015-07-07 16:00:46 -07:00
Chad Versace
e7ddfe03ab vk/0.130: Stub vkCmdClear*Attachment() funcs
vkCmdClearColorAttachment
vkCmdClearDepthStencilAttachment
2015-07-07 15:57:37 -07:00
Chad Versace
f89e2e6304 vk/0.130: Define enum VkImageAspectFlagBits 2015-07-07 15:57:37 -07:00
Chad Versace
55ab1737d3 vk/0.130: Define VkRect3D 2015-07-07 15:55:53 -07:00
Chad Versace
11901a9100 vk/0.130: Update name of vkCmdClearDepthStencilImage() 2015-07-07 15:53:35 -07:00
Chad Versace
dff32238c7 vk/0.130: Stub vkCmdExecuteCommands() 2015-07-07 15:51:55 -07:00
Chad Versace
85c0d69be9 vk/0.130: Update vkCmdWaitEvents() signature 2015-07-07 15:49:57 -07:00
Chad Versace
0ecb789b71 vk: Remove unused 'v' param from stub() macro 2015-07-07 15:47:24 -07:00
Chad Versace
f78d684772 vk: Stub vkCmdPushConstants() from 0.130 header 2015-07-07 15:46:19 -07:00
Chad Versace
18ee32ef9d vk: Update vkCmdPipelineBarrier to 0.130 header 2015-07-07 15:43:41 -07:00
Chad Versace
4af79ab076 vk: Add func anv_clear_mask()
A little helper func for inspecting and clearing bitmasks.
2015-07-07 15:43:41 -07:00
Jason Ekstrand
788a8352b9 vk/vulkan.h: Remove some unused fields.
In particular, the following are removed:

 - disableVertexReuse
 - clipOrigin
 - depthMode
 - pointOrigin
 - provokingVertex
2015-07-07 15:33:00 -07:00
Jason Ekstrand
7fbed521bb vk/vulkan.h: Remove the explicit primitive restart index
Unfortunately, this requires some non-trivial changes to the driver.  Now
that the primitive restart index isn't given explicitly by the client, we
always use ~0 for everything like D3D does.  Unfortunately, our hardware is
awesome and a 32-bit version of ~0 doesn't match any 16-bit values.  This
means, we have to set it to either UINT16_MAX or UINT32_MAX depending on
the size of the index type.  Since we get the index type from
CmdBindIndexBuffer and the rest of the VF packet from the pipeline, we need
to lazy-emit the VF packet.
2015-07-07 15:33:00 -07:00
Chad Versace
d6b840beff vk: Delete some comments not present in 0.130 header
Deleting the comments reduces diff noise.
2015-07-07 15:16:13 -07:00
Chad Versace
84a5bc25e3 vk: Pull in remaining 0.130 handle types
This pulls in the definition of VkShaderModule and VkPipelineCache,
which nowhere used yet.
2015-07-07 15:13:01 -07:00
Chad Versace
f2899b1af2 vk: Pull in #defines from 0.130 header
Despite not being used yet, pulling in the macros does diminish the
header diff.
2015-07-07 15:11:30 -07:00
Jason Ekstrand
962d6932fa vk/vulkan.h: Rename (min|max)Depth to (min|max)DepthBounds 2015-07-07 12:37:54 -07:00
Jason Ekstrand
1fb859e4b2 vk/vulkan.h: Remove client-settable pointSize from DynamicRsState 2015-07-07 12:35:32 -07:00
Jason Ekstrand
245583075c vk/vulkan.h: Remove UINT8 index buffers 2015-07-07 11:26:49 -07:00
Jason Ekstrand
0a42332904 vk/vulkan.h: Re-order the object declarations 2015-07-07 11:26:49 -07:00
Kristian Høgsberg Kristensen
a1eea996d4 vk: Emit 3DSTATE_SAMPLE_MASK
This was missing and was causing the driver to not work with
execlists. Presumably we get a different initial hw context with
execlists enabled, that has sample mask 0 initially.

Set this to 0xffff for now.  When we add MS support, we need to take the
value from VkPipelineMsStateCreateInfo::sampleMask.
2015-07-06 23:54:12 -07:00
Kristian Høgsberg Kristensen
c325bb24b5 vk: Pull in new generated headers
The new headers use stdbool for enable/disable fields which
implicitly converts expressions like (flags & 8) to 0 or 1.
Also handles MBO (must-be-one) fields by setting them to one,
corrects a bspec typo (_3DPRIM_LISTSTRIP_ADJ -> LINESTRIP) and
makes a few enum values less clashy.
2015-07-06 22:12:26 -07:00
Chad Versace
23075bccb3 vk/image: Validate vkCreateImageView more
Exhaustively validate the function input.  If it's not validated and
doesn't have an anv_finishme(), then I overlooked it.
2015-07-06 18:28:26 -07:00
Chad Versace
69e11adecc vk/image: Add more info to VkImageViewType table
Convert the table from the direct mapping
  VkImageViewType -> SurfaceType

into a mapping to an info struct
  VkImageViewType -> struct anv_image_view_info
2015-07-06 18:28:26 -07:00
Chad Versace
b844f542e0 vk: Update VkImageViewType to 0.130.0
This splits 1D and 1D_ARRAY, 2D and 2D_ARRAY, CUBE and CUBE_ARRAY.

The new tokens are unused. This is just a header update.
2015-07-06 18:28:26 -07:00
Chad Versace
5b04db71ff vk/image: Move validation for vkCreateImageView
Move the validation from anv_CreateImageView() and anv_image_view_init()
to anv_validate_CreateImageView(). No new validation is added.
2015-07-06 18:27:14 -07:00
Jason Ekstrand
1f1b26bceb vk/vulkan.h: Rename VkRect to VkRect2D 2015-07-06 17:47:18 -07:00
Jason Ekstrand
63c1190e47 vk/vulkan.h: Rename count to arraySize in VkDescriptorSetLayoutBinding 2015-07-06 17:43:58 -07:00
Jason Ekstrand
d84f3155b1 vk/vulkan.h: Remove the Vk(Memory|Semaphor|Image)OpenInfo structs
We already deleted the functions that need them.  The structs are just
dangling uselessly.
2015-07-06 17:37:13 -07:00
Jason Ekstrand
65f9ccb4e7 vk/vulkan.h: Remove VK_MEMORY_PROPERTY_PREFER_HOST_LOCAL_BIT
We weren't doing anything with it, so this is a no-op
2015-07-06 17:33:45 -07:00
Jason Ekstrand
68fa750f2e vk/vulkan.h: Replace DEVICE_COHERENT_BIT with DEVICE_NON_COHERENT_BIT 2015-07-06 17:32:28 -07:00
Jason Ekstrand
d5b5bd67f6 vk/vulkan.h: Use the query result bits from revision 130
None of the important bits or names actually changed.  It just
added/removed some no-op names.

No functional change.
2015-07-06 17:27:11 -07:00
Jason Ekstrand
d843418c2e vk/vulkan.h: One more quick enum refactor clean-up 2015-07-06 17:26:29 -07:00
Jason Ekstrand
2b37fc28d1 vk/vulkan.h: Get rid of VERTEX_INPUT_STEP_RATE_DRAW
We never supported it, so no functional change.
2015-07-06 17:24:26 -07:00
Jason Ekstrand
a75967b1bb vk/vulkan.h: Remove the CLEAR_OPTIMAL image layout 2015-07-06 17:21:19 -07:00
Jason Ekstrand
2b404e5d00 vk: Rename CPU_READ/WRITE_BIT to HOST_READ/WRITE_BIT 2015-07-06 17:18:25 -07:00
Jason Ekstrand
c57ca3f16f vk/vulkan.h: Remove VK_IMAGE_CREATE_CLONEABLE_BIT 2015-07-06 17:14:30 -07:00
Jason Ekstrand
2de388c49c vk: Remove SHAREABLE bits
They were removed from the Vulkan API and we don't really use them because
there are no multi-GPU i965 systems.
2015-07-06 17:12:51 -07:00
Jason Ekstrand
1b0c47bba6 vk/vulkan.h: Re-order the logic op enums 2015-07-06 17:08:11 -07:00
Jason Ekstrand
c7cef662d0 vk/vulkan.h: Reformat a bunch of enums to match revision 130
In theory, no functional change.
2015-07-06 17:06:02 -07:00
Jason Ekstrand
8c5e48f307 vk: Rename NUM_SHADER_STAGE to SHADER_STAGE_NUM
This is a refactor of more than just the header but it lets us finish
reformating the shader stage enum.
2015-07-06 16:43:28 -07:00
Jason Ekstrand
d9176f2ec7 vk: Reformat a bunch of enums
This accounts for a number differences between the generated headers and
the hand-written header.  Not all reformatting is done in this commit but
it does make the headers much more diffable.

In theory, no functional change.
2015-07-06 16:41:31 -07:00
Jason Ekstrand
e95bf93e5a vk: Pull the VkResult enum from revision 130 2015-07-06 16:15:12 -07:00
Jason Ekstrand
1b7b580756 vk: re-arrange enums to match the order in revision 130 2015-07-06 16:11:05 -07:00
Jason Ekstrand
2fb524b369 vk: Rename a parameter in CmdBindDynamicStateObject 2015-07-06 15:37:17 -07:00
Jason Ekstrand
c5ffcc9958 vk: Remove multi-device stuff 2015-07-06 15:34:55 -07:00
Jason Ekstrand
c5ab5925df vk: Remove ClearDescriptorSets 2015-07-06 15:32:40 -07:00
Jason Ekstrand
ea5fbe1957 vk: Remove begin/end descriptor pool update 2015-07-06 15:32:27 -07:00
Jason Ekstrand
9a798fa946 vk: Remove stub for CloneImageData 2015-07-06 15:30:05 -07:00
Jason Ekstrand
78a0d23d4e vk: Remove the stub support for memory priorities 2015-07-06 15:28:10 -07:00
Jason Ekstrand
11cf214578 vk: Remove the stub support for explicit memory references 2015-07-06 15:27:58 -07:00
Jason Ekstrand
0dc7d4ac8a vk/vulkan.h: Reformat structs to match revision 130
Structs in the old version were specified as

typedef struct VkSomeThing_
{
   type                                        field; // comment
} VkSomeThing;

However, in the generated headers, you have

typedef struct {
   type                                        field;
} VkSomeThing;

This commit also removes some unneeded whitespaces.
2015-07-06 15:19:12 -07:00
Jason Ekstrand
19aabb5730 vk/vulkah.h: Re-arrange structures to match the order in 130 2015-07-06 15:09:30 -07:00
Connor Abbott
f9dbc34a18 nir/spirv: fix some bugs 2015-07-06 15:00:37 -07:00
Connor Abbott
f3ea3b6e58 nir/spirv: add support for builtins inside structures
We may be able to revert this depending on the outcome of bug 14190, but
for now it gets vertex shaders working with SPIR-V.
2015-07-06 15:00:37 -07:00
Connor Abbott
15047514c9 nir/spirv: fix a bug with structure creation
We were creating 2 extra bogus fields.
2015-07-06 15:00:37 -07:00
Connor Abbott
73351c6a18 nir/spirv: fix a bad assertion in the decoration handling
We should be asserting that the parent decoration didn't hand us
a member if the child decoration did, but different child decorations
may obviously have different members.
2015-07-06 15:00:37 -07:00
Connor Abbott
70d2336e7e nir/spirv: pull out logic for getting builtin locations
Also add support for more builtins.
2015-07-06 15:00:37 -07:00
Connor Abbott
aca5fc6af1 nir/spirv: plumb through the type of dereferences
We need this to know if a deref is of a builtin.
2015-07-06 15:00:37 -07:00
Connor Abbott
66375e2852 nir/spirv: handle structure member builtin decorations 2015-07-06 15:00:37 -07:00
Connor Abbott
23c179be75 nir/spirv: add a vtn_type struct
This will handle decorations that aren't in the glsl_type.
2015-07-06 15:00:37 -07:00
Connor Abbott
f9bb95ad4a nir/spirv: move 'type' into the union
Since SSA values now have their own types, it's more convenient to make
'type' only used when we want to look up an actual SPIR-V type, since
we're going to change its type soon to support various decorations that
are handled at the SPIR-V -> NIR level.
2015-07-06 15:00:37 -07:00
Jason Ekstrand
d5dccc1e7a vk: Move CreateFramebuffer and CreateRenderPass higher in the header
This matches where they are in the 130 header.
2015-07-06 14:41:43 -07:00
Jason Ekstrand
4a42f45514 vk: Remove atomic counters stubs 2015-07-06 14:38:45 -07:00
Jason Ekstrand
630b19a1c8 vk: Make vulkan.h look more like vulkan-130.h
Most of these changes are insubstantial.  The only potentially substantial
cyhange is that we added a few new #defines for API maximums.
2015-07-06 14:32:52 -07:00
Jason Ekstrand
2f9180b1b2 vk: Add a revision 130 header along-side the current header 2015-07-06 14:16:51 -07:00
Jason Ekstrand
1f1465f077 vk/meta: Add an initial implementation of ClearColorImage 2015-07-02 18:15:06 -07:00
Jason Ekstrand
8a6c8177e0 vk/meta: Factor the guts out of cmd_buffer_clear 2015-07-02 18:13:59 -07:00
Jason Ekstrand
beb0e25327 vk: Roll back to API v90
This is what version 0.1 of the Vulkan SDK is built against.
2015-07-01 16:44:12 -07:00
Jason Ekstrand
fa663c27f5 nir/spirv: Add initial structure member decoration support 2015-07-01 15:38:26 -07:00
Jason Ekstrand
e3d60d479b nir/spirv: Make vtn_handle_type match the other handler functions
Previously, the caller of vtn_handle_type had to handle actually inserting
the type.  However, this didn't really work if the type was decorated in
any way.
2015-07-01 15:34:10 -07:00
Jason Ekstrand
7a749aa4ba nir/spirv: Add basic support for Op[Group]MemberDecorate 2015-07-01 14:18:07 -07:00
Jason Ekstrand
682eb9489d vk/x11: Allow for the client querying the size of the format properties 2015-07-01 14:18:07 -07:00
Chad Versace
bba767a9af vk/formats: Fix entry for S8_UINT
I forgot to update this when fixing the depth formats.
2015-06-30 09:41:44 -07:00
Chad Versace
6720b47717 vk/formats: Document new meaning of anv_format::cpp
The way the code currently works is that anv_format::cpp is the cpp of
anv_format::surface_format.

Me and Kristian disagree about how the code *should* work. Despite that,
I think it's in our discussion's best interest to document how the code
*currently* works. That should eliminate confusion.

If and when the code begins to work differently, then we'll update the
anv_format comments.
2015-06-30 09:41:41 -07:00
Chad Versace
709fa463ec vk/depth: Add a FIXME
3DSTATE_DEPTH_BUFFER.Width,Height are wrong.
2015-06-26 22:15:03 -07:00
Chad Versace
5b3a1ceb83 vk/image: Enable 2d single-sample color miptrees
What's been tested, for both image views and color attachment views:

    - VK_FORMAT_R8G8B8A8_UNORM
    - VK_IMAGE_VIEW_TYPE_2D
    - mipLevels: 1, 2
    - baseMipLevel: 0, 1
    - arraySize: 1, 2
    - baseArraySlice: 0, 1

What's known to be broken:

    - Depth and stencil miptrees. To fix this, anv_depth_stencil_view
      needs major rework.
    - VkImageViewType != 2D
    - MSAA

Fixes Crucible tests:

  func.miptree.view-2d.levels02.array01.*
  func.miptree.view-2d.levels01.array02.*
  func.miptree.view-2d.levels02.array02.*
2015-06-26 22:11:15 -07:00
Chad Versace
c6e76aed9d vk/image: Define anv_surface, refactor anv_image
This prepares for upcoming miptree support.

anv_surface is a proxy for color surfaces, depth surfaces, and stencil
surfaces.  Embed two instances of anv_surface into anv_image: the
primary surface (color or depth), and an optional stencil surface.
2015-06-26 21:45:53 -07:00
Chad Versace
127cb3f6c5 vk/image: Reformat function signatures
Reformat them to match Mesa code-style.
2015-06-26 20:12:42 -07:00
Chad Versace
fdcd71f71d vk/image: Embed VkImageCreateInfo* into anv_image_create_info
All function signatures that matched this pattern,
  old: f(const VkImageCreateInfo *, const struct anv_image_create_info *)

were rewritten as
  new: f(const struct anv_image_create_info *)
2015-06-26 20:06:08 -07:00
Chad Versace
ca6cef3302 vk/image: Drop some tmp vars in anv_image_view_init()
Variables 'tile_mode' and 'format' are unneeded.
2015-06-26 19:50:04 -07:00
Chad Versace
9c46ba9ca2 vk/image: Abort on stencil image views
The code doesn't work. Not even close.

Replace the broken code with a FINISHME and abort.
2015-06-26 19:23:21 -07:00
Chad Versace
667529fbaa vk: Reindent struct anv_image 2015-06-26 15:27:20 -07:00
Chad Versace
74e3eb304f vk: Define MIN(a, b) macro 2015-06-26 15:09:07 -07:00
Chad Versace
55752fe94a vk: Rename functions ALIGN_*32 -> align_*32
ALIGN_U32 and ALIGN_I32 are functions, not macros. So stop using
allcaps.
2015-06-26 15:07:59 -07:00
Connor Abbott
6ee082718f Merge branch 'wip/nir-vtn' into vulkan
Adds composites and matrix multiplication, plus some control flow fixes.
2015-06-26 12:14:05 -07:00
Chad Versace
37d6e04ba1 vk/formats: Remove the cpp=0 stencil hack
The format table defined cpp = 0 for stencil-only formats. The real cpp
is 1.

When code begins to lie, especially about stencil buffers, code becomes
increasingly fragile as time progresses, and the damage becomes
increasingly hard to undo. (For precedent, see the painful history of
stencil buffer cpp in the git log for gen6 and gen7 in the i965 driver).
Let's undo the stencil buffer cpp lie now to avoid future pain.

In the format table, set cpp = 1 for VK_FORMAT_S8; replace checks for
cpp == 0; and delete all comments about the hack.
2015-06-26 09:58:22 -07:00
Chad Versace
67a7659d69 vk/image: Refactor anv_image_create()
From my experience with intel_mipmap_tree.c, I learned that for struct's
like anv_image and intel_mipmap_tree, which have sprawling
multi-function construction codepaths, it's easy to mistakenly use
unitialized struct members during construction.

Let's eliminate the risk of using unitialized anv_image members during
construction.  Fill the struct at the function bottom instead of
piecemeal throughout the constructor.
2015-06-26 09:32:59 -07:00
Chad Versace
5d7103ee15 vk/image: Group some assertions closer together
In anv_image_create(), group together the assertions on
VkImageCreateInfo.
2015-06-26 09:05:46 -07:00
Chad Versace
0349e8d607 vk/formats: #undef fmt at end of format table 2015-06-26 07:38:02 -07:00
Chad Versace
068b8a41e2 vk: Fix comment for anv_depth_stencil_view::stencil_qpitch
s/DEPTH/STENCIL/
2015-06-26 07:31:57 -07:00
Chad Versace
7ea707a42a vk/image: Add qpitch fields to anv_depth_stencil_view
For now, hard-code them to 0.
2015-06-25 20:10:16 -07:00
Chad Versace
b91a76de98 vk: Reindent and document struct anv_depth_stencil_view 2015-06-25 20:10:16 -07:00
Chad Versace
ebe1e768b8 vk/formats: Fix incorrect depth formats
anv_format::surface_format was incorrect for Vulkan depth formats.
For example, the format table mapped

    VK_FORMAT_D24_UNORM -> .surface_format = D24_UNORM_X8_UINT
    VK_FORMAT_D32_FLOAT -> .surface_format = D32_FLOAT

but should have mapped

    VK_FORMAT_D24_UNORM -> .surface_format = R24_UNORM_X8_TYPELESS
    VK_FORMAT_D32_FLOAT -> .surface_format = R32_FLOAT

The Crucible test func.depthstencil.basic passed despite the bug, but
only because it did not attempt to texture from the depth surface.

The core problem is that RENDER_SURFACE_STATE.SurfaceFormat and
3DSTATE_DEPTH_BUFFER.SurfaceFormat are distinct types. Considering them
as enum spaces, the two enum spaces have incompatible collisions.

Fix this by adding a new field 'depth_format' to struct anv_format.

Refer to brw_surface_formats.c:translate_tex_format() for precedent.
2015-06-25 20:10:16 -07:00
Chad Versace
45b804a049 vk/image: Rename local variable in anv_image_create()
This function has many local variables for info structs. Having one
named simply 'info' is confusing.  Rename it to 'format_info'.
2015-06-25 20:10:16 -07:00
Chad Versace
528071f004 vk/formats: Fix table entry for R8G8B8_SNORM
Now that anv_formats[] is formatted like a table, buggy entries are
easier to see.
2015-06-25 20:10:16 -07:00
Chad Versace
4c8146313f vk/formats: Rename anv_format::format -> surface_format
I misinterpreted anv_format::format as a VkFormat. Instead, it is
a hardware surface format (RENDER_SURFACE_STATE.SurfaceFormat). Rename
the field to 'surface_format' to make it unambiguous.
2015-06-25 20:10:16 -07:00
Chad Versace
4b8b451a1d vk/formats: Rename anv_format::channels -> num_channels
I misinterpreted anv_format::channels as a bitmask of channels.
Renaming it to 'num_channels' makes it unambiguous.
2015-06-25 20:10:16 -07:00
Chad Versace
af0ade0d6c vk: Reindent struct anv_format 2015-06-25 20:10:16 -07:00
Chad Versace
ae29fd1b55 vk/formats: Don't abbreviate tokens in the format table
Abbreviating the VK_FORMAT_* tokens doesn't help much. To the contrary,
it means grep and ctags can't find them.
2015-06-25 20:10:16 -07:00
Jason Ekstrand
d5e41a3a99 vk/compiler: Add the initial hacks to get SPIR-V up and going 2015-06-25 17:36:35 -07:00
Jason Ekstrand
c4c1d96a01 HACK: Get rid of sanity_param_count for FS 2015-06-25 17:36:34 -07:00
Jason Ekstrand
4f5ef945e0 i965: Don't print the GLSL IR if it doesn't exist 2015-06-25 17:36:34 -07:00
Jason Ekstrand
588acdb431 nir/spirv: Set the right location for shader input/outputs
We need to add FRAG_RESULT_DATA0 etc. to the input/output location.
2015-06-25 17:36:34 -07:00
Jason Ekstrand
333b8ddd6b nir/spirv: Set the interface type on uniform blocks 2015-06-25 17:36:34 -07:00
Jason Ekstrand
7e1792b1b7 nir/spirv: Set the system value mode on builtins 2015-06-25 17:36:34 -07:00
Jason Ekstrand
b72936fdad nir/spirv: Actually put variables on the right linked list 2015-06-25 17:36:34 -07:00
Jason Ekstrand
ee0a8f23e4 glsl: Move vert_attrib varying_slot and frag_result enums to shader_enums.h 2015-06-25 17:36:34 -07:00
Chad Versace
fa352969a2 vk/image: Check extent does not exceed surface type limits 2015-06-25 16:53:24 -07:00
Chad Versace
99031aa0f3 vk/image: Stop hardcoding SurfaceType of VkImageView
Instead, translate VkImageViewType to a gen SurfaceType.
2015-06-25 16:53:22 -07:00
Chad Versace
7ea121687c vk/image: Add anv_image::surf_type
This the gen SurfaceType, such as SURFTYPE_2D.
2015-06-25 16:52:16 -07:00
Chad Versace
cb30acaced vk/image: Add tables for gen SurfaceType
Tables for mapping VkImageType and VkImageViewType to gen SurfaceType.
Tables are unused.
2015-06-25 16:52:16 -07:00
Chad Versace
1132080d5d vk/util: Add anv_loge() for logging error messages 2015-06-25 16:52:16 -07:00
Chad Versace
5f2d469e37 vk: Add func anv_is_aligned() 2015-06-25 16:52:16 -07:00
Chad Versace
f7fb7575ef vk: Add anv_minify() 2015-06-25 16:52:05 -07:00
Chad Versace
7cec6c5dfd vk: Define MAX(a, b) macro 2015-06-25 16:29:42 -07:00
Jason Ekstrand
d178e15567 nir/spirv: Fix up some dererf ralloc parenting 2015-06-24 21:39:07 -07:00
Jason Ekstrand
845002e163 i965/nir: Handle returns as long as they're at the end of a function 2015-06-24 21:38:49 -07:00
Jason Ekstrand
2ecac045a4 i965/nir: Split NIR shader handling into two functions
The brw_create_nir function takes a GLSL or ARB shader and turns it into a
NIR shader.  The guts of the optimization and lowering code is now split
into a new brw_process_shader function.
2015-06-24 21:22:07 -07:00
Jason Ekstrand
e369a0eb41 nir/spirv: Use vtn_ssa_value for texture coordinates 2015-06-24 20:39:37 -07:00
Jason Ekstrand
d0bd2bc604 nir/spirv: Add support for the Uniform storage class
This is kida sketchy.  I'm not really sure this is the way it's supposed to
be used.
2015-06-24 20:32:05 -07:00
Jason Ekstrand
ba0d9d33d4 nir/spirv: Add support for some more decorations including built-in 2015-06-24 20:30:32 -07:00
Jason Ekstrand
1bc0a1ad98 nir/spirv: Make the header file C++ safe 2015-06-24 19:01:10 -07:00
Jason Ekstrand
88d02a1b27 vk: Build xmlconfig stuff into libi965_compiler 2015-06-24 15:59:09 -07:00
Kristian Høgsberg Kristensen
24dff4f8fa vk/headers: Handle MBO fields
These must be set to one.
2015-06-24 09:37:50 -07:00
Jason Ekstrand
a62edcce4e Merge remote-tracking branch 'mesa-public/master' into vulkan 2015-06-23 18:05:25 -07:00
Connor Abbott
dee4a94e69 nir/vtn: add support for phi nodes 2015-06-23 10:34:55 -07:00
Connor Abbott
fe1269cf28 nir/builder: add support for inserting before/after blocks 2015-06-23 10:34:22 -07:00
Connor Abbott
9a3dda101e nir/vtn: fix emitting code after loops
When we're done emitting the code for a loop, we need to visit the new
break block, which is the merge block of the current loop, rather than
the old merge block, which is the merge block of the loop containing the
one we just emitted code for.
2015-06-22 13:53:08 -07:00
Connor Abbott
e9c21d0ca0 unbreak things 2015-06-22 11:59:55 -07:00
Kristian Høgsberg Kristensen
9b9f973ca6 vk: Implement scratch buffers to make spilling work 2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen
9e59003fb1 vk: Undo relocs for scratch bos 2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen
b20794cfa8 vk/allocator: Get rid of non-memfd path
We can just use modern valgrind now.
2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen
aba75d0546 vk/headers: Make General State offsets relocations 2015-06-19 15:42:15 -07:00
Connor Abbott
841aab6f50 matrices matrices matrices 2015-06-18 18:52:44 -07:00
Connor Abbott
d0fc04aacf nir/types: be less strict about constructing matrix types 2015-06-18 18:51:51 -07:00
Connor Abbott
22854a60ef nir/builder: add a nir_fdot() convenience function 2015-06-18 17:34:55 -07:00
Connor Abbott
0e86ab7c0a nir/types: add a helper to transpose a matrix type 2015-06-18 17:34:12 -07:00
Connor Abbott
de4c31a085 fix glsl450 for composites 2015-06-18 17:33:08 -07:00
Kristian Høgsberg Kristensen
aedd3c9579 vk: Add missing gen7 RENDER_SURFACE_STATE struct 2015-06-17 21:42:29 -07:00
Connor Abbott
bf5a615659 composites composites composites 2015-06-17 16:25:38 -07:00
Kristian Høgsberg Kristensen
fa8a07748d vk: Compute CS exec mask and thread width max in pipeline
We compute the right mask and thread width max parameters as part of
pipeline creation and set them accordingly at vkCmdDispatch() and
vkCmdDispatchIndirect() time. These parameters depend only on the local
group size and the dispatch width of the program so we can figure this
out at pipeline create time.
2015-06-12 18:21:50 -07:00
Kristian Høgsberg Kristensen
c103c4990c vk: Set binding table layout for CS
We weren't setting the binding table layout for the backend compiler.
2015-06-12 18:21:49 -07:00
Kristian Høgsberg Kristensen
2fdd17d259 vk: Generate CS prog_data into the pipeline instance
We were generating the prog_data into a local variable and never
initializing the pipeline->cs_prog_data one.
2015-06-12 18:21:49 -07:00
Kristian Høgsberg Kristensen
00494c6cb7 vk: Document how depth/stencil formats work in anv_image_create()
This reverts commits

  e17ed04 * vk/image: Don't double-allocate stencil buffers
  1ee2d1c * vk/image: Teach anv_image_choose_tile_mode about WMAJOR

and instead adds a comment to describe the subtlety of how we create
images for stencil only formats.
2015-06-11 22:07:16 -07:00
Kristian Høgsberg Kristensen
fbc9fe3c92 vk: Use compute pipeline layout when binding compute sets 2015-06-11 21:57:43 -07:00
Kristian Høgsberg Kristensen
765175f5d1 vk: Implement basic compute shader support 2015-06-11 15:31:42 -07:00
Kristian Høgsberg Kristensen
7637b02aaa vk: Emit PIPELINE_SELECT on demand 2015-06-11 15:21:49 -07:00
Kristian Høgsberg Kristensen
405697eb3d vk: Stop asserting we have a fragment shader
Even for graphics, this is not a requirement, we can have a depth-only output pipeline.
2015-06-11 15:07:38 -07:00
Kristian Høgsberg Kristensen
e7edde60ba vk: Defer setting viewport dynamic state
We can't emit this until we've done a 3D pipeline select.
2015-06-11 15:04:09 -07:00
Kristian Høgsberg Kristensen
f7fe06cf0a vk: Disable shader stages in the graphics pipeline batch
We need to move this into the graphics pipeline batch so we don't  emit it
for compute pipelines.
2015-06-11 14:58:31 -07:00
Kristian Høgsberg Kristensen
9aae480cc4 vk: Don't emit STATE_SIP
We don't have a SIP kernel and don't enable exceptions.
2015-06-11 14:56:29 -07:00
Kristian Høgsberg Kristensen
923e923bbc vk: Compile fragment shader after VS and GS
Just moving code around to do shader stages in the natual order.
2015-06-11 14:55:50 -07:00
Jason Ekstrand
1dd63fcbed vk/entrypoints: Don't print every single function call 2015-06-11 10:10:13 -07:00
Kristian Høgsberg Kristensen
b581e924b6 vk: Remove left-over trp call 2015-06-11 09:26:49 -07:00
Kristian Høgsberg Kristensen
d76ea7644a vk: Set maximum point size range
We set both minimum and maximum point size to 0 in 3DSTATE_CLIP, which
will clip away all points.
2015-06-11 09:25:04 -07:00
Kristian Høgsberg Kristensen
a5b49d2799 vk: Use generated headers with fixed point support
The generated headers now convert float in the template struct to the
correct fixed point format.
2015-06-11 09:25:04 -07:00
Kristian Høgsberg Kristensen
ea7ef46cf9 vk: Regenerate headers with __gen_validate_value() 2015-06-11 09:25:03 -07:00
Jason Ekstrand
a566b1e08a vk/formats: Refactor format properties code
Along with the refactor, we now do the right thing when we hit an
unsupported format: Set the flags to 0 and return VK_SUCCESS.
2015-06-11 09:11:16 -07:00
Jason Ekstrand
2a3c29698c vk/image: Add a bunch of asserts 2015-06-10 21:04:51 -07:00
Jason Ekstrand
c8b62d109b vk: Add a couple vk_error calls 2015-06-10 21:04:13 -07:00
Jason Ekstrand
7153b56abc vk/private: Add a non-fatal assert 2015-06-10 21:03:50 -07:00
Jason Ekstrand
29d2bbb2b5 vk/cmd: Add an initial implementation of PipelineBarrier
We may want to do something more inteligent here later such as actually
handling image layout transitions.  However, this should do for now.
2015-06-10 16:37:33 -07:00
Jason Ekstrand
047ed02723 vk/emit: Use valgrind to validate every packed field 2015-06-10 12:43:02 -07:00
Jason Ekstrand
9cae3d18ac vk: Add valgrind checks in various emit functions
The check in batch_bo_finish should catch any undefined values in the batch
but isn't that great for debugging.  The checks in the various emit
functions will help get better granularity.
2015-06-09 21:51:37 -07:00
Jason Ekstrand
d5ad24e39b vk: Move the valgrind include and VG() macro to private.h 2015-06-09 21:51:37 -07:00
Chad Versace
e17ed04b03 vk/image: Don't double-allocate stencil buffers
If the main surface has format S8_UINT, then don't allocate the
auxiliary stencil surface.
2015-06-09 16:39:28 -07:00
Chad Versace
1ee2d1c3fc vk/image: Teach anv_image_choose_tile_mode about WMAJOR 2015-06-09 16:38:55 -07:00
Chad Versace
2d2e148952 vk/util: Add anv_abortf(), anv_abortfv()
Convenience functions to print an error message then abort.
2015-06-09 16:38:50 -07:00
Chad Versace
ffb1ee5d20 vk: Define anv_noreturn macro 2015-06-09 16:38:46 -07:00
Chad Versace
f1db3b3869 vk/image: Factor tile mode selection into separate function
Because it will eventually need to get smarter.
2015-06-09 16:38:42 -07:00
Jason Ekstrand
11e941900a vk/device: Actually allow destruction 2015-06-09 16:28:46 -07:00
Jason Ekstrand
5d4b6a01af vk/cmd_buffer: Properly initialize/reset dynamic states 2015-06-09 16:27:55 -07:00
Jason Ekstrand
634a6150b9 vk/pipeline: Zero out the depth-stencil state when not in use 2015-06-09 16:26:55 -07:00
Jason Ekstrand
919e7b7551 vk/device: Use anv_CreateDynamicViewportState instead of the vk one 2015-06-09 16:01:56 -07:00
Jason Ekstrand
0599d39dd9 vk/device: Dedent the vkCreateDynamicViewportState call 2015-06-09 15:53:26 -07:00
Chad Versace
d57c4cf999 vk/util: Annotate anv_finishme() as printflike 2015-06-09 14:46:49 -07:00
Chad Versace
822cb16abe vk: Define anv_printflike() macro 2015-06-09 14:46:45 -07:00
Chad Versace
081f617b5a vk/image: Stop hardcoding alignment of stencil surfaces
Look up the alignment from anv_tile_info_table.
2015-06-09 14:16:56 -07:00
Chad Versace
e6bd568f36 vk/image: Rewrite tile info table
- Reduce the number of table lookups in anv_image_create from 4 to 1.
- Add field for surface alignment.
- Shorten field names tile_width, tile_height -> width, height.
2015-06-09 14:16:45 -07:00
Chad Versace
5b777e2bcf vk/image: Delete an old comment 2015-06-09 14:14:29 -07:00
Jason Ekstrand
d842a6965f vk/compiler: Free the GL errors data 2015-06-09 12:36:23 -07:00
Jason Ekstrand
9f292219bf vk/compiler: Free more of prog_data when tearing down a pipeline 2015-06-09 12:36:23 -07:00
Jason Ekstrand
66b00d5e5a vk/queue: Embed the queue in and allocate it with the device 2015-06-09 12:36:23 -07:00
Jason Ekstrand
38f5eef59d vk/device: Free border color states when we have valgrind 2015-06-09 12:36:23 -07:00
Jason Ekstrand
999b56c507 vk/device: Destroy all batch buffers
Due to a copy+paste error, we were destroying all but the first batch or
surface state buffer.  Now we destroy them all.
2015-06-09 12:36:23 -07:00
Jason Ekstrand
3a38b0db5f vk/meta: Clean up temporary objects 2015-06-09 12:36:23 -07:00
Jason Ekstrand
9d6f55dedf vk/surface_view: Add a destructor 2015-06-09 12:36:23 -07:00
Chad Versace
e6162c2fef vk/image: Add anv_image::h_align,v_align
Use the new fields to compute RENDER_SURFACE_STATE.Surface*Alignment.
We still hardcode them to 4, though.
2015-06-09 12:19:24 -07:00
Jason Ekstrand
58afc24e57 vk/allocator: Remove the concept of a slave block pool
This reverts commit d24f8245db.
2015-06-08 17:46:32 -07:00
Jason Ekstrand
b6363c3f12 vk/device: Remove the binding table pools/streams 2015-06-08 17:45:57 -07:00
Jason Ekstrand
531549d9fc vk/pipeline: Move freeing the program stream to pipeline.c
It's created in pipeline.c so we should free it there.
2015-06-08 14:27:04 -07:00
Jason Ekstrand
66a4dab89a vk/pipeline: Don't destroy the program stream
It's freed in compiler.cpp and we don't want to free it twice.
2015-06-08 13:53:19 -07:00
Jason Ekstrand
920fb771d4 vk/allocator: Make the use of NULL_BLOCK in state_stream_finish explicit 2015-06-08 13:53:19 -07:00
Kristian Høgsberg Kristensen
52637c0996 vk: Quiet a few warnings 2015-06-08 08:51:40 -07:00
Kristian Høgsberg Kristensen
9eab70e54f vk: Create a minimal context for the compiler
This avoids the full brw context initialization and just sets up context
constants, initializes extensions and sets a few driver vfuncs for the
front-end GLSL compiler.
2015-06-08 08:51:40 -07:00
Jason Ekstrand
ce00233c13 vk/cmd_buffer: Use the dynamic state stream in emit_dynamic and merge_dynamic 2015-06-05 17:26:41 -07:00
Jason Ekstrand
e69588b764 vk/device: Use a 64-byte alignment for CC state 2015-06-05 17:26:26 -07:00
Jason Ekstrand
c2eeab305b vk/pipeline: Actually free the program stream and dynamic pool 2015-06-05 17:26:26 -07:00
Jason Ekstrand
ed2ca020f8 vk/allocator: Avoid double-free in the bo pool 2015-06-05 17:12:28 -07:00
Jason Ekstrand
aa523d3c62 vk/gem: Call VALGRIND_FREELIKE_BLOCK before unmapping 2015-06-05 16:41:49 -07:00
Chad Versace
87d98e1935 vk: Fix 2 incorrect typecasts
The compiler didn't find the cast errors because all Vulkan types are
just integers.
2015-06-04 14:32:22 -07:00
Chad Versace
b981379bcf vk: Make make clean remove generated spirv headers 2015-06-04 14:26:46 -07:00
Jason Ekstrand
8d930da35d vk/allocator: Remove an unneeded VG() wrapper 2015-06-04 09:14:33 -07:00
Jason Ekstrand
7f90e56e42 vk/device: Dissalow device destruction 2015-06-04 09:14:33 -07:00
Chad Versace
9cd42b3dea vk: Fix build
Commit 1286bd, which deleted vk.c, broke the build. Update the Makefile
to fix it.
2015-06-04 09:01:30 -07:00
Jason Ekstrand
251aea80b0 vk/DS: Mask stencil masks to 8 bits 2015-06-03 16:59:13 -07:00
Connor Abbott
47bd462b0c awesome control flow bugfixes/clarifications 2015-06-03 14:10:28 -04:00
Kristian Høgsberg Kristensen
a37d122e88 vk: Set color/blend state in meta clear if not set yet 2015-06-02 23:08:05 -07:00
Kristian Høgsberg Kristensen
1286bd3160 vk: Delete vk.c test case
We now have crucible up and running and all vk sub-cases have been moved
over. Delete this crufty old hack of a test case.
2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen
2f6aa424e9 vk: Update generated headers with support for 64 bit fields 2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen
5744d1763c vk: Set cb_state to NULL at cmd buffer create time
Dynamic color/blend state can be NULL in case we're not rendering to
color targets (only output to depth and/or stencil). Initialize
cmd_buffer->cb_state to NULL so we can reliably detect whether it's been
set or not.
2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen
c8f078537e vk: Implement vertexOffset parameter of vkCmdDrawIndexed()
As exposed by the func.draw_indexed test, we were ignoring the argument
and hardcoding 0.
2015-06-02 22:57:42 -07:00
Jason Ekstrand
e702197e3f vk/formats: Add a name to the metadata and better logging 2015-06-02 11:30:39 -07:00
Jason Ekstrand
fbafc946c6 vk/formats: Rework the formats table 2015-06-02 11:30:39 -07:00
Kristian Høgsberg Kristensen
f98c89ef31 vk: Move query related functionality to new file query.c 2015-06-01 21:52:45 -07:00
Jason Ekstrand
08748e3a0c i965: Use NIR by default for vertex shaders on GEN8+
GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:

   total instructions in shared programs: 2742062 -> 2681339 (-2.21%)
   instructions in affected programs:     1514770 -> 1454047 (-4.01%)
   helped:                                5813
   HURT:                                  1120

The gained programs are ARB vertext programs that were previously going
through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
programs can go through the scalar backend so they show up as "gained" in
the shader-db results.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2015-06-01 12:25:58 -07:00
Jason Ekstrand
d4cbf6a728 vk/compiler: Add an index_count to the bind map and check for OOB 2015-06-01 12:25:58 -07:00
Jason Ekstrand
510b5c3bed vk/HACK: Plumb real descriptor set/index into textures 2015-06-01 12:25:58 -07:00
Jason Ekstrand
aded32bf04 NIR: Add a helper for doing sampler lowering for vulkan 2015-06-01 12:25:58 -07:00
Kristian Høgsberg Kristensen
5caa408579 vk: Indent tables to align '=' at column 48 2015-05-31 22:36:26 -07:00
Kristian Høgsberg Kristensen
76bb658518 vk: Add support for anisotropic bits 2015-05-31 22:15:34 -07:00
Kristian Høgsberg Kristensen
dc56e4f7b8 vk: Implement support for sampler border colors
This supports the three Vulkan border color types for float color
formats. The support for integer formats is a little trickier, as we
don't know the format of the texture at this time.
2015-05-31 17:20:48 -07:00
Jason Ekstrand
e497ac2c62 vk/device: Only flush the texture cache when setting state base address
After further examination, it appears that the other flushes and stalls
weren't actually needed.
2015-05-30 18:04:50 -07:00
Jason Ekstrand
2251305e1a vk/cmd_buffer: Track descriptor set dirtying per-stage 2015-05-30 10:07:29 -07:00
Jason Ekstrand
33cccbbb73 vk/device: Emit PIPE_CONTROL flushes surrounding new STATE_BASE_ADDRESS
According to the bspec, you're supposed to emit a PIPE_CONTROL with a CS
stall and a render target flush prior to chainging STATE_BASE_ADDRESS.  A
little experimentation, however, shows that this is not enough.  It also
appears as if you have to flush the texture cache after chainging base
address or things won't propagate properly.
2015-05-30 08:08:07 -07:00
Jason Ekstrand
b2b9fc9fad vk/allocator: Don't call VALGRIND_MALLOCLIKE_BLOCK on fresh gem_mmap's 2015-05-29 21:15:47 -07:00
Jason Ekstrand
03ffa9ca31 vk: Don't crash on partial descriptor sets 2015-05-29 20:43:10 -07:00
Jason Ekstrand
4ffbab5ae0 vk/device: Allow for starting a new surface state buffer
This commit allows for us to create a whole new surface state buffer when
the old one runs out of room.  We simply re-emit the state base address for
the new state, re-emit binding tables, and keep going.
2015-05-29 17:49:41 -07:00
Jason Ekstrand
c4bd5f87a0 vk/device: Do lazy surface state emission for binding tables
Before, we were emitting surface states up-front when binding tables were
updated.  Now, we wait to emit the surface states until we emit the binding
table.  This makes meta simpler and should make it easier to deal with
swapping out the surface state buffer.
2015-05-29 16:51:11 -07:00
Kristian Høgsberg Kristensen
4aecec0bd6 vk: Store dynamic slot index with struct anv_descriptor_slot
We need to make sure we use the right index into dynamic offset
array. Dynamic descriptors can be present or not in different stages and
to get the right offset, we need to compute the index at
vkCreateDescriptorSetLayout time.
2015-05-29 11:32:53 -07:00
Kristian Høgsberg Kristensen
fad418ff47 vk: Implement dynamic buffer offsets
We do this by creating a surface state on the fly that incorporates the
dynamic offset. This patch also refactor the descriptor set layout
constructor a bit to be less clever with switch statement fall
through. Instead of duplicating the subtle code to update the sampler
and surface slot map, we just use two switch statements.
2015-05-28 22:41:20 -07:00
Jason Ekstrand
9ffc1bed15 vk/device: Split state base address emit into its own function 2015-05-28 15:34:08 -07:00
Jason Ekstrand
468c89a351 vk/device: Use anv_batch_emit for MI_BATCH_BUFFER_START 2015-05-28 15:25:02 -07:00
Jason Ekstrand
2dc0f7fe5b vk/device: Actually destroy batch buffers 2015-05-28 13:08:21 -07:00
Jason Ekstrand
8cf932fd25 vk/query: Don't emit a CS stall by itself
Both the bspec and the simulator don't like this.  I'm not sure if stalling
at the scoreboard is right but it at least shuts up the simulator.
2015-05-28 10:27:53 -07:00
Jason Ekstrand
730ca0efb1 vk/device: Fixups for batch buffer chaining
Some how these didn't get merged with the other batch buffer chaining
stuff.  Oh well, it's here now.
2015-05-28 10:26:11 -07:00
Jason Ekstrand
de221a672d meta: Add a default ds_state and use it when no ds state is set 2015-05-28 10:06:45 -07:00
Jason Ekstrand
6eefeb1f84 vk/meta: Share the dummy RS and CB state between clear and blit 2015-05-28 10:00:38 -07:00
Kristian Høgsberg Kristensen
5a317ef4cb vk: Initialize dynamic state binding points to NULL
We rely on these being initialized to NULL so meta can reliably detect
whether or not they've been set. ds_state is also allowed to not be
present so we need a well-defined value for that.
2015-05-27 22:13:48 -07:00
Chad Versace
1435bf4bc4 .gitignore: Ignore spirv2nir binary 2015-05-27 17:01:09 -07:00
Chad Versace
f559fe9134 .gitignore: Scope Vulkan's generated source files
Don't ignore any file named entrypoints.{c,h}. Ignore it only if it's in
src/vulkan.
2015-05-27 16:59:53 -07:00
Chad Versace
ca385dcf2a vk: gitignore generated source files 2015-05-27 16:57:31 -07:00
Chad Versace
466f61e9f6 vk/glsl_scraper: Replace adhoc arg parsing with argparse 2015-05-27 16:56:02 -07:00
Chad Versace
fab9011c44 vk/image: Assert that VkImageTiling is valid 2015-05-27 16:21:04 -07:00
Chad Versace
c0739043b3 vk/image: Remove trailing whitespace 2015-05-27 16:15:47 -07:00
Chad Versace
4514e63893 vk/glsl: Reject invalid options
The script incorrectly interpreted --blah as the input filename.
2015-05-27 16:14:26 -07:00
Chad Versace
fd8b5e0df2 vk/glsl_scraper: Indent large text blocks
Indent them to the same level as if the text was code.

No changes in entrypoints.{c,h} after a clean build.
2015-05-27 16:09:31 -07:00
Chad Versace
df4b02f4ed vk/glsl_scraper: Fix code style for imports
Python style is one module imported per line, and imports are at the top
of the file.
2015-05-27 16:04:12 -07:00
Jason Ekstrand
b23885857f vk/meta: Actually create the CB state for blits 2015-05-27 12:06:30 -07:00
Jason Ekstrand
da8f148203 vk: Rework anv_batch and use chaining batch buffers
This mega-commit primarily does two things.  First, is to turn anv_batch
into a better abstraction of a batch.  Instead of actually having a BO, it
now has a few pointers to some piece of memory that are used to add data to
the "batch".  If it gets to the end, there is a function pointer that it
can call to attempt to grow the batch.

The second change is to start using chained batch buffers.  When the end of
the current batch BO is reached, it automatically creates a new one and
ineserts an MI_BATCH_BUFFER_START command to chain to it.  In this way, our
batch buffers are effectively infinite in length.
2015-05-27 11:48:28 -07:00
Jason Ekstrand
59def43fc8 Fixup for growable reloc lists 2015-05-27 11:48:28 -07:00
Jason Ekstrand
1c63575de8 vk/cmd_buffer: Allocate the surface_bo from device->batch_bo_pool 2015-05-27 11:48:28 -07:00
Jason Ekstrand
403266be05 vk/device: Make reloc lists growable 2015-05-27 11:48:28 -07:00
Jason Ekstrand
5ef81f0a05 vk/device: Use a bo pool for batch buffers 2015-05-27 11:48:28 -07:00
Jason Ekstrand
6f3e3c715a vk/allocator: Add a BO pool 2015-05-27 11:48:28 -07:00
Jason Ekstrand
59328bac10 vk/allocator: Add a free list that acts on pointers instead of offsets 2015-05-27 11:48:28 -07:00
Kristian Høgsberg
a1d30f867d vk: Add support for dynamic and pipeline color blend state 2015-05-26 17:12:37 -07:00
Kristian Høgsberg
2514ac5547 vk/test: Create and use color/blend dynamic and pipeline state 2015-05-26 17:12:37 -07:00
Kristian Høgsberg
1cd8437b9d vk/meta: Allocate and set color/blend state
For color blend, we have to set our own state to avoid inheriting bogus
blend state.
2015-05-26 17:12:37 -07:00
Kristian Høgsberg
610e6291da vk: Allocate samplers from dynamic stream 2015-05-26 11:50:34 -07:00
Kristian Høgsberg
b29f44218d vk: Emit color calc state
This involves pulling stencil ref values out of DS dynamic state and the
blend constant out of CB dynamic state.
2015-05-26 11:27:31 -07:00
Kristian Høgsberg
5e637c5d5a vk/pack: Generate length macros for structs 2015-05-26 11:27:31 -07:00
Kristian Høgsberg
998837764f vk: Program depth bias
This makes 3DSTATE_RASTER a split state command.
2015-05-26 11:27:31 -07:00
Kristian Høgsberg
0dbed616af vk: Add support for texture component swizzle
This also drops the share create_surface_state helper and moves filling
out SURFACE_STATE directly into anv_image_view_init() and
anv_color_attachment_view_init().
2015-05-26 11:27:29 -07:00
Kristian Høgsberg
cbe7ed416e vk: Implement dynamic and pipeline ds state 2015-05-25 20:20:31 -07:00
Kristian Høgsberg
37743f90bc vk: Set up depth and stencil buffers 2015-05-25 20:20:31 -07:00
Kristian Høgsberg
7c0d0021eb vk/test: Add new depth-stencil test
Not yet a depth stencil test, but will become one.
2015-05-25 20:20:31 -07:00
Kristian Høgsberg
0997a7b2e3 vk: Add basic MOCS settings
This matches what we do for GL.
2015-05-25 20:20:31 -07:00
Kristian Høgsberg
c03314bdd3 vk: Update to header files with nested struct support
This will let us do MOCS settings right.
2015-05-25 20:20:31 -07:00
Jason Ekstrand
ae8c93e023 vk/cmd_buffer: Initialize the pipeline pointer to NULL
If a meta operation is called before the pipeline is set, this can cause
uses of undefined values.  They *should* be harmless, but we might as well
shut up valgrind on this one too.
2015-05-25 17:14:49 -07:00
Jason Ekstrand
912944e59d vk/device: Use the correct number of viewports when creating default VP state
Fixes valgrind uninitialized value errors
2015-05-25 17:14:49 -07:00
Jason Ekstrand
1b211feb6c vk/compiler: Zero out the vs_prog_data struct when VS is disabled
Prevents uninitialized value errors
2015-05-25 17:14:49 -07:00
Jason Ekstrand
903bd4b056 vk/compiler: Fix up the binding hack and make it work in NIR 2015-05-25 12:57:32 -07:00
Jason Ekstrand
57153da2d5 vk: Actually implement some sort of destructor for all object types 2015-05-22 15:15:08 -07:00
Jason Ekstrand
0f0b5aecb8 vk/pipeline: Track VB's that are actually used by the pipeline
Previously, we just blasted out whatever VB's we had marked as "dirty"
regardless of which ones were used by the pipeline.  Given that the stride
of the VB is embedded in the pipeline this can cause problems.  One problem
is if the pipeline doesn't use the given VB binding we emit a bogus stride.
Another problem is that we weren't properly resetting the dirty bits when
the pipeline changed.
2015-05-21 16:58:53 -07:00
Jason Ekstrand
0a54751910 vk/device: Memset descriptor sets to 0 and handle descriptor set holes 2015-05-21 16:33:04 -07:00
Jason Ekstrand
519fe765e2 vk: Do relocations in surface states when they are created
Previously, we waited until later and did a pass through the used surfaces
and did the relocations then.  This lead to doing double-relocations which
was causing us to get bogus surface offsets.
2015-05-21 15:55:29 -07:00
Jason Ekstrand
ccf2bf9b99 vk/test: Use the glsl_scraper for building shaders 2015-05-21 12:24:02 -07:00
Jason Ekstrand
f3d70e4165 vk/glsl_scraper: Use the LunarG back-door for GLSL source 2015-05-21 12:22:44 -07:00
Jason Ekstrand
cb56372eeb vk/glsl_scraper: Use a fake GLSL version that glslang will accept 2015-05-21 12:21:02 -07:00
Jason Ekstrand
0e441cde71 vk: Bake the GLSL_VK_SHADER macro into the scraper output file 2015-05-21 12:21:00 -07:00
Jason Ekstrand
f17e835c26 vk/meta: Use glsl_scraper for our GLSL source
We are not yet using SPIR-V for meta but this is a first step.
2015-05-21 11:39:54 -07:00
Jason Ekstrand
b13c0f469b vk: More out-of-tree build fixes 2015-05-21 11:32:59 -07:00
Jason Ekstrand
f294154e42 vk: Fix for out-of-tree builds 2015-05-21 10:23:18 -07:00
Kristian Høgsberg
f9e66ea621 vk: Remove render pass stub call
This isn't really a stub.
2015-05-20 20:34:52 -07:00
Kristian Høgsberg
a29df71dd2 vk: Add WSI implementation 2015-05-20 20:34:52 -07:00
Kristian Høgsberg
f886647b75 vk: Add debug stubs 2015-05-20 20:34:52 -07:00
Kristian Høgsberg
63da974529 vk: Mark remaining unsupported formats as such 2015-05-20 20:34:52 -07:00
Kristian Høgsberg
387a1bb58f vk: Mark VK_FORMAT_UNDEFINED as 1 cpp, 1 channel 2015-05-20 20:34:52 -07:00
Kristian Høgsberg
a1bd426393 vk: Stream surface state instead of using the surface pool
Since the binding table pointer is only 16 bits, we can only have 64kb
of binding table state allocated at any given time. With a block size of
1kb, that amounts to just 64 command buffers, which is not enough.
2015-05-20 20:34:52 -07:00
Kristian Høgsberg
01504057f5 vk: Use surface_format_info from dri driver for vkGetFormatInfo 2015-05-20 20:34:52 -07:00
Chad Versace
a61f307996 vk: Fix result of vkCreateInstance
When fill_physical_device() fails, don't return VK_SUCCESS.
2015-05-20 19:51:10 -07:00
Jason Ekstrand
14929046ba vk/compiler: Add shader language detection
This commit adds support for the LunarG GLSL back-door as well as detecting
regular GLSL and SPIR-V.  The SPIR-V path doesn't exist yet, so that will
cause an assert-fail.
2015-05-20 17:05:41 -07:00
Jason Ekstrand
47c1cf5ce6 vk/test: Add a test for testing buffer copies 2015-05-20 16:20:04 -07:00
Jason Ekstrand
bea66ac5ad vk/meta: Add support for copying arbitrary size buffers 2015-05-20 16:20:04 -07:00
Jason Ekstrand
9557b85e3d vk/meta: Use the biggest format possible for buffer copies
This should substantially improve throughput of buffer copies.
2015-05-20 16:20:04 -07:00
Jason Ekstrand
13719e9225 vk/meta: Fix buffer copy extents 2015-05-20 16:20:04 -07:00
Jason Ekstrand
d7044a19b1 vk/meta: Use texture() instead of texture2D() 2015-05-19 12:44:35 -07:00
Jason Ekstrand
edff076188 vk: Use binding instead of index in uniform layout qualifiers
This more closely matches what the Vulkan docs say to do.
2015-05-19 12:44:22 -07:00
Jason Ekstrand
e37a89136f vk/glsl_scraper: Add a --glsl-only option 2015-05-19 11:29:07 -07:00
Jason Ekstrand
4bcf58a192 vk/glsl_scraper: Use the line number from the end of the macro
We used to use the line number from the start of the macro but this doesn't
seem to match the c preprocessor
2015-05-19 11:29:07 -07:00
Jason Ekstrand
1573913194 vk/glsl_scraper: Don't open files until needed
This prevents us from writing an empty file when the compile failed.
2015-05-19 11:29:07 -07:00
Kristian Høgsberg
e4c11f50b5 vk: Call finish for binding table state stream 2015-05-18 21:12:13 -07:00
Jason Ekstrand
851495d344 vk/meta: Use the new *view_init functions and stack-allocated views
This should save us a good deal of the leakage that meta currently has.
2015-05-18 20:57:43 -07:00
Jason Ekstrand
4668bbb161 vk/image: Factor view creation out into separate *_init functions
The *_init functions work basically the same as the Vulkan entrypoints
except that they act on an already-created view and take an optional
command buffer option.  If a command buffer is given, the surface state is
allocated out of the command buffer's state stream.
2015-05-18 20:57:43 -07:00
Jason Ekstrand
7c9f209427 Revert "vk/allocator: Don't use memfd when valgrind is detected"
This reverts commit b6ab076d6b.

It turns out setting USE_MEMFD to 0 is really bad because it means we can't
resize the pool.  Besides, valgrind SVN handles memfd so we really don't
need this fallback for valgrind anymore.
2015-05-18 20:57:43 -07:00
Jason Ekstrand
923691c70d vk: Use a separate block pool and state stream for binding tables
The binding table pointers packet only allows for a 16-bit binding table
address so all binding tables have to be in the first 64 KB of the surface
state BO.  We solve this by adding a slave block pool that pulls off the
first 64 KB worth of blocks and reserves them for binding tables.
2015-05-18 20:57:43 -07:00
Jason Ekstrand
d24f8245db vk/allocator: Add a concept of a slave block pool
We probably need a better name but this will do for now.
2015-05-18 20:57:43 -07:00
Kristian Høgsberg
997596e4c4 vk/test: Add test that prints format features 2015-05-18 20:52:44 -07:00
Kristian Høgsberg
241b59cba0 vk/test: Test timestamps and occlusion queries 2015-05-18 20:52:44 -07:00
Kristian Høgsberg
ae9ac47c74 vk: Make timestamp command work correctly
This was using the wrong timestamp register and needs to write a 64 bit
value.
2015-05-18 20:52:43 -07:00
Kristian Høgsberg
82ddab4b18 vk: Make occlusion query work, both copy and get functions 2015-05-18 20:52:43 -07:00
Kristian Høgsberg
1d40e6ade8 vk: Update generated header files
This fixes a problem where register addresses where incorrectly shifted.
2015-05-18 20:52:43 -07:00
Kristian Høgsberg
f330bad545 vk: Only fill render targets for meta clear
Clear inherits the render targets from the current render pass. This
means we need to fill out the binding table after switching to meta
bindings. However, meta copies etc happen outside a render pass and
break when we try to fill in the render targets. This change fills the
render targets only for meta clear.
2015-05-18 20:52:43 -07:00
Jason Ekstrand
b6c7d8c911 vk/pipeline: Use a state_stream for storing programs
Previously, we were effectively using a state_stream, it was just
hand-rolled based on a block pool.  Now we actually use the data structure.
2015-05-18 15:58:20 -07:00
Jason Ekstrand
4063b7deb8 vk/allocator: Add support for valgrind tracking of state pools and streams
We leave the block pool untracked so that reads/writes to freed blocks
will get caught and do the tracking at the state pool/stream level.  We
have to do a few extra gymnastics for streams because valgrind works in
terms of poitners and we work in terms of separate map and offset.
Fortunately, the users of the state pool and stream should always be using
the map pointer provided in the anv_state structure.  We just have to
track, per block, the map that was used when we initially got the block.
Then we can make sure we always use that map and valgrind should stay
happy.
2015-05-18 15:58:20 -07:00
Jason Ekstrand
b6ab076d6b vk/allocator: Don't use memfd when valgrind is detected 2015-05-18 15:58:20 -07:00
Jason Ekstrand
682d11a6e8 vk/allocator: Assert that block_pool_grow succeeds 2015-05-18 15:48:19 -07:00
Jason Ekstrand
28804fb9e4 vk/gem: VG_CLEAR the padding for the gem_mmap struct 2015-05-18 12:05:17 -07:00
Jason Ekstrand
8440b13f55 vk/meta: Rework the indentation style
No functional change.
2015-05-18 10:43:51 -07:00
Kristian Høgsberg
5286ef7849 vk: Provide more realistic values for device info 2015-05-18 10:27:08 -07:00
Kristian Høgsberg
69fd473321 vk: Use a temporary buffer for formatting in finishme
This is more likely to avoid breaking up the message when racing with
other threads.
2015-05-18 10:27:08 -07:00
Jason Ekstrand
cd7ab6ba4e vk/meta: Add an initial implementation of vkCmdCopyBuffer
Compile-tested only
2015-05-18 10:27:08 -07:00
Jason Ekstrand
c25ce55fd3 vk/meta: Add an initial implementation of vkCmdCopyBufferToImage
Compile-tested only
2015-05-18 10:27:08 -07:00
Jason Ekstrand
08bd554cda vk/meta: Add an initial implementation of vkCmdBlitImage
Compile-tested only
2015-05-18 10:27:08 -07:00
Jason Ekstrand
fb27d80781 vk/meta: Add an initial implementation of vkCmdCopyImage
Compile-tested only
2015-05-18 10:27:08 -07:00
Jason Ekstrand
c15f3834e3 vk/gem: Set the gem_mmap.flags parameter to 0 if it exists 2015-05-18 10:27:08 -07:00
Jason Ekstrand
f7b0f922be vk/gem: Only VK_CLEAR the addr_ptr in gen_mmap 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
ca7e62d421 vk: Add a logger wrapper for the generated entrypoint 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
eb92745b2e vk/gem: Just return -1 from anv_gem_wait() on error
We were returning -errno, unlike all the other gem functions.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
05754549e8 vk: Fix vkGetOjectInfo return values
We weren't properly returning the allocation count.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
6afb26452b vk: Implement fences
This basic implementation uses a throw-away bo for synchronization.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
e26a7ffbd9 vk/meta: Use anv_* internal entrypoints 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
b7fac7a7d1 vk: Implement allocation count query 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
783e6217fc vk: Change pData/pDataSize semantics
We now always copy the entire struct unless pData is NULL and
unconditionally write back the struct size. It's not clear this is
useful if the structs may grow over time, but it seems to be the
expected behaviour for now.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
b4b3bd1c51 vk: Return VK_SUCCESS from vkAllocDescriptorSets
This should've been returning VK_SUCCESS all along.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
a9f2115486 vk: Return VK_SUCCESS for all descriptor pool entry points 2015-05-18 10:27:07 -07:00
Kristian Høgsberg
60ebcbed54 vk: Start Implementing vkGetFormatInfo()
We move the format table and vkGetFormatInfo to their own file in the
process.
2015-05-18 10:27:07 -07:00
Kristian Høgsberg
454345da1e vk: Add script for generating ifunc entry points
This lets us generate a hash table for vkGetProcAddress and lets us call
public functions internally without the public entrypoint overhead.
2015-05-18 10:27:02 -07:00
Kristian Høgsberg
333bcc2072 vk: Fix vulkan header inconsistency
The function pointer typedef and the function prototype for
vkCmdClearColorImage() didn't agree. Fix the typedef to match the
prototype.
2015-05-17 21:08:31 -07:00
Kristian Høgsberg
b9eb56a404 vk: Add function pointer typedef for intel extension
Also guard function prototype by VK_PROTOTYPES.
2015-05-17 21:08:30 -07:00
Kristian Høgsberg
75cb85c56a vk: Add missing VKAPI for vkQueueRemoveMemReferences 2015-05-17 21:08:30 -07:00
Jason Ekstrand
a924ea0c75 Merge remote-tracking branch 'fdo-personal/wip/nir-vtn' into vulkan
This adds the SPIR-V -> NIR translator.
2015-05-16 12:43:16 -07:00
Jason Ekstrand
a63952510d nir/spirv: Don't assert that the current block is empty
It's possible that someone will give us SPIR-V code in which someone
needlessly branches to new blocks.  We should handle that ok now.
2015-05-16 12:34:34 -07:00
Jason Ekstrand
4e44dcc312 nir/spirv: Add initial support for samplers 2015-05-16 12:34:15 -07:00
Jason Ekstrand
d6f52dfb3e nir/spirv: Move Exp and Log to the list of currently unhandled ALU ops
NIR doesn't have the native opcodes for them anymore
2015-05-16 12:33:32 -07:00
Jason Ekstrand
a53e795524 nir/types: Add support for sampler types 2015-05-16 12:32:58 -07:00
Jason Ekstrand
0fa9211d7f nir/spirv: Make the global constants in spirv.h static
I've been promissed in a bug that this will be fixed in a future version of
the header.  However, in the interest of my branch building, I'm adding
these changes in myself for the moment.
2015-05-16 11:16:34 -07:00
Jason Ekstrand
036a4b1855 nir/spirv: Handle jump-to-loop in a more general way 2015-05-16 11:16:34 -07:00
Jason Ekstrand
56f533b3a0 nir/spirv: Handle boolean uniforms correctly 2015-05-16 11:16:34 -07:00
Jason Ekstrand
64bc58a88e nir/spirv: Handle control-flow with loops 2015-05-16 11:16:34 -07:00
Jason Ekstrand
3a2db9207d nir/spirv: Set a name on temporary variables 2015-05-16 11:16:34 -07:00
Jason Ekstrand
a28f8ad9f1 nir/spirv: Use the correct length for copying string literals 2015-05-16 11:16:34 -07:00
Jason Ekstrand
7b9c29e440 nir/spirv: Make vtn_ssa_value handle constants as well as ssa values 2015-05-16 11:16:33 -07:00
Jason Ekstrand
b0d1854efc nir/spirv: Add initial support for GLSL 4.50 builtins 2015-05-16 11:16:33 -07:00
Jason Ekstrand
1da9876486 nir/spirv: Split the core datastructures into a header file 2015-05-16 11:16:33 -07:00
Jason Ekstrand
98d78856f6 nir/spirv: Use the builder for all instructions
We don't actually use it to create all the instructions but we do use it
for insertion always.  This should make things far more consistent for
implementing extended instructions.
2015-05-16 11:16:33 -07:00
Jason Ekstrand
ff828749ea nir/spirv: Add support for a bunch of ALU operations 2015-05-16 11:16:33 -07:00
Jason Ekstrand
d2a7972557 nir/spirv: Add support for indirect array accesses 2015-05-16 11:16:33 -07:00
Jason Ekstrand
683c99908a nir/spirv: Explicitly type constants and SSA values 2015-05-16 11:16:33 -07:00
Jason Ekstrand
c5650148a9 nir/spirv: Handle OpBranchConditional
We do control-flow handling as a two-step process.  The first step is to
walk the instructions list and record various information about blocks and
functions.  This is where the acutal nir_function_overload objects get
created.  We also record the start/stop instruction for each block.  Then
a second pass walks over each of the functions and over the blocks in each
function in a way that's NIR-friendly and actually parses the instructions.
2015-05-16 11:16:33 -07:00
Jason Ekstrand
ebc152e4c9 nir/spirv: Add a helper for getting a value as an SSA value 2015-05-16 11:16:33 -07:00
Jason Ekstrand
f23afc549b nir/spirv: Split instruction handling into preamble and body sections 2015-05-16 11:16:33 -07:00
Jason Ekstrand
ae6d32c635 nir/spirv: Implement load/store instructiosn 2015-05-16 11:16:33 -07:00
Jason Ekstrand
88f6fbc897 nir: Add a helper for getting the tail of a deref chain 2015-05-16 11:16:33 -07:00
Jason Ekstrand
06acd174f3 nir/spirv: Actaully add variables to the funciton or shader 2015-05-16 11:16:33 -07:00
Jason Ekstrand
5045efa4aa nir/spirv: Add a vtn_untyped_value helper 2015-05-16 11:16:33 -07:00
Jason Ekstrand
01f3aa9c51 nir/spirv: Use vtn_value in the types code and fix a off-by-one error 2015-05-16 11:16:33 -07:00
Jason Ekstrand
6ff0830d64 nir/types: Add an is_vector_or_scalar helper 2015-05-16 11:16:33 -07:00
Jason Ekstrand
5acd472271 nir/spirv: Add support for deref chains 2015-05-16 11:16:33 -07:00
Jason Ekstrand
7182597e50 nir/types: Add a scalar type constructor 2015-05-16 11:16:32 -07:00
Jason Ekstrand
eccd798cc2 nir/spirv: Add support for OpLabel 2015-05-16 11:16:32 -07:00
Jason Ekstrand
a6cb9d9222 nir/spirv: Add support for declaring functions 2015-05-16 11:16:32 -07:00
Jason Ekstrand
8ee23dab04 nir/types: Add accessors for function parameter/return types 2015-05-16 11:16:32 -07:00
Jason Ekstrand
707b706d18 nir/spirv: Add support for declaring variables
Deref chains and variable load/store operations are still missing.
2015-05-16 11:16:32 -07:00
Jason Ekstrand
b2db85d8e4 nir/spirv: Add support for constants 2015-05-16 11:16:32 -07:00
Jason Ekstrand
3f83579664 nir/spirv: Add basic support for types 2015-05-16 11:16:32 -07:00
Jason Ekstrand
e9d3b1e694 nir/types: Add more helpers for creating types 2015-05-16 11:16:32 -07:00
Jason Ekstrand
fe550f0738 glsl/types: Expose the function_param and struct_field structs to C
Previously, they were hidden behind a #ifdef __cplusplus so C wouldn't find
them.  This commit simpliy moves the ifdef.
2015-05-16 11:16:32 -07:00
Jason Ekstrand
053778c493 glsl/types: Add support for function types 2015-05-16 11:16:32 -07:00
Jason Ekstrand
7b63b3de93 glsl: Add GLSL_TYPE_FUNCTION to the base types enums 2015-05-16 11:16:32 -07:00
Jason Ekstrand
2b570a49a9 nir/spirv: Rework the way values are added
Instead of having functions to add values and set various things, we just
have a function that does a few asserts and then returns the value.  The
caller is then responsible for setting the various fields.
2015-05-16 11:16:32 -07:00
Jason Ekstrand
f9a31ba044 nir/spirv: Add stub support for extension instructions 2015-05-16 11:16:32 -07:00
Jason Ekstrand
4763a13b07 REVERT: Add a simple helper program for testing SPIR-V -> NIR translation 2015-05-16 11:16:32 -07:00
Jason Ekstrand
cae8db6b7e glsl/compiler: Move the error_no_memory stub to standalone_scaffolding.cpp 2015-05-16 11:16:32 -07:00
Jason Ekstrand
98452cd8ae nir: Add the start of a SPIR-V to NIR translator
At the moment, it can handle the very basics of strings and can ignore
debug instructions.  It also has basic support for decorations.
2015-05-16 11:16:32 -07:00
Jason Ekstrand
573ca4a4a7 nir: Import the revision 30 SPIR-V header from Khronos 2015-05-16 11:16:31 -07:00
Jason Ekstrand
057bef8a84 vk/device: Use bias rather than layers for computing binding table size
Because we statically use the first 8 binding table entries for render
targets, we need to create a table of size 8 + surfaces.
2015-05-16 10:42:53 -07:00
Jason Ekstrand
22e61c9da4 vk/meta: Make clear a no-op if no layers need clearing
Among other things, this prevents recursive meta.
2015-05-16 10:30:05 -07:00
Jason Ekstrand
120394ac92 vk/meta: Save and restore the old bindings pointer
If we don't do this then recursive meta is completely broken.  What happens
is that the outer meta call may change the bindings pointer and the inner
meta call will change it again and, when it exits set it back to the
default.  However, the outer meta call may be relying on it being left
alone so it uses the non-meta descriptor sets instead of its own.
2015-05-16 10:28:04 -07:00
Jason Ekstrand
4223de769e vk/device: Simplify surface_count calculation 2015-05-16 10:23:09 -07:00
Jason Ekstrand
eb1952592e vk/glsl_helpers: Fix GLSL_VK_SHADER with respect to commas
Previously, the GLSL_VK_SHADER macro didn't work if the shader contained
commas outside of parentheses due to the way the C preprocessor works.
This commit fixes this by making it variadic again and doing it correctly
this time.
2015-05-15 22:17:07 -07:00
Kristian Høgsberg
3b9f32e893 vk: Make cmd_buffer->bindings a pointer
This lets us save and restore efficiently by just moving the pointer to
a temporary bindings struct for meta.
2015-05-15 18:12:07 -07:00
Kristian Høgsberg
9540130c41 vk: Move vertex buffers into struct anv_bindings 2015-05-15 16:34:31 -07:00
Kristian Høgsberg
0cfc493775 vk: Fix GLSL_VK_SHADER macro
Stringify doesn't work with __ARGV__. The last macro argument swallows
up excess arguments and as such we can just stringify that.
2015-05-15 16:15:04 -07:00
Kristian Høgsberg
af45f4a558 vk: Fix warning from missing initializer
Struct initializers need to be { 0, } to zero out the variable they're
initializing.
2015-05-15 16:07:17 -07:00
Kristian Høgsberg
bf096c9ec3 vk: Build binding tables at bind descriptor time
This changes the way descriptor sets and layouts work so that we fill
out binding table contents at the time we bind descriptor sets. We
manipulate the binding table contents and sampler state in a shadow-copy
in anv_cmd_buffer. At draw time, we allocate the actual binding table
and sampler state and flush the anv_cmd_buffer copies.
2015-05-15 16:05:31 -07:00
Kristian Høgsberg
1f6c220b45 vk: Update the bind map length to reflect MAX_SETS 2015-05-15 15:22:29 -07:00
Kristian Høgsberg
b806e80e66 vk: Flip back to using memfd for the allocators 2015-05-15 15:22:29 -07:00
Kristian Høgsberg
0a775e1eab vk: Rename dyn_state_pool to dynamic_state_pool
Given that we already tolerate surface_state_pool and the even longer
instruction_state_pool, there's no reason to arbitrarily abbreviate
dynamic.
2015-05-15 15:22:29 -07:00
Kristian Høgsberg
f5b0f1351f vk: Consolidate image, buffer and color attachment views
These are all just surface state, offset and a bo.
2015-05-15 15:22:29 -07:00
Jason Ekstrand
41db8db0f2 vk: Add a GLSL scraper utility
This new utility, glsl_scraper.py scrapes C files for instances of the
GLSL_VK_SHADER macro, pulls out the shader source, and compiles it to
SPIR-V.  The compilation is done using glslValidator.  The result is then
placed into another C file as arrays of dwords that can be easiliy handed
to a Vulkan driver.
2015-05-14 19:18:57 -07:00
Jason Ekstrand
79ace6def6 vk/meta: Add a magic GLSL shader source macro 2015-05-14 19:07:34 -07:00
Jason Ekstrand
018a0c1741 vk/meta: Add a better comment about the VS for blits 2015-05-14 11:39:32 -07:00
Jason Ekstrand
8c92701a69 vk/test: Use VK_IMAGE_TILING_OPTIMAL for the render target 2015-05-13 22:27:38 -07:00
Jason Ekstrand
4fb8bddc58 vk/test: Do a copy of the RT into a linear buffer and write that to a PNG 2015-05-13 22:23:30 -07:00
Jason Ekstrand
bd5b76d6d0 vk/meta: Add the start of a blit implementation
Currently, we only implement CopyImageToBuffer
2015-05-13 22:23:30 -07:00
Jason Ekstrand
94b8c0b810 vk/pipeline: Default to a SamplerCount of 1 for PS 2015-05-13 22:23:30 -07:00
Jason Ekstrand
d3d4776202 vk/pipeline: Add an extra flag for force-disabling the vertex shader
This way we can pass in a vertex shader and yet have the pipeline emit an
empty 3DSTATE_VS packet.  We need this for meta because we need to trick
the compiler into not deleting our inputs but at the same time disable the
VS so that we can use a rectlist.  This should go away once we actually get
SPIR-V.
2015-05-13 22:23:30 -07:00
Jason Ekstrand
a1309c5255 vk/pass: Emit a flushing pipe control at the end of the pass
This is rather crude but it at least makes sure that all the render targets
get flushed at the end of the pass.  We probably actually want to do
somthing based on image layout traansitions, but this will work for now.
2015-05-13 22:23:30 -07:00
Jason Ekstrand
07943656a7 vk/compiler: Set the binding table texture_start
This is by no means a complete solution to the binding table problems.
However, it does make texturing actually work.  Before, we were texturing
from the render target since they were both starting at 0.
2015-05-13 22:23:30 -07:00
Jason Ekstrand
cd197181f2 vk/compiler: Zero the prog data
We use prog_data[stage] != NULL to determine whether or not we need to
clean up that stage.  Make sure it default to NULL.
2015-05-13 22:22:59 -07:00
Jason Ekstrand
1f7dcf9d75 vk/image: Stash more information in images and views 2015-05-13 22:22:59 -07:00
Jason Ekstrand
43126388cd vk/meta: Save/restore more stuff in cmd_buffer_restore 2015-05-13 22:22:59 -07:00
Chad Versace
50806e8dec vk: Install headers
I need this for building a testsuite.
2015-05-13 17:49:26 -07:00
Kristian Høgsberg
83c7e1f1db vk: Add support for sampler descriptors 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
4f9eaf77a5 vk: Use a typesafe anv_descriptor struct 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
5c9d77600b vk: Create and bind a sampler in vk.c 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
18acfa7301 vk: Fix copy-n-paste sType in vkCreateSampler 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
a1ec789b0b vk: Add a dynamic state stream to anv_cmd_buffer
We'll need this for sampler state.
2015-05-13 14:47:11 -07:00
Kristian Høgsberg
3f52c016fa vk: Move struct anv_sampler to private.h 2015-05-13 14:47:11 -07:00
Kristian Høgsberg
a77229c979 vk: Allocate layout->count number of descriptors
layout->count is the number of descriptors the application
requested. layout->total is the number of entries we need across all
stages.
2015-05-13 14:47:11 -07:00
Kristian Høgsberg
a3fd136509 vk: Fill out sampler state from API values 2015-05-13 14:47:11 -07:00
Chad Versace
828817b88f vk: Ignore vk executable 2015-05-13 12:05:38 -07:00
Kristian Høgsberg
2b7a060178 vk: Fix stale error handling in vkQueueSubmit 2015-05-12 14:38:58 -07:00
Kristian Høgsberg
cb986ef597 vk: Submit all cmd buffers passed to vkQueueSubmit 2015-05-12 14:38:12 -07:00
Kristian Høgsberg
9905481552 vk: Add generated header for HSW and IVB (GEN75 and GEN7) 2015-05-12 14:29:04 -07:00
Jason Ekstrand
ffe9f60358 vk: Add stub() and stub_return() macros and mark piles of functions as stubs 2015-05-12 13:45:02 -07:00
Jason Ekstrand
d3b374ce59 vk/util: Add a anv_finishme function/macro 2015-05-12 13:43:36 -07:00
Jason Ekstrand
7727720585 vk/meta: Break setting up meta clear state into it's own functin 2015-05-12 13:03:50 -07:00
Jason Ekstrand
4336a1bc00 vk/pipeline: Add support for disabling the scissor in "extra" 2015-05-12 12:53:01 -07:00
Kristian Høgsberg
d77c34d1d2 vk: Add clear load-op for render passes 2015-05-11 23:25:29 -07:00
Kristian Høgsberg
b734e0bcc5 vk: Add support for driver-internal custom pipelines
This lets us disable the viewport, use rect lists and repclear.
2015-05-11 23:25:29 -07:00
Kristian Høgsberg
ad132bbe48 vk: Fix 3DSTATE_VERTEX_BUFFER emission
Set VertexBufferIndex to the attribute binding, not the location.
2015-05-11 23:25:29 -07:00
Kristian Høgsberg
6a895c6681 vk: Add 32 bpc signed and unsigned integer formats 2015-05-11 23:25:29 -07:00
Kristian Høgsberg
55b9b703ea vk: Add anv_batch_emit_merge() helper macro
This lets us emit a state packet by merging to half-backed versions,
typically one from the pipeline object and one from a dynamic state
objects.
2015-05-11 23:25:28 -07:00
Kristian Høgsberg
099faa1a2b vk: Store bo pointer in anv_image and anv_buffer
We don't need to point back to the memory object the bo came from.
Pointing directly to a bo lets us bind images and buffers to other
bos - like our allocator bos.
2015-05-11 23:25:28 -07:00
Kristian Høgsberg
4f25f5d86c vk: Support not having a vertex shader
This lets us bypass the vertex shader and pass data straight into
the rasterizer part of the pipeline.
2015-05-11 23:25:28 -07:00
Kristian Høgsberg
20ad071190 vk: Allow NULL as a valid pipeline layout
Vertex buffers and render targets aren't part of the layout so having
an empty layout is pretty common.
2015-05-11 22:12:56 -07:00
Kristian Høgsberg
769785c497 Add vulkan driver for BDW 2015-05-09 11:38:32 -07:00
1860 changed files with 202072 additions and 41489 deletions

3
.gitignore vendored
View File

@@ -34,6 +34,7 @@ aclocal.m4
config.log
config.status
cscope*
tags
.scon*
config.py
build
@@ -46,3 +47,5 @@ manifest.txt
Makefile
Makefile.in
.install-mesa-links
.install-gallium-links
/src/git_sha1.h

460
.mailmap Normal file
View File

@@ -0,0 +1,460 @@
Aapo Tahkola <aet@rasterburn.org> <aapo@aapo-desktop.(none)>
Adam Jackson <ajax@redhat.com> <ajax@benzedrine.nwnk.net>
Adam Jackson <ajax@redhat.com> <ajax@freedesktop.org>
Adrian Marius Negreanu <adrian.m.negreanu@intel.com> Adrian Negreanu <adrian.m.negreanu@intel.com>
Adrian Marius Negreanu <adrian.m.negreanu@intel.com> Negreanu Marius Adrian <adrian.m.negreanu@intel.com>
Dave Airlie <airlied@redhat.com> <airliedfreedesktop.org>
Dave Airlie <airlied@redhat.com> airlied <airlied@unused-12-215.bne.redhat.com>
Dave Airlie <airlied@redhat.com> <airlied@dhcp-1-203.bne.redhat.com>
Dave Airlie <airlied@redhat.com> <airlied@gmail.com>
Dave Airlie <airlied@redhat.com> <airlied@itt42.(none)>
Dave Airlie <airlied@redhat.com> <airlied@linux.ie>
Dave Airlie <airlied@redhat.com> <airlied@nx6125b.(none)>
Dave Airlie <airlied@redhat.com> <airlied@panoply-rh.(none)>
Dave Airlie <airlied@redhat.com> <airlied@ppcg5.localdomain>
Alan Coopersmith <alan.coopersmith@oracle.com> <alan.coopersmith@sun.com>
Alan Hourihane <alanh@vmware.com> <alanh@tungstengraphics.com>
Alan Hourihane <alanh@vmware.com> <alanh@fairlite.demon.co.uk>
Alan Hourihane <alanh@vmware.com> <alanh@jetpack.(none)>
Alexander Monakov <amonakov@gmail.com> <amonakov@ispras.ru>
Alexander von Gluck IV <kallisti5@unixzen.com> Alexander von Gluck <kallisti5@unixzen.com>
Alex Corscadden <alexc@vmware.com> <alexc@alexc-dev1.prom.eng.vmware.com>
Alex Corscadden <alexc@vmware.com> <alexc@alexc-dev1.vmware.com>
Alex Deucher <alexdeucher@gmail.com> <alexander.deucher@amd.com>
Alex Deucher <alexdeucher@gmail.com> <agd5f@yahoo.com>
Alex Deucher <alexdeucher@gmail.com> <alex@botch2.com>
Alex Deucher <alexdeucher@gmail.com> <alex@botch2.(none)>
Alex Deucher <alexdeucher@gmail.com> <alex@cube.(none)>
Alex Deucher <alexdeucher@gmail.com> <alex@samba.(none)>
Andreas Fänger <a.faenger@e-sign.com> <a.faenger@e-sign.com>
Andreas Hartmetz <ahartmetz@gmail.com> <andreas.hartmetz@kdab.com>
Andre Heider <a.heider@gmail.com>
Andreas Heider <andreas@heider.io>
Andreas Pokorny <andreas.pokorny@canonical.com> <andreas.pokorny@elektrobit.com>
Andrew Randrianasulu <randrianasulu@gmail.com> <randrik_a@yahoo.com>
Andrew Randrianasulu <randrianasulu@gmail.com> <randrik@mail.ru>
Arthur Huillet <arthur.huillet@free.fr> Arthur HUILLET <arthur.huillet@free.fr>
Benjamin Franzke <benjaminfranzke@googlemail.com> ben <benjaminfranzke@googlemail.com>
Ben Skeggs <bskeggs@redhat.com> <darktama@beleth.(none)>
Ben Skeggs <bskeggs@redhat.com> <darktama@iinet.net.au>
Ben Skeggs <bskeggs@redhat.com> <darktama@nisroch.keine.ath.cx>
Ben Skeggs <bskeggs@redhat.com> <skeggsb-at-gmail.com>
Ben Skeggs <bskeggs@redhat.com> <skeggsb@gmail.com>
Ben Skeggs <bskeggs@redhat.com> <skeggsb@localhost.localdomain>
Ben Skeggs <bskeggs@redhat.com> <skeggsb@nisroch.keine.ath.cx>
Ben Widawsky <benjamin.widawsky@intel.com> Ben Widawsky <ben@bwidawsk.net>
Blair Sadewitz <blair.sadewitz@gmail.com> Blair Sadewitz <blair.sadewitz.gmail.com>
Boris Peterbarg <reist@users.sourceforge.net> reist <reist>
Brian Paul <brianp@vmware.com> Brian <brian.paul@tungstengraphics.com>
Brian Paul <brianp@vmware.com> <brian.paul@tungstengraphics.com>
Brian Paul <brianp@vmware.com> <brian.e.paul@gmail.com>
Brian Paul <brianp@vmware.com> <brianp@kemper.freedesktop.org>
Brian Paul <brianp@vmware.com> brian <brian@cvp965.(none)>
Brian Paul <brianp@vmware.com> Brian <brian@i915.localnet.net>
Brian Paul <brianp@vmware.com> Brian <brian@nostromo.localnet.net>
Brian Paul <brianp@vmware.com> Brian <brian@poulsbo.localnet.net>
Brian Paul <brianp@vmware.com> Brian <brian@ps3.localnet.net>
Brian Paul <brianp@vmware.com> Brian <brianp@vmware.com>
Brian Paul <brianp@vmware.com> Brian <brian@yutani.localnet.net>
Brian Paul <brianp@vmware.com> root <brian.paul@tungstengraphics.com>
Brian Paul <brianp@vmware.com> root <root@i915.localnet.net>
Brian Paul <brianp@vmware.com> root <root@nostromo.localnet.net>
Brian Paul <brianp@vmware.com> root <root@i965.localnet.net>
Bruce Merry <bmerry@users.sourceforge.net> <bmerry@gmail.com>
Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <s3734770@mail.zih.tu-dresden.de>
Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <carli@carli-laptop.(none)>
Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <Carl-Philip.Haensch@mailbox.tu-dresden.de>
Chad Versace <chad.versace@intel.com> <chad@chad-versace.us>
Chad Versace <chad.versace@intel.com> <Chad Versace chad@chad-versace.us>
Chad Versace <chad.versace@intel.com> <chad.versace@linux.intel.com>
Chia-I Wu <olvaffe@gmail.com> <olv@lunarg.com>
Chia-I Wu <olvaffe@gmail.com> Chia-Wu <olvaffe@gmail.com>
Chih-Wei Huang <cwhuang@linux.org.tw> Chih-Wei Huang <cwhuang@android-x86.org>
Christian König <christian.koenig@amd.com> Christian Koenig <christian.koenig@amd.com>
Christian König <christian.koenig@amd.com> Christian König <christian.koenig at amd.com>
Christian König <christian.koenig@amd.com> Christian König <deathsimple@vodafone.de>
Christoph Brill <egore911@egore911.de> Christoph Bill <egore@gmx.de>
Christoph Brill <egore911@egore911.de> <egore@gmx.de>
Christoph Bumiller <christoph.bumiller@speed.at> <e0425955@student.tuwien.ac.at>
Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Christopher James Halse Rogers <raof@ubuntu.com>
Claudio Ciccani <klan@directfb.org> <klan@users.sf.net>
Claudio Ciccani <klan@directfb.org> <klan@users.sourceforge.net>
Connor Abbott <cwabbott0@gmail.com> <connor.w.abbott@intel.com>
Connor Abbott <cwabbott0@gmail.com> <connor.abbott@intel.com>
Corbin Simpson <MostAwesomeDude@gmail.com> <mostawesomed...@gmail.com>
Corbin Simpson <MostAwesomeDude@gmail.com> <mostawesomedude@gmail.com>
Courtney Goeltzenleuchter <courtney@lunarg.com> <courtney@LunarG.com>
Daniel Skinner <sio@users.sourceforge.net> sio <sio>
Daniel Stone <daniels@collabora.com> <daniel@fooishbar.org>
David Miller <davem@davemloft.net> David S. Miller <davem@davemloft.net>
David Miller <davem@davemloft.net> Dave Miller <davem@davemloft.net>
David Miller <davem@davemloft.net> davem69 <davem69>
David Heidelberger <david.heidelberger@ixit.cz> David Heidelberg <david@ixit.cz>
David Heidelberger <david.heidelberger@ixit.cz> <d.okias@gmail.com>
David Reveman <reveman@chromium.org> <c99drn@cs.umu.se>
Dieter Nützel <Dieter@nuetzel-hh.de> Dieter Nützel <dieter@nuetzel-hh.de>
Dmitry Cherkassov <dcherkassov@gmail.com> Dmitry Cherkasov <dcherkassov@gmail.com>
Dylan Baker <dylanx.c.baker@intel.com> <baker.dylan.c@gmail.com>
Emeric Grange <emeric.grange@gmail.com> Emeric <emeric.grange@gmail.com>
Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
Eric Anholt <eric@anholt.net> Eric Anholt <anholt@FreeBSD.org>
Eugeni Dodonov <eugeni.dodonov@intel.com> <eugeni@mandriva.com>
Fabian Bieler <der.fabe@gmx.net> <fabianbieler@fastmail.fm>
Fabian Bieler <der.fabe@gmx.net> <&lt;der.fabe@gmx.net&gt>
Feng, Haitao <haitao.feng@intel.com> Haitao Feng <haitao.feng@intel.com>
Frank Henigman <fjhenigman@google.com> <fjhenigman@chromium.org>
George Sapountzis <gsapountzis@gmail.com> George Sapountzis <gsap7@yahoo.gr>
Gwenole Beauchesne <gwenole.beauchesne@intel.com> <gb.devel@gmail.com>
Hamish Marson <hmarson@users.sourceforge.net> hmarson <hmarson>
Hans de Goede <hdegoede@redhat.com> Hans de Goede <j.w..r..degoede@hhs.nl>
Homer Hsing <dongsheng.xing@intel.com> <homer.hsing@gmail.com>
Hui Qi Tay <hqtay@vmware.com> <tayhuiqithq@gmail.com>
Ian Romanick <ian.d.romanick@intel.com> <idr@freedesktop.org>
Ian Romanick <ian.d.romanick@intel.com> <idr@us.ibm.com>
Jakob Bornecrantz <wallbraker@gmail.com> <jakob@vmware.com>
Jakob Bornecrantz <wallbraker@gmail.com> <jakob@aurora.(none)>
Jakob Bornecrantz <wallbraker@gmail.com> <jakob@aurora.walkyrie.se>
Jakob Bornecrantz <wallbraker@gmail.com> <jakob@tungstengraphics.com>
Jakob Bornecrantz <wallbraker@gmail.com> <wallbraker 'at' gmail 'dot' com>
Jakub Bogusz <qboosh@pld-linux.org> <gboosh@pld-linux.org>
James Legg <jlegg@feralinteractive.com> <lankyleggy@gmail.com>
Jan Vesely <jano.vesely@gmail.com> Jan Vesely <jan.vesely@rutgers.edu>
Jason Ekstrand <jason@jlekstrand.net> <jason.ekstrand@intel.com>
Jeremy Huddleston <jeremyhu@apple.com> <jeremyhu@freedesktop.org>
Jeremy Huddleston <jeremyhu@apple.com> <jeremy@tifa.local>
Jeremy Huddleston <jeremyhu@apple.com> <jeremy@vincent.local>
Jeremy Huddleston <jeremyhu@apple.com> <jeremy@yuffie.local>
Jeremy Huddleston <jeremyhu@apple.com> Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Jeremy Kolb <jkolb@freedesktop.org> <jkolb@brandeis.edu>
Jerome Glisse <jglisse@redhat.com> <glisse@freedesktop.org>
Jerome Glisse <jglisse@redhat.com> <glisse@kemper.freedesktop.org>
Jerome Glisse <jglisse@redhat.com> John Doe <glisse@barney.(none)>
Jerome Glisse <jglisse@redhat.com> John Doe <glisse@localhost.localdomain>
Jesse Barnes <jesse.barnes@intel.com> <jbarnes@hobbes.lan>
Jesse Barnes <jesse.barnes@intel.com> <jbarnes@hobbes.(none)>
Jesse Barnes <jesse.barnes@intel.com> <jbarnes@jbarnes-desktop.localdomain>
Jesse Barnes <jesse.barnes@intel.com> <jbarnes@jbarnes-t61.(none)>
Jesse Barnes <jesse.barnes@intel.com> <jbarnes@virtuousgeek.org>
Joakim Sindholt <bacn@zhasha.com> <opensource@zhasha.com>
Joakim Sindholt <bacn@zhasha.com> <zhasha@gallium-dev.(none)>
Jochen Gerlach <jtg@users.sourceforge.net> jtg <jtg>
Joel Bosveld <joel.bosveld@gmail.com> <Joel.Bosveld@gmail.com>
Jonathan Adamczewski <jadamcze@utas.edu.au> <jadamcze@utas.edu.a>
Jon Turney <jon.turney@dronecode.org.uk> Jon TURNEY <jon.turney@dronecode.org.uk>
José Fonseca <jfonseca@vmware.com> Jose Fonseca <jfonseca@vmware.com>
José Fonseca <jfonseca@vmware.com> Jose Fonseca <jrfonseca@tungstengraphics.com>
José Fonseca <jfonseca@vmware.com> <jfonseca@pegasus.(none)>
José Fonseca <jfonseca@vmware.com> <jfonseca@titan.(none)>
José Fonseca <jfonseca@vmware.com> <jose.r.fonseca@gmail.com>
José Fonseca <jfonseca@vmware.com> <jrfonseca@tungstengraphics.com>
José Fonseca <jfonseca@vmware.com> <j_r_fonseca@yahoo.co.uk>
Jouk Jansen <joukj@hrem.nano.tudelft.nl> Jouk Jansen <jouk@hrem.nano.tudelft.nl>
Jouk Jansen <joukj@hrem.nano.tudelft.nl> Jouk Jansen <joukj@hrem.stm.tudelft.nl>
Jouk Jansen <joukj@hrem.nano.tudelft.nl> joukj <joukj@tarantella.(none)>
Jouk Jansen <joukj@hrem.nano.tudelft.nl> Jouk <joukj@tarantella.nano.tudelft.nl>
Jouk Jansen <joukj@hrem.nano.tudelft.nl> Jouk <joukj@tarantella.(none)>
Jouk Jansen <joukj@hrem.nano.tudelft.nl> J.Jansen <joukj@tarantella.nano.tudelft.nl>
Juan Zhao <juan.j.zhao@intel.com> <juan.j.zhao@linux.intel.com>
Julien Cristau <jcristau@debian.org> <julien.cristau@logilab.fr>
Julien Isorce <j.isorce@samsung.com> <julien.isorce@gmail.com>
Kalyan Kondapally <kalyan.kondapally@intel.com> <kondapallykalyancontribute@gmail.com>
Karl Schultz <karl.w.schultz@gmail.com> Karl Schultze <k.w.schultz@comcast.net>
Karl Schultz <karl.w.schultz@gmail.com> unknown <kwschult@.na.qualcomm.com>
Karl Schultz <karl.w.schultz@gmail.com> <k.w.schultz@comcast.net>
Karl Schultz <karl.w.schultz@gmail.com> <Karl.W.Schultz@gmail.com>
Karl Schultz <karl.w.schultz@gmail.com> <kschultz@freedesktop.org>
Keith Harrison <sio2@users.sourceforge.net> sio2 <sio2>
Keith Packard <keithp@keithp.com> <keithp@koto.keithp.com>
Keith Packard <keithp@keithp.com> <keithp@neko.keithp.com>
Keith Whitwell <keithw@vmware.com> <keith@tungstengraphics.com>
Keith Whitwell <keithw@vmware.com> keithw <keithw@keithw-laptop.(none)>
Kristian Høgsberg <krh@bitplanet.net> <krh@redhat.com>
Kristian Høgsberg <krh@bitplanet.net> <krh@hinata.boston.redhat.com>
Kristian Høgsberg <krh@bitplanet.net> <krh@sasori.boston.redhat.com>
Kristian Høgsberg <krh@bitplanet.net> <krh@temari.boston.redhat.com>
Kristian Høgsberg <krh@bitplanet.net> <kristian.h.kristensen@intel.com>
Krzesimir Nowak <qdlacz@gmail.com> <krzesimir@kinvolk.io>
Li Peng <peng.li@intel.com> <peng.li@linux.intel.com>
Lucas Stach <dev@lynxeye.de> <l.stach@pengutronix.de>
Maarten Lankhorst <maarten.lankhorst@ubuntu.com> <dev@mblankhorst.nl>
Maarten Lankhorst <maarten.lankhorst@ubuntu.com> <m.b.lankhorst@gmail.com>
Maarten Lankhorst <maarten.lankhorst@ubuntu.com> <maarten.lankhorst@canonical.com>
Maciej Cencora <m.cencora@gmail.com> <maciej@osiris.(none)>
Marc-André Lureau <marcandre.lureau@gmail.com> Marc-Andre Lureau <marcandre.lureau@gmail.com>
Marc Dietrich <marvin24@gmx.de> Marc <marvin24@gmx.de>
Marc Dietrich <marvin24@gmx.de> marvin24 <marvin24@gmx.de>
Marcin Ślusarz <marcin.slusarz@gmail.com> Marcin Slusarz <marcin.slusarz@gmail.com>
Marek Olšák <marek.olsak@amd.com> <maraeo@gmail.com>
Mario Kleiner <mario.kleiner.de@gmail.com> kleinerm <mario.kleiner@tuebingen.mpg.de>
Mario Kleiner <mario.kleiner.de@gmail.com> <mario.kleiner@tuebingen.mpg.de>
Mark Mueller <markkmueller@gmail.com> <MarkKMueller@gmail.com>
Marta Lofstedt <marta.lofstedt@intel.com> <marta.lofstedt@linux.intel.com>
Martin Peres <martin.peres@linux.intel.com> <martin.peres@labri.fr>
Mathias Fröhlich <mathias.froehlich@gmx.net> Mathias Froehlich <Mathias.Froehlich@gmx.net>
Mathias Fröhlich <mathias.froehlich@gmx.net> Mathias Froehlich <Mathias.Froehlich@web.de>
Mathias Fröhlich <mathias.froehlich@gmx.net> Mathias Frohlich <M.Froehlich@science-computing.de>
Mathias Fröhlich <mathias.froehlich@gmx.net> <frohlich8@users.sourceforge.net>
Mathias Fröhlich <mathias.froehlich@gmx.net> <Mathias.Froehlich@gmx.net>
Mathias Fröhlich <mathias.froehlich@gmx.net> <Mathias.Froehlich@web.de>
Mathias Fröhlich <mathias.froehlich@gmx.net> M.Froehlich@science-computing.de <M.Froehlich@science-computing.de>
Matthew W. S. Bell <matthew@bells23.org.uk> Matthew Bell <matthew@bells23.org.uk>
Maxence Le Doré <maxence.ledore@gmail.com> Maxence Le Dore <maxence.ledore@gmail.com>
Micah Fedke <micah.fedke@collabora.co.uk> <M.Fedke@Astronautics.com>
Michal Krol <michal@vmware.com> <michal@tungstengraphics.com>
Michal Krol <michal@vmware.com> Michal Krol <michal@ubuntu-vbox.(none)>
Michal Krol <michal@vmware.com> Michal Krol <mjkrol@gmail.org>
Michal Krol <michal@vmware.com> michal <michal@capacitor.(none)>
Michal Krol <michal@vmware.com> michal <michal@michal-laptop.(none)>
Michal Krol <michal@vmware.com> michal <michal@quad.(none)>
Michal Krol <michal@vmware.com> michal <michal@transistor.(none)>
Michal Krol <michal@vmware.com> Michal <michal@tungstengraphics.com>
Michal Krol <michal@vmware.com> michal <michal@wmvare.com>
Michel Dänzer <michel@daenzer.net> <michel.daenzer@amd.com>
Michel Dänzer <michel@daenzer.net> <daenzer@vmware.com>
Michel Dänzer <michel@daenzer.net> <michel@tungstengraphics.com>
Michel Dänzer <michel@daenzer.net> Michel Daenzer <michel.daenzer@amd.com>
Michel Dänzer <michel@daenzer.net> Michel Daenzer <daenzer@localhost.(none)>
Mike Kaplinskiy <mike.kaplinskiy@gmail.com> Mike Kaplinksiy <mike.kaplinskiy@gmail.com>
Mike Kaplinskiy <mike.kaplinskiy@gmail.com> <mike.kaplinskiy@gmai.com>
Mike Stroyan <mike@lunarg.com> <mike@LunarG.com>
Nian Wu <nian.wu@intel.com> <nian@graphics.(none)>
Nian Wu <nian.wu@intel.com> <nian@tinderbox.sh.intel.com>
Nick Bowler <nbowler@draconx.ca>
Nick Sarnie <commendsarnex@gmail.com>
Nicolai Hähnle <nicolai.haehnle@amd.com> <nhaehnle@gmail.com>
Nicolai Hähnle <nicolai.haehnle@amd.com> Nicolai Haehnle <nhaehnle@gmail.com>
Nicolai Hähnle <nicolai.haehnle@amd.com> Nicolai Haehnle <prefect_@gmx.net>
Nicolai Hähnle <nicolai.haehnle@amd.com> Nicolai Haehnle <prefect@upb.de>
Nigel Stewart <nigels@users.sourceforge.net> <nigels@sourceforge.net>
Nigel Stewart <nigels@users.sourceforge.net> <nstewart@nvidia.com>
nobled <nobled@dreamwidth.org> <nobled2@nobled2-karmic.(none)>
Oliver McFadden <oliver.mcfadden@linux.intel.com> <z3ro.geek@gmail.com>
Owain Ainsworth <zerooa@googlemail.com> Owain G. Ainsworth <oga@openbsd.org>
Owen W. Taylor <otaylor@fishsoup.net> Owen Taylor <otaylor@snell.localdomain>
Patrice Mandin <patmandin@gmail.com> <patrice@manoir.racoon.city>
Patrice Mandin <patmandin@gmail.com> <pmandin@caramail.com>
Patrice Mandin <patmandin@gmail.com> <pmandin@freedesktop.org>
Pauli Nieminen <pauli.nieminen@linux.intel.com> <suokkos@gmail.com>
Paulo Zanoni <paulo.r.zanoni@intel.com> Paulo Zanoni <pzanoni@mandriva.com>
Paul Seidler <sepek@exherbo.org> Paul Seidler <pl.seidler@googlemail.com>
Pekka Paalanen <pekka.paalanen@collabora.co.uk> <ppaalanen@gmail.com>
Pekka Paalanen <pekka.paalanen@collabora.co.uk> <pq@iki.fi>
Peter Hutterer <peter.hutterer@who-t.net> <peter@cs.unisa.edu.au>
Pierre-Eric Pelloux-Prayer <pelloux@gmail.com> pepp <pelloux@gmail.com>
Pierre Willenbrock <pierre@pirsoft.de> Pierre Willenbrok <pierre@pirsoft.de>
Quentin Glidic <sardemff7+git@sardemff7.net> <sardemff7@sardemff7.net>
RALOVICH, Kristóf <tade60@freemail.hu> <kristof.ralovich@gmail.com>
Richard Li <richardradeon@gmail.com> <RichardZ.Li@amd.com>
# The next ones are not 100% sure
Richard Li <richardradeon@gmail.com> richard <richard@richard-desktop3.(none)>
Richard Li <richardradeon@gmail.com> richard <richard@richard-desktop.(none)>
Richard Li <richardradeon@gmail.com> root <root@richard-desktop.(none)>
Richard Sandiford <rsandifo@linux.vnet.ibm.com> <r.sandiford@uk.ibm.com>
Rob Clark <robclark@freedesktop.org> <Rob Clark robdclark@freedesktop.org>
Rob Clark <robclark@freedesktop.org> <robdclark@gmail.com>
Robert Bragg <robert@sixbynine.org> <robert@linux.intel.com>
Robert Ellison <papillo@vmware.com> <papillo@i965-laptop.(none)>
Robert Ellison <papillo@vmware.com> <papillo@tungstengraphics.com>
Robert Hooker <sarvatt@ubuntu.com> <robert.hooker@canonical.com>
Roland Scheidegger <sroland@vmware.com> <rscheidegger@gmx.ch>
Roland Scheidegger <sroland@vmware.com> <sroland@tungstengraphics.com>
Roy Spliet <rspliet@eclipso.eu> <r.spliet@student.tudelft.nl>
Rune Petersen <rune@megahurts.dk> Rune Peterson <rune@megahurts.dk>
Ryan Houdek <sonicadvance1@gmail.com> <Sonicadvance1@gmail.com>
Sam Hocevar <sam@hocevar.net> Sam Hocevar <sam@zoy.org>
Samuel Iglesias Gonsálvez <siglesias@igalia.com> Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Sean D'Epagnier <sean@depagnier.com> <geckosenator@freedesktop.org>
Serge Martin <edb+mesa@sigluy.net> Serge Martin (EdB) <edb+mesa@sigluy.net>
Serge Martin <edb+mesa@sigluy.net> EdB <edb+mesa@sigluy.net>
Sinclair Yeh <syeh@vmware.com> <sinclair.yeh@intel.com>
Stefan Brüns <stefan.bruens@rwth-aachen.de> <Stefan.Bruens@rwth-aachen.de>
Stéphane Marchesin <marcheu@chromium.org> Stephane Marchesin <marchesin@icps.u-strasbg.fr>
Stéphane Marchesin <marcheu@chromium.org> Stephane Marchesin <stephane.marchesin@gmail.com>
Sven M. Hallberg <pesco@users.sourceforge.net> pesco <pesco>
Tapani Pälli <tapani.palli@intel.com> <tapani.palli@gmail.com>
Tapani Pälli <tapani.palli@intel.com> Tapani <tapani.palli@intel.com>
Thierry Reding <treding@nvidia.com> <thierry@gilfi.de>
Thierry Reding <treding@nvidia.com> <thierry.reding@avionic-design.de>
Thierry Vignaud <thierry.vignaud@gmail.com> <tvignaud@mandriva.com>
Thomas Balling Sørensen <tball@io.dk> <tball@tball-laptop.(none)>
Thomas Hellstrom <thellstrom@vmware.com> Thomas <thellstrom@vmware.com>
Thomas Hellstrom <thellstrom@vmware.com> Thomas Hellstrom <thellstrom-at-vmware-dot-com>
Thomas Hellstrom <thellstrom@vmware.com> Thomas Hellstrom <thomas-at-tungstengraphics-dot-com>
Thomas Hellstrom <thellstrom@vmware.com> Thomas Hellstrom <thomas@tungstengraphics.com>
Thomas Hellstrom <thellstrom@vmware.com> Thomas Hellström <thomas@tungstengraphics.com>
Thomas Tanner <tanner@gmx.net> tanner <tanner>
Tilman Sauerbeck <tilman@code-monkey.de> <tilman@freedesktop.org>
Timothy Arceri <timothy.arceri@collabora.com> <t_arceri@yahoo.com.au>
Timothy Arceri <timothy.arceri@collabora.com> Timothy <t_arceri@yahoo.com.au>
Tom Fogal <tfogal@alumni.unh.edu> <tfogal@sci.utah.edu>
Tom Stellard <thomas.stellard@amd.com> <tstellar@gmail.com>
Tom Stellard <thomas.stellard@amd.com> Thomas Stellard <tom.stellard@amd.com>
Tormod Volden <debian.tormod@gmail.com> <lists.tormod@gmail.com>
Török Edwin <edwin+mesa@etorok.net> Török Edvin <edwintorok@gmail.com>
Török Edwin <edwin+mesa@etorok.net> <edwintorok@gmail.com>
Ville Syrjälä <ville.syrjala@linux.intel.com> Ville Syrjala <syrjala@freedesktop.org>
Ville Syrjälä <ville.syrjala@linux.intel.com> Ville Syrjala <syrjala@sci.fi>
Vincent Lejeune <vljn@ovi.com> <peluche.canard@gmail.com>
Vinson Lee <vlee@freedesktop.org> <vlee@vmware.com>
Zhenyu Wang <zhenyuw@linux.intel.com> Wang Zhenyu <zhenyu.z.wang@intel.com>
Zack Rusin <zackr@vmware.com> <zack@kde.org>
Zack Rusin <zackr@vmware.com> <zack@pixel.(none)>
Zack Rusin <zackr@vmware.com> <zack@tungstengraphics.com>
Zhang <zxpmyth@yahoo.com.cn> zhang <zxpmyth@yahoo.com.cn>

View File

@@ -1,6 +1,7 @@
language: c
sudo: false
sudo: true
dist: trusty
cache:
directories:
@@ -15,7 +16,11 @@ addons:
- libexpat1-dev
- libxcb-dri2-0-dev
- libx11-xcb-dev
- llvm-3.4-dev
- llvm-3.5-dev
# llvm-config is not in the dev package?
- llvm-3.5
# LLVM packaging is broken and misses this dep.
- libedit-dev
- scons
env:
@@ -41,6 +46,16 @@ install:
- export PATH="/usr/lib/ccache:$PATH"
- pip install --user mako
# Since libdrm gets updated in configure.ac regularly, try to pick up the
# latest version from there.
- for line in `grep "^LIBDRM_.*_REQUIRED=" configure.ac`; do
old_ver=`echo $LIBDRM_VERSION | sed 's/libdrm-//'`;
new_ver=`echo $line | sed 's/.*REQUIRED=//'`;
if `echo "$old_ver,$new_ver" | tr ',' '\n' | sort -Vc 2> /dev/null`; then
export LIBDRM_VERSION="libdrm-$new_ver";
fi;
done
# Install dependencies where we require specific versions (or where
# disallowed by Travis CI's package whitelisting).
@@ -78,22 +93,19 @@ install:
- wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
- tar -jxvf $LIBDRM_VERSION.tar.bz2
- (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix && make install)
- (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 && make install)
- wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
- tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
- (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)
# Disabled LLVM (and therefore r300 and r600) because the build fails
# with "undefined reference to `clock_gettime'" and "undefined
# reference to `setupterm'" in llvmpipe.
script:
- if test "x$BUILD" = xmake; then
./autogen.sh --enable-debug
--disable-gallium-llvm
--with-egl-platforms=x11,drm
--with-dri-drivers=i915,i965,radeon,r200,swrast,nouveau
--with-gallium-drivers=svga,swrast,vc4,virgl
--with-gallium-drivers=svga,swrast,vc4,virgl,r300,r600
--disable-llvm-shared-libs
;
make && make check;
elif test x$BUILD = xscons; then

View File

@@ -33,6 +33,7 @@ MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)
# define ANDROID_VERSION (e.g., 4.0.x => 0x0400)
LOCAL_CFLAGS += \
-Wno-unused-parameter \
-Wno-date-time \
-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \
-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)
@@ -53,6 +54,7 @@ LOCAL_CFLAGS += \
-DHAVE___BUILTIN_CLZLL \
-DHAVE___BUILTIN_UNREACHABLE \
-DHAVE_PTHREAD=1 \
-DHAVE_DLOPEN \
-fvisibility=hidden \
-Wno-sign-compare
@@ -64,7 +66,6 @@ ifeq ($(strip $(MESA_ENABLE_ASM)),true)
ifeq ($(TARGET_ARCH),x86)
LOCAL_CFLAGS += \
-DUSE_X86_ASM \
-DHAVE_DLOPEN \
endif
endif
@@ -82,9 +83,19 @@ LOCAL_CPPFLAGS += \
-Wno-error=non-virtual-dtor \
-Wno-non-virtual-dtor
ifeq ($(MESA_LOLLIPOP_BUILD),true)
LOCAL_CFLAGS_32 += -DDEFAULT_DRIVER_DIR=\"/system/lib/$(MESA_DRI_MODULE_REL_PATH)\"
LOCAL_CFLAGS_64 += -DDEFAULT_DRIVER_DIR=\"/system/lib64/$(MESA_DRI_MODULE_REL_PATH)\"
else
LOCAL_CFLAGS += -DDEFAULT_DRIVER_DIR=\"/system/lib/$(MESA_DRI_MODULE_REL_PATH)\"
endif
# uncomment to keep the debug symbols
#LOCAL_STRIP_MODULE := false
ifeq ($(strip $(LOCAL_MODULE_TAGS)),)
LOCAL_MODULE_TAGS := optional
endif
# Quiet down the build system and remove any .h files from the sources
LOCAL_SRC_FILES := $(patsubst %.h, , $(LOCAL_SRC_FILES))

View File

@@ -42,6 +42,10 @@ $(call local-intermediates-dir)
endef
endif
MESA_DRI_MODULE_REL_PATH := dri
MESA_DRI_MODULE_PATH := $(TARGET_OUT_SHARED_LIBRARIES)/$(MESA_DRI_MODULE_REL_PATH)
MESA_DRI_MODULE_UNSTRIPPED_PATH := $(TARGET_OUT_SHARED_LIBRARIES_UNSTRIPPED)/$(MESA_DRI_MODULE_REL_PATH)
MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk
MESA_PYTHON2 := python
@@ -84,19 +88,23 @@ MESA_ENABLE_LLVM := $(if $(filter radeonsi,$(MESA_GPU_DRIVERS)),true,false)
ifneq ($(strip $(MESA_GPU_DRIVERS)),)
SUBDIRS := \
src/gbm \
src/loader \
src/mapi \
src/compiler \
src/glsl \
src/mesa \
src/util \
src/egl \
src/intel/genxml \
src/intel/isl \
src/mesa/drivers/dri
INC_DIRS := $(call all-named-subdir-makefiles,$(SUBDIRS))
ifeq ($(strip $(MESA_BUILD_GALLIUM)),true)
SUBDIRS += src/gallium
INC_DIRS += $(call all-named-subdir-makefiles,src/gallium)
endif
include $(call all-named-subdir-makefiles,$(SUBDIRS))
include $(INC_DIRS)
endif

View File

@@ -22,20 +22,29 @@
SUBDIRS = src
AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-dri \
--enable-dri3 \
--enable-egl \
--enable-gallium-tests \
--enable-gallium-osmesa \
--enable-gallium-llvm \
--enable-gbm \
--enable-gles1 \
--enable-gles2 \
--enable-glx \
--enable-glx-tls \
--enable-nine \
--enable-opencl \
--enable-opengl \
--enable-va \
--enable-vdpau \
--enable-xa \
--enable-xvmc \
--disable-llvm-shared-libs \
--with-egl-platforms=x11,wayland,drm \
--enable-llvm-shared-libs \
--with-egl-platforms=x11,wayland,drm,surfaceless \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast
--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr \
--with-vulkan-drivers=intel
ACLOCAL_AMFLAGS = -I m4
@@ -53,6 +62,7 @@ noinst_HEADERS = \
include/c99_math.h \
include/c11 \
include/D3D9 \
include/GL/wglext.h \
include/HaikuGL \
include/no_extern_c.h \
include/pci_ids

106
REVIEWERS Normal file
View File

@@ -0,0 +1,106 @@
Overview:
This file is similar in syntax (or more precisly a subset) of what is
used by the MAINTAINERS file in the linux kernel. Some fields do not
apply, for example, in all cases, send patches to:
mesa-dev@lists.freedesktop.org
and in all cases the patchwork instance is:
https://patchwork.freedesktop.org/project/mesa/
The purpose is not exactly the same the MAINTAINERS file in the linux
kernel, as there are not official/formal maintainers of different
subsystems in mesa, but is meant to give an idea of who to CC for
various patches for review, and to allow the use of
scripts/get_reviewer.pl as git --cc-cmd.
Usage:
When sending patches:
git send-email --cc-cmd ./scripts/get_reviewer.pl ...
Or to configure as default:
git config sendemail.cccmd ./scripts/get_reviewer.pl
Descriptions of section entries:
R: Designated reviewer: FullName <address@domain>
These reviewers should be CCed on patches.
F: Files and directories with wildcard patterns.
A trailing slash includes all files and subdirectory files.
F: drivers/net/ all files in and below drivers/net
F: drivers/net/* all files in drivers/net, but not below
F: */net/* all files in "any top level directory"/net
One pattern per line. Multiple F: lines acceptable.
N: Files and directories with regex patterns.
N: [^a-z]tegra all files whose path contains the word tegra
One pattern per line. Multiple N: lines acceptable.
scripts/get_maintainer.pl has different behavior for files that
match F: pattern and matches of N: patterns. By default,
get_maintainer will not look at git log history when an F: pattern
match occurs. When an N: match occurs, git log history is used
to also notify the people that have git commit signatures.
Maintainers List (try to look for most precise areas first)
Note: this is an opt-in system, I have not tried to add anyone who hasn't
either asked me or sent a patch to add themselves.
-----------------------------------
NIR
R: Jason Ekstrand <jason@jlekstrand.net>
F: src/compiler/nir/
DOCUMENTATION
R: Emil Velikov <emil.l.velikov@gmail.com>
F: docs/
F: doxygen/
COMPATIBILITY HEADERS
R: Emil Velikov <emil.l.velikov@gmail.com>
F: include/c99*
DRI LOADER
R: Emil Velikov <emil.l.velikov@gmail.com>
F: src/loader/
GALLIUM LOADER
R: Emil Velikov <emil.l.velikov@gmail.com>
F: src/gallium/auxiliary/pipe-loader/
F: src/gallium/auxiliary/target-helpers/
GALLIUM TARGETS
R: Emil Velikov <emil.l.velikov@gmail.com>
F: src/gallium/targets/
AUTOCONF BUILD
R: Emil Velikov <emil.l.velikov@gmail.com>
F: configure.ac
F: */Automake.inc
F: */Makefile.*am
F: */Makefile.sources
SCONS BUILD
F: scons/
F: */SConscript*
F: */Makefile.sources
ANDROID BUILD
R: Emil Velikov <emil.l.velikov@gmail.com>
F: CleanSpec.mk
F: */Android.*mk
F: */Makefile.sources
WAYLAND EGL SUPPORT
R: Daniel Stone <daniels@collabora.com>
F: src/egl/wayland/*
F: src/egl/drivers/dri2/platform_wayland.c
FREEDRENO
R: Rob Clark <robclark@freedesktop.org>
F: src/gallium/drivers/freedreno/

View File

@@ -1,7 +1,7 @@
#######################################################################
# Top-level SConstruct
#
# For example, invoke scons as
# For example, invoke scons as
#
# scons build=debug llvm=yes machine=x86
#
@@ -12,13 +12,13 @@
# build='debug'
# llvm=True
# machine='x86'
#
#
# Invoke
#
# scons -h
#
# to get the full list of options. See scons manpage for more info.
#
#
import os
import os.path
@@ -36,7 +36,7 @@ common.AddOptions(opts)
env = Environment(
options = opts,
tools = ['gallium'],
toolpath = ['#scons'],
toolpath = ['#scons'],
ENV = os.environ,
)
@@ -53,7 +53,7 @@ else:
print 'scons: warning: targets option is deprecated; pass the targets on their own such as'
print
print ' scons %s' % ' '.join(targets)
print
print
COMMAND_LINE_TARGETS.append(targets)
@@ -84,9 +84,14 @@ env.Append(CPPPATH = [
#print env.Dump()
# Add a check target for running tests
check = env.Alias('check')
env.AlwaysBuild(check)
#######################################################################
# Invoke host SConscripts
#
# Invoke host SConscripts
#
# For things that are meant to be run on the native host build machine, instead
# of the target machine.
#

View File

@@ -1 +1 @@
11.2.0-devel
12.0.6

View File

@@ -37,6 +37,8 @@ cache:
- win_flex_bison-2.4.5.zip
- llvm-3.3.1-msvc2013-mtd.7z
os: Visual Studio 2013
environment:
WINFLEXBISON_ARCHIVE: win_flex_bison-2.4.5.zip
LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z
@@ -47,11 +49,13 @@ install:
- python -m pip --version
# Install Mako
- python -m pip install --egg Mako
# Install pywin32 extensions, needed by SCons
- python -m pip install pypiwin32
# Install SCons
- python -m pip install --egg scons==2.4.1
- scons --version
# Install flex/bison
- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "http://downloads.sourceforge.net/project/winflexbison/%WINFLEXBISON_ARCHIVE%"
- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://downloads.sourceforge.net/project/winflexbison/old_versions/%WINFLEXBISON_ARCHIVE%"
- 7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul
- set Path=%CD%\winflexbison;%Path%
- win_flex --version
@@ -65,6 +69,9 @@ install:
build_script:
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1
after_build:
- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1 check
# It's possible to setup notification here, as described in
# http://www.appveyor.com/docs/notifications#appveyor-yml-configuration , but

28
bin/.cherry-ignore Normal file
View File

@@ -0,0 +1,28 @@
# The offending commit that this patch (part) reverts isn't in 12.0
be32a2132785fbc119f17e62070e007ee7d17af7 i965/compiler: Bring back the INTEL_PRECISE_TRIG environment variable
# The patch depends on the batch_cache work at least.
89f00f749fda4c1beca38f362c7f86bdc6e32785 a4xx: make sure to actually clamp depth as requested
# The patch depends on the 'generic' interoplation and location
# implementation introduced with 2d6dd30a9b30
114874b22beafb2d07006b197c62d717fc7f80cc i965/fs: Use sample interpolation for interpolateAtCentroid in persample mode
# VAAPI encode landed after the branch point.
a5993022275c20061ac025d9adc26c5f9d02afee st/va Avoid VBR bitrate calculation overflow v2
# EGL_KHR_debug landed after the branch point.
17084b6f9340f798111e53e08f5d35c7630cee48 egl: Fix missing unlock in eglGetSyncAttribKHR
# Depends on update_renderbuffer_read_surfaces at least
f2b9b0c730e345bcffa9eadabb25af3ab02642f2 i965: Add missing BRW_NEW_FS_PROG_DATA to render target reads.
# The commit in question hasn't landed in branch
1ef787339774bc7f1cc9c1615722f944005e070c Revert "egl/android: Set EGL_MAX_PBUFFER_WIDTH and EGL_MAX_PBUFFER_HEIGHT"
# Patches depend on the fence_finish() gallium API change and corresponding driver work
f240ad98bc05281ea7013d91973cb5f932ae9434 st/mesa: unduplicate st_check_sync code
b687f766fddb7b39479cd9ee0427984029ea3559 st/mesa: allow multiple concurrent waiters in ClientWaitSync
# Commit was reverted shortly after it landed in master
a39ad185932eab4f25a0cb2b112c10d8700ef242 configure.ac: honour LLVM_LIBDIR when linking against LLVM

View File

@@ -40,7 +40,7 @@ else
for i in $urls
do
id=$(echo $i | cut -d'=' -f2)
summary=$(wget --quiet -O - $i | grep -e '<title>.*</title>' | sed -e 's/ *<title>Bug [0-9]\+ &ndash; \(.*\)<\/title>/\1/')
summary=$(wget --quiet -O - $i | grep -e '<title>.*</title>' | sed -e 's/ *<title>[0-9]\+ &ndash; \(.*\)<\/title>/\1/')
echo "<li><a href=\"$i\">Bug $id</a> - $summary</li>"
echo ""
done

35
bin/get-extra-pick-list.sh Executable file
View File

@@ -0,0 +1,35 @@
#!/bin/sh
# Script for generating a list of candidates which fix commits that have been
# previously cherry-picked to a stable branch.
#
# Usage examples:
#
# $ bin/get-extra-pick-list.sh
# $ bin/get-extra-pick-list.sh > picklist
# $ bin/get-extra-pick-list.sh | tee picklist
# Use the last branchpoint as our limit for the search
# XXX: there should be a better way for this
latest_branchpoint=`git branch | grep \* | cut -c 3-`-branchpoint
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' |\
cut -c -8 |\
while read sha
do
# Check if the original commit is referenced in master
git log -n1 --pretty=oneline --grep=$sha $latest_branchpoint..origin/master |\
cut -c -8 |\
while read candidate
do
# Check if the potential fix, hasn't landed in branch yet.
found=`git log -n1 --pretty=oneline --reverse --grep=$candidate $latest_branchpoint..HEAD |wc -l`
if test $found = 0
then
echo Commit $candidate might need to be picked, as it references $sha
fi
done
done

View File

@@ -14,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*12\.0.*mesa-stable\)' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

39
bin/get-typod-pick-list.sh Executable file
View File

@@ -0,0 +1,39 @@
#!/bin/sh
# Script for generating a list of candidates which have typos in the nomination line
#
# Usage examples:
#
# $ bin/get-typod-pick-list.sh
# $ bin/get-typod-pick-list.sh > picklist
# $ bin/get-typod-pick-list.sh | tee picklist
# NB:
# This script intentionally _never_ checks for specific version tag
# Should we consider folding it with the original get-pick-list.sh
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^CC:.*mesa-dev' HEAD..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.
if [ -f bin/.cherry-ignore ] ; then
if grep -q ^$sha bin/.cherry-ignore ; then
continue
fi
fi
# Check to see if it has already been picked over.
if grep -q ^$sha already_picked ; then
continue
fi
git log -n1 --pretty=oneline $sha | cat
done
rm -f already_picked

View File

@@ -97,6 +97,7 @@ def AddOptions(opts):
opts.Add(BoolOption('embedded', 'embedded build', 'no'))
opts.Add(BoolOption('analyze',
'enable static code analysis where available', 'no'))
opts.Add(BoolOption('asan', 'enable Address Sanitizer', 'no'))
opts.Add('toolchain', 'compiler toolchain', default_toolchain)
opts.Add(BoolOption('gles', 'EXPERIMENTAL: enable OpenGL ES support',
'no'))

View File

@@ -68,7 +68,7 @@ OPENCL_VERSION=1
AC_SUBST([OPENCL_VERSION])
dnl Versions for external dependencies
LIBDRM_REQUIRED=2.4.60
LIBDRM_REQUIRED=2.4.66
LIBDRM_RADEON_REQUIRED=2.4.56
LIBDRM_AMDGPU_REQUIRED=2.4.63
LIBDRM_INTEL_REQUIRED=2.4.61
@@ -110,10 +110,10 @@ LT_INIT([disable-static])
AC_CHECK_PROG(RM, rm, [rm -f])
AX_PROG_BISON([],
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-parse.c"],
AS_IF([test ! -f "$srcdir/src/compiler/glsl/glcpp/glcpp-parse.c"],
[AC_MSG_ERROR([bison not found - unable to compile glcpp-parse.y])]))
AX_PROG_FLEX([],
AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-lex.c"],
AS_IF([test ! -f "$srcdir/src/compiler/glsl/glcpp/glcpp-lex.c"],
[AC_MSG_ERROR([flex not found - unable to compile glcpp-lex.l])]))
AC_CHECK_PROG(INDENT, indent, indent, cat)
@@ -223,8 +223,11 @@ AX_GCC_FUNC_ATTRIBUTE([format])
AX_GCC_FUNC_ATTRIBUTE([malloc])
AX_GCC_FUNC_ATTRIBUTE([packed])
AX_GCC_FUNC_ATTRIBUTE([pure])
AX_GCC_FUNC_ATTRIBUTE([returns_nonnull])
AX_GCC_FUNC_ATTRIBUTE([unused])
AX_GCC_FUNC_ATTRIBUTE([visibility])
AX_GCC_FUNC_ATTRIBUTE([warn_unused_result])
AX_GCC_FUNC_ATTRIBUTE([weak])
AM_CONDITIONAL([GEN_ASM_OFFSETS], test "x$GEN_ASM_OFFSETS" = xyes)
@@ -247,7 +250,11 @@ _SAVE_CPPFLAGS="$CPPFLAGS"
dnl Compiler macros
DEFINES="-D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS"
AC_SUBST([DEFINES])
android=no
case "$host_os" in
*-android)
android=yes
;;
linux*|*-gnu*|gnu*)
DEFINES="$DEFINES -D_GNU_SOURCE"
;;
@@ -259,6 +266,8 @@ cygwin*)
;;
esac
AM_CONDITIONAL(HAVE_ANDROID, test "x$android" = xyes)
dnl Add flags for gcc and g++
if test "x$GCC" = xyes; then
CFLAGS="$CFLAGS -Wall"
@@ -513,6 +522,8 @@ else
DEFINES="$DEFINES -DNDEBUG"
fi
DEFAULT_GL_LIB_NAME=GL
dnl
dnl Check if linker supports -Bsymbolic
dnl
@@ -610,6 +621,23 @@ esac
AM_CONDITIONAL(HAVE_COMPAT_SYMLINKS, test "x$HAVE_COMPAT_SYMLINKS" = xyes)
DEFAULT_GL_LIB_NAME=GL
dnl
dnl Libglvnd configuration
dnl
AC_ARG_ENABLE([libglvnd],
[AS_HELP_STRING([--enable-libglvnd],
[Build for libglvnd @<:@default=disabled@:>@])],
[enable_libglvnd="$enableval"],
[enable_libglvnd=no])
AM_CONDITIONAL(USE_LIBGLVND_GLX, test "x$enable_libglvnd" = xyes)
#AM_COND_IF([USE_LIBGLVND_GLX], [DEFINES="${DEFINES} -DUSE_LIBGLVND_GLX=1"])
if test "x$enable_libglvnd" = xyes ; then
DEFINES="${DEFINES} -DUSE_LIBGLVND_GLX=1"
DEFAULT_GL_LIB_NAME=GLX_mesa
fi
dnl
dnl library names
dnl
@@ -647,13 +675,13 @@ AC_ARG_WITH([gl-lib-name],
[AS_HELP_STRING([--with-gl-lib-name@<:@=NAME@:>@],
[specify GL library name @<:@default=GL@:>@])],
[GL_LIB=$withval],
[GL_LIB=GL])
[GL_LIB="$DEFAULT_GL_LIB_NAME"])
AC_ARG_WITH([osmesa-lib-name],
[AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@],
[specify OSMesa library name @<:@default=OSMesa@:>@])],
[OSMESA_LIB=$withval],
[OSMESA_LIB=OSMesa])
AS_IF([test "x$GL_LIB" = xyes], [GL_LIB=GL])
AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"])
AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa])
dnl
@@ -704,8 +732,10 @@ test "x$enable_asm" = xno && AC_MSG_RESULT([no])
if test "x$enable_asm" = xyes -a "x$cross_compiling" = xyes; then
case "$host_cpu" in
i?86 | x86_64 | amd64)
enable_asm=no
AC_MSG_RESULT([no, cross compiling])
if test "x$host_cpu" != "x$target_cpu"; then
enable_asm=no
AC_MSG_RESULT([no, cross compiling])
fi
;;
esac
fi
@@ -754,6 +784,7 @@ if test "x$enable_asm" = xyes; then
esac
fi
AC_HEADER_MAJOR
AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"])
AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"])
AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"])
@@ -796,6 +827,10 @@ dnl to -pthread, which causes problems if we need -lpthread to appear in
dnl pkgconfig files.
test -z "$PTHREAD_LIBS" && PTHREAD_LIBS="-lpthread"
PKG_CHECK_MODULES(PTHREADSTUBS, pthread-stubs)
AC_SUBST(PTHREADSTUBS_CFLAGS)
AC_SUBST(PTHREADSTUBS_LIBS)
dnl SELinux awareness.
AC_ARG_ENABLE([selinux],
[AS_HELP_STRING([--enable-selinux],
@@ -856,8 +891,8 @@ AC_ARG_ENABLE([dri3],
[enable_dri3="$enableval"],
[enable_dri3="$dri3_default"])
AC_ARG_ENABLE([glx],
[AS_HELP_STRING([--enable-glx],
[enable GLX library @<:@default=enabled@:>@])],
[AS_HELP_STRING([--enable-glx@<:@=dri|xlib|gallium-xlib@:>@],
[enable the GLX library and choose an implementation @<:@default=auto@:>@])],
[enable_glx="$enableval"],
[enable_glx=yes])
AC_ARG_ENABLE([osmesa],
@@ -923,17 +958,6 @@ AC_ARG_ENABLE([opencl_icd],
@<:@default=disabled@:>@])],
[enable_opencl_icd="$enableval"],
[enable_opencl_icd=no])
AC_ARG_ENABLE([xlib-glx],
[AS_HELP_STRING([--enable-xlib-glx],
[make GLX library Xlib-based instead of DRI-based @<:@default=disabled@:>@])],
[enable_xlib_glx="$enableval"],
[enable_xlib_glx=no])
AC_ARG_ENABLE([r600-llvm-compiler],
[AS_HELP_STRING([--enable-r600-llvm-compiler],
[Enable experimental LLVM backend for graphics shaders @<:@default=disabled@:>@])],
[enable_r600_llvm="$enableval"],
[enable_r600_llvm=no])
AC_ARG_ENABLE([gallium-tests],
[AS_HELP_STRING([--enable-gallium-tests],
@@ -992,36 +1016,86 @@ AM_CONDITIONAL(NEED_OPENGL_COMMON, test "x$enable_opengl" = xyes -o \
"x$enable_gles1" = xyes -o \
"x$enable_gles2" = xyes)
if test "x$enable_glx" = xno; then
AC_MSG_WARN([GLX disabled, disabling Xlib-GLX])
enable_xlib_glx=no
# Validate GLX options
if test "x$enable_glx" = xyes; then
if test "x$enable_dri" = xyes; then
enable_glx=dri
elif test -n "$with_gallium_drivers"; then
enable_glx=gallium-xlib
else
enable_glx=xlib
fi
fi
case "x$enable_glx" in
xdri | xxlib | xgallium-xlib)
# GLX requires OpenGL
if test "x$enable_opengl" = xno; then
AC_MSG_ERROR([GLX cannot be built without OpenGL])
fi
if test "x$enable_dri$enable_xlib_glx" = xyesyes; then
AC_MSG_ERROR([DRI and Xlib-GLX cannot be built together])
# Check individual dependencies
case "x$enable_glx" in
xdri)
if test "x$enable_dri" = xno; then
AC_MSG_ERROR([DRI-based GLX requires DRI to be enabled])
fi
;;
xxlib)
if test "x$enable_dri" = xyes; then
AC_MSG_ERROR([Xlib-based GLX cannot be built with DRI enabled])
fi
;;
xgallium-xlib )
if test "x$enable_dri" = xyes; then
AC_MSG_ERROR([Xlib-based (Gallium) GLX cannot be built with DRI enabled])
fi
if test -z "$with_gallium_drivers"; then
AC_MSG_ERROR([Xlib-based (Gallium) GLX cannot be built without Gallium enabled])
fi
;;
esac
;;
xno)
;;
*)
AC_MSG_ERROR([Illegal value for --enable-glx: $enable_glx])
;;
esac
AM_CONDITIONAL(HAVE_GLX, test "x$enable_glx" != xno)
AM_CONDITIONAL(HAVE_DRI_GLX, test "x$enable_glx" = xdri)
AM_CONDITIONAL(HAVE_XLIB_GLX, test "x$enable_glx" = xxlib)
AM_CONDITIONAL(HAVE_GALLIUM_XLIB_GLX, test "x$enable_glx" = xgallium-xlib)
dnl
dnl Libglvnd configuration
dnl
AC_ARG_ENABLE([libglvnd],
[AS_HELP_STRING([--enable-libglvnd],
[Build for libglvnd @<:@default=disabled@:>@])],
[enable_libglvnd="$enableval"],
[enable_libglvnd=no])
AM_CONDITIONAL(USE_LIBGLVND_GLX, test "x$enable_libglvnd" = xyes)
if test "x$enable_libglvnd" = xyes ; then
dnl XXX: update once we can handle more than libGL/glx.
dnl Namely: we should error out if neither of the glvnd enabled libraries
dnl are built
case "x$enable_glx" in
xno)
AC_MSG_ERROR([cannot build libglvnd without GLX])
;;
xxlib | xgallium-xlib )
AC_MSG_ERROR([cannot build libgvnd when Xlib-GLX or Gallium-Xlib-GLX is enabled])
;;
xdri)
;;
esac
PKG_CHECK_MODULES([GLVND], libglvnd >= 0.1.0)
DEFINES="${DEFINES} -DUSE_LIBGLVND_GLX=1"
DEFAULT_GL_LIB_NAME=GLX_mesa
fi
if test "x$enable_opengl$enable_xlib_glx" = xnoyes; then
AC_MSG_ERROR([Xlib-GLX cannot be built without OpenGL])
fi
# Disable GLX if OpenGL is not enabled
if test "x$enable_glx$enable_opengl" = xyesno; then
AC_MSG_WARN([OpenGL not enabled, disabling GLX])
enable_glx=no
fi
# Disable GLX if DRI and Xlib-GLX are not enabled
if test "x$enable_glx" = xyes -a \
"x$enable_dri" = xno -a \
"x$enable_xlib_glx" = xno; then
AC_MSG_WARN([Neither DRI nor Xlib-GLX enabled, disabling GLX])
enable_glx=no
fi
AM_CONDITIONAL(HAVE_DRI_GLX, test "x$enable_glx" = xyes -a \
"x$enable_dri" = xyes)
# Check for libdrm
PKG_CHECK_MODULES([LIBDRM], [libdrm >= $LIBDRM_REQUIRED],
[have_libdrm=yes], [have_libdrm=no])
@@ -1076,10 +1150,6 @@ dnl
dnl Driver specific build directories
dnl
if test -n "$with_gallium_drivers" -a "x$enable_glx$enable_xlib_glx" = xyesyes; then
NEED_WINSYS_XLIB="yes"
fi
if test "x$enable_gallium_osmesa" = xyes; then
if ! echo "$with_gallium_drivers" | grep -q 'swrast'; then
AC_MSG_ERROR([gallium_osmesa requires the gallium swrast driver])
@@ -1272,8 +1342,8 @@ AC_ARG_ENABLE([driglx-direct],
dnl
dnl libGL configuration per driver
dnl
case "x$enable_glx$enable_xlib_glx" in
xyesyes)
case "x$enable_glx" in
xxlib | xgallium-xlib)
# Xlib-based GLX
dri_modules="x11 xext xcb"
PKG_CHECK_MODULES([XLIBGL], [$dri_modules])
@@ -1283,7 +1353,7 @@ xyesyes)
GL_LIB_DEPS="$GL_LIB_DEPS $SELINUX_LIBS -lm $PTHREAD_LIBS $DLOPEN_LIBS"
GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $SELINUX_LIBS -lm $PTHREAD_LIBS"
;;
xyesno)
xdri)
# DRI-based GLX
PKG_CHECK_MODULES([GLPROTO], [glproto >= $GLPROTO_REQUIRED])
@@ -1310,7 +1380,7 @@ xyesno)
if test x"$enable_dri3" = xyes; then
PKG_CHECK_EXISTS([xcb >= $XCB_REQUIRED], [], AC_MSG_ERROR([DRI3 requires xcb >= $XCB_REQUIRED]))
dri3_modules="xcb-dri3 xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED"
dri3_modules="xcb xcb-dri3 xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED"
PKG_CHECK_MODULES([XCB_DRI3], [$dri3_modules])
fi
fi
@@ -1372,11 +1442,11 @@ AC_SUBST([HAVE_XF86VIDMODE])
dnl
dnl More GLX setup
dnl
case "x$enable_glx$enable_xlib_glx" in
xyesyes)
case "x$enable_glx" in
xxlib | xgallium-xlib)
DEFINES="$DEFINES -DUSE_XSHM"
;;
xyesno)
xdri)
DEFINES="$DEFINES -DGLX_INDIRECT_RENDERING"
if test "x$driglx_direct" = xyes; then
DEFINES="$DEFINES -DGLX_DIRECT_RENDERING"
@@ -1549,8 +1619,58 @@ if test -n "$with_dri_drivers"; then
DRI_DIRS=`echo $DRI_DIRS|tr " " "\n"|sort -u|tr "\n" " "`
fi
#
# Vulkan driver configuration
#
AC_ARG_WITH([vulkan-drivers],
[AS_HELP_STRING([--with-vulkan-drivers@<:@=DIRS...@:>@],
[comma delimited Vulkan drivers list, e.g.
"intel"
@<:@default=no@:>@])],
[with_vulkan_drivers="$withval"],
[with_vulkan_drivers="no"])
# Doing '--without-vulkan-drivers' will set this variable to 'no'. Clear it
# here so that the script doesn't choke on an unknown driver name later.
case "x$with_vulkan_drivers" in
xyes) with_vulkan_drivers="$VULKAN_DRIVERS_DEFAULT" ;;
xno) with_vulkan_drivers='' ;;
esac
AC_ARG_WITH([vulkan-icddir],
[AS_HELP_STRING([--with-vulkan-icddir=DIR],
[directory for the Vulkan driver icd files @<:@${datarootdir}/vulkan/icd.d@:>@])],
[VULKAN_ICD_INSTALL_DIR="$withval"],
[VULKAN_ICD_INSTALL_DIR='${datarootdir}/vulkan/icd.d'])
AC_SUBST([VULKAN_ICD_INSTALL_DIR])
if test -n "$with_vulkan_drivers"; then
VULKAN_DRIVERS=`IFS=', '; echo $with_vulkan_drivers`
for driver in $VULKAN_DRIVERS; do
case "x$driver" in
xintel)
if test "x$HAVE_I965_DRI" != xyes; then
AC_MSG_ERROR([Intel Vulkan driver requires the i965 dri driver])
fi
if test "x$with_sha1" == "x"; then
AC_MSG_ERROR([Intel Vulkan driver requires SHA1])
fi
HAVE_INTEL_VULKAN=yes;
;;
*)
AC_MSG_ERROR([Vulkan driver '$driver' does not exist])
;;
esac
done
VULKAN_DRIVERS=`echo $VULKAN_DRIVERS|tr " " "\n"|sort -u|tr "\n" " "`
fi
AM_CONDITIONAL(NEED_MEGADRIVER, test -n "$DRI_DIRS")
AM_CONDITIONAL(NEED_LIBMESA, test "x$enable_xlib_glx" = xyes -o \
AM_CONDITIONAL(NEED_LIBMESA, test "x$enable_glx" = xxlib -o \
"x$enable_osmesa" = xyes -o \
-n "$DRI_DIRS")
@@ -1565,7 +1685,7 @@ AC_ARG_WITH([osmesa-bits],
[osmesa_bits="$withval"],
[osmesa_bits=8])
if test "x$osmesa_bits" != x8; then
if test "x$enable_dri" = xyes -o "x$enable_glx" = xyes; then
if test "x$enable_dri" = xyes -o "x$enable_glx" != xno; then
AC_MSG_WARN([Ignoring OSMesa channel bits because of non-OSMesa driver])
osmesa_bits=8
fi
@@ -1721,7 +1841,12 @@ if test "x$enable_xvmc" = xyes -o \
"x$enable_vdpau" = xyes -o \
"x$enable_omx" = xyes -o \
"x$enable_va" = xyes; then
PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
if test x"$enable_dri3" = xyes; then
PKG_CHECK_MODULES([VL], [xcb-dri3 xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED
x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
else
PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
fi
need_gallium_vl_winsys=yes
fi
AM_CONDITIONAL(NEED_GALLIUM_VL_WINSYS, test "x$need_gallium_vl_winsys" = xyes)
@@ -1735,6 +1860,7 @@ AM_CONDITIONAL(HAVE_ST_XVMC, test "x$enable_xvmc" = xyes)
if test "x$enable_vdpau" = xyes; then
PKG_CHECK_MODULES([VDPAU], [vdpau >= $VDPAU_REQUIRED])
gallium_st="$gallium_st vdpau"
DEFINES="$DEFINES -DHAVE_ST_VDPAU"
fi
AM_CONDITIONAL(HAVE_ST_VDPAU, test "x$enable_vdpau" = xyes)
@@ -1873,8 +1999,8 @@ if test "x$with_egl_platforms" != "x" -a "x$enable_egl" != xyes; then
AC_MSG_ERROR([cannot build egl state tracker without EGL library])
fi
PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland_scanner],
WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland_scanner`,
PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`,
WAYLAND_SCANNER='')
if test "x$WAYLAND_SCANNER" = x; then
AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner])
@@ -1911,6 +2037,9 @@ for plat in $egl_platforms; do
AC_MSG_ERROR([EGL platform surfaceless requires libdrm >= $LIBDRM_REQUIRED])
;;
android)
;;
*)
AC_MSG_ERROR([EGL platform '$plat' does not exist])
;;
@@ -1931,11 +2060,11 @@ else
EGL_NATIVE_PLATFORM="_EGL_INVALID_PLATFORM"
fi
AM_CONDITIONAL(HAVE_EGL_PLATFORM_X11, echo "$egl_platforms" | grep -q 'x11')
AM_CONDITIONAL(HAVE_EGL_PLATFORM_WAYLAND, echo "$egl_platforms" | grep -q 'wayland')
AM_CONDITIONAL(HAVE_PLATFORM_X11, echo "$egl_platforms" | grep -q 'x11')
AM_CONDITIONAL(HAVE_PLATFORM_WAYLAND, echo "$egl_platforms" | grep -q 'wayland')
AM_CONDITIONAL(HAVE_EGL_PLATFORM_DRM, echo "$egl_platforms" | grep -q 'drm')
AM_CONDITIONAL(HAVE_EGL_PLATFORM_SURFACELESS, echo "$egl_platforms" | grep -q 'surfaceless')
AM_CONDITIONAL(HAVE_EGL_PLATFORM_NULL, echo "$egl_platforms" | grep -q 'null')
AM_CONDITIONAL(HAVE_EGL_PLATFORM_ANDROID, echo "$egl_platforms" | grep -q 'android')
AM_CONDITIONAL(HAVE_EGL_DRIVER_DRI2, test "x$HAVE_EGL_DRIVER_DRI2" != "x")
@@ -1976,6 +2105,9 @@ AC_ARG_WITH([llvm-prefix],
strip_unwanted_llvm_flags() {
# Use \> (marks the end of the word)
echo `$1` | sed \
-e 's/-march=\S*//g' \
-e 's/-mtune=\S*//g' \
-e 's/-mcpu=\S*//g' \
-e 's/-DNDEBUG\>//g' \
-e 's/-D_GNU_SOURCE\>//g' \
-e 's/-pedantic\>//g' \
@@ -2052,6 +2184,10 @@ if test "x$enable_gallium_llvm" = xyes; then
LLVM_COMPONENTS="engine bitwriter mcjit mcdisassembler"
if $LLVM_CONFIG --components | grep -q inteljitevents ; then
LLVM_COMPONENTS="${LLVM_COMPONENTS} inteljitevents"
fi
if test "x$enable_opencl" = xyes; then
llvm_check_version_for "3" "5" "0" "opencl"
@@ -2191,6 +2327,55 @@ radeon_llvm_check() {
fi
}
swr_llvm_check() {
gallium_require_llvm $1
if test ${LLVM_VERSION_INT} -lt 306; then
AC_MSG_ERROR([LLVM version 3.6 or later required when building $1])
fi
if test "x$enable_gallium_llvm" != "xyes"; then
AC_MSG_ERROR([--enable-gallium-llvm is required when building $1])
fi
}
swr_require_cxx_feature_flags() {
feature_name="$1"
preprocessor_test="$2"
option_list="$3"
output_var="$4"
AC_MSG_CHECKING([whether $CXX supports $feature_name])
AC_LANG_PUSH([C++])
save_CXXFLAGS="$CXXFLAGS"
save_IFS="$IFS"
IFS=","
found=0
for opts in $option_list
do
unset IFS
CXXFLAGS="$opts $save_CXXFLAGS"
AC_COMPILE_IFELSE(
[AC_LANG_PROGRAM(
[ #if !($preprocessor_test)
#error
#endif
])],
[found=1; break],
[])
IFS=","
done
IFS="$save_IFS"
CXXFLAGS="$save_CXXFLAGS"
AC_LANG_POP([C++])
if test $found -eq 1; then
AC_MSG_RESULT([$opts])
eval "$output_var=\$opts"
return 0
fi
AC_MSG_RESULT([no])
AC_MSG_ERROR([swr requires $feature_name support])
return 1
}
dnl Duplicates in GALLIUM_DRIVERS_DIRS are removed by sorting it after this block
if test -n "$with_gallium_drivers"; then
gallium_drivers=`IFS=', '; echo $with_gallium_drivers`
@@ -2225,14 +2410,8 @@ if test -n "$with_gallium_drivers"; then
PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
gallium_require_drm "Gallium R600"
gallium_require_drm_loader
if test "x$enable_r600_llvm" = xyes -o "x$enable_opencl" = xyes; then
radeon_llvm_check "r600g"
LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
fi
if test "x$enable_r600_llvm" = xyes; then
USE_R600_LLVM_COMPILER=yes;
fi
if test "x$enable_opencl" = xyes; then
radeon_llvm_check "r600g"
LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
fi
;;
@@ -2263,6 +2442,26 @@ if test -n "$with_gallium_drivers"; then
HAVE_GALLIUM_LLVMPIPE=yes
fi
;;
xswr)
swr_llvm_check "swr"
swr_require_cxx_feature_flags "C++11" "__cplusplus >= 201103L" \
",-std=c++11" \
SWR_CXX11_CXXFLAGS
AC_SUBST([SWR_CXX11_CXXFLAGS])
swr_require_cxx_feature_flags "AVX" "defined(__AVX__)" \
",-mavx,-march=core-avx" \
SWR_AVX_CXXFLAGS
AC_SUBST([SWR_AVX_CXXFLAGS])
swr_require_cxx_feature_flags "AVX2" "defined(__AVX2__)" \
",-mavx2 -mfma -mbmi2 -mf16c,-march=core-avx2" \
SWR_AVX2_CXXFLAGS
AC_SUBST([SWR_AVX2_CXXFLAGS])
HAVE_GALLIUM_SWR=yes
;;
xvc4)
HAVE_GALLIUM_VC4=yes
gallium_require_drm "vc4"
@@ -2352,6 +2551,10 @@ AM_CONDITIONAL(HAVE_GALLIUM_NOUVEAU, test "x$HAVE_GALLIUM_NOUVEAU" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_FREEDRENO, test "x$HAVE_GALLIUM_FREEDRENO" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_SOFTPIPE, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_LLVMPIPE, test "x$HAVE_GALLIUM_LLVMPIPE" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_SWR, test "x$HAVE_GALLIUM_SWR" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_SWRAST, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes -o \
"x$HAVE_GALLIUM_LLVMPIPE" = xyes -o \
"x$HAVE_GALLIUM_SWR" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_VC4, test "x$HAVE_GALLIUM_VC4" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_VIRGL, test "x$HAVE_GALLIUM_VIRGL" = xyes)
@@ -2373,12 +2576,16 @@ AM_CONDITIONAL(HAVE_R200_DRI, test x$HAVE_R200_DRI = xyes)
AM_CONDITIONAL(HAVE_RADEON_DRI, test x$HAVE_RADEON_DRI = xyes)
AM_CONDITIONAL(HAVE_SWRAST_DRI, test x$HAVE_SWRAST_DRI = xyes)
AM_CONDITIONAL(HAVE_INTEL_VULKAN, test "x$HAVE_INTEL_VULKAN" = xyes)
AM_CONDITIONAL(HAVE_INTEL_DRIVERS, test "x$HAVE_INTEL_VULKAN" = xyes -o \
"x$HAVE_I965_DRI" = xyes)
AM_CONDITIONAL(NEED_RADEON_DRM_WINSYS, test "x$HAVE_GALLIUM_R300" = xyes -o \
"x$HAVE_GALLIUM_R600" = xyes -o \
"x$HAVE_GALLIUM_RADEONSI" = xyes)
AM_CONDITIONAL(NEED_WINSYS_XLIB, test "x$NEED_WINSYS_XLIB" = xyes)
AM_CONDITIONAL(NEED_WINSYS_XLIB, test "x$enable_glx" = xgallium-xlib)
AM_CONDITIONAL(NEED_RADEON_LLVM, test x$NEED_RADEON_LLVM = xyes)
AM_CONDITIONAL(USE_R600_LLVM_COMPILER, test x$USE_R600_LLVM_COMPILER = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_COMPUTE, test x$enable_opencl = xyes)
AM_CONDITIONAL(HAVE_MESA_LLVM, test x$MESA_LLVM = x1)
AM_CONDITIONAL(USE_VC4_SIMULATOR, test x$USE_VC4_SIMULATOR = xyes)
@@ -2387,9 +2594,10 @@ if test "x$USE_VC4_SIMULATOR" = xyes -a "x$HAVE_GALLIUM_ILO" = xyes; then
fi
AM_CONDITIONAL(HAVE_LIBDRM, test "x$have_libdrm" = xyes)
AM_CONDITIONAL(HAVE_X11_DRIVER, test "x$enable_xlib_glx" = xyes)
AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes)
AM_CONDITIONAL(HAVE_GALLIUM_OSMESA, test "x$enable_gallium_osmesa" = xyes)
AM_CONDITIONAL(HAVE_COMMON_OSMESA, test "x$enable_osmesa" = xyes -o \
"x$enable_gallium_osmesa" = xyes)
AM_CONDITIONAL(HAVE_X86_ASM, test "x$asm_arch" = xx86 -o "x$asm_arch" = xx86_64)
AM_CONDITIONAL(HAVE_X86_64_ASM, test "x$asm_arch" = xx86_64)
@@ -2421,6 +2629,29 @@ AC_SUBST([XA_MINOR], $XA_MINOR)
AC_SUBST([XA_TINY], $XA_TINY)
AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_TINY")
AC_SUBST([TIMESTAMP_CMD], '`test $(SOURCE_DATE_EPOCH) && echo $(SOURCE_DATE_EPOCH) || date +%s`')
AC_ARG_ENABLE(valgrind,
[AS_HELP_STRING([--enable-valgrind],
[Build mesa with valgrind support (default: auto)])],
[VALGRIND=$enableval], [VALGRIND=auto])
if test "x$VALGRIND" != xno; then
PKG_CHECK_MODULES(VALGRIND, [valgrind], [have_valgrind=yes], [have_valgrind=no])
fi
AC_MSG_CHECKING([whether to enable Valgrind support])
if test "x$VALGRIND" = xauto; then
VALGRIND="$have_valgrind"
fi
if test "x$VALGRIND" = "xyes"; then
if ! test "x$have_valgrind" = xyes; then
AC_MSG_ERROR([Valgrind support required but not present])
fi
AC_DEFINE([HAVE_VALGRIND], 1, [Use valgrind intrinsics to suppress false warnings])
fi
AC_MSG_RESULT([$VALGRIND])
dnl Restore LDFLAGS and CPPFLAGS
LDFLAGS="$_SAVE_LDFLAGS"
CPPFLAGS="$_SAVE_CPPFLAGS"
@@ -2461,6 +2692,7 @@ AC_CONFIG_FILES([Makefile
src/gallium/drivers/rbug/Makefile
src/gallium/drivers/softpipe/Makefile
src/gallium/drivers/svga/Makefile
src/gallium/drivers/swr/Makefile
src/gallium/drivers/trace/Makefile
src/gallium/drivers/vc4/Makefile
src/gallium/drivers/virgl/Makefile
@@ -2512,6 +2744,10 @@ AC_CONFIG_FILES([Makefile
src/glx/apple/Makefile
src/glx/tests/Makefile
src/gtest/Makefile
src/intel/Makefile
src/intel/genxml/Makefile
src/intel/isl/Makefile
src/intel/vulkan/Makefile
src/loader/Makefile
src/mapi/Makefile
src/mapi/es1api/glesv1_cm.pc
@@ -2538,6 +2774,14 @@ AC_CONFIG_FILES([Makefile
AC_OUTPUT
# Fix up dependencies in *.Plo files, where we changed the extension of a
# source file
$SED -i -e 's/brw_blorp.cpp/brw_blorp.c/' src/mesa/drivers/dri/i965/.deps/brw_blorp.Plo
$SED -i -e 's/gen6_blorp.cpp/gen6_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen6_blorp.Plo
$SED -i -e 's/gen7_blorp.cpp/gen7_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen7_blorp.Plo
$SED -i -e 's/gen8_blorp.cpp/gen8_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen8_blorp.Plo
dnl
dnl Output some configuration info for the user
dnl
@@ -2576,12 +2820,15 @@ if test "x$enable_dri" != xno; then
echo " DRI driver dir: $DRI_DRIVER_INSTALL_DIR"
fi
case "x$enable_glx$enable_xlib_glx" in
xyesyes)
case "x$enable_glx" in
xdri)
echo " GLX: DRI-based"
;;
xxlib)
echo " GLX: Xlib-based"
;;
xyesno)
echo " GLX: DRI-based"
xgallium-xlib)
echo " GLX: Xlib-based (Gallium)"
;;
*)
echo " GLX: $enable_glx"
@@ -2605,6 +2852,15 @@ if test "$enable_egl" = yes; then
echo " EGL drivers: $egl_drivers"
fi
# Vulkan
echo ""
if test "x$VULKAN_DRIVERS" != x; then
echo " Vulkan drivers: $VULKAN_DRIVERS"
echo " Vulkan ICD dir: $VULKAN_ICD_INSTALL_DIR"
else
echo " Vulkan drivers: no"
fi
echo ""
if test "x$MESA_LLVM" = x1; then
echo " llvm: yes"

View File

@@ -1,490 +0,0 @@
Some parts of Mesa are copyrighted under the GNU LGPL. See the
Mesa/docs/COPYRIGHT file for details.
The following is the standard GNU copyright file.
----------------------------------------------------------------------
GNU LIBRARY GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1991 Free Software Foundation, Inc.
675 Mass Ave, Cambridge, MA 02139, USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
[This is the first released version of the library GPL. It is
numbered 2 because it goes with version 2 of the ordinary GPL.]
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
Licenses are intended to guarantee your freedom to share and change
free software--to make sure the software is free for all its users.
This license, the Library General Public License, applies to some
specially designated Free Software Foundation software, and to any
other libraries whose authors decide to use it. You can use it for
your libraries, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if
you distribute copies of the library, or if you modify it.
For example, if you distribute copies of the library, whether gratis
or for a fee, you must give the recipients all the rights that we gave
you. You must make sure that they, too, receive or can get the source
code. If you link a program with the library, you must provide
complete object files to the recipients so that they can relink them
with the library, after making changes to the library and recompiling
it. And you must show them these terms so they know their rights.
Our method of protecting your rights has two steps: (1) copyright
the library, and (2) offer you this license which gives you legal
permission to copy, distribute and/or modify the library.
Also, for each distributor's protection, we want to make certain
that everyone understands that there is no warranty for this free
library. If the library is modified by someone else and passed on, we
want its recipients to know that what they have is not the original
version, so that any problems introduced by others will not reflect on
the original authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that companies distributing free
software will individually obtain patent licenses, thus in effect
transforming the program into proprietary software. To prevent this,
we have made it clear that any patent must be licensed for everyone's
free use or not licensed at all.
Most GNU software, including some libraries, is covered by the ordinary
GNU General Public License, which was designed for utility programs. This
license, the GNU Library General Public License, applies to certain
designated libraries. This license is quite different from the ordinary
one; be sure to read it in full, and don't assume that anything in it is
the same as in the ordinary license.
The reason we have a separate public license for some libraries is that
they blur the distinction we usually make between modifying or adding to a
program and simply using it. Linking a program with a library, without
changing the library, is in some sense simply using the library, and is
analogous to running a utility program or application program. However, in
a textual and legal sense, the linked executable is a combined work, a
derivative of the original library, and the ordinary General Public License
treats it as such.
Because of this blurred distinction, using the ordinary General
Public License for libraries did not effectively promote software
sharing, because most developers did not use the libraries. We
concluded that weaker conditions might promote sharing better.
However, unrestricted linking of non-free programs would deprive the
users of those programs of all benefit from the free status of the
libraries themselves. This Library General Public License is intended to
permit developers of non-free programs to use free libraries, while
preserving your freedom as a user of such programs to change the free
libraries that are incorporated in them. (We have not seen how to achieve
this as regards changes in header files, but we have achieved it as regards
changes in the actual functions of the Library.) The hope is that this
will lead to faster development of free libraries.
The precise terms and conditions for copying, distribution and
modification follow. Pay close attention to the difference between a
"work based on the library" and a "work that uses the library". The
former contains code derived from the library, while the latter only
works together with the library.
Note that it is possible for a library to be covered by the ordinary
General Public License rather than by this special one.
GNU LIBRARY GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License Agreement applies to any software library which
contains a notice placed by the copyright holder or other authorized
party saying it may be distributed under the terms of this Library
General Public License (also called "this License"). Each licensee is
addressed as "you".
A "library" means a collection of software functions and/or data
prepared so as to be conveniently linked with application programs
(which use some of those functions and data) to form executables.
The "Library", below, refers to any such software library or work
which has been distributed under these terms. A "work based on the
Library" means either the Library or any derivative work under
copyright law: that is to say, a work containing the Library or a
portion of it, either verbatim or with modifications and/or translated
straightforwardly into another language. (Hereinafter, translation is
included without limitation in the term "modification".)
"Source code" for a work means the preferred form of the work for
making modifications to it. For a library, complete source code means
all the source code for all modules it contains, plus any associated
interface definition files, plus the scripts used to control compilation
and installation of the library.
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running a program using the Library is not restricted, and output from
such a program is covered only if its contents constitute a work based
on the Library (independent of the use of the Library in a tool for
writing it). Whether that is true depends on what the Library does
and what the program that uses the Library does.
1. You may copy and distribute verbatim copies of the Library's
complete source code as you receive it, in any medium, provided that
you conspicuously and appropriately publish on each copy an
appropriate copyright notice and disclaimer of warranty; keep intact
all the notices that refer to this License and to the absence of any
warranty; and distribute a copy of this License along with the
Library.
You may charge a fee for the physical act of transferring a copy,
and you may at your option offer warranty protection in exchange for a
fee.
2. You may modify your copy or copies of the Library or any portion
of it, thus forming a work based on the Library, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) The modified work must itself be a software library.
b) You must cause the files modified to carry prominent notices
stating that you changed the files and the date of any change.
c) You must cause the whole of the work to be licensed at no
charge to all third parties under the terms of this License.
d) If a facility in the modified Library refers to a function or a
table of data to be supplied by an application program that uses
the facility, other than as an argument passed when the facility
is invoked, then you must make a good faith effort to ensure that,
in the event an application does not supply such function or
table, the facility still operates, and performs whatever part of
its purpose remains meaningful.
(For example, a function in a library to compute square roots has
a purpose that is entirely well-defined independent of the
application. Therefore, Subsection 2d requires that any
application-supplied function or table used by this function must
be optional: if the application does not supply it, the square
root function must still compute square roots.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Library,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Library, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote
it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Library.
In addition, mere aggregation of another work not based on the Library
with the Library (or with a work based on the Library) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may opt to apply the terms of the ordinary GNU General Public
License instead of this License to a given copy of the Library. To do
this, you must alter all the notices that refer to this License, so
that they refer to the ordinary GNU General Public License, version 2,
instead of to this License. (If a newer version than version 2 of the
ordinary GNU General Public License has appeared, then you can specify
that version instead if you wish.) Do not make any other change in
these notices.
Once this change is made in a given copy, it is irreversible for
that copy, so the ordinary GNU General Public License applies to all
subsequent copies and derivative works made from that copy.
This option is useful when you wish to copy part of the code of
the Library into a program that is not a library.
4. You may copy and distribute the Library (or a portion or
derivative of it, under Section 2) in object code or executable form
under the terms of Sections 1 and 2 above provided that you accompany
it with the complete corresponding machine-readable source code, which
must be distributed under the terms of Sections 1 and 2 above on a
medium customarily used for software interchange.
If distribution of object code is made by offering access to copy
from a designated place, then offering equivalent access to copy the
source code from the same place satisfies the requirement to
distribute the source code, even though third parties are not
compelled to copy the source along with the object code.
5. A program that contains no derivative of any portion of the
Library, but is designed to work with the Library by being compiled or
linked with it, is called a "work that uses the Library". Such a
work, in isolation, is not a derivative work of the Library, and
therefore falls outside the scope of this License.
However, linking a "work that uses the Library" with the Library
creates an executable that is a derivative of the Library (because it
contains portions of the Library), rather than a "work that uses the
library". The executable is therefore covered by this License.
Section 6 states terms for distribution of such executables.
When a "work that uses the Library" uses material from a header file
that is part of the Library, the object code for the work may be a
derivative work of the Library even though the source code is not.
Whether this is true is especially significant if the work can be
linked without the Library, or if the work is itself a library. The
threshold for this to be true is not precisely defined by law.
If such an object file uses only numerical parameters, data
structure layouts and accessors, and small macros and small inline
functions (ten lines or less in length), then the use of the object
file is unrestricted, regardless of whether it is legally a derivative
work. (Executables containing this object code plus portions of the
Library will still fall under Section 6.)
Otherwise, if the work is a derivative of the Library, you may
distribute the object code for the work under the terms of Section 6.
Any executables containing that work also fall under Section 6,
whether or not they are linked directly with the Library itself.
6. As an exception to the Sections above, you may also compile or
link a "work that uses the Library" with the Library to produce a
work containing portions of the Library, and distribute that work
under terms of your choice, provided that the terms permit
modification of the work for the customer's own use and reverse
engineering for debugging such modifications.
You must give prominent notice with each copy of the work that the
Library is used in it and that the Library and its use are covered by
this License. You must supply a copy of this License. If the work
during execution displays copyright notices, you must include the
copyright notice for the Library among them, as well as a reference
directing the user to the copy of this License. Also, you must do one
of these things:
a) Accompany the work with the complete corresponding
machine-readable source code for the Library including whatever
changes were used in the work (which must be distributed under
Sections 1 and 2 above); and, if the work is an executable linked
with the Library, with the complete machine-readable "work that
uses the Library", as object code and/or source code, so that the
user can modify the Library and then relink to produce a modified
executable containing the modified Library. (It is understood
that the user who changes the contents of definitions files in the
Library will not necessarily be able to recompile the application
to use the modified definitions.)
b) Accompany the work with a written offer, valid for at
least three years, to give the same user the materials
specified in Subsection 6a, above, for a charge no more
than the cost of performing this distribution.
c) If distribution of the work is made by offering access to copy
from a designated place, offer equivalent access to copy the above
specified materials from the same place.
d) Verify that the user has already received a copy of these
materials or that you have already sent this user a copy.
For an executable, the required form of the "work that uses the
Library" must include any data and utility programs needed for
reproducing the executable from it. However, as a special exception,
the source code distributed need not include anything that is normally
distributed (in either source or binary form) with the major
components (compiler, kernel, and so on) of the operating system on
which the executable runs, unless that component itself accompanies
the executable.
It may happen that this requirement contradicts the license
restrictions of other proprietary libraries that do not normally
accompany the operating system. Such a contradiction means you cannot
use both them and the Library together in an executable that you
distribute.
7. You may place library facilities that are a work based on the
Library side-by-side in a single library together with other library
facilities not covered by this License, and distribute such a combined
library, provided that the separate distribution of the work based on
the Library and of the other library facilities is otherwise
permitted, and provided that you do these two things:
a) Accompany the combined library with a copy of the same work
based on the Library, uncombined with any other library
facilities. This must be distributed under the terms of the
Sections above.
b) Give prominent notice with the combined library of the fact
that part of it is a work based on the Library, and explaining
where to find the accompanying uncombined form of the same work.
8. You may not copy, modify, sublicense, link with, or distribute
the Library except as expressly provided under this License. Any
attempt otherwise to copy, modify, sublicense, link with, or
distribute the Library is void, and will automatically terminate your
rights under this License. However, parties who have received copies,
or rights, from you under this License will not have their licenses
terminated so long as such parties remain in full compliance.
9. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Library or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Library (or any work based on the
Library), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Library or works based on it.
10. Each time you redistribute the Library (or any work based on the
Library), the recipient automatically receives a license from the
original licensor to copy, distribute, link with or modify the Library
subject to these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
11. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Library at all. For example, if a patent
license would not permit royalty-free redistribution of the Library by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Library.
If any portion of this section is held invalid or unenforceable under any
particular circumstance, the balance of the section is intended to apply,
and the section as a whole is intended to apply in other circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
12. If the distribution and/or use of the Library is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Library under this License may add
an explicit geographical distribution limitation excluding those countries,
so that distribution is permitted only in or among countries not thus
excluded. In such case, this License incorporates the limitation as if
written in the body of this License.
13. The Free Software Foundation may publish revised and/or new
versions of the Library General Public License from time to time.
Such new versions will be similar in spirit to the present version,
but may differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Library
specifies a version number of this License which applies to it and
"any later version", you have the option of following the terms and
conditions either of that version or of any later version published by
the Free Software Foundation. If the Library does not specify a
license version number, you may choose any version ever published by
the Free Software Foundation.
14. If you wish to incorporate parts of the Library into other free
programs whose distribution conditions are incompatible with these,
write to the author to ask for permission. For software which is
copyrighted by the Free Software Foundation, write to the Free
Software Foundation; we sometimes make exceptions for this. Our
decision will be guided by the two goals of preserving the free status
of all derivatives of our free software and of promoting the sharing
and reuse of software generally.
NO WARRANTY
15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
DAMAGES.
END OF TERMS AND CONDITIONS
Appendix: How to Apply These Terms to Your New Libraries
If you develop a new library, and you want it to be of the greatest
possible use to the public, we recommend making it free software that
everyone can redistribute and change. You can do so by permitting
redistribution under these terms (or, alternatively, under the terms of the
ordinary General Public License).
To apply these terms, attach the following notices to the library. It is
safest to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least the
"copyright" line and a pointer to where the full notice is found.
<one line to give the library's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Library General Public
License as published by the Free Software Foundation; either
version 2 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Library General Public License for more details.
You should have received a copy of the GNU Library General Public
License along with this library; if not, write to the Free
Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
Also add information on how to contact you by electronic and paper mail.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the library, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the
library `Frob' (a library for tweaking knobs) written by James Random Hacker.
<signature of Ty Coon>, 1 April 1990
Ty Coon, President of Vice
That's all there is to it!

View File

@@ -1,13 +1,28 @@
# Status of OpenGL extensions in Mesa
Status of OpenGL 3.x features in Mesa
Here's how to read this file:
all DONE: <driver>, ...
All the extensions are done for the given list of drivers.
Note: when an item is marked as "DONE" it means all the core Mesa
infrastructure is complete but it may be the case that few (if any) drivers
implement the features.
DONE
The extension is done for Mesa and no implementation is necessary on the
driver-side.
DONE ()
The extension is done for Mesa and all the drivers in the "all DONE" list.
OpenGL Core and Compatibility context support
DONE (<driver>, ...)
The extension is done for Mesa, all the drivers in the "all DONE" list, and
all the drivers in the brackets.
in progress
The extension is started but not finished yet.
not started
The extension isn't started yet.
# OpenGL Core and Compatibility context support
OpenGL 3.1 and later versions are only supported with the Core profile.
There are no plans to support GL_ARB_compatibility. The last supported OpenGL
@@ -15,249 +30,248 @@ version with all deprecated features is 3.0. Some of the later GL features
are exposed in the 3.0 context as extensions.
Feature Status
----------------------------------------------------- ------------------------
Feature Status
------------------------------------------------------- ------------------------
GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
glBindFragDataLocation, glGetFragDataLocation DONE
Conditional rendering (GL_NV_conditional_render) DONE ()
Map buffer subranges (GL_ARB_map_buffer_range) DONE ()
Clamping controls (GL_ARB_color_buffer_float) DONE ()
Float textures, renderbuffers (GL_ARB_texture_float) DONE ()
GL_NV_conditional_render (Conditional rendering) DONE ()
GL_ARB_map_buffer_range (Map buffer subranges) DONE ()
GL_ARB_color_buffer_float (Clamping controls) DONE ()
GL_ARB_texture_float (Float textures, renderbuffers) DONE ()
GL_EXT_packed_float DONE ()
GL_EXT_texture_shared_exponent DONE ()
Float depth buffers (GL_ARB_depth_buffer_float) DONE ()
Framebuffer objects (GL_ARB_framebuffer_object) DONE ()
GL_ARB_depth_buffer_float (Float depth buffers) DONE ()
GL_ARB_framebuffer_object (Framebuffer objects) DONE ()
GL_ARB_half_float_pixel DONE (all drivers)
GL_ARB_half_float_vertex DONE ()
GL_EXT_texture_integer DONE ()
GL_EXT_texture_array DONE ()
Per-buffer blend and masks (GL_EXT_draw_buffers2) DONE ()
GL_EXT_draw_buffers2 (Per-buffer blend and masks) DONE ()
GL_EXT_texture_compression_rgtc DONE ()
GL_ARB_texture_rg DONE ()
Transform feedback (GL_EXT_transform_feedback) DONE ()
Vertex array objects (GL_ARB_vertex_array_object) DONE ()
sRGB framebuffer format (GL_EXT_framebuffer_sRGB) DONE ()
GL_EXT_transform_feedback (Transform feedback) DONE ()
GL_ARB_vertex_array_object (Vertex array objects) DONE ()
GL_EXT_framebuffer_sRGB (sRGB framebuffer format) DONE ()
glClearBuffer commands DONE
glGetStringi command DONE
glTexParameterI, glGetTexParameterI commands DONE
glVertexAttribI commands DONE
Depth format cube textures DONE ()
GLX_ARB_create_context (GLX 1.4 is required) DONE
Multisample anti-aliasing DONE (llvmpipe (*), softpipe (*))
Multisample anti-aliasing DONE (llvmpipe (*), softpipe (*), swr (*))
(*) llvmpipe and softpipe have fake Multisample anti-aliasing support
(*) llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
Forward compatible context support/deprecations DONE ()
Instanced drawing (GL_ARB_draw_instanced) DONE ()
Buffer copying (GL_ARB_copy_buffer) DONE ()
Primitive restart (GL_NV_primitive_restart) DONE ()
GL_ARB_draw_instanced (Instanced drawing) DONE ()
GL_ARB_copy_buffer (Buffer copying) DONE ()
GL_NV_primitive_restart (Primitive restart) DONE ()
16 vertex texture image units DONE ()
Texture buffer objs (GL_ARB_texture_buffer_object) DONE for OpenGL 3.1 contexts ()
Rectangular textures (GL_ARB_texture_rectangle) DONE ()
Uniform buffer objs (GL_ARB_uniform_buffer_object) DONE ()
Signed normalized textures (GL_EXT_texture_snorm) DONE ()
GL_ARB_texture_buffer_object (Texture buffer objs) DONE (for OpenGL 3.1 contexts)
GL_ARB_texture_rectangle (Rectangular textures) DONE ()
GL_ARB_uniform_buffer_object (Uniform buffer objs) DONE ()
GL_EXT_texture_snorm (Signed normalized textures) DONE ()
GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
Core/compatibility profiles DONE
Geometry shaders DONE ()
BGRA vertex order (GL_ARB_vertex_array_bgra) DONE ()
Base vertex offset(GL_ARB_draw_elements_base_vertex) DONE ()
Frag shader coord (GL_ARB_fragment_coord_conventions) DONE ()
Provoking vertex (GL_ARB_provoking_vertex) DONE ()
Seamless cubemaps (GL_ARB_seamless_cube_map) DONE ()
Multisample textures (GL_ARB_texture_multisample) DONE ()
Frag depth clamp (GL_ARB_depth_clamp) DONE ()
Fence objects (GL_ARB_sync) DONE ()
GL_ARB_vertex_array_bgra (BGRA vertex order) DONE (swr)
GL_ARB_draw_elements_base_vertex (Base vertex offset) DONE (swr)
GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (swr)
GL_ARB_provoking_vertex (Provoking vertex) DONE (swr)
GL_ARB_seamless_cube_map (Seamless cubemaps) DONE (swr)
GL_ARB_texture_multisample (Multisample textures) DONE (swr)
GL_ARB_depth_clamp (Frag depth clamp) DONE (swr)
GL_ARB_sync (Fence objects) DONE (swr)
GLX_ARB_create_context_profile DONE
GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
GL_ARB_blend_func_extended DONE ()
GL_ARB_blend_func_extended DONE (swr)
GL_ARB_explicit_attrib_location DONE (all drivers that support GLSL)
GL_ARB_occlusion_query2 DONE ()
GL_ARB_occlusion_query2 DONE (swr)
GL_ARB_sampler_objects DONE (all drivers)
GL_ARB_shader_bit_encoding DONE ()
GL_ARB_texture_rgb10_a2ui DONE ()
GL_ARB_texture_swizzle DONE ()
GL_ARB_timer_query DONE ()
GL_ARB_instanced_arrays DONE ()
GL_ARB_vertex_type_2_10_10_10_rev DONE ()
GL_ARB_shader_bit_encoding DONE (swr)
GL_ARB_texture_rgb10_a2ui DONE (swr)
GL_ARB_texture_swizzle DONE (swr)
GL_ARB_timer_query DONE (swr)
GL_ARB_instanced_arrays DONE (swr)
GL_ARB_vertex_type_2_10_10_10_rev DONE (swr)
GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi
GL_ARB_draw_buffers_blend DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_draw_indirect DONE (i965, llvmpipe, softpipe)
GL_ARB_gpu_shader5 DONE (i965)
- 'precise' qualifier DONE
- Dynamically uniform sampler array indices DONE (softpipe)
- Dynamically uniform UBO array indices DONE ()
- Implicit signed -> unsigned conversions DONE
- Fused multiply-add DONE ()
- Packing/bitfield/conversion functions DONE (softpipe)
- Enhanced textureGather DONE (softpipe)
- Geometry shader instancing DONE (llvmpipe, softpipe)
- Geometry shader multiple streams DONE ()
- Enhanced per-sample shading DONE ()
- Interpolation functions DONE ()
- New overload resolution rules DONE
GL_ARB_gpu_shader_fp64 DONE (llvmpipe, softpipe)
GL_ARB_sample_shading DONE (i965, nv50)
GL_ARB_shader_subroutine DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_tessellation_shader DONE (i965)
GL_ARB_texture_buffer_object_rgb32 DONE (i965, llvmpipe, softpipe)
GL_ARB_texture_cube_map_array DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_texture_gather DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_texture_query_lod DONE (i965, nv50, softpipe)
GL_ARB_transform_feedback2 DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_transform_feedback3 DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_draw_buffers_blend DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_draw_indirect DONE (i965, llvmpipe, softpipe, swr)
GL_ARB_gpu_shader5 DONE (i965)
- 'precise' qualifier DONE
- Dynamically uniform sampler array indices DONE (softpipe)
- Dynamically uniform UBO array indices DONE ()
- Implicit signed -> unsigned conversions DONE
- Fused multiply-add DONE ()
- Packing/bitfield/conversion functions DONE (softpipe)
- Enhanced textureGather DONE (softpipe)
- Geometry shader instancing DONE (llvmpipe, softpipe)
- Geometry shader multiple streams DONE ()
- Enhanced per-sample shading DONE ()
- Interpolation functions DONE ()
- New overload resolution rules DONE
GL_ARB_gpu_shader_fp64 DONE (i965/gen8+, llvmpipe, softpipe)
GL_ARB_sample_shading DONE (i965, nv50)
GL_ARB_shader_subroutine DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_tessellation_shader DONE (i965)
GL_ARB_texture_buffer_object_rgb32 DONE (i965, llvmpipe, softpipe, swr)
GL_ARB_texture_cube_map_array DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_texture_gather DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_query_lod DONE (i965, nv50, softpipe)
GL_ARB_transform_feedback2 DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_transform_feedback3 DONE (i965, nv50, llvmpipe, softpipe, swr)
GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi
GL_ARB_ES2_compatibility DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_get_program_binary DONE (0 binary formats)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_shader_precision DONE (all drivers that support GLSL 4.10)
GL_ARB_vertex_attrib_64bit DONE (llvmpipe, softpipe)
GL_ARB_viewport_array DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_ES2_compatibility DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_get_program_binary DONE (0 binary formats)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_shader_precision DONE (all drivers that support GLSL 4.10)
GL_ARB_vertex_attrib_64bit DONE (i965/gen8+, llvmpipe, softpipe)
GL_ARB_viewport_array DONE (i965, nv50, llvmpipe, softpipe)
GL 4.2, GLSL 4.20:
GL 4.2, GLSL 4.20 -- all DONE: radeonsi
GL_ARB_texture_compression_bptc DONE (i965, nvc0, r600, radeonsi)
GL_ARB_compressed_texture_pixel_storage DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (i965, nvc0)
GL_ARB_texture_storage DONE (all drivers)
GL_ARB_transform_feedback_instanced DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_base_instance DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_shader_image_load_store DONE (i965)
GL_ARB_conservative_depth DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_420pack DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_internalformat_query DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_map_buffer_alignment DONE (all drivers)
GL_ARB_texture_compression_bptc DONE (i965, nvc0, r600, radeonsi)
GL_ARB_compressed_texture_pixel_storage DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_texture_storage DONE (all drivers)
GL_ARB_transform_feedback_instanced DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_base_instance DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_shader_image_load_store DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_conservative_depth DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_420pack DONE (all drivers that support GLSL 1.30)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_internalformat_query DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_map_buffer_alignment DONE (all drivers)
GL 4.3, GLSL 4.30:
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_ES3_compatibility DONE (all drivers that support GLSL 3.30)
GL_ARB_clear_buffer_object DONE (all drivers)
GL_ARB_compute_shader DONE (i965)
GL_ARB_copy_image DONE (i965, nv50, nvc0, r600, radeonsi)
GL_KHR_debug DONE (all drivers)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_fragment_layer_viewport DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe)
GL_ARB_framebuffer_no_attachments DONE (i965)
GL_ARB_internalformat_query2 in progress (elima)
GL_ARB_invalidate_subdata DONE (all drivers)
GL_ARB_multi_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_robust_buffer_access_behavior not started
GL_ARB_shader_image_size DONE (i965)
GL_ARB_shader_storage_buffer_object DONE (i965, nvc0)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_buffer_range DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
GL_ARB_texture_query_levels DONE (all drivers that support GLSL 1.30)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_texture_view DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_ES3_compatibility DONE (all drivers that support GLSL 3.30)
GL_ARB_clear_buffer_object DONE (all drivers)
GL_ARB_compute_shader DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_copy_image DONE (i965, nv50, nvc0, r600, radeonsi)
GL_KHR_debug DONE (all drivers)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_fragment_layer_viewport DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe)
GL_ARB_framebuffer_no_attachments DONE (i965, nvc0, r600, radeonsi, softpipe)
GL_ARB_internalformat_query2 DONE (all drivers)
GL_ARB_invalidate_subdata DONE (all drivers)
GL_ARB_multi_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_robust_buffer_access_behavior DONE (i965, nvc0, radeonsi)
GL_ARB_shader_image_size DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_storage_buffer_object DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_texture_buffer_range DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
GL_ARB_texture_query_levels DONE (all drivers that support GLSL 1.30)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_texture_view DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GL 4.4, GLSL 4.40:
GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers)
GL_ARB_buffer_storage DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_clear_texture DONE (i965, nv50, nvc0)
GL_ARB_enhanced_layouts in progress (Timothy)
- compile-time constant expressions DONE
- explicit byte offsets for blocks in progress
- forced alignment within blocks in progress
- specified vec4-slot component numbers in progress
- specified transform/feedback layout in progress
- input/output block locations DONE
GL_ARB_multi_bind DONE (all drivers)
GL_ARB_query_buffer_object DONE (nvc0)
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_stencil8 DONE (nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers)
GL_ARB_buffer_storage DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_clear_texture DONE (i965, nv50, nvc0)
GL_ARB_enhanced_layouts in progress (Timothy)
- compile-time constant expressions DONE
- explicit byte offsets for blocks DONE
- forced alignment within blocks DONE
- specified vec4-slot component numbers in progress
- specified transform/feedback layout DONE
- input/output block locations DONE
GL_ARB_multi_bind DONE (all drivers)
GL_ARB_query_buffer_object DONE (i965/hsw+, nvc0)
GL_ARB_texture_mirror_clamp_to_edge DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_texture_stencil8 DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_vertex_type_10f_11f_11f_rev DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL 4.5, GLSL 4.50:
GL_ARB_ES3_1_compatibility not started
GL_ARB_clip_control DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_conditional_render_inverted DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_cull_distance in progress (Tobias)
GL_ARB_derivative_control DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_direct_state_access DONE (all drivers)
GL_ARB_get_texture_sub_image DONE (all drivers)
GL_ARB_shader_texture_image_samples DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_texture_barrier DONE (i965, nv50, nvc0, r600, radeonsi)
GL_KHR_context_flush_control DONE (all - but needs GLX/EGL extension to be useful)
GL_KHR_robust_buffer_access_behavior not started
GL_KHR_robustness 90% done (the ARB variant)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
GL_ARB_ES3_1_compatibility DONE (nvc0, radeonsi)
GL_ARB_clip_control DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_conditional_render_inverted DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_cull_distance DONE (i965, nv50, nvc0, llvmpipe, softpipe)
GL_ARB_derivative_control DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_direct_state_access DONE (all drivers)
GL_ARB_get_texture_sub_image DONE (all drivers)
GL_ARB_shader_texture_image_samples DONE (i965, nv50, nvc0, r600, radeonsi)
GL_ARB_texture_barrier DONE (i965, nv50, nvc0, r600, radeonsi)
GL_KHR_context_flush_control DONE (all - but needs GLX/EGL extension to be useful)
GL_KHR_robustness DONE (i965)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
These are the extensions cherry-picked to make GLES 3.1
GLES3.1, GLSL ES 3.1
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_compute_shader DONE (i965)
GL_ARB_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_framebuffer_no_attachments DONE (i965)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (i965, nvc0)
GL_ARB_shader_image_load_store DONE (i965)
GL_ARB_shader_image_size DONE (i965)
GL_ARB_shader_storage_buffer_object DONE (i965, nvc0)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
Multisample textures (GL_ARB_texture_multisample) DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GS5 Enhanced textureGather DONE (i965, nvc0, r600, radeonsi)
GS5 Packing/bitfield/conversion functions DONE (i965, nvc0, r600, radeonsi)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30)
GL_ARB_compute_shader DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_draw_indirect DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL)
GL_ARB_framebuffer_no_attachments DONE (i965, nvc0, r600, radeonsi, softpipe)
GL_ARB_program_interface_query DONE (all drivers)
GL_ARB_shader_atomic_counters DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_image_load_store DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_image_size DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shader_storage_buffer_object DONE (i965, nvc0, radeonsi, softpipe)
GL_ARB_shading_language_packing DONE (all drivers)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_texture_multisample (Multisample textures) DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
GL_ARB_texture_storage_multisample DONE (all drivers that support GL_ARB_texture_multisample)
GL_ARB_vertex_attrib_binding DONE (all drivers)
GS5 Enhanced textureGather DONE (i965, nvc0, r600, radeonsi)
GS5 Packing/bitfield/conversion functions DONE (i965, nvc0, r600, radeonsi)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
Additional functionality not covered above:
glMemoryBarrierByRegion DONE
glGetTexLevelParameter[fi]v - needs updates DONE
glMemoryBarrierByRegion DONE
glGetTexLevelParameter[fi]v - needs updates DONE
glGetBooleani_v - restrict to GLES enums
gl_HelperInvocation support DONE (i965, nvc0, r600)
gl_HelperInvocation support DONE (i965, nvc0, r600, radeonsi)
GLES3.2, GLSL ES 3.2
GL_EXT_color_buffer_float DONE (all drivers)
GL_KHR_blend_equation_advanced not started
GL_KHR_debug DONE (all drivers)
GL_KHR_robustness 90% done (the ARB variant)
GL_KHR_texture_compression_astc_ldr DONE (i965/gen9+)
GL_OES_copy_image not started (based on GL_ARB_copy_image, which is done for some drivers)
GL_OES_draw_buffers_indexed not started
GL_OES_draw_elements_base_vertex DONE (all drivers)
GL_OES_geometry_shader started (Marta)
GL_OES_gpu_shader5 not started (based on parts of GL_ARB_gpu_shader5, which is done for some drivers)
GL_OES_primitive_bounding box not started
GL_OES_sample_shading not started (based on parts of GL_ARB_sample_shading, which is done for some drivers)
GL_OES_sample_variables not started (based on parts of GL_ARB_sample_shading, which is done for some drivers)
GL_OES_shader_image_atomic not started (based on parts of GL_ARB_shader_image_load_store, which is done for some drivers)
GL_OES_shader_io_blocks not started (based on parts of GLSL 1.50, which is done)
GL_OES_shader_multisample_interpolation not started (based on parts of GL_ARB_gpu_shader5, which is done)
GL_OES_tessellation_shader not started (based on GL_ARB_tessellation_shader, which is done for some drivers)
GL_OES_texture_border_clamp not started (based on GL_ARB_texture_border_clamp, which is done)
GL_OES_texture_buffer not started (based on GL_ARB_texture_buffer_object, GL_ARB_texture_buffer_range, and GL_ARB_texture_buffer_object_rgb32 that are all done)
GL_OES_texture_cube_map_array not started (based on GL_ARB_texture_cube_map_array, which is done for all drivers)
GL_OES_texture_stencil8 DONE (all drivers that support GL_ARB_texture_stencil8)
GL_OES_texture_storage_multisample_2d_array DONE (all drivers that support GL_ARB_texture_multisample)
GL_EXT_color_buffer_float DONE (all drivers)
GL_KHR_blend_equation_advanced not started
GL_KHR_debug DONE (all drivers)
GL_KHR_robustness DONE (i965)
GL_KHR_texture_compression_astc_ldr DONE (i965/gen9+)
GL_OES_copy_image DONE (i965)
GL_OES_draw_buffers_indexed DONE (all drivers that support GL_ARB_draw_buffers_blend)
GL_OES_draw_elements_base_vertex DONE (all drivers)
GL_OES_geometry_shader started (idr)
GL_OES_gpu_shader5 DONE (all drivers that support GL_ARB_gpu_shader5)
GL_OES_primitive_bounding_box not started
GL_OES_sample_shading DONE (i965, nvc0, r600, radeonsi)
GL_OES_sample_variables DONE (i965, nvc0, r600, radeonsi)
GL_OES_shader_image_atomic DONE (all drivers that support GL_ARB_shader_image_load_store)
GL_OES_shader_io_blocks DONE (i965/gen8+, nvc0, radeonsi)
GL_OES_shader_multisample_interpolation DONE (i965, nvc0, r600, radeonsi)
GL_OES_tessellation_shader started (Ken)
GL_OES_texture_border_clamp DONE (all drivers)
GL_OES_texture_buffer DONE (i965, nvc0, radeonsi)
GL_OES_texture_cube_map_array not started (based on GL_ARB_texture_cube_map_array, which is done for all drivers)
GL_OES_texture_stencil8 DONE (all drivers that support GL_ARB_texture_stencil8)
GL_OES_texture_storage_multisample_2d_array DONE (all drivers that support GL_ARB_texture_multisample)
More info about these features and the work involved can be found at
http://dri.freedesktop.org/wiki/MissingFunctionality

View File

@@ -18,7 +18,9 @@
<p>
Primary Mesa download site:
<a href="ftp://ftp.freedesktop.org/pub/mesa/">freedesktop.org</a> (FTP)
<a href="ftp://ftp.freedesktop.org/pub/mesa/">ftp.freedesktop.org</a> (FTP)
or <a href="https://mesa.freedesktop.org/archive/">mesa.freedesktop.org</a>
(HTTP).
</p>
<p>

View File

@@ -89,9 +89,11 @@ types such as <code>EGLNativeDisplayType</code> or
<p>The available platforms are <code>x11</code>, <code>drm</code>,
<code>wayland</code>, <code>surfaceless</code>, <code>android</code>,
and <code>haiku</code>. The <code>android</code> platform
can only be built as a system component, part of AOSP, while the
<code>haiku</code> platform can only be built with SCons.
and <code>haiku</code>.
The <code>android</code> platform can either be built as a system
component, part of AOSP, using <code>Android.mk</code> files, or
cross-compiled using appropriate <code>configure</code> options.
The <code>haiku</code> platform can only be built with SCons.
Unless for special needs, the build system should
select the right platforms automatically.</p>

View File

@@ -163,6 +163,10 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.
<li>blorp - emit messages about the blorp operations (blits &amp; clears)</li>
<li>nodualobj - suppress generation of dual-object geometry shader code</li>
<li>optimizer - dump shader assembly to files at each optimization pass and iteration that make progress</li>
<li>vec4 - force vec4 mode in vertex shader</li>
<li>spill_fs - force spilling of all registers in the scalar backend (useful to debug spilling code)</li>
<li>spill_vec4 - force spilling of all registers in the vec4 backend (useful to debug spilling code)</li>
<li>norbc - disable single sampled render buffer compression</li>
</ul>
</ul>

View File

@@ -16,6 +16,33 @@
<h1>News</h1>
<h2>May 9, 2016</h2>
<p>
<a href="relnotes/11.1.4.html">Mesa 11.1.4</a> and
<a href="relnotes/11.2.2.html">Mesa 11.2.2</a> are released.
These are bug-fix releases from the 11.1 and 11.2 branches, respectively.
<br>
NOTE: It is anticipated that 11.1.4 will be the final release in the 11.1.4
series. Users of 11.1 are encouraged to migrate to the 11.2 series in order
to obtain future fixes.
</p>
<h2>April 17, 2016</h2>
<p>
<a href="relnotes/11.1.3.html">Mesa 11.1.3</a> and
<a href="relnotes/11.2.1.html">Mesa 11.2.1</a> are released.
These are bug-fix releases from the 11.1 and 11.2 branches, respectively.
</p>
<h2>April 4, 2016</h2>
<p>
<a href="relnotes/11.2.0.html">Mesa 11.2.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>February 10, 2016</h2>
<p>
<a href="relnotes/11.1.2.html">Mesa 11.1.2</a> is released.

View File

@@ -73,8 +73,7 @@ The following are required for DRI-based hardware acceleration with Mesa:
<ul>
<li><a href="http://xorg.freedesktop.org/releases/individual/proto/">
dri2proto</a> version 2.6 or later
<li><a href="http://dri.freedesktop.org/libdrm/">libDRM</a>
version 2.4.33 or later
<li><a href="http://dri.freedesktop.org/libdrm/">libDRM</a> latest version
<li>Xorg server version 1.5 or later
<li>Linux 2.6.28 or later
</ul>

View File

@@ -46,10 +46,10 @@ library</em>. <br>
<p>
The Mesa distribution consists of several components. Different copyrights
and licenses apply to different components. For example, some demo programs
are copyrighted by SGI, some of the Mesa device drivers are copyrighted by
their authors. See below for a list of Mesa's main components and the license
for each.
and licenses apply to different components.
For example, the GLX client code uses the SGI Free Software License B, and
some of the Mesa device drivers are copyrighted by their authors.
See below for a list of Mesa's main components and the license for each.
</p>
<p>
The core Mesa library is licensed according to the terms of the MIT license.
@@ -97,13 +97,17 @@ and their respective licenses.
<pre>
Component Location License
------------------------------------------------------------------
Main Mesa code src/mesa/ Mesa (MIT)
Main Mesa code src/mesa/ MIT
Device drivers src/mesa/drivers/* MIT, generally
Gallium code src/gallium/ MIT
Ext headers include/GL/glext.h Khronos
include/GL/glxext.h
GLX client code src/glx/ SGI Free Software License B
C11 thread include/c11/threads*.h Boost (permissive)
emulation
</pre>

View File

@@ -21,6 +21,11 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/11.2.2.html">11.2.2 release notes</a>
<li><a href="relnotes/11.1.4.html">11.1.4 release notes</a>
<li><a href="relnotes/11.2.1.html">11.2.1 release notes</a>
<li><a href="relnotes/11.1.3.html">11.1.3 release notes</a>
<li><a href="relnotes/11.2.0.html">11.2.0 release notes</a>
<li><a href="relnotes/11.1.2.html">11.1.2 release notes</a>
<li><a href="relnotes/11.0.9.html">11.0.9 release notes</a>
<li><a href="relnotes/11.1.1.html">11.1.1 release notes</a>

319
docs/relnotes/11.1.3.html Normal file
View File

@@ -0,0 +1,319 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.1.3 Release Notes / April 17, 2016</h1>
<p>
Mesa 11.1.3 is a bug fix release which fixes bugs found since the 11.1.2 release.
</p>
<p>
Mesa 11.1.3 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
9e86c72b6b2e8adb53c1c4a0002ab267b45094d753eb9404b1db34f81ce94ccf mesa-11.1.3.tar.gz
51f6658a214d75e4d9f05207586d7ed56ebba75c6b10841176fb6675efa310ac mesa-11.1.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=27512">Bug 27512</a> - Illegal instruction _mesa_x86_64_transform_points4_general</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91526">Bug 91526</a> - World of Warcraft (on Wine) has UI corruption with nouveau</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92193">Bug 92193</a> - [SKL] ES2-CTS.gtf.GL2ExtensionTests.compressed_astc_texture.compressed_astc_texture fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93358">Bug 93358</a> - [HSW] Unreal Elemental demo - assertion error in copy_image_with_blitter</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93418">Bug 93418</a> - Geometry Shaders output wrong vertices on Sandy Bridge</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93524">Bug 93524</a> - Clover doesn't build</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93667">Bug 93667</a> - Crash in eglCreateImageKHR with huge texture size</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93813">Bug 93813</a> - Incorrect viewport range when GL_CLIP_ORIGIN is GL_UPPER_LEFT</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94050">Bug 94050</a> - test_vec4_register_coalesce regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94073">Bug 94073</a> - Miscompilation of abs_vec3_vert_xvary_ref.vert in WebGL conformance</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94088">Bug 94088</a> - [llvmpipe] SIGFPE pthread_barrier_destroy.c:40</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94193">Bug 94193</a> - [llvmpipe] Line antialiasing looks different when GL_LINE_STIPPLE is enabled with pattern 0xffff</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94195">Bug 94195</a> - [llvmpipe] Does not build with LLVM 3.7.x on Windows</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94388">Bug 94388</a> - r600_blit.c:281: r600_decompress_depth_textures: Assertion `tex-&gt;is_depth &amp;&amp; !tex-&gt;is_flushing_texture' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94412">Bug 94412</a> - Trine 3 misrender</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94481">Bug 94481</a> - softpipe - access violation in img_filter_2d_nearest</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94595">Bug 94595</a> - [Mesa AMD&amp;swrast] Texture views attached as framebuffers return their viewed tecture's color encoding and render incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94954">Bug 94954</a> - test_vec4_copy_propagation fails in `make check`</li>
</ul>
<h2>Changes</h2>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965: Fix assert conditions for src/dst x/y offsets</li>
</ul>
<p>Ben Widawsky (2):</p>
<ul>
<li>i965: Make sure we blit a full compressed block</li>
<li>i965/skl: Add two missing device IDs</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>mesa: fix incorrect viewport position when GL_CLIP_ORIGIN = GL_LOWER_LEFT</li>
</ul>
<p>Chris Forbes (1):</p>
<ul>
<li>i965/blorp: Fix hiz ops on MSAA surfaces</li>
</ul>
<p>Christian König (1):</p>
<ul>
<li>radeon/uvd: disable MPEG1</li>
</ul>
<p>Christian Schmidbauer (1):</p>
<ul>
<li>st/nine: specify WINAPI only for i386 and amd64</li>
</ul>
<p>Daniel Czarnowski (3):</p>
<ul>
<li>egl_dri2: NULL check for xcb_dri2_get_buffers_reply()</li>
<li>egl_dri2: set correct error code if swapbuffers fails</li>
<li>egl: support EGL_LARGEST_PBUFFER in eglCreatePbufferSurface(...)</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>mesa/fbobject: propogate Layered when reusing attachments.</li>
</ul>
<p>Derek Foreman (1):</p>
<ul>
<li>egl/wayland: Try to use wl_surface.damage_buffer for SwapBuffersWithDamage</li>
</ul>
<p>Dongwon Kim (1):</p>
<ul>
<li>egl: move Null check to eglGetSyncAttribKHR to prevent Segfault</li>
</ul>
<p>Emil Velikov (10):</p>
<ul>
<li>docs: add sha256 checksums for 11.1.2</li>
<li>get-pick-list.sh: Require explicit "11.1" for nominating stable patches</li>
<li>cherry-ignore: do not pick nv50/ir commit</li>
<li>automake: add nine to make distcheck</li>
<li>install-gallium-links: port changes from install-lib-links</li>
<li>automake: add more missing options for make distcheck</li>
<li>mesa; add get-extra-pick-list.sh script into bin/</li>
<li>egl/x11: check the return value of xcb_dri2_get_buffers_reply()</li>
<li>nvc/ir: remove duplicate variable declaration</li>
<li>Update version to 11.1.3</li>
</ul>
<p>Francisco Jerez (4):</p>
<ul>
<li>i965: Reupload push and pull constants when we get new shader image unit state.</li>
<li>i965/fs: Add missing analysis invalidation in opt_sampler_eot().</li>
<li>i965/fs: Add missing analysis invalidation in fixup_3src_null_dest().</li>
<li>i965/vec4: Consider removal of no-op MOVs as progress during register coalesce.</li>
</ul>
<p>Ilia Mirkin (21):</p>
<ul>
<li>nvc0/ir: fix converting between predicate and gpr</li>
<li>nvc0: add some missing PUSH_SPACE's</li>
<li>nvc0: avoid negatives in PUSH_SPACE argument</li>
<li>glsl: make sure builtins are initialized before getting the shader</li>
<li>glsl: return cloned signature, not the builtin one</li>
<li>nv50/ir: fix quadop emission in the presence of predication</li>
<li>st/mesa: fix up result_src.type when doing i2u/u2i conversions</li>
<li>meta/copy_image: use precomputed dst_internal_format to avoid segfault</li>
<li>st/mesa: force depth mode to GL_RED for sized depth/stencil formats</li>
<li>glx: update to updated version of EXT_create_context_es2_profile</li>
<li>nv50,nvc0: bump minimum texture buffer offset alignment</li>
<li>nvc0: reset TFB bufctx when we no longer hold a reference to the buffers</li>
<li>glsl: avoid stack smashing when there are too many attributes</li>
<li>nvc0: fix blit triangle size to fully cover FB's &gt; 8192x8192</li>
<li>nv50: reset TFB bufctx when we no longer hold a reference to the buffers</li>
<li>nv50/ir: force-enable derivatives on TXD ops</li>
<li>st/mesa: only minify depth for 3d targets</li>
<li>nv50/ir: fix indirect texturing for non-array textures on nvc0</li>
<li>nvc0/ir: fix picking of coordinates from tex instruction for textureGrad</li>
<li>nvc0: disable primitive restart and index bias during blits</li>
<li>nv50/ir: we can't load local memory directly into an output</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>nir/lower_vec_to_movs: Better report channels handled by insert_mov</li>
</ul>
<p>Kenneth Graunke (3):</p>
<ul>
<li>mesa: Make glGet queries initialize ctx-&gt;Debug when necessary.</li>
<li>mesa: Allow Get*() of several forgotten IsEnabled() pnames.</li>
<li>i965: Only magnify depth for 3D textures, not array textures.</li>
</ul>
<p>Koop Mast (1):</p>
<ul>
<li>st/clover: Add libelf cflags to the build</li>
</ul>
<p>Marc-André Lureau (1):</p>
<ul>
<li>virtio_gpu: Add virtio 1.0 PCI ID to driver map</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>radeonsi: fix Hyper-Z on Stoney</li>
<li>gallium/radeon: don't use temporary buffers for persistent mappings</li>
<li>radeonsi: fix Hyper-Z hangs on P2 configs</li>
</ul>
<p>Matt Turner (3):</p>
<ul>
<li>i965/vec4: don't copy ATTR into 3src instructions with complex swizzles</li>
<li>i965/fs: Don't CSE negated multiplies with saturation.</li>
<li>i965/vec4: Update vec4 unit tests for commit 01dacc83ff.</li>
</ul>
<p>Nanley Chery (2):</p>
<ul>
<li>mesa/image: Make _mesa_clip_readpixels() work with renderbuffers</li>
<li>mesa/readpix: Clip ReadPixels() area to the ReadBuffer's</li>
</ul>
<p>Nicolai Hähnle (2):</p>
<ul>
<li>r600g: clear compressed_depthtex/colortex_mask when binding buffer texture</li>
<li>st/mesa: use the texture view's format for render-to-texture</li>
</ul>
<p>Nishanth Peethambaran (2):</p>
<ul>
<li>st/omx: Remove trailing spaces</li>
<li>st/omx/dec: Correct the timestamping</li>
</ul>
<p>Oded Gabbay (8):</p>
<ul>
<li>gallium/radeon: Correctly translate colorswaps for big endian</li>
<li>llvmpipe: use vpkswss when dst is signed</li>
<li>gallium/radeon: return correct values for BE in r600_translate_colorswap</li>
<li>gallium/radeon: remove separate BE path in r600_translate_colorswap</li>
<li>gallium/r600: Don't let h/w do endian swap for colorformat</li>
<li>gallium/radeon: disable evergreen_do_fast_color_clear for BE</li>
<li>r600g: Do colorformat endian swap for PIPE_USAGE_STAGING</li>
<li>radeonsi: Do colorformat endian swap for PIPE_USAGE_STAGING</li>
</ul>
<p>Olivier Pena (1):</p>
<ul>
<li>scons: support for LLVM 3.7.</li>
</ul>
<p>Patrick Baggett (1):</p>
<ul>
<li>mesa: Use SSE prefetch instructions rather than 3DNow instructions</li>
</ul>
<p>Rob Herring (10):</p>
<ul>
<li>Android: remove dependence on .SECONDEXPANSION</li>
<li>Android: glsl: fix dependence on YACC_HEADER_SUFFIX from build system</li>
<li>Android: add -Wno-date-time flag for clang</li>
<li>Android: remove headers from LOCAL_SRC_FILES</li>
<li>Android: clean-up and fix DRI module path handling</li>
<li>freedreno: drop unnecessary -Wno-packed-bitfield-compat</li>
<li>gallium/radeon: Add space between string literal and identifier</li>
<li>r600: Make enum alu_op_flags unsigned</li>
<li>virtio_gpu: Add PCI ID to driver map</li>
<li>Android: fix x86 gallium builds</li>
</ul>
<p>Roland Scheidegger (2):</p>
<ul>
<li>softpipe: fix anisotropic filtering crash</li>
<li>draw: fix line stippling</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>nvc0: make sure to delete samplers used by compute shaders</li>
</ul>
<p>Steinar H. Gunderson (1):</p>
<ul>
<li>mesa: Fix locking of GLsync objects.</li>
</ul>
<p>Tamil velan (1):</p>
<ul>
<li>radeon/uvd: increase max height to 4096 for VI and newer</li>
</ul>
<p>Thomas Hellstrom (2):</p>
<ul>
<li>winsys/svga: Fix an uninitialized return value</li>
<li>winsys/svga: Increase the fence timeout</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>llvmpipe: Do not use barriers if not using threads.</li>
</ul>
<p>xavier (1):</p>
<ul>
<li>r600/sb: Do not distribute neg in expr_handler::fold_assoc() when folding multiplications.</li>
</ul>
</div>
</body>
</html>

182
docs/relnotes/11.1.4.html Normal file
View File

@@ -0,0 +1,182 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.1.4 Release Notes / May 9, 2016</h1>
<p>
Mesa 11.1.4 is a bug fix release which fixes bugs found since the 11.1.3 release.
</p>
<p>
Mesa 11.1.4 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
034231fffb22621dadb8e4a968cb44752b8b68db7a2417568d63c275b3490cea mesa-11.1.4.tar.gz
0f781e9072655305f576efd4204d183bf99ac8cb8d9e0dd9fc2b4093230a0eba mesa-11.1.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92850">Bug 92850</a> - Segfault loading War Thunder</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93962">Bug 93962</a> - [HSW, regression, bisected, CTS] ES2-CTS.gtf.GL2FixedTests.scissor.scissor - segfault/asserts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94955">Bug 94955</a> - Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94994">Bug 94994</a> - OSMesaGetProcAdress always fails on mangled OSMesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95026">Bug 95026</a> - Alien Isolation segfault after initial loading screen/video</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95133">Bug 95133</a> - X-COM Enemy Within crashes when entering tactical mission with Bonaire</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (1):</p>
<ul>
<li>gallium/util: initialize pipe_framebuffer_state to zeros</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>dri: Fix robust context creation via EGL attribute</li>
</ul>
<p>Egbert Eich (1):</p>
<ul>
<li>dri2: Check for dummyContext to see if the glx_context is valid</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 11.1.3</li>
<li>cherry-ignore: add non-applicable "fix of a fix"</li>
<li>cherry-ignore: ignore st_DrawAtlasBitmaps mem leak fix</li>
<li>cherry-ignore: add CodeEmitterGK110::emitATOM() fix</li>
<li>Update version to 11.1.4</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>vc4: Fix subimage accesses to LT textures.</li>
<li>vc4: Add support for rendering to cube map surfaces.</li>
<li>vc4: Fix tests for format supported with nr_samples == 1.</li>
<li>vc4: Make sure we recompile when sample_mask changes.</li>
</ul>
<p>Frederic Devernay (1):</p>
<ul>
<li>glapi: fix _glapi_get_proc_address() for mangled function names</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>i965/tiled_memcopy: Add aligned mem_copy parameters to the [de]tiling functions</li>
<li>i965/tiled_memcpy: Rework the RGBA -&gt; BGRA mem_copy functions</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>egl/x11: authenticate before doing chipset id ioctls</li>
</ul>
<p>Jose Fonseca (1):</p>
<ul>
<li>winsys/sw/xlib: use correct free function for xlib_dt-&gt;data</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/uvd: fix tonga feedback buffer size</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>drirc: add a workaround for blackness in Warsow</li>
<li>st/mesa: fix blit-based GetTexImage for non-finalized textures</li>
</ul>
<p>Nicolai Hähnle (5):</p>
<ul>
<li>radeonsi: fix bounds check in si_create_vertex_elements</li>
<li>gallium/radeon: handle failure when mapping staging buffer</li>
<li>st/glsl_to_tgsi: reduce stack explosion in recursive expression visitor</li>
<li>gallium/radeon: fix crash in r600_set_streamout_targets</li>
<li>radeonsi: correct NULL-pointer check in si_upload_const_buffer</li>
</ul>
<p>Oded Gabbay (4):</p>
<ul>
<li>r600g/radeonsi: send endian info to format translation functions</li>
<li>r600g: set endianess of 16/32-bit buffers according to do_endian_swap</li>
<li>r600g: use do_endian_swap in color swapping functions</li>
<li>r600g: use do_endian_swap in texture swapping function</li>
</ul>
<p>Roland Scheidegger (3):</p>
<ul>
<li>llvmpipe: (trivial) initialize src1_alpha var to NULL</li>
<li>gallivm: fix bogus argument order to lp_build_sample_mipmap function</li>
<li>gallivm: make sampling more robust against bogus coordinates</li>
</ul>
<p>Samuel Pitoiset (5):</p>
<ul>
<li>gk110/ir: make use of IMUL32I for all immediates</li>
<li>nvc0/ir: fix wrong emission of (a OP b) OP c</li>
<li>gk110/ir: add emission for (a OP b) OP c</li>
<li>nvc0: reduce GL_MAX_3D_TEXTURE_SIZE to 2048 on Kepler+</li>
<li>st/glsl_to_tgsi: fix potential crash when allocating temporaries</li>
</ul>
<p>Stefan Dirsch (1):</p>
<ul>
<li>dri3: Check for dummyContext to see if the glx_context is valid</li>
</ul>
<p>Thomas Hindoe Paaboel Andersen (1):</p>
<ul>
<li>st/va: avoid dereference after free in vlVaDestroyImage</li>
</ul>
<p>WuZhen (3):</p>
<ul>
<li>tgsi: initialize stack allocated struct</li>
<li>winsys/sw/dri: use correct free function for dri_sw_dt-&gt;data</li>
<li>android: enable dlopen() on all architectures</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.2.0 Release Notes / TBD</h1>
<h1>Mesa 11.2.0 Release Notes / 4 April 2016</h1>
<p>
Mesa 11.2.0 is a new development release.
@@ -33,7 +33,8 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD.
dea3d8143929aad5c24ef0993ddb05807b30c284b488fc62903adfcc1c127887 mesa-11.2.0.tar.gz
1c1fed2674abf3f16ed2623e9a5694d6752c293194e18462ebc644a19cfaafb2 mesa-11.2.0.tar.xz
</pre>
@@ -70,7 +71,217 @@ Note: some of the new features are only available with certain drivers.
<h2>Bug fixes</h2>
TBD.
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=27512">Bug 27512</a> - Illegal instruction _mesa_x86_64_transform_points4_general</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75165">Bug 75165</a> - compute.c:464:49: error: function definition is not allowed here</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79783">Bug 79783</a> - Distorted output in obs-studio where other vendors &quot;work&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89330">Bug 89330</a> - piglit glsl-1.50 invariant-qualifier-in-out-block-01 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89969">Bug 89969</a> - nouveau: add support for chunk decoding in order to support vaapi (st/va)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90348">Bug 90348</a> - Spilling failure of b96 merged value</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91526">Bug 91526</a> - World of Warcraft (on Wine) has UI corruption with nouveau</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91596">Bug 91596</a> - EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91806">Bug 91806</a> - configure does not test whether assembler supports sse4.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91927">Bug 91927</a> - [SKL] [regression] piglit compressed textures tests fail with kernel upgrade</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92193">Bug 92193</a> - [SKL] ES2-CTS.gtf.GL2ExtensionTests.compressed_astc_texture.compressed_astc_texture fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92229">Bug 92229</a> - [APITRACE] SOMA have serious graphical errors</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92233">Bug 92233</a> - Unigine Heaven 4.0 silhuette run</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92438">Bug 92438</a> - Segfault in pushbuf_kref when running the android emulator (qemu) on nv50</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92589">Bug 92589</a> - [BDW BSW SKL CTS] ES31-CTS.texture_gather.* GPU_HANG</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92595">Bug 92595</a> - [HSW,BDW,SKL][GLES 3.1 CTS] Big difference in the results for the ES31-CTS.shader_bitfield_operation.* tests</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92609">Bug 92609</a> - [BDW, BSW] piglit sampling-2d-array-as-2d-layer fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92687">Bug 92687</a> - Add support for ARB_internalformat_query2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92706">Bug 92706</a> - glBlitFramebuffer refuses to blit RGBA to RGB with MSAA</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92709">Bug 92709</a> - &quot;LLVM triggered Diagnostic Handler: unsupported call to function ldexpf in main&quot; when starting race in stuntrally</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92743">Bug 92743</a> - Centroid shouldn't have to match between the FS and the VS</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92759">Bug 92759</a> - [Regression, bisected] Visuals without alpha bits are not sRGB-capable</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92849">Bug 92849</a> - [IVB HSW BDW] piglit image load/store load-from-cleared-image.shader_test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92909">Bug 92909</a> - Offset/alignment issue with layout std140 and vec3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93004">Bug 93004</a> - Guild Wars 2 crash on nouveau DX11 cards</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93048">Bug 93048</a> - [CTS regression] mesa af2723 breaks GL Conformance for debug extension</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93063">Bug 93063</a> - drm_helper.h:227:1: error: static declaration of pipe_virgl_create_screen follows non-static declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93091">Bug 93091</a> - [opencl] segfault when running any opencl programs (like clinfo)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93092">Bug 93092</a> - lp_test_format regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93126">Bug 93126</a> - wrongly claim supporting GL_EXT_texture_rg</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93180">Bug 93180</a> - [regression] arb_separate_shader_objects.active sampler conflict fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93189">Bug 93189</a> - &quot;./util/u_inlines.h&quot;, line 83: operands have incompatible types: void &quot;:&quot; int</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93215">Bug 93215</a> - [Regression bisected] Ogles1conform Automatic mipmap generation test is fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93235">Bug 93235</a> - [regression] dispatch sanity broken by GetPointerv</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93257">Bug 93257</a> - [SKL, bisected] ASTC dEQP tests segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93264">Bug 93264</a> - Tonga VM Faults since llvm ScheduleDAGInstrs: Rework schedule graph builder.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93266">Bug 93266</a> - gl_arb_shading_language_420pack does not allow binding of image variables</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93300">Bug 93300</a> - Two Worlds 2 renders water incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93312">Bug 93312</a> - [SKL][GLES 3.1 CTS] ES31-CTS.layout_binding* GPU_HANG</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93320">Bug 93320</a> - [HSW,BDW,SKL][GLES 3.1 CTS] ES31-CTS.vertex_attrib_binding.advanced-bindingUpdate fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93322">Bug 93322</a> - [HSW,BDW,SKL][GLES 3.1 CTS] ES31-CTS.compute_shader.resource-ubo fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93323">Bug 93323</a> - [HSW,BDW,SKL][GLES 3.1 CTS]ES31-CTS.shader_image_load_store.basic-allTargets-store-fs fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93325">Bug 93325</a> - [HSW,BDW,SKL]ES31-CTS.explicit_uniform_location.uniform-loc-* 2 tests fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93339">Bug 93339</a> - glLinkProgram() should fail when a varying is never written to in a previous stage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93348">Bug 93348</a> - [HSW,BDW,SKL][GLES 3.1 CTS] ES31-CTS.compute_shader.* segfault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93358">Bug 93358</a> - [HSW] Unreal Elemental demo - assertion error in copy_image_with_blitter</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93387">Bug 93387</a> - inverse() shouldnt be exposed in GLSL 1.20 and 1.30</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93388">Bug 93388</a> - [i965, regression, bisection] MESA_FORMAT_B8G8R8X8_SRGB changes break kwin</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93407">Bug 93407</a> - [SKL][GLES 3.1 CTS]ES31-CTS.compute_shader.resources-texture fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93410">Bug 93410</a> - [BDW,SKL][GLES 3.1 CTS]ES31-CTS.shader_image_load_store.negative-linkErrors fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93418">Bug 93418</a> - Geometry Shaders output wrong vertices on Sandy Bridge</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93426">Bug 93426</a> - [SKL,BDW,BSW,BXT] CTS regression: es2-cts.gtf.gl2fixedtests.buffer_objects.buffer_object,s</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93524">Bug 93524</a> - Clover doesn't build</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93526">Bug 93526</a> - GfxBench 4 tessellation demos misrender</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93532">Bug 93532</a> - [HSW,BDW,SKL][GLES 3.1 CTS] ES31-CTS.compute_shader.*. Regression, bisected.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93540">Bug 93540</a> - [BISECTED, HSW] Rendering issue in Heaven (and other benchmarks)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93560">Bug 93560</a> - opt_combine_constants failing fabsf(reg-&gt;f) == table.imm[i].val assertion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93599">Bug 93599</a> - Strange green flashes with &quot;Metro: Last Light Redux&quot; + &quot;Metro 2033 Redux&quot; with Intel Mesa driver</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93648">Bug 93648</a> - Random lines being rendered when playing Dolphin (geometry shaders related, w/ apitrace)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93650">Bug 93650</a> - GL_ARB_separate_shader_objects is buggy (PCSX2)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93667">Bug 93667</a> - Crash in eglCreateImageKHR with huge texture size</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93696">Bug 93696</a> - [HSW,BDW;SKL][GLES 3.1 CTS]ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-* fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93700">Bug 93700</a> - [SKL, regression] deqp-gles2.functional.texture.completeness</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93717">Bug 93717</a> - Meta mipmap generation can corrupt texture state</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93722">Bug 93722</a> - Segfault when compiling shader with a subroutine that takes a parameter</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93725">Bug 93725</a> - [HSW, regression, bisected] ES31-CTS.texture_gather.*depth*</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93731">Bug 93731</a> - glUniformSubroutinesuiv segfaults when subroutine uniform is bound to a specific location</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93761">Bug 93761</a> - A conditional discard in a fragment shader causes no depth writing at all</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93790">Bug 93790</a> - [HSW] Use after free with compute programs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93792">Bug 93792</a> - [HSW] intel_mipmap_tree.c:1325: intel_miptree_copy_slice: Assertion `src_mt-&gt;format == dst_mt-&gt;format</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93813">Bug 93813</a> - Incorrect viewport range when GL_CLIP_ORIGIN is GL_UPPER_LEFT</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93840">Bug 93840</a> - [i965] Alien: Isolation fails with GL_ARB_compute_shader enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93862">Bug 93862</a> - [Bisected] &quot;drm/amdgpu: fix amdgpu_bo_pin_restricted VRAM placing v2&quot; is bad</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93878">Bug 93878</a> - [llvmpipe][softpipe] piglit arb_gpu_shader_fp64-double-gettransformfeedbackvarying regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93957">Bug 93957</a> - [HSW] Mishandling of sample count when using an attachment-less framebuffer (assertion error)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93961">Bug 93961</a> - virgl build failure after 2016-02-01 changes - no previous prototype for 'virgl_drm_winsys_create'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93962">Bug 93962</a> - [HSW, regression, bisected, CTS] ES2-CTS.gtf.GL2FixedTests.scissor.scissor - segfault/asserts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93989">Bug 93989</a> - build: flex-2.5.39 seems to be failing for glsl_lexer.ll</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94016">Bug 94016</a> - make check MesaExtensionsTest.AlphabeticallySorted regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94019">Bug 94019</a> - [bisected] 3D acceleration broken with gallium/radeon: just get num_tile_pipes from the winsys</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94050">Bug 94050</a> - test_vec4_register_coalesce regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94073">Bug 94073</a> - Miscompilation of abs_vec3_vert_xvary_ref.vert in WebGL conformance</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94081">Bug 94081</a> - [HSW] compute shader shared var + atomic op = fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94088">Bug 94088</a> - [llvmpipe] SIGFPE pthread_barrier_destroy.c:40</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94091">Bug 94091</a> - Tonga unreal elemental segfault since radeonsi: put image, fmask, and sampler descriptors into one array</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94100">Bug 94100</a> - [HSW] compute indirect dispatch with 0 work groups causes gpu hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94134">Bug 94134</a> - [regression] piglit.spec.arb_texture_view.sampling-2d-array-as-2d-layer assertion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94139">Bug 94139</a> - [regression, HSW, IVB] piglit.spec.arb_compute_shader.minmax</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94150">Bug 94150</a> - UE4 Suntemple rendering errors</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94186">Bug 94186</a> - Crash when launching glxinfo and World of Warcraft with RV790</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94188">Bug 94188</a> - define (or undef) defined behaves stupidly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94193">Bug 94193</a> - [llvmpipe] Line antialiasing looks different when GL_LINE_STIPPLE is enabled with pattern 0xffff</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94199">Bug 94199</a> - Shader abort/crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94253">Bug 94253</a> - [llvmpipe] piglit gl-1.0-swapbuffers-behavior regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94254">Bug 94254</a> - [llvmpipe] [softpipe] piglit read-front regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94257">Bug 94257</a> - [softpipe] piglit glx-copy-sub-buffer regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94274">Bug 94274</a> - [swrast] piglit arb_occlusion_query2-render regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94284">Bug 94284</a> - [radeonsi] outlast segfault on start</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94388">Bug 94388</a> - r600_blit.c:281: r600_decompress_depth_textures: Assertion `tex-&gt;is_depth &amp;&amp; !tex-&gt;is_flushing_texture' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94412">Bug 94412</a> - Trine 3 misrender</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94481">Bug 94481</a> - softpipe - access violation in img_filter_2d_nearest</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94524">Bug 94524</a> - Wrong gl_TessLevelOuter interpretation for isolines</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94595">Bug 94595</a> - [Mesa AMD&amp;swrast] Texture views attached as framebuffers return their viewed tecture's color encoding and render incorrectly</li>
</ul>
<h2>Changes</h2>
@@ -78,7 +289,7 @@ Microsoft Visual Studio 2013 or later is now required for building
on Windows.
Previously, Visual Studio 2008 and later were supported.
TBD.
</div>
</body>

119
docs/relnotes/11.2.1.html Normal file
View File

@@ -0,0 +1,119 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.2.1 Release Notes / April 17, 2016</h1>
<p>
Mesa 11.2.1 is a bug fix release which fixes bugs found since the 11.2.0 release.
</p>
<p>
Mesa 11.2.1 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
cc2a024204564a71acc95cf262bf618fe49b1d77d351e5755eea705cadac5167 mesa-11.2.1.tar.gz
a65207e9ae5c5f1c29f863c6a2cc98a7ab99762a24b82a248337f0ea9cfce01b mesa-11.2.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93962">Bug 93962</a> - [HSW, regression, bisected, CTS] ES2-CTS.gtf.GL2FixedTests.scissor.scissor - segfault/asserts</li>
</ul>
<h2>Changes</h2>
<p>Brian Paul (2):</p>
<ul>
<li>st/mesa: fix glReadBuffer() assertion failure</li>
<li>st/mesa: fix memleak in glDrawPixels cache code</li>
</ul>
<p>Christian Schmidbauer (1):</p>
<ul>
<li>st/nine: specify WINAPI only for i386 and amd64</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: add sha256 checksums for 11.2.0</li>
<li>configure.ac: update the path of the generated files</li>
<li>Update version to 11.2.1</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>glsl: allow usage of the keyword buffer before GLSL 430 / ESSL 310</li>
</ul>
<p>Iurie Salomov (1):</p>
<ul>
<li>va: check null context in vlVaDestroyContext</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>i965/tiled_memcopy: Add aligned mem_copy parameters to the [de]tiling functions</li>
<li>i965/tiled_memcpy: Rework the RGBA -&gt; BGRA mem_copy functions</li>
</ul>
<p>Kenneth Graunke (3):</p>
<ul>
<li>i965: Fix textureSize() depth value for 1 layer surfaces on Gen4-6.</li>
<li>i965: Use brw-&gt;urb.min_vs_urb_entries instead of 32 for BLORP.</li>
<li>glsl: Lower variable indexing of system value arrays unconditionally.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>drirc: add a workaround for blackness in Warsow</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>radeonsi: fix bounds check in si_create_vertex_elements</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>nv50/ir: do not try to attach JOIN ops to ATOM</li>
</ul>
<p>Thomas Hindoe Paaboel Andersen (1):</p>
<ul>
<li>st/va: avoid dereference after free in vlVaDestroyImage</li>
</ul>
</div>
</body>
</html>

210
docs/relnotes/11.2.2.html Normal file
View File

@@ -0,0 +1,210 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 11.2.2 Release Notes / May 9, 2016</h1>
<p>
Mesa 11.2.2 is a bug fix release which fixes bugs found since the 11.2.1 release.
</p>
<p>
Mesa 11.2.2 implements the OpenGL 4.1 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.1. OpenGL
4.1 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e2453014cd2cc5337a5180cdeffe8cf24fffbb83e20a96888e2b01df868eaae6 mesa-11.2.2.tar.gz
40e148812388ec7c6d7b6657d5a16e2e8dabba8b97ddfceea5197947647bdfb4 mesa-11.2.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92850">Bug 92850</a> - Segfault loading War Thunder</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93767">Bug 93767</a> - Glitches with soft shadows and MSAA in Knights of the Old Republic 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94955">Bug 94955</a> - Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94994">Bug 94994</a> - OSMesaGetProcAdress always fails on mangled OSMesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95026">Bug 95026</a> - Alien Isolation segfault after initial loading screen/video</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95133">Bug 95133</a> - X-COM Enemy Within crashes when entering tactical mission with Bonaire</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95164">Bug 95164</a> - GLSL compiler (linker I think) emits assertion upon call to glAttachShader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95251">Bug 95251</a> - vdpau decoder capabilities: not supported</li>
</ul>
<h2>Changes</h2>
<p>Boyuan Zhang (1):</p>
<ul>
<li>radeon/uvd: alignment fix for decode message buffer</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>st/mesa: fix sampler view leak in st_DrawAtlasBitmaps()</li>
<li>gallium/util: initialize pipe_framebuffer_state to zeros</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>dri: Fix robust context creation via EGL attribute</li>
</ul>
<p>Egbert Eich (1):</p>
<ul>
<li>dri2: Check for dummyContext to see if the glx_context is valid</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 11.2.1</li>
<li>docs: update the sha256 checksums for 11.2.1</li>
<li>cherry-ignore: remove duplicate commit</li>
<li>cherry-ignore: ignore the GetSamplerParameterIuiv{EXT,OES} fixups</li>
<li>Update version to 11.2.2</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>vc4: Fix subimage accesses to LT textures.</li>
<li>vc4: Add support for rendering to cube map surfaces.</li>
<li>vc4: Fix tests for format supported with nr_samples == 1.</li>
<li>vc4: Make sure we recompile when sample_mask changes.</li>
</ul>
<p>Frederic Devernay (1):</p>
<ul>
<li>glapi: fix _glapi_get_proc_address() for mangled function names</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>nvc0: fix retrieving query results into buffer for timestamps</li>
<li>nouveau/video: properly detect the decoder class for availability checks</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965/fs: Properly report regs_written from SAMPLEINFO</li>
</ul>
<p>Jonathan Gray (1):</p>
<ul>
<li>egl/x11: authenticate before doing chipset id ioctls</li>
</ul>
<p>Jose Fonseca (1):</p>
<ul>
<li>winsys/sw/xlib: use correct free function for xlib_dt-&gt;data</li>
</ul>
<p>Kenneth Graunke (3):</p>
<ul>
<li>i965: Fix clear code for ignoring colormask for XRGB formats on Gen9+.</li>
<li>glsl: Convert lower_vec_index_to_swizzle to a rvalue visitor.</li>
<li>glsl: Lower vector_extracts to swizzles after lower_vector_derefs.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>radeon/uvd: fix tonga feedback buffer size</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>st/mesa: fix blit-based GetTexImage for non-finalized textures</li>
</ul>
<p>Nicolai Hähnle (5):</p>
<ul>
<li>gallium/radeon: handle failure when mapping staging buffer</li>
<li>st/glsl_to_tgsi: reduce stack explosion in recursive expression visitor</li>
<li>gallium/radeon: fix crash in r600_set_streamout_targets</li>
<li>radeonsi: correct NULL-pointer check in si_upload_const_buffer</li>
<li>radeonsi: work around an MSAA fast stencil clear problem</li>
</ul>
<p>Oded Gabbay (4):</p>
<ul>
<li>r600g/radeonsi: send endian info to format translation functions</li>
<li>r600g: set endianess of 16/32-bit buffers according to do_endian_swap</li>
<li>r600g: use do_endian_swap in color swapping functions</li>
<li>r600g: use do_endian_swap in texture swapping function</li>
</ul>
<p>Patrick Rudolph (1):</p>
<ul>
<li>r600g: fix and optimize tgsi_cmp when using ABS and NEG modifier</li>
</ul>
<p>Roland Scheidegger (3):</p>
<ul>
<li>llvmpipe: (trivial) initialize src1_alpha var to NULL</li>
<li>gallivm: fix bogus argument order to lp_build_sample_mipmap function</li>
<li>gallivm: make sampling more robust against bogus coordinates</li>
</ul>
<p>Samuel Pitoiset (6):</p>
<ul>
<li>gk110/ir: do not overwrite def value with zero for EXCH ops</li>
<li>gk110/ir: make use of IMUL32I for all immediates</li>
<li>nvc0/ir: fix wrong emission of (a OP b) OP c</li>
<li>gk110/ir: add emission for (a OP b) OP c</li>
<li>nvc0: reduce GL_MAX_3D_TEXTURE_SIZE to 2048 on Kepler+</li>
<li>st/glsl_to_tgsi: fix potential crash when allocating temporaries</li>
</ul>
<p>Stefan Dirsch (1):</p>
<ul>
<li>dri3: Check for dummyContext to see if the glx_context is valid</li>
</ul>
<p>Topi Pohjolainen (2):</p>
<ul>
<li>i965/blorp/gen7: Prepare re-using for gen8</li>
<li>i965/blorp: Use 8k chunk size for urb allocation</li>
</ul>
<p>WuZhen (3):</p>
<ul>
<li>tgsi: initialize stack allocated struct</li>
<li>winsys/sw/dri: use correct free function for dri_sw_dt-&gt;data</li>
<li>android: enable dlopen() on all architectures</li>
</ul>
</div>
</body>
</html>

335
docs/relnotes/12.0.0.html Normal file
View File

@@ -0,0 +1,335 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.0 Release Notes / July 8, 2016</h1>
<p>
Mesa 12.0.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 12.0.1.
</p>
<p>
Mesa 12.0.0 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
3b8fa4d86d78f8f6ec86055b92ad1afe869001483593b3dd4531184b8bc4fcfb mesa-12.0.0.tar.gz
0090c025219318935124292b482e3439bc43e8c074ad01086449fcad88547dc6 mesa-12.0.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>OpenGL 4.3 on nvc0, radeonsi, i965 (Gen8+)</li>
<li>OpenGL ES 3.1 on nvc0, radeonsi</li>
<li>GL_ARB_ES3_1_compatibility on nvc0, radeonsi</li>
<li>GL_ARB_compute_shader on nvc0, radeonsi, softpipe</li>
<li>GL_ARB_cull_distance on i965/gen6+, nv50, nvc0, llvmpipe, softpipe</li>
<li>GL_ARB_framebuffer_no_attachments on nvc0, r600, radeonsi, softpipe</li>
<li>GL_ARB_internalformat_query2 on all drivers</li>
<li>GL_ARB_query_buffer_object on i965/hsw+</li>
<li>GL_ARB_robust_buffer_access_behavior on i965, nvc0, radeonsi</li>
<li>GL_ARB_shader_atomic_counters on radeonsi, softpipe</li>
<li>GL_ARB_shader_atomic_counter_ops on nvc0, radeonsi, softpipe</li>
<li>GL_ARB_shader_image_load_store on nvc0, radeonsi, softpipe</li>
<li>GL_ARB_shader_image_size on nvc0, radeonsi, softpipe</li>
<li>GL_ARB_shader_storage_buffer_objects on radeonsi, softpipe</li>
<li>GL_ATI_fragment_shader on all Gallium drivers</li>
<li>GL_EXT_base_instance on all drivers that support GL_ARB_base_instance</li>
<li>GL_EXT_clip_cull_distance on all drivers that support GL_ARB_cull_distance</li>
<li>GL_KHR_robustness on i965</li>
<li>GL_OES_copy_image on i965 (Baytrail and Gen8+)</li>
<li>GL_OES_draw_buffers_indexed and GL_EXT_draw_buffers_indexed on all drivers that support GL_ARB_draw_buffers_blend</li>
<li>GL_OES_gpu_shader5 and GL_EXT_gpu_shader5 on all drivers that support GL_ARB_gpu_shader5</li>
<li>GL_OES_sample_shading on i965, nvc0, r600, radeonsi</li>
<li>GL_OES_sample_variables on i965, nvc0, r600, radeonsi</li>
<li>GL_OES_shader_image_atomic on all drivers that support GL_ARB_shader_image_load_store</li>
<li>GL_OES_shader_io_blocks on i965, nvc0, radeonsi</li>
<li>GL_OES_shader_multisample_interpolation on i965, nvc0, r600, radeonsi</li>
<li>GL_OES_texture_border_clamp and GL_EXT_texture_border_clamp on all drivers that support GL_ARB_texture_border_clamp</li>
<li>GL_OES_texture_buffer and GL_EXT_texture_buffer on i965, nvc0, radeonsi</li>
<li>EGL_KHR_reusable_sync on all drivers</li>
<li>GL_ARB_stencil_texture8 and GL_OES_stencil_texture8 on i965/gen8+</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=42187">Bug 42187</a> - ES 1.1 conformance pntszary.c fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71789">Bug 71789</a> - [r300g] Visuals not found in (default) depth = 24</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81585">Bug 81585</a> - piglit spec_glsl-1.10_compiler_literals_invalid-float-suffix-capital-f.vert fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83036">Bug 83036</a> - [ILK]Piglit spec_ARB_copy_image_arb_copy_image-formats fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89607">Bug 89607</a> - Assertion hit in opt_array_splitting with recursive array indexing</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90513">Bug 90513</a> - Odd gray and red flicker in The Talos Principle on GK104</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91526">Bug 91526</a> - World of Warcraft (on Wine) has UI corruption with nouveau</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92628">Bug 92628</a> - HTTP site for Mesa downloads</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92743">Bug 92743</a> - Centroid shouldn't have to match between the FS and the VS</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92850">Bug 92850</a> - Segfault loading War Thunder</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93054">Bug 93054</a> - [BDW] DiRT Showdown and Bioshock Infinite only render half the screen (bottom left triangle)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93524">Bug 93524</a> - Clover doesn't build</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93551">Bug 93551</a> - Divinity: Original Sin Enhanced Edition(Native) crash on start</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93667">Bug 93667</a> - Crash in eglCreateImageKHR with huge texture size</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93767">Bug 93767</a> - Glitches with soft shadows and MSAA in Knights of the Old Republic 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93840">Bug 93840</a> - [i965] Alien: Isolation fails with GL_ARB_compute_shader enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93962">Bug 93962</a> - [HSW, regression, bisected, CTS] ES2-CTS.gtf.GL2FixedTests.scissor.scissor - segfault/asserts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94081">Bug 94081</a> - [HSW] compute shader shared var + atomic op = fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94086">Bug 94086</a> - Multiple conflicting libGL libraries installed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94116">Bug 94116</a> - program interface queries not returning right data for UBO / GL_BLOCK_INDEX</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94129">Bug 94129</a> - Mesa's compiler should warn about undefined values</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94181">Bug 94181</a> - [regression] piglit.spec.ext_framebuffer_object.getteximage-formats init-by-clear-and-render</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94193">Bug 94193</a> - [llvmpipe] Line antialiasing looks different when GL_LINE_STIPPLE is enabled with pattern 0xffff</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94198">Bug 94198</a> - [HSW] segfault in copy image when copying from cubemap to 2d</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94199">Bug 94199</a> - Shader abort/crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94253">Bug 94253</a> - [llvmpipe] piglit gl-1.0-swapbuffers-behavior regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94254">Bug 94254</a> - [llvmpipe] [softpipe] piglit read-front regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94257">Bug 94257</a> - [softpipe] piglit glx-copy-sub-buffer regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94274">Bug 94274</a> - [swrast] piglit arb_occlusion_query2-render regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94284">Bug 94284</a> - [radeonsi] outlast segfault on start</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94291">Bug 94291</a> - llvmpipe tests fail if built on skylake i7-6700k</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94348">Bug 94348</a> - vkBindImageMemory doesn't take into account the offset when the image is used as a depth buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94383">Bug 94383</a> - build error on i386 when enabling swr</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94388">Bug 94388</a> - r600_blit.c:281: r600_decompress_depth_textures: Assertion `tex-&gt;is_depth &amp;&amp; !tex-&gt;is_flushing_texture' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94412">Bug 94412</a> - Trine 3 misrender</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94447">Bug 94447</a> - glsl/glcpp/tests/glcpp-test-cr-lf regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94453">Bug 94453</a> - dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_{center,corner} fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94454">Bug 94454</a> - dEQP-GLES3.functional.clipping.point.wide_point_clip* fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94456">Bug 94456</a> - dEQP-GLES3.functional.state_query.floats.{blend_color,color_clear_value,depth_clear_value}_getinteger64 fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94458">Bug 94458</a> - dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_x_size_initial fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94468">Bug 94468</a> - [HSW, regression, bisected] numerous Sascha demos render incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94481">Bug 94481</a> - softpipe - access violation in img_filter_2d_nearest</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94485">Bug 94485</a> - dEQP-GLES3.functional.negative_api.shader.compile_shader and delete_shader broken by Meta</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94524">Bug 94524</a> - Wrong gl_TessLevelOuter interpretation for isolines</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94595">Bug 94595</a> - [Mesa AMD&amp;swrast] Texture views attached as framebuffers return their viewed tecture's color encoding and render incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94657">Bug 94657</a> - [llvmpipe] [softpipe] piglit arb_texture_view-getteximage-srgb regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94661">Bug 94661</a> - [bdw, skl] vk-cts: new test failing</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94671">Bug 94671</a> - [radeonsi] Blue-ish textures in Shadow of Mordor</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94713">Bug 94713</a> - [Gen8+] ES 3.1 Stencil texturing broken for 2DArray/Cubes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94747">Bug 94747</a> - Convert phi nodes to logical operations</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94835">Bug 94835</a> - Increase fragment shader sample limits from 16 to 32 (AMD Linux - Mesa/RadeonSi)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94847">Bug 94847</a> - [ES3.1CTS] es31-cts.draw_buffers_indexed.color_masks fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94896">Bug 94896</a> - [vulkan] new CTS tests fail on i965</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94904">Bug 94904</a> - [vulkan, BSW] dEQP-VK.api.object_management.multithreaded_per_thread_device intermittent crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94907">Bug 94907</a> - codegen/nv50_ir_ra.cpp:1330:29: error: isinf was not declared in this scope</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94909">Bug 94909</a> - [llvmpipe] piglit fs-roundEven-float regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94917">Bug 94917</a> - radeonsi supports GL_ARB_shader_storage_buffer_object with 0 GL_MAX_COMBINED_SHADER_STORAGE_BLOCKS</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94924">Bug 94924</a> - [GEN8] Ungine Valley fails to run due to &quot;intel_do_flush_locked failed: Input/output error&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94925">Bug 94925</a> - Crash in egl_dri3_get_dri_context with Dolphin EGL/X11 in single-core mode</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94944">Bug 94944</a> - [regression, hswgt1] gpu hang on arb_shader_image_load_store</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94955">Bug 94955</a> - Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94969">Bug 94969</a> - build fails because install-data-local doesn't follow $DESTDIR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94972">Bug 94972</a> - blend failures on llvmpipe with llvm 3.7 due to vector selects</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94979">Bug 94979</a> - dolphin-emu rendering broken on gallium/SWR + crashing often</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94984">Bug 94984</a> - XCom2 crashes with SIGSEGV on radeonsi</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94994">Bug 94994</a> - OSMesaGetProcAdress always fails on mangled OSMesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94997">Bug 94997</a> - [vulkan, SKL,BDW,HSW] deqp-vk.spirv_assembly.instruction.compute.opcopymemory.array regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94998">Bug 94998</a> - [vulkan] deqp-vk.pipeline.push_constant.graphics_pipeline.count_3shader_vgf regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95001">Bug 95001</a> - [vulkan] deqp-vk.binding_model.shader_access regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95005">Bug 95005</a> - Unreal engine demos segfault after shader compilation error with OpenGL 4.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95026">Bug 95026</a> - Alien Isolation segfault after initial loading screen/video</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95034">Bug 95034</a> - vkResetCommandPool should not destroy the command buffers.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95071">Bug 95071</a> - [bisected] Wrong colors in KDE/Qt applications</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95133">Bug 95133</a> - X-COM Enemy Within crashes when entering tactical mission with Bonaire</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95138">Bug 95138</a> - [deqp, 32bit, gen8+] deqp-gles31.functional.draw_indirect.negative</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95142">Bug 95142</a> - [ES3.1CTS,GEN8] ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos assertion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95158">Bug 95158</a> - glx-test compilation fails in `make check`</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95164">Bug 95164</a> - GLSL compiler (linker I think) emits assertion upon call to glAttachShader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95180">Bug 95180</a> - rasterizer/memory/Convert.h:170:9: error: __builtin_isnan is not a member of std</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95198">Bug 95198</a> - Shadow of Mordor beta has missing geometry with gl 4.3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95203">Bug 95203</a> - Tonga GST/OMX/VCE encode broken since mesa: st/omx: Fix resource leak on OMX_ErrorNone</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95211">Bug 95211</a> - scons TypeError: 'tuple' object is not callable</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95246">Bug 95246</a> - Segfault in glBindFramebuffer()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95251">Bug 95251</a> - vdpau decoder capabilities: not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95252">Bug 95252</a> - [deqp] deqp-gles31.functional.debug.object_labels.query_length_only crashes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95292">Bug 95292</a> - [IVB,SKL] vulkan: stride/tiling issue with vkCmdCopyBufferToImage from larger source buffer into destination image</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95296">Bug 95296</a> - nir_lower_double_packing.c:79:4: error: void function 'lower_double_pack_impl' should not return a value [-Wreturn-type]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95324">Bug 95324</a> - GL33-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo fails in one case on Haswell</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95370">Bug 95370</a> - [965GM] piglit fails many tests after a5d7e144</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95373">Bug 95373</a> - Suspicious warning in brw_blorp_clear.cpp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95403">Bug 95403</a> - [GK110] misaligned_gpr spamming dmesg when playing victor vran</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95419">Bug 95419</a> - [HSW][regression][bisect] RPG Maker game gives &quot;invalid floating point operation&quot; at startup</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95456">Bug 95456</a> - glXGetFBConfigs has invalid screen bounds</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95462">Bug 95462</a> - [BXT,BSW] arb_gpu_shader_fp64 causes gpu hang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95529">Bug 95529</a> - [regression, bisected] Image corruption in Chrome</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95537">Bug 95537</a> - Invalid argument in anv_ioctl called from anv_physical_device_init</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96221">Bug 96221</a> - nir/nir_lower_tex.c:202: error: unknown field f32 specified in initializer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96228">Bug 96228</a> - SSBO test regressions from mesa 5b267509</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96236">Bug 96236</a> - dri_interface.h:404: error: redefinition of typedef mesa_glinterop_device_info</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96238">Bug 96238</a> - swr fails to build outside of the main directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96239">Bug 96239</a> - [radeonsi tessellation] [R9 290/390] Random &quot;texture flickering&quot; (Shadow of Mordor, Tomb Raider, Unigine Heaven 4.0)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96258">Bug 96258</a> - [NVC0] Hang when running compute program</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96285">Bug 96285</a> - Mesa build broken</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96299">Bug 96299</a> - [vulkan] 64 regressions due to mesa d5f2f32</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96346">Bug 96346</a> - [SNB,CTS] es2-cts.gtf.gl.atan regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96349">Bug 96349</a> - [CTS,SKL,BSW,BDW,KBL,BXT] es31-cts.arrays_of_arrays.interactionuniformbuffers3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96351">Bug 96351</a> - [CTS,SKL,KBL,BXT] es2-cts.gtf.gl2extensiontests.egl_image.egl_image</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96358">Bug 96358</a> - SSO: wrong interface validation between GS and VS (regresion due to latest gles 3.1)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96425">Bug 96425</a> - [bisected] occasional dark render in The Talos Principle</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96504">Bug 96504</a> - [vulkancts] compute tests crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96516">Bug 96516</a> - [bisected: 482526] &quot;clover: Update OpenCL version string to match OpenGL&quot;: clover's build fails because of missing git_sha1.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96565">Bug 96565</a> - Clive Barker's Jericho displays strange,vivid colors when motion blur enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96607">Bug 96607</a> - [bisected] texture misrender / flicker in The Talos Principle on SKL</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96617">Bug 96617</a> - gl_SecondaryFragDataEXT doesn't work for extended blend func</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96629">Bug 96629</a> - dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0: Assertion `width &gt;= 1' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96639">Bug 96639</a> - st/mesa: transfer_map with too-high level with dEQP-GLES2.functional.texture.completeness.cube.extra_level</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96674">Bug 96674</a> - [SNB, ILK] spec.ext_image_dma_buf_import.ext_image_dma_buf_import-sample_nv1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96765">Bug 96765</a> - BindFragDataLocationIndexed on array fragment shader output.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96791">Bug 96791</a> - Cannot use image from swapchains for sampling</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96825">Bug 96825</a> - anv_device.c:31:27: fatal error: anv_timestamp.h: No such file or directory</li>
</ul>
<h2>Changes</h2>
Radeon drivers (r600 and radeonsi) now require LLVm 3.6 as a minimum.
</div>
</body>
</html>

67
docs/relnotes/12.0.1.html Normal file
View File

@@ -0,0 +1,67 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.1 Release Notes / July 8, 2016</h1>
<h1>Mesa 12.0.1 Release Notes / July 8, 2016</h1>
<p>
Mesa 12.0.1 is a bug fix release which fixes bugs found since the 12.0.1 release.
</p>
<p>
Mesa 12.0.1 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
28dff9c045f4305c96a875a487b9f06c7e88d910511cd6016dbddcd1f53ade0d mesa-12.0.1.tar.gz
bab24fb79f78c876073527f515ed871fc9c81d816f66c8a0b051d8d653896389 mesa-12.0.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96864">Bug 96864</a> - Mesa 12.0 radeon build broken</li>
</ul>
<h2>Changes</h2>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: add sha256 checksums for 12.0.0</li>
<li>radeon: reference the correct cdw/max_dw</li>
<li>Update version to 12.0.1</li>
<li>docs: add release notes for 12.0.1</li>
</ul>
</div>
</body>
</html>

403
docs/relnotes/12.0.2.html Normal file
View File

@@ -0,0 +1,403 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.2 Release Notes / September 2, 2016</h1>
<p>
Mesa 12.0.2 is a bug fix release which fixes bugs found since the 12.0.1 release.
</p>
<p>
Mesa 12.0.2 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
a08565ab1273751ebe2ffa928cbf785056594c803077c9719d0763da780f2918 mesa-12.0.2.tar.gz
d957a5cc371dcd7ff2aa0d87492f263aece46f79352f4520039b58b1f32552cb mesa-12.0.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69622">Bug 69622</a> - eglTerminate then eglMakeCurrent crahes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89599">Bug 89599</a> - symbol 'x86_64_entry_start' is already defined when building with LLVM/clang</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91342">Bug 91342</a> - Very dark textures on some objects in indoors environments in Postal 2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92306">Bug 92306</a> - GL Excess demo renders incorrectly on nv43</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94148">Bug 94148</a> - Framebuffer considered invalid when a draw call is done before glCheckFramebufferStatus</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96274">Bug 96274</a> - [NVC0] Failure when compiling compute shader: Assertion `bb-&gt;getFirst()-&gt;serial &lt;= bb-&gt;getExit()-&gt;serial' failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96358">Bug 96358</a> - SSO: wrong interface validation between GS and VS (regresion due to latest gles 3.1)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96381">Bug 96381</a> - Texture artifacts with immutable texture storage and mipmaps</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96762">Bug 96762</a> - [radeonsi,apitrace] Firewatch: nothing rendered in scrollable (text) areas</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96835">Bug 96835</a> - &quot;gallium: Force blend color to 16-byte alignment&quot; crash with &quot;-march=native -O3&quot; causes some 32bit games to crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96850">Bug 96850</a> - Crucible tests fail for 32bit mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96908">Bug 96908</a> - [radeonsi] MSAA causes graphical artifacts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96911">Bug 96911</a> - webgl2 conformance2/textures/misc/tex-mipmap-levels.html crashes 12.1 Intel driver</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96971">Bug 96971</a> - invariant qualifier is not valid for shader inputs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97039">Bug 97039</a> - The Talos Principle and Serious Sam 3 GPU faults</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97207">Bug 97207</a> - [IVY BRIDGE] Fragment shader discard writing to depth</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97214">Bug 97214</a> - X not running with error &quot;Failed to make EGL context current&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97225">Bug 97225</a> - [i965 on HD4600 Haswell] xcom switch to ingame cinematics cause segmentation fault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97307">Bug 97307</a> - glsl/glcpp/tests/glcpp-test regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97331">Bug 97331</a> - glDrawElementsBaseVertex doesn't work in display list on i915</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97351">Bug 97351</a> - DrawElementsBaseVertex with VBO ignores base vertex on Intel GMA 9xx in some cases</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97426">Bug 97426</a> - glScissor gives vertically inverted result</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97476">Bug 97476</a> - Shader binaries should not be stored in the PipelineCache</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97567">Bug 97567</a> - [SNB, ILK] ctl, piglit regressions in mesa 12.0.2rc1</li>
</ul>
<h2>Changes</h2>
<p>Andreas Boll (1):</p>
<ul>
<li>configure.ac: Use ${datarootdir} for --with-vulkan-icddir help string too</li>
</ul>
<p>Bernard Kilarski (1):</p>
<ul>
<li>glx: fix error code when there is no context bound</li>
</ul>
<p>Brian Paul (4):</p>
<ul>
<li>svga: handle mismatched number of samplers, sampler views</li>
<li>mesa: use _mesa_clear_texture_image() in clear_texture_fields()</li>
<li>swrast: fix incorrectly positioned putImage() in swrast driver</li>
<li>mesa: fix format conversion bug in get_tex_rgba_uncompressed()</li>
</ul>
<p>Chad Versace (2):</p>
<ul>
<li>i965: Fix miptree layout for EGLImage-based renderbuffers</li>
<li>i965: Respect miptree offsets in intel_readpixels_tiled_memcpy()</li>
</ul>
<p>Christian König (1):</p>
<ul>
<li>st/mesa: fix reference counting bug in st_vdpau</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>swr: Refactor checks for compiler feature flags</li>
</ul>
<p>Daniel Scharrer (1):</p>
<ul>
<li>mesa: Fix fixed function spot lighting on newer hardware (again)</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>anv: fix writemask on blit fragment shader.</li>
<li>st/glsl_to_tgsi: fix st_src_reg_for_double constant.</li>
</ul>
<p>Emil Velikov (15):</p>
<ul>
<li>docs: add sha256 checksums for 12.0.1</li>
<li>mesa: automake: list builddir before srcdir</li>
<li>mesa: scons: list builddir before srcdir</li>
<li>i965: store reference to the context within struct brw_fence (v2)</li>
<li>anv: remove internal 'validate' layer</li>
<li>anv: automake: use VISIBILITY_CFLAGS to restrict symbol visibility</li>
<li>anv: automake: build with -Bsymbolic</li>
<li>anv: do not export the Vulkan API</li>
<li>anv: remove dummy VK_DEBUG_MARKER_EXT entry points</li>
<li>isl: automake: use VISIBILITY_CFLAGS to restrict symbol visibility</li>
<li>cherry-ignore: temporary(?) drop "a4xx: make sure to actually clamp depth"</li>
<li>i915: Check return value of screen-&gt;image.loader-&gt;getBuffers</li>
<li>Revert "i965/miptree: Set logical_depth0 == 6 for cube maps"</li>
<li>glx/glvnd: list the strcmp arguments in correct order</li>
<li>Update version to 12.0.2</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>vc4: Close our screen's fd on screen close.</li>
<li>vc4: Disable early Z with computed depth.</li>
<li>vc4: Fix a leak of the src[] array of VPM reads in optimization.</li>
<li>vc4: Fix leak of the bo_handles table.</li>
</ul>
<p>Francisco Jerez (3):</p>
<ul>
<li>i965: Emit SKL VF cache invalidation W/A from brw_emit_pipe_control_flush.</li>
<li>i965: Make room in the batch epilogue for three more pipe controls.</li>
<li>i965: Fix remaining flush vs invalidate race conditions in brw_emit_pipe_control_flush.</li>
</ul>
<p>Haixia Shi (1):</p>
<ul>
<li>platform_android: prevent deadlock in droid_swap_buffers</li>
</ul>
<p>Ian Romanick (5):</p>
<ul>
<li>mesa: Strip arrayness from interface block names in some IO validation</li>
<li>glsl: Pack integer and double varyings as flat even if interpolation mode is none</li>
<li>glcpp: Track the actual version instead of just the version_resolved flag</li>
<li>glcpp: Only disallow #undef of pre-defined macros on GLSL ES &gt;= 3.00 shaders</li>
<li>glsl: Mark cube map array sampler types as reserved in GLSL ES 3.10</li>
</ul>
<p>Ilia Mirkin (16):</p>
<ul>
<li>mesa: etc2 online compression is unsupported, don't attempt it</li>
<li>st/mesa: return appropriate mesa format for ETC texture formats</li>
<li>mesa: set _NEW_BUFFERS when updating texture bound to current buffers</li>
<li>nv50,nvc0: srgb rendering is only available for rgba/bgra</li>
<li>vbo: allow DrawElementsBaseVertex in display lists</li>
<li>gallium/util: add helper to compute zmin/zmax for a viewport state</li>
<li>nv50,nvc0: fix depth range when halfz is enabled</li>
<li>nv50/ir: fix bb positions after exit instructions</li>
<li>vbo: add basevertex when looking up elements for vbo splitting</li>
<li>a4xx: only disable depth clipping, not all clipping, when requested</li>
<li>nv50/ir: make sure cfg iterator always hits all blocks</li>
<li>main: add missing EXTRA_END in OES_sample_variables get check</li>
<li>nouveau: always enable at least one RC</li>
<li>nv30: only bail on color/depth bpp mismatch when surfaces are swizzled</li>
<li>a4xx: make sure to actually clamp depth as requested</li>
<li>gk110/ir: fix quadop dall emission</li>
</ul>
<p>Jan Ziak (2):</p>
<ul>
<li>egl/x11: avoid using freed memory if dri2 init fails</li>
<li>loader: fix memory leak in loader_dri3_open</li>
</ul>
<p>Jason Ekstrand (31):</p>
<ul>
<li>nir/spirv: Don't multiply the push constant block size by 4</li>
<li>anv: Add a stub for CmdCopyQueryPoolResults on Ivy Bridge</li>
<li>glsl/types: Fix function type comparison function</li>
<li>glsl/types: Use _mesa_hash_data for hashing function types</li>
<li>genxml: Make gen6-7 blending look more like gen8</li>
<li>anv/pipeline: Unify blend state setup between gen7 and gen8</li>
<li>anv: Enable independentBlend on gen7</li>
<li>anv: Add an align_down_npot_u32 helper</li>
<li>anv: Handle VK_WHOLE_SIZE properly for buffer views</li>
<li>i965/miptree: Enforce that height == 1 for 1-D array textures</li>
<li>i965/miptree: Set logical_depth0 == 6 for cube maps</li>
<li>nir: Add a nir_deref_foreach_leaf helper</li>
<li>nir/inline: Constant-initialize local variables in the callee if needed</li>
<li>anv/pipeline: Set up point coord enables</li>
<li>i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations</li>
<li>i965/vec4: Make opt_vector_float reset at the top of each block</li>
<li>anv/blit2d: Add a format parameter to bind_dst and create_iview</li>
<li>anv/blit2d: Add support for RGB destinations</li>
<li>anv/clear: Make cmd_clear_image take an actual VkClearValue</li>
<li>anv/clear: Clear E5B9G9R9 images as R32_UINT</li>
<li>anv: Include the pipeline layout in the shader hash</li>
<li>isl: Allow multisampled array textures</li>
<li>anv/descriptor_set: memset anv_descriptor_set_layout</li>
<li>anv/pipeline: Fix bind maps for fragment output arrays</li>
<li>anv/allocator: Correctly set the number of buckets</li>
<li>anv/pipeline: Properly handle OOM during shader compilation</li>
<li>anv: Remove unused fields from anv_pipeline_bind_map</li>
<li>anv: Add pipeline_has_stage guards a few places</li>
<li>anv: Add a struct for storing a compiled shader</li>
<li>anv/pipeline: Add support for caching the push constant map</li>
<li>anv: Rework pipeline caching</li>
</ul>
<p>José Fonseca (2):</p>
<ul>
<li>appveyor: Install pywin32 extensions.</li>
<li>appveyor: Force Visual Studio 2013 image.</li>
</ul>
<p>Kenneth Graunke (21):</p>
<ul>
<li>genxml: Add CLIPMODE_* prefix to 3DSTATE_CLIP's "Clip Mode" enum values.</li>
<li>genxml: Add APIMODE_D3D missing enum values and improve consistency.</li>
<li>anv: Fix near plane clipping on Gen7/7.5.</li>
<li>anv: Enable early culling on Gen7.</li>
<li>anv: Unify 3DSTATE_CLIP code across generations.</li>
<li>genxml: Rename "API Rendering Disable" to "Rendering Disable".</li>
<li>anv: Properly call gen75_emit_state_base_address on Haswell.</li>
<li>i965: Include VUE handles for GS with invocations &gt; 1.</li>
<li>nir: Add a base const_index to shared atomic intrinsics.</li>
<li>i965: Fix shared atomic intrinsics to pay attention to base.</li>
<li>mesa: Add GL_BGRA_EXT to the list of GenerateMipmap internal formats.</li>
<li>mesa: Don't call GenerateMipmap if Width or Height == 0.</li>
<li>glsl: Delete bogus ir_set_program_inouts assert.</li>
<li>glsl: Fix the program resource names of gl_TessLevelOuter/Inner[].</li>
<li>glsl: Fix location bias for patch variables.</li>
<li>glsl: Fix invariant matching in GLSL 4.30 and GLSL ES 1.00.</li>
<li>mesa: Fix uf10_to_f32() scale factor in the E == 0 and M != 0 case.</li>
<li>nir/builder: Add bany_inequal and bany helpers.</li>
<li>i965: Implement the WaPreventHSTessLevelsInterference workaround.</li>
<li>i965: Fix execution size of scalar TCS barrier setup code.</li>
<li>i965: Fix barrier count shift in scalar TCS backend.</li>
</ul>
<p>Leo Liu (2):</p>
<ul>
<li>st/omx/enc: check uninitialized list from task release</li>
<li>vl/dri3: fix a memory leak from front buffer</li>
</ul>
<p>Marek Olšák (7):</p>
<ul>
<li>glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast</li>
<li>radeonsi: add a workaround for a compute VGPR-usage LLVM bug</li>
<li>winsys/amdgpu: disallow DCC with mipmaps</li>
<li>gallium/util: fix align64</li>
<li>radeonsi: only set dual source blending for MRT0</li>
<li>radeonsi: fix VM faults due NULL internal const buffers on CIK</li>
<li>radeonsi: disable SDMA texture copying on Carrizo</li>
</ul>
<p>Matt Turner (4):</p>
<ul>
<li>mapi: Massage code to allow clang to compile.</li>
<li>i965/vec4: Ignore swizzle of VGRF for use by var_range_end().</li>
<li>mesa: Use AC_HEADER_MAJOR to include correct header for major().</li>
<li>nir: Walk blocks in source code order in lower_vars_to_ssa.</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>glx: Don't use current context in __glXSendError</li>
</ul>
<p>Miklós Máté (1):</p>
<ul>
<li>vbo: set draw_id</li>
</ul>
<p>Nanley Chery (5):</p>
<ul>
<li>anv/descriptor_set: Fix binding partly undefined descriptor sets</li>
<li>isl: Fix assert on raw buffer surface state size</li>
<li>anv/device: Fix max buffer range limits</li>
<li>isl: Fix isl_tiling_is_any_y()</li>
<li>anv/gen7_pipeline: Set PixelShaderKillPixel for discards</li>
</ul>
<p>Nicolai Hähnle (7):</p>
<ul>
<li>radeonsi: explicitly choose center locations for 1xAA on Polaris</li>
<li>radeonsi: fix Polaris MSAA regression</li>
<li>radeonsi: ensure sample locations are set for line and polygon smoothing</li>
<li>st_glsl_to_tgsi: only skip over slots of an input array that are present</li>
<li>glsl: fix optimization of discard nested multiple levels</li>
<li>radeonsi: flush TC L2 cache for indirect draw data</li>
<li>radeonsi: add si_set_rw_buffer to be used for internal descriptors</li>
</ul>
<p>Nicolas Boichat (6):</p>
<ul>
<li>egl/dri2: dri2_make_current: Set EGL error if bindContext fails</li>
<li>egl/wayland: Set disp-&gt;DriverData to NULL on error</li>
<li>egl/surfaceless: Set disp-&gt;DriverData to NULL on error</li>
<li>egl/drm: Set disp-&gt;DriverData to NULL on error</li>
<li>egl/android: Set dpy-&gt;DriverData to NULL on error</li>
<li>egl/dri2: Add reference count for dri2_egl_display</li>
</ul>
<p>Rob Herring (3):</p>
<ul>
<li>Android: add missing u_math.h include path for libmesa_isl</li>
<li>vc4: fix vc4_resource_from_handle() stride calculation</li>
<li>vc4: add hash table look-up for exported dmabufs</li>
</ul>
<p>Samuel Pitoiset (7):</p>
<ul>
<li>nvc0/ir: fix images indirect access on Fermi</li>
<li>nvc0: fix the driver cb size when draw parameters are used</li>
<li>gm107/ir: add missing NEG modifier for IADD32I</li>
<li>gm107/ir: make use of ADD32I for all immediates</li>
<li>nvc0: upload sample locations on GM20x</li>
<li>nvc0: invalidate textures/samplers on GK104+</li>
<li>nv50/ir: always emit the NDV bit for OP_QUADOP</li>
</ul>
<p>Stefan Dirsch (1):</p>
<ul>
<li>Avoid overflow in 'last' variable of FindGLXFunction(...)</li>
</ul>
<p>Stencel, Joanna (1):</p>
<ul>
<li>egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.</li>
</ul>
<p>Tim Rowley (2):</p>
<ul>
<li>Revert "gallium: Force blend color to 16-byte alignment"</li>
<li>swr: switch from overriding -march to selecting features</li>
</ul>
<p>Tomasz Figa (8):</p>
<ul>
<li>gallium/dri: Add shared glapi to LIBADD on Android</li>
<li>egl/android: Remove unused variables</li>
<li>egl/android: Check return value of dri2_get_dri_config()</li>
<li>egl/android: Stop leaking DRI images</li>
<li>gallium/winsys/kms: Fix double refcount when importing from prime FD (v2)</li>
<li>gallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2)</li>
<li>gallium/winsys/kms: Move display target handle lookup to separate function</li>
<li>gallium/winsys/kms: Look up the GEM handle after importing a prime FD</li>
</ul>
</div>
</body>
</html>

71
docs/relnotes/12.0.3.html Normal file
View File

@@ -0,0 +1,71 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.3 Release Notes / September 15, 2016</h1>
<p>
Mesa 12.0.3 is a bug fix release which fixes bugs found since the 12.0.3 release.
</p>
<p>
Mesa 12.0.3 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
79abcfab3de30dbd416d1582a3cf6b1be308466231488775f1b7bb43be353602 mesa-12.0.3.tar.gz
1dc86dd9b51272eee1fad3df65e18cda2e556ef1bc0b6e07cd750b9757f493b1 mesa-12.0.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97781">Bug 97781</a> - [HSW, BYT, IVB] es2-cts.gtf.gl2extensiontests.depth_texture_cube_map.depth_texture_cube_map</li>
</ul>
<h2>Changes</h2>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: add sha256 checksums for 12.0.2</li>
<li>Revert "i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations"</li>
<li>Update version to 12.0.3</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>appveyor: Update winflexbison download URL.</li>
</ul>
</div>
</body>
</html>

321
docs/relnotes/12.0.4.html Normal file
View File

@@ -0,0 +1,321 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.4 Release Notes / November 10, 2016</h1>
<p>
Mesa 12.0.4 is a bug fix release which fixes bugs found since the 12.0.4 release.
</p>
<p>
Mesa 12.0.4 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
22026ce4f1c6a7908b0d10ff057decec0a5633afe7f38a0cef5c08d0689f02a6 mesa-12.0.4.tar.gz
5d6003da867d3f54e5000b4acdfc37e6cce5b6a4459274fdad73e24bd2f0065e mesa-12.0.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71759">Bug 71759</a> - Intel driver fails with &quot;intel_do_flush_locked failed: No such file or directory&quot; if buffer imported with EGL_NATIVE_PIXMAP_KHR</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94354">Bug 94354</a> - R9285 Unigine Valley perf regression since radeonsi: use re-Z</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96770">Bug 96770</a> - include/GL/mesa_glinterop.h:62: error: redefinition of typedef GLXContext</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97233">Bug 97233</a> - vkQuake VkSpecializationMapEntry related bug</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97260">Bug 97260</a> - R9 290 low performance in Linux 4.7</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97549">Bug 97549</a> - [SNB, BXT] up to 40% perf drop from &quot;loader/dri3: Overhaul dri3_update_num_back&quot; commit</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97887">Bug 97887</a> - llvm segfault in janusvr -render vive</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98025">Bug 98025</a> - [radeonsi] incorrect primitive restart index used</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98134">Bug 98134</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers wants a different GL error code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98326">Bug 98326</a> - [dEQP, EGL] pbuffer depth/stencil tests fail</li>
</ul>
<h2>Changes</h2>
<p>Axel Davy (4):</p>
<ul>
<li>gallium/util: Really allow aliasing of dst for u_box_union_*</li>
<li>st/nine: Fix the calculation of the number of vs inputs</li>
<li>st/nine: Fix mistake in Volume9 UnlockBox</li>
<li>st/nine: Fix locking CubeTexture surfaces.</li>
</ul>
<p>Brendan King (1):</p>
<ul>
<li>configure.ac: fix the name of the Wayland Scanner pc file</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>st/mesa: fix swizzle issue in st_create_sampler_view_from_stobj()</li>
</ul>
<p>Chad Versace (3):</p>
<ul>
<li>egl: Fix truncation error in _eglParseSyncAttribList64</li>
<li>i965/sync: Fix uninitalized usage and leak of mutex</li>
<li>egl: Don't advertise unsupported platform extensions</li>
</ul>
<p>Chuanbo Weng (1):</p>
<ul>
<li>gbm: fix potential NULL deref of mapImage/unmapImage.</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>autoconf: Make header install distinct for various APIs (v2)</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>anv: initialise and increment send_sbc</li>
<li>anv/wsi: fix apps that acquire multiple images up front</li>
<li>Revert "st/vdpau: use linear layout for output surfaces"</li>
</ul>
<p>Emil Velikov (12):</p>
<ul>
<li>docs: add sha256 checksums for 12.0.3</li>
<li>cherry-ignore: add non-applicable i965 commit</li>
<li>cherry-ignore: add vaapi encode fix</li>
<li>cherry-ignore: add EGL_KHR_debug fix</li>
<li>cherry-ignore: add update_renderbuffer_read_surfaces()</li>
<li>isl/gen6: correctly check msaa layout samples count</li>
<li>egl/x11: don't crash if dri2_dpy-&gt;conn is NULL</li>
<li>get-pick-list.sh: Require explicit "12.0" for nominating stable patches</li>
<li>automake: don't forget to pick wglext.h in the tarball</li>
<li>cherry-ignore: add N/A EGL revert</li>
<li>cherry-ignore: add ClientWaitSync fixes</li>
<li>Update version to 12.0.4</li>
</ul>
<p>Eric Anholt (5):</p>
<ul>
<li>travis: Parse configure.ac to pick an updated LIBDRM_VERSION.</li>
<li>travis: Update to the Ubuntu Trusty image.</li>
<li>travis: Enable vc4 in libdrm to satisfy vc4 test build dependency.</li>
<li>travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers.</li>
<li>gallium: Fix install-gallium-links.mk on non-bash /bin/sh</li>
</ul>
<p>Hans de Goede (1):</p>
<ul>
<li>pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>glsl: Fix cut-and-paste bug in hierarchical visitor ir_expression::accept</li>
</ul>
<p>Ilia Mirkin (16):</p>
<ul>
<li>nv30: set usage to staging so that the buffer is allocated in GART</li>
<li>a3xx: make sure to actually clamp depth as requested</li>
<li>a3xx: make use of software clipping when hw can't handle it</li>
<li>a3xx: use window scissor to simulate viewport xy clip</li>
<li>main: GL_RGB10_A2UI does not come with GL 3.0/EXT_texture_integer</li>
<li>mesa/formatquery: limit ES target support, fix core context support</li>
<li>nir: fix definition of pack_uvec2_to_uint</li>
<li>gm107/ir: AL2P writes to a predicate register</li>
<li>st/mesa: fix is_scissor_enabled when X/Y are negative</li>
<li>nvc0/ir: fix overwriting of value backing non-constant gather offset</li>
<li>nv50/ir: copy over value's register id when resolving merge of a phi</li>
<li>nvc0/ir: fix textureGather with a single offset</li>
<li>gm107/ir: fix texturing with indirect samplers</li>
<li>gm107/ir: fix bit offset of tex lod setting for indirect texturing</li>
<li>nv50,nvc0: avoid reading out of bounds when getting bogus so info</li>
<li>nv50/ir: process texture offset sources as regular sources</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>radeonsi: Fix primitive restart when index changes</li>
</ul>
<p>Jason Ekstrand (9):</p>
<ul>
<li>nir/spirv: Swap the argument order for AtomicCompareExchange</li>
<li>nir/spirv: Use the correct sources for CompareExchange on images</li>
<li>nir/spirv: Break variable decoration handling into a helper</li>
<li>nir/spirv: Refactor variable deocration handling</li>
<li>nir/spirv/cfg: Handle switches whose break block is a loop continue</li>
<li>nir/spirv/cfg: Detect switch_break after loop_break/continue</li>
<li>nir: Add a nop intrinsic</li>
<li>nir/spirv/cfg: Use a nop intrinsic for tagging the ends of blocks</li>
<li>intel/blorp: Rework our usage of ralloc when compiling shaders</li>
</ul>
<p>Jonathan Gray (3):</p>
<ul>
<li>genxml: add generated headers to EXTRA_DIST</li>
<li>mapi: automake: set VISIBILITY_CFLAGS for shared glapi</li>
<li>mesa: automake: include mesa_glinterop.h in distfile</li>
</ul>
<p>Julien Isorce (1):</p>
<ul>
<li>st/va: also honors interlaced preference when providing a video format</li>
</ul>
<p>Kenneth Graunke (8):</p>
<ul>
<li>nir: Call nir_metadata_preserve from nir_lower_alu_to_scalar().</li>
<li>mesa: Expose RESET_NOTIFICATION_STRATEGY with KHR_robustness.</li>
<li>i965: Fix missing _NEW_TRANSFORM in Gen8+ 3DSTATE_DS atom.</li>
<li>i965: Add missing BRW_NEW_VS_PROG_DATA to 3DSTATE_CLIP.</li>
<li>i965: Move BRW_NEW_FRAGMENT_PROGRAM from 3DSTATE_PS to PS_EXTRA.</li>
<li>i965: Add missing BRW_NEW_CS_PROG_DATA to compute constant atom.</li>
<li>i965: Add missing BRW_CS_PROG_DATA to CS work group surface atom.</li>
<li>i965: Fix gl_InvocationID in dual object GS where invocations == 1.</li>
</ul>
<p>Marek Olšák (12):</p>
<ul>
<li>radeonsi: fix cubemaps viewed as 2D</li>
<li>radeonsi: take compute shader and dispatch indirect memory usage into account</li>
<li>radeonsi: fix FP64 UBO loads with indirect uniform block indexing</li>
<li>mesa: fix glGetFramebufferAttachmentParameteriv w/ on-demand FRONT_BACK alloc</li>
<li>radeonsi: fix interpolateAt opcodes for .zw components</li>
<li>radeonsi: fix texture border colors for compute shaders</li>
<li>radeonsi: disable ReZ</li>
<li>gallium/radeon: make sure the address of separate CMASK is aligned properly</li>
<li>winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures</li>
<li>egl: use util/macros.h</li>
<li>egl: make interop ABI visible again</li>
<li>glx: make interop ABI visible again</li>
</ul>
<p>Mario Kleiner (1):</p>
<ul>
<li>glx: Perform check for valid fbconfig against proper X-Screen.</li>
</ul>
<p>Martin Peres (2):</p>
<ul>
<li>loader/dri3: add get_dri_screen() to the vtable</li>
<li>loader/dri3: import prime buffers in the currently-bound screen</li>
</ul>
<p>Matt Whitlock (5):</p>
<ul>
<li>egl/android: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
<li>gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
<li>st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
<li>st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
<li>gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)</li>
</ul>
<p>Max Staudt (1):</p>
<ul>
<li>r300g: Set R300_VAP_CNTL on RSxxx to avoid triangle flickering</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>loader/dri3: Overhaul dri3_update_num_back</li>
</ul>
<p>Nicholas Bishop (2):</p>
<ul>
<li>gbm: return appropriate error when queryImage() fails</li>
<li>st/dri: check pipe_screen-&gt;resource_get_handle() return value</li>
</ul>
<p>Nicolai Hähnle (10):</p>
<ul>
<li>gallium/radeon: cleanup and fix branch emits</li>
<li>st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations</li>
<li>st/glsl_to_tgsi: simplify translate_tex_offset</li>
<li>st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets</li>
<li>st/mesa: fix vertex elements setup for doubles</li>
<li>radeonsi: fix indirect loads of 64 bit constants</li>
<li>st/glsl_to_tgsi: fix atomic counter addressing</li>
<li>st/glsl_to_tgsi: fix block copies of arrays of doubles</li>
<li>st/mesa: only set primitive_restart when the restart index is in range</li>
<li>radeonsi: fix 64-bit loads from LDS</li>
</ul>
<p>Samuel Pitoiset (4):</p>
<ul>
<li>nvc0/ir: fix subops for IMAD</li>
<li>gk110/ir: fix wrong emission of OP_NOT</li>
<li>nvc0: use correct bufctx when invalidating CP textures</li>
<li>nvc0/ir: fix emission of IMAD with NEG modifiers</li>
</ul>
<p>Stencel, Joanna (1):</p>
<ul>
<li>egl/wayland: add missing destroy_window callback</li>
</ul>
<p>Tapani Pälli (5):</p>
<ul>
<li>egl: stop claiming support for pbuffer + msaa</li>
<li>egl/dri2: set max values for pbuffer width and height</li>
<li>egl: add check that eglCreateContext gets a valid config</li>
<li>mesa: fix error handling in DrawBuffers</li>
<li>egl: set preserved behavior for surface only if config supports it</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>configure.ac: add llvm inteljitevents component if enabled</li>
</ul>
<p>Vedran Miletić (1):</p>
<ul>
<li>clover: Fix build against clang SVN &gt;= r273191</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>Revert "mesa_glinterop: remove inclusion of GLX header"</li>
</ul>
</div>
</body>
</html>

138
docs/relnotes/12.0.5.html Normal file
View File

@@ -0,0 +1,138 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.5 Release Notes / December 5, 2016</h1>
<p>
Mesa 12.0.5 is a bug fix release which fixes bugs found since the 12.0.5 release.
</p>
<p>
Mesa 12.0.5 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
44d08a27d98bfeacd864381189e434d98afbf451689d01f80380dc1d66450e5b mesa-12.0.5.tar.gz
2b0a972d8282860a11291c09c3ef01ac45171405951eb21a83c45ed2b4321924 mesa-12.0.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77662">Bug 77662</a> - Fail to render to different faces of depth-stencil cube map</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97779">Bug 97779</a> - [regression, bisected][BDW, GPU hang] stuck on render ring, always reproducible</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98415">Bug 98415</a> - Vulkan Driver JSON file contains incorrect field</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (2):</p>
<ul>
<li>glx/glvnd: Don't modify the dummy slot in the dispatch table</li>
<li>glx/glvnd: Fix dispatch function names and indices</li>
</ul>
<p>Anuj Phogat (1):</p>
<ul>
<li>i965: Fix GPU hang related to multiple render targets and alpha testing</li>
</ul>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: add release notes for 12.0.4</li>
<li>docs: add sha256 checksums for 12.0.4</li>
<li>cherry-ignore: add reverted LLVM_LIBDIR patch</li>
<li>Update version to 12.0.5</li>
</ul>
<p>Haixia Shi (1):</p>
<ul>
<li>mesa: change state query return value for RGB565</li>
</ul>
<p>Jason Ekstrand (3):</p>
<ul>
<li>i965/fs/generator: Don't use the address immediate for MOV_INDIRECT</li>
<li>anv/cmd_buffer: Take a command buffer instead of a batch in two helpers</li>
<li>anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>intel: Fix pixel shader scratch space allocation on Gen9+ platforms.</li>
</ul>
<p>Marek Olšák (13):</p>
<ul>
<li>gallium/radeon: fix behavior of GLSL findLSB(0)</li>
<li>gallium/radeon: make sure HTILE address is aligned properly</li>
<li>radeonsi: fix an assertion failure in si_decompress_sampler_color_textures</li>
<li>gallium/radeon: unify viewport emission code</li>
<li>gallium/radeon: set VPORT_ZMIN/MAX registers correctly</li>
<li>radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader</li>
<li>radeonsi: fix a crash in imageSize for cubemap arrays</li>
<li>radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it</li>
<li>gallium/radeon: add support for sharing textures with DCC between processes</li>
<li>radeonsi: always set all blend registers</li>
<li>radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending</li>
<li>radeonsi: disable RB+ blend optimizations for dual source blending</li>
<li>radeonsi: silence runtime warnings with LLVM 3.9</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>anv: Replace "abi_versions" with correct "api_version".</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>mesa/fbobject: Update CubeMapFace when reusing textures</li>
</ul>
<p>Steinar H. Gunderson (1):</p>
<ul>
<li>Fix races during _mesa_HashWalk().</li>
</ul>
<p>Tim Rowley (3):</p>
<ul>
<li>swr: [rasterizer jitter] cleanup supporting different llvm versions</li>
<li>swr: [rasterizer jitter] fix llvm-3.7 compile</li>
<li>swr: [rasterizer] add support for llvm-3.9</li>
</ul>
</div>
</body>
</html>

148
docs/relnotes/12.0.6.html Normal file
View File

@@ -0,0 +1,148 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.6 Release Notes / January 23, 2017</h1>
<p>
Mesa 12.0.6 is a bug fix release which fixes bugs found since the 12.0.5 release.
</p>
<p>
Mesa 12.0.6 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
65339ba5d76a45225b8b56f9a1da9db15c569e1d163760faa2921da0a8461741 mesa-12.0.6.tar.gz
7d6da9744c1022a4c2ab6ad01a206984d00443fb691568011d01b3dd97e36448 mesa-12.0.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92234">Bug 92234</a> - [BDW] GPU hang in Shogun2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95130">Bug 95130</a> - Derivatives of gl_Color wrong when helper pixels used</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98329">Bug 98329</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99030">Bug 99030</a> - [HSW, regression] transform feedback fails on Linux 4.8</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99354">Bug 99354</a> - [G71] &quot;Assertion `bkref' failed&quot; reproducible with glmark2</li>
</ul>
<h2>Changes</h2>
<p>Chad Versace (3):</p>
<ul>
<li>i965/mt: Disable aux surfaces after making miptree shareable</li>
<li>i965/mt: Disable HiZ when sharing depth buffer externally (v2)</li>
<li>anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 12.0.5</li>
<li>get-typod-pick-list.sh: add new script</li>
<li>automake: use shared llvm libs for make distcheck</li>
<li>egl/wayland: use the destroy_window_callback for swrast</li>
<li>Update version to 12.0.6</li>
</ul>
<p>Fredrik Höglund (1):</p>
<ul>
<li>dri3: Fix MakeCurrent without a default framebuffer</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nouveau: take extra push space into account for pushbuf_space calls</li>
</ul>
<p>Jason Ekstrand (19):</p>
<ul>
<li>spirv/nir: Fix some texture opcode asserts</li>
<li>spirv/nir: Add support for shadow samplers that return vec4</li>
<li>spirv/nir: Properly handle gather components</li>
<li>anv/pipeline: Set binding_table.gather_texture_start</li>
<li>nir: Add a helper for determining the type of a texture source</li>
<li>nir/lower_tex: Add some helpers for working with tex sources</li>
<li>nir/lower_tex: Add support for lowering coordinate offsets</li>
<li>i965/nir: Enable NIR lowering of txf and rect offsets</li>
<li>i965: Get rid of the do_lower_unnormalized_offsets pass</li>
<li>spirv/nir: Don't increment coord_components for array lod queries</li>
<li>anv/image: Assert that the image format is actually supported</li>
<li>spirv/nir: Move opcode selection higher up in handle_texture</li>
<li>spirv/nir: Refactor type handling in handle_texture</li>
<li>nir/spirv: Refactor coordinate handling in handle_texture</li>
<li>spirv/nir: Handle texture projectors</li>
<li>spirv/nir: Add support for ImageQuerySamples</li>
<li>anv/device: Return the right error for failed maps</li>
<li>anv/device: Implicitly unmap memory objects in FreeMemory</li>
<li>anv/descriptor_set: Write the state offset in the surface state free list.</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>spirv: Move cursor before calling vtn_ssa_value() in phi 2nd pass.</li>
<li>i965: Properly flush in hsw_pause_transform_feedback().</li>
</ul>
<p>Marek Olšák (6):</p>
<ul>
<li>cso: don't release sampler states that are bound</li>
<li>radeonsi: always restore sampler states when unbinding sampler views</li>
<li>radeonsi: fix incorrect FMASK checking in bind_sampler_states</li>
<li>radeonsi: disable CE on SI + AMDGPU</li>
<li>radeonsi: disable the constant engine (CE) on Carrizo and Stoney</li>
<li>gallium/radeon: fix the draw-calls HUD query</li>
</ul>
<p>Matt Turner (3):</p>
<ul>
<li>i965/fs: Rename opt_copy_propagate -&gt; opt_copy_propagation.</li>
<li>i965/fs: Add unit tests for copy propagation pass.</li>
<li>i965/fs: Reject copy propagation into SEL if not min/max.</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>cso: Don't restore nr_samplers in cso_restore_fragment_samplers</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>radeonsi: enable WQM in PS prolog when needed</li>
</ul>
</div>
</body>
</html>

View File

@@ -68,21 +68,39 @@ To get the Mesa sources anonymously (read-only):
<h2 id="developer">Developer git Access</h2>
<p>
Mesa developers need to first have an account on
<a href="http://www.freedesktop.org">freedesktop.org</a>.
To get an account, please ask Brian or the other Mesa developers for
permission.
Then, if there are no objections, follow this
<a href="http://www.freedesktop.org/wiki/AccountRequests">
procedure</a>.
If you wish to become a Mesa developer with git-write privilege, please
follow this procedure:
</p>
<ol>
<li>Subscribe to the
<a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev</a>
mailing list.
<li>Start contributing to the project by posting patches / review requests to
the mesa-dev list. Specifically,
<ul>
<li>Use <code>git send-mail</code> to post your patches to mesa-dev.
<li>Wait for someone to review the code and give you a <code>Reviewed-by</code>
statement.
<li>You'll have to rely on another Mesa developer to push your initial patches
after they've been reviewed.
</ul>
<li>After you've demonstrated the ability to write good code and have had
a dozen or so patches accepted you can apply for an account.
<li>Occasionally, but rarely, someone may be given a git account sooner, but
only if they're being supervised by another Mesa developer at the same
organization and planning to work in a limited area of the code or on a
separate branch.
<li>To apply for an account, follow
<a href="http://www.freedesktop.org/wiki/AccountRequests">these directions</a>.
It's also appreciated if you briefly describe what you intend to do (work
on a particular driver, add a new extension, etc.) in the bugzilla record.
</ol>
<p>
Once your account is established:
</p>
<ol>
<li>Install the git software on your computer if needed.<br><br>
<li>Get an initial, local copy of the repository with:
<pre>
git clone git+ssh://username@git.freedesktop.org/git/mesa/mesa

View File

@@ -209,51 +209,6 @@ The final vertex and fragment programs may be interpreted in software
(see drivers/dri/i915/i915_fragprog.c for example).
</p>
<h3>Code Generation Options</h3>
<p>
Internally, there are several options that control the compiler's code
generation and instruction selection.
These options are seen in the gl_shader_state struct and may be set
by the device driver to indicate its preferences:
<pre>
struct gl_shader_state
{
...
/** Driver-selectable options: */
GLboolean EmitHighLevelInstructions;
GLboolean EmitCondCodes;
GLboolean EmitComments;
};
</pre>
<dl>
<dt>EmitHighLevelInstructions</dt>
<dd>
This option controls instruction selection for loops and conditionals.
If the option is set high-level IF/ELSE/ENDIF, LOOP/ENDLOOP, CONT/BRK
instructions will be emitted.
Otherwise, those constructs will be implemented with BRA instructions.
</dd>
<dt>EmitCondCodes</dt>
<dd>
If set, condition codes (ala GL_NV_fragment_program) will be used for
branching and looping.
Otherwise, ordinary registers will be used (the IF instruction will
examine the first operand's X component and do the if-part if non-zero).
This option is only relevant if EmitHighLevelInstructions is set.
</dd>
<dt>EmitComments</dt>
<dd>
If set, instructions will be annotated with comments to help with debugging.
Extra NOP instructions will also be inserted.
</dd>
</dl>
<h2 id="validation">Compiler Validation</h2>
<p>

View File

@@ -34,8 +34,7 @@ Hardware drivers include:
</p>
<ul>
<li>Intel i965, i945, i915.
See <a href="http://intellinuxgraphics.org/index.html">
Intel's website</a></li>
See <a href="https://01.org/linuxgraphics">Intel's website</a></li>
<li>AMD Radeon series.
See <a href="http://www.x.org/wiki/RadeonFeature">RadeonFeature</a></li>
<li>NVIDIA GPUs.

View File

@@ -31,7 +31,7 @@
<dd>is a very useful tool for tracking down
memory-related problems in your code.</dd>
<dt><a href="http:scan.coverity.com/projects/mesa">Coverity</a><dt>
<dt><a href="http://scan.coverity.com/projects/mesa">Coverity</a><dt>
<dd>provides static code analysis of Mesa. If you create an account
you can see the results and try to fix outstanding issues.</dd>
</dl>

8
doxygen/.gitignore vendored
View File

@@ -1,9 +1,6 @@
*.db
*.tag
*.tmp
agpgart
array_cache
core
core_subset
gallium
gbm
@@ -13,11 +10,8 @@ i965
main
math
math_subset
miniglx
radeondrm
radeonfb
nir
radeon_subset
shader
swrast
swrast_setup
tnl

View File

@@ -12,20 +12,21 @@ FULL = \
vbo.doxy \
glapi.doxy \
glsl.doxy \
shader.doxy \
swrast.doxy \
swrast_setup.doxy \
tnl.doxy \
tnl_dd.doxy \
gbm.doxy \
i965.doxy
i965.doxy \
nir.doxy
full: $(FULL:.doxy=.tag)
$(foreach FILE,$(FULL),doxygen $(FILE);)
SUBSET = \
main.doxy \
math.doxy
math.doxy \
gallium.doxy
subset: $(SUBSET:.doxy=.tag)
$(foreach FILE,$(SUBSET),doxygen $(FILE);)

View File

@@ -53,16 +53,6 @@ CREATE_SUBDIRS = NO
OUTPUT_LANGUAGE = English
# This tag can be used to specify the encoding used in the generated output.
# The encoding is not always determined by the language that is chosen,
# but also whether or not the output is meant for Windows or non-Windows users.
# In case there is a difference, setting the USE_WINDOWS_ENCODING tag to YES
# forces the Windows encoding (this is the default for the Windows binary),
# whereas setting the tag to NO uses a Unix-style encoding (the default for
# all platforms other than Windows).
USE_WINDOWS_ENCODING = NO
# If the BRIEF_MEMBER_DESC tag is set to YES (the default) Doxygen will
# include brief member descriptions after the members that are listed in
# the file and class documentation (similar to JavaDoc).
@@ -147,13 +137,6 @@ JAVADOC_AUTOBRIEF = YES
MULTILINE_CPP_IS_BRIEF = NO
# If the DETAILS_AT_TOP tag is set to YES then Doxygen
# will output the detailed description near the top, like JavaDoc.
# If set to NO, the detailed description appears after the member
# documentation.
DETAILS_AT_TOP = YES
# If the INHERIT_DOCS tag is set to YES (the default) then an undocumented
# member inherits the documentation from any documented member that it
# re-implements.
@@ -607,12 +590,6 @@ HTML_FOOTER =
HTML_STYLESHEET =
# If the HTML_ALIGN_MEMBERS tag is set to YES, the members of classes,
# files or namespaces will be aligned in HTML using tables. If set to
# NO a bullet list will be used.
HTML_ALIGN_MEMBERS = YES
# If the GENERATE_HTMLHELP tag is set to YES, additional index files
# will be generated that can be used as input for tools like the
# Microsoft HTML help workshop to generate a compressed HTML help file (.chm)
@@ -839,18 +816,6 @@ GENERATE_XML = NO
XML_OUTPUT = xml
# The XML_SCHEMA tag can be used to specify an XML schema,
# which can be used by a validating XML parser to check the
# syntax of the XML files.
XML_SCHEMA =
# The XML_DTD tag can be used to specify an XML DTD,
# which can be used by a validating XML parser to check the
# syntax of the XML files.
XML_DTD =
# If the XML_PROGRAMLISTING tag is set to YES Doxygen will
# dump the program listings (including syntax highlighting
# and cross-referencing information) to the XML output. Note that
@@ -1104,22 +1069,6 @@ DOT_PATH =
DOTFILE_DIRS =
# The MAX_DOT_GRAPH_WIDTH tag can be used to set the maximum allowed width
# (in pixels) of the graphs generated by dot. If a graph becomes larger than
# this value, doxygen will try to truncate the graph, so that it fits within
# the specified constraint. Beware that most browsers cannot cope with very
# large images.
MAX_DOT_GRAPH_WIDTH = 1024
# The MAX_DOT_GRAPH_HEIGHT tag can be used to set the maximum allows height
# (in pixels) of the graphs generated by dot. If a graph becomes larger than
# this value, doxygen will try to truncate the graph, so that it fits within
# the specified constraint. Beware that most browsers cannot cope with very
# large images.
MAX_DOT_GRAPH_HEIGHT = 1024
# The MAX_DOT_GRAPH_DEPTH tag can be used to set the maximum depth of the
# graphs generated by dot. A depth value of 3 means that only nodes reachable
# from the root by following a path via at most 3 edges will be shown. Nodes that

View File

@@ -190,8 +190,7 @@ SKIP_FUNCTION_MACROS = YES
# Configuration::addtions related to external references
#---------------------------------------------------------------------------
TAGFILES = \
math_subset.tag=../math_subset \
miniglx.tag=../miniglx
math_subset.tag=../math_subset
GENERATE_TAGFILE = core_subset.tag
ALLEXTERNALS = NO
PERL_PATH =

View File

@@ -6,7 +6,9 @@ doxygen swrast_setup.doxy
doxygen tnl.doxy
doxygen core.doxy
doxygen glapi.doxy
doxygen shader.doxy
doxygen glsl.doxy
doxygen nir.doxy
doxygen i965.doxy
echo Building again, to resolve tags
doxygen tnl_dd.doxy
@@ -15,5 +17,8 @@ doxygen math.doxy
doxygen swrast.doxy
doxygen swrast_setup.doxy
doxygen tnl.doxy
doxygen core.doxy
doxygen glapi.doxy
doxygen shader.doxy
doxygen glsl.doxy
doxygen nir.doxy
doxygen i965.doxy

View File

@@ -39,10 +39,10 @@ SKIP_FUNCTION_MACROS = YES
#---------------------------------------------------------------------------
# Configuration::addtions related to external references
#---------------------------------------------------------------------------
TAGFILES = main.tag=../core \
TAGFILES = main.tag=../main \
math.tag=../math \
tnl_dd.tag=../tnl_dd \
swrast_setup.tag=../gbm_setup \
swrast_setup.tag=../swrast_setup \
tnl.tag=../tnl \
vbo.tag=vbo
vbo.tag=../vbo
GENERATE_TAGFILE = gbm.tag

View File

@@ -9,7 +9,7 @@ PROJECT_NAME = "Mesa GL API dispatcher"
#---------------------------------------------------------------------------
# configuration options related to the input files
#---------------------------------------------------------------------------
INPUT = ../src/mesa/glapi/
INPUT = ../src/mapi/glapi/
FILE_PATTERNS = *.c *.h
RECURSIVE = NO
EXCLUDE =
@@ -39,11 +39,11 @@ SKIP_FUNCTION_MACROS = YES
#---------------------------------------------------------------------------
# Configuration::addtions related to external references
#---------------------------------------------------------------------------
TAGFILES = main.tag=../core \
TAGFILES = main.tag=../main \
math.tag=../math \
tnl_dd.tag=../tnl_dd \
swrast.tag=../swrast \
swrast_setup.tag=../swrast_setup \
tnl.tag=../tnl \
vbo.tag=vbo
GENERATE_TAGFILE = swrast.tag
vbo.tag=../vbo
GENERATE_TAGFILE = glapi.tag

View File

@@ -9,11 +9,12 @@ PROJECT_NAME = "Mesa GLSL module"
#---------------------------------------------------------------------------
# configuration options related to the input files
#---------------------------------------------------------------------------
INPUT = ../src/glsl/
INPUT = ../src/compiler/glsl/
FILE_PATTERNS = *.c *.cpp *.h
RECURSIVE = NO
EXCLUDE = ../src/glsl/glsl_lexer.cpp \
../src/glsl/glsl_parser.cpp \
../src/glsl/glsl_parser.h
EXCLUDE = ../src/compiler/glsl/glsl_lexer.cpp \
../src/compiler/glsl/glsl_parser.cpp \
../src/compiler/glsl/glsl_parser.h
EXCLUDE_PATTERNS =
#---------------------------------------------------------------------------
# configuration options related to the HTML output

View File

@@ -8,9 +8,9 @@
<a class="qindex" href="../main/index.html">core</a> |
<a class="qindex" href="../glapi/index.html">glapi</a> |
<a class="qindex" href="../glsl/index.html">glsl</a> |
<a class="qindex" href="../nir/index.html">nir</a> |
<a class="qindex" href="../vbo/index.html">vbo</a> |
<a class="qindex" href="../math/index.html">math</a> |
<a class="qindex" href="../shader/index.html">shader</a> |
<a class="qindex" href="../swrast/index.html">swrast</a> |
<a class="qindex" href="../swrast_setup/index.html">swrast_setup</a> |
<a class="qindex" href="../tnl/index.html">tnl</a> |

View File

@@ -6,6 +6,5 @@
<div class="qindex">
<a class="qindex" href="../core_subset/index.html">Mesa Core</a> |
<a class="qindex" href="../math_subset/index.html">math</a> |
<a class="qindex" href="../miniglx/index.html">MiniGLX</a> |
<a class="qindex" href="../radeon_subset/index.html">radeon_subset</a>
</div>

View File

@@ -46,5 +46,5 @@ TAGFILES = glsl.tag=../glsl \
swrast_setup.tag=../swrast_setup \
tnl.tag=../tnl \
tnl_dd.tag=../tnl_dd \
vbo.tag=vbo
vbo.tag=../vbo
GENERATE_TAGFILE = i965.tag

View File

@@ -43,7 +43,6 @@ TAGFILES = tnl_dd.tag=../tnl_dd \
vbo.tag=../vbo \
glapi.tag=../glapi \
math.tag=../math \
shader.tag=../shader \
swrast.tag=../swrast \
swrast_setup.tag=../swrast_setup \
tnl.tag=../tnl

View File

@@ -41,7 +41,7 @@ SKIP_FUNCTION_MACROS = YES
# Configuration::addtions related to external references
#---------------------------------------------------------------------------
TAGFILES = tnl_dd.tag=../tnl_dd \
main.tag=../core \
main.tag=../main \
swrast.tag=../swrast \
swrast_setup.tag=../swrast_setup \
tnl.tag=../tnl \

View File

@@ -5,45 +5,46 @@
#---------------------------------------------------------------------------
# General configuration options
#---------------------------------------------------------------------------
PROJECT_NAME = "Mesa Vertex and Fragment Program code"
PROJECT_NAME = "Mesa NIR module"
#---------------------------------------------------------------------------
# configuration options related to the input files
# Configuration options related to the input files
#---------------------------------------------------------------------------
INPUT = ../src/mesa/shader/
FILE_PATTERNS = *.c *.h
INPUT = ../src/compiler/nir
FILE_PATTERNS = *.c *.cpp *.h
RECURSIVE = NO
EXCLUDE =
EXCLUDE_PATTERNS =
EXAMPLE_PATH =
EXAMPLE_PATTERNS =
EXCLUDE =
EXCLUDE_PATTERNS =
EXAMPLE_PATH =
EXAMPLE_PATTERNS =
EXAMPLE_RECURSIVE = NO
IMAGE_PATH =
INPUT_FILTER =
IMAGE_PATH =
INPUT_FILTER =
FILTER_SOURCE_FILES = NO
#---------------------------------------------------------------------------
# configuration options related to the HTML output
# Configuration options related to the HTML output
#---------------------------------------------------------------------------
HTML_OUTPUT = shader
HTML_OUTPUT = nir
#---------------------------------------------------------------------------
# Configuration options related to the preprocessor
# Configuration options related to the preprocessor
#---------------------------------------------------------------------------
ENABLE_PREPROCESSING = YES
MACRO_EXPANSION = NO
EXPAND_ONLY_PREDEF = NO
SEARCH_INCLUDES = YES
INCLUDE_PATH = ../include/
INCLUDE_FILE_PATTERNS =
PREDEFINED =
EXPAND_AS_DEFINED =
INCLUDE_FILE_PATTERNS =
PREDEFINED =
EXPAND_AS_DEFINED =
SKIP_FUNCTION_MACROS = YES
#---------------------------------------------------------------------------
# Configuration::addtions related to external references
# Configuration::additions related to external references
#---------------------------------------------------------------------------
TAGFILES = main.tag=../core \
TAGFILES = glsl.tag=../glsl \
main.tag=../main \
math.tag=../math \
tnl_dd.tag=../tnl_dd \
swrast.tag=../swrast \
swrast_setup.tag=../swrast_setup \
tnl.tag=../tnl \
vbo.tag=vbo
GENERATE_TAGFILE = swrast.tag
tnl_dd.tag=../tnl_dd \
vbo.tag=../vbo
GENERATE_TAGFILE = nir.tag

View File

@@ -168,8 +168,7 @@ SKIP_FUNCTION_MACROS = YES
#---------------------------------------------------------------------------
TAGFILES = \
core_subset.tag=../core_subset \
math_subset.tag=../math_subset \
miniglx.tag=../miniglx
math_subset.tag=../math_subset
GENERATE_TAGFILE = radeon_subset.tag
ALLEXTERNALS = NO
PERL_PATH =

View File

@@ -39,10 +39,10 @@ SKIP_FUNCTION_MACROS = YES
#---------------------------------------------------------------------------
# Configuration::addtions related to external references
#---------------------------------------------------------------------------
TAGFILES = main.tag=../core \
TAGFILES = main.tag=../main \
math.tag=../math \
tnl_dd.tag=../tnl_dd \
swrast_setup.tag=../swrast_setup \
tnl.tag=../tnl \
vbo.tag=vbo
vbo.tag=../vbo
GENERATE_TAGFILE = swrast.tag

View File

@@ -41,7 +41,7 @@ SKIP_FUNCTION_MACROS = YES
# Configuration::addtions related to external references
#---------------------------------------------------------------------------
TAGFILES = tnl_dd.tag=../tnl_dd \
main.tag=../core \
main.tag=../main \
math.tag=../math \
swrast.tag=../swrast \
tnl.tag=../tnl \

View File

@@ -40,11 +40,10 @@ SKIP_FUNCTION_MACROS = YES
#---------------------------------------------------------------------------
# Configuration::addtions related to external references
#---------------------------------------------------------------------------
TAGFILES = tnl_dd.tag=../tnl \
main.tag=../core \
TAGFILES = tnl_dd.tag=../tnl_dd \
main.tag=../main \
math.tag=../math \
shader.tag=../shader \
swrast.tag=../swrast \
swrast_setup.tag=swrast_setup \
vbo.tag=vbo
swrast_setup.tag=../swrast_setup \
vbo.tag=../vbo
GENERATE_TAGFILE = tnl.tag

View File

@@ -39,11 +39,10 @@ SKIP_FUNCTION_MACROS = YES
#---------------------------------------------------------------------------
# Configuration::addtions related to external references
#---------------------------------------------------------------------------
TAGFILES = main.tag=../core \
TAGFILES = main.tag=../main \
math.tag=../math \
shader.tag=../shader \
swrast.tag=../swrast \
swrast_setup.tag=../swrast_setup \
tnl.tag=../tnl \
vbo.tag=vbo
vbo.tag=../vbo
GENERATE_TAGFILE = tnl_dd.tag

View File

@@ -40,9 +40,8 @@ SKIP_FUNCTION_MACROS = YES
#---------------------------------------------------------------------------
# Configuration::addtions related to external references
#---------------------------------------------------------------------------
TAGFILES = main.tag=../core \
TAGFILES = main.tag=../main \
math.tag=../math \
shader.tag=../shader \
swrast.tag=../swrast \
swrast_setup.tag=../swrast_setup \
tnl.tag=../tnl \

View File

@@ -260,7 +260,7 @@ struct IDirect3DDevice9 : public IUnknown
virtual HRESULT WINAPI SetStreamSourceFreq(UINT StreamNumber, UINT Setting) = 0;
virtual HRESULT WINAPI GetStreamSourceFreq(UINT StreamNumber, UINT *pSetting) = 0;
virtual HRESULT WINAPI SetIndices(IDirect3DIndexBuffer9 *pIndexData) = 0;
virtual HRESULT WINAPI GetIndices(IDirect3DIndexBuffer9 **ppIndexData, UINT *pBaseVertexIndex) = 0;
virtual HRESULT WINAPI GetIndices(IDirect3DIndexBuffer9 **ppIndexData) = 0;
virtual HRESULT WINAPI CreatePixelShader(const DWORD *pFunction, IDirect3DPixelShader9 **ppShader) = 0;
virtual HRESULT WINAPI SetPixelShader(IDirect3DPixelShader9 *pShader) = 0;
virtual HRESULT WINAPI GetPixelShader(IDirect3DPixelShader9 **ppShader) = 0;
@@ -848,7 +848,7 @@ typedef struct IDirect3DDevice9Vtbl
HRESULT (WINAPI *SetStreamSourceFreq)(IDirect3DDevice9 *This, UINT StreamNumber, UINT Setting);
HRESULT (WINAPI *GetStreamSourceFreq)(IDirect3DDevice9 *This, UINT StreamNumber, UINT *pSetting);
HRESULT (WINAPI *SetIndices)(IDirect3DDevice9 *This, IDirect3DIndexBuffer9 *pIndexData);
HRESULT (WINAPI *GetIndices)(IDirect3DDevice9 *This, IDirect3DIndexBuffer9 **ppIndexData, UINT *pBaseVertexIndex);
HRESULT (WINAPI *GetIndices)(IDirect3DDevice9 *This, IDirect3DIndexBuffer9 **ppIndexData);
HRESULT (WINAPI *CreatePixelShader)(IDirect3DDevice9 *This, const DWORD *pFunction, IDirect3DPixelShader9 **ppShader);
HRESULT (WINAPI *SetPixelShader)(IDirect3DDevice9 *This, IDirect3DPixelShader9 *pShader);
HRESULT (WINAPI *GetPixelShader)(IDirect3DDevice9 *This, IDirect3DPixelShader9 **ppShader);
@@ -975,7 +975,7 @@ struct IDirect3DDevice9
#define IDirect3DDevice9_SetStreamSourceFreq(p,a,b) (p)->lpVtbl->SetStreamSourceFreq(p,a,b)
#define IDirect3DDevice9_GetStreamSourceFreq(p,a,b) (p)->lpVtbl->GetStreamSourceFreq(p,a,b)
#define IDirect3DDevice9_SetIndices(p,a) (p)->lpVtbl->SetIndices(p,a)
#define IDirect3DDevice9_GetIndices(p,a,b) (p)->lpVtbl->GetIndices(p,a,b)
#define IDirect3DDevice9_GetIndices(p,a) (p)->lpVtbl->GetIndices(p,a)
#define IDirect3DDevice9_CreatePixelShader(p,a,b) (p)->lpVtbl->CreatePixelShader(p,a,b)
#define IDirect3DDevice9_SetPixelShader(p,a) (p)->lpVtbl->SetPixelShader(p,a)
#define IDirect3DDevice9_GetPixelShader(p,a) (p)->lpVtbl->GetPixelShader(p,a)
@@ -1099,7 +1099,7 @@ typedef struct IDirect3DDevice9ExVtbl
HRESULT (WINAPI *SetStreamSourceFreq)(IDirect3DDevice9Ex *This, UINT StreamNumber, UINT Setting);
HRESULT (WINAPI *GetStreamSourceFreq)(IDirect3DDevice9Ex *This, UINT StreamNumber, UINT *pSetting);
HRESULT (WINAPI *SetIndices)(IDirect3DDevice9Ex *This, IDirect3DIndexBuffer9 *pIndexData);
HRESULT (WINAPI *GetIndices)(IDirect3DDevice9Ex *This, IDirect3DIndexBuffer9 **ppIndexData, UINT *pBaseVertexIndex);
HRESULT (WINAPI *GetIndices)(IDirect3DDevice9Ex *This, IDirect3DIndexBuffer9 **ppIndexData);
HRESULT (WINAPI *CreatePixelShader)(IDirect3DDevice9Ex *This, const DWORD *pFunction, IDirect3DPixelShader9 **ppShader);
HRESULT (WINAPI *SetPixelShader)(IDirect3DDevice9Ex *This, IDirect3DPixelShader9 *pShader);
HRESULT (WINAPI *GetPixelShader)(IDirect3DDevice9Ex *This, IDirect3DPixelShader9 **ppShader);
@@ -1242,7 +1242,7 @@ struct IDirect3DDevice9Ex
#define IDirect3DDevice9Ex_SetStreamSourceFreq(p,a,b) (p)->lpVtbl->SetStreamSourceFreq(p,a,b)
#define IDirect3DDevice9Ex_GetStreamSourceFreq(p,a,b) (p)->lpVtbl->GetStreamSourceFreq(p,a,b)
#define IDirect3DDevice9Ex_SetIndices(p,a) (p)->lpVtbl->SetIndices(p,a)
#define IDirect3DDevice9Ex_GetIndices(p,a,b) (p)->lpVtbl->GetIndices(p,a,b)
#define IDirect3DDevice9Ex_GetIndices(p,a) (p)->lpVtbl->GetIndices(p,a)
#define IDirect3DDevice9Ex_CreatePixelShader(p,a,b) (p)->lpVtbl->CreatePixelShader(p,a,b)
#define IDirect3DDevice9Ex_SetPixelShader(p,a) (p)->lpVtbl->SetPixelShader(p,a)
#define IDirect3DDevice9Ex_GetPixelShader(p,a) (p)->lpVtbl->GetPixelShader(p,a)

View File

@@ -173,16 +173,16 @@ typedef struct _RGNDATA {
#define D3DPRESENTFLAG_RESTRICTED_CONTENT 0x00000400
#define D3DPRESENTFLAG_RESTRICT_SHARED_RESOURCE_DRIVER 0x00000800
#ifdef WINAPI
#undef WINAPI
#endif /* WINAPI*/
#if defined(__x86_64__) || defined(_M_X64)
#define WINAPI __attribute__((ms_abi))
#else /* x86_64 */
#define WINAPI __attribute__((__stdcall__))
#endif /* x86_64 */
/* Windows calling convention */
#ifndef WINAPI
#if defined(__x86_64__) && !defined(__ILP32__)
#define WINAPI __attribute__((ms_abi))
#elif defined(__i386__)
#define WINAPI __attribute__((__stdcall__))
#else /* neither amd64 nor i386 */
#define WINAPI
#endif
#endif /* WINAPI */
/* Implementation caps */
#define D3DPRESENT_BACK_BUFFERS_MAX 3

View File

@@ -34,17 +34,6 @@ extern "C" {
#include <EGL/eglplatform.h>
#ifndef EGL_MESA_drm_display
#define EGL_MESA_drm_display 1
#ifdef EGL_EGLEXT_PROTOTYPES
EGLAPI EGLDisplay EGLAPIENTRY eglGetDRMDisplayMESA(int fd);
#endif /* EGL_EGLEXT_PROTOTYPES */
typedef EGLDisplay (EGLAPIENTRYP PFNEGLGETDRMDISPLAYMESA) (int fd);
#endif /* EGL_MESA_drm_display */
#ifdef EGL_MESA_drm_image
/* Mesa's extension to EGL_MESA_drm_image... */
#ifndef EGL_DRM_BUFFER_USE_CURSOR_MESA

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2013-2014 The Khronos Group Inc.
** Copyright (c) 2013-2016 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -33,7 +33,7 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/
**
** Khronos $Revision: 27684 $ on $Date: 2014-08-11 01:21:35 -0700 (Mon, 11 Aug 2014) $
** Khronos $Revision: 32433 $ on $Date: 2016-02-10 02:02:08 -0500 (Wed, 10 Feb 2016) $
*/
#if defined(_WIN32) && !defined(APIENTRY) && !defined(__CYGWIN__) && !defined(__SCITECH_SNAP__)
@@ -1160,6 +1160,22 @@ typedef unsigned short GLhalf;
#define GL_COLOR_ATTACHMENT13 0x8CED
#define GL_COLOR_ATTACHMENT14 0x8CEE
#define GL_COLOR_ATTACHMENT15 0x8CEF
#define GL_COLOR_ATTACHMENT16 0x8CF0
#define GL_COLOR_ATTACHMENT17 0x8CF1
#define GL_COLOR_ATTACHMENT18 0x8CF2
#define GL_COLOR_ATTACHMENT19 0x8CF3
#define GL_COLOR_ATTACHMENT20 0x8CF4
#define GL_COLOR_ATTACHMENT21 0x8CF5
#define GL_COLOR_ATTACHMENT22 0x8CF6
#define GL_COLOR_ATTACHMENT23 0x8CF7
#define GL_COLOR_ATTACHMENT24 0x8CF8
#define GL_COLOR_ATTACHMENT25 0x8CF9
#define GL_COLOR_ATTACHMENT26 0x8CFA
#define GL_COLOR_ATTACHMENT27 0x8CFB
#define GL_COLOR_ATTACHMENT28 0x8CFC
#define GL_COLOR_ATTACHMENT29 0x8CFD
#define GL_COLOR_ATTACHMENT30 0x8CFE
#define GL_COLOR_ATTACHMENT31 0x8CFF
#define GL_DEPTH_ATTACHMENT 0x8D00
#define GL_STENCIL_ATTACHMENT 0x8D20
#define GL_FRAMEBUFFER 0x8D40
@@ -2097,6 +2113,10 @@ GLAPI void APIENTRY glGetDoublei_v (GLenum target, GLuint index, GLdouble *data)
#ifndef GL_VERSION_4_2
#define GL_VERSION_4_2 1
#define GL_COPY_READ_BUFFER_BINDING 0x8F36
#define GL_COPY_WRITE_BUFFER_BINDING 0x8F37
#define GL_TRANSFORM_FEEDBACK_ACTIVE 0x8E24
#define GL_TRANSFORM_FEEDBACK_PAUSED 0x8E23
#define GL_UNPACK_COMPRESSED_BLOCK_WIDTH 0x9127
#define GL_UNPACK_COMPRESSED_BLOCK_HEIGHT 0x9128
#define GL_UNPACK_COMPRESSED_BLOCK_DEPTH 0x9129
@@ -2642,7 +2662,6 @@ GLAPI void APIENTRY glBindVertexBuffers (GLuint first, GLsizei count, const GLui
#define GL_MAX_COMBINED_CLIP_AND_CULL_DISTANCES 0x82FA
#define GL_TEXTURE_TARGET 0x1006
#define GL_QUERY_TARGET 0x82EA
#define GL_TEXTURE_BINDING 0x82EB
#define GL_GUILTY_CONTEXT_RESET 0x8253
#define GL_INNOCENT_CONTEXT_RESET 0x8254
#define GL_UNKNOWN_CONTEXT_RESET 0x8255
@@ -2655,25 +2674,25 @@ GLAPI void APIENTRY glBindVertexBuffers (GLuint first, GLsizei count, const GLui
typedef void (APIENTRYP PFNGLCLIPCONTROLPROC) (GLenum origin, GLenum depth);
typedef void (APIENTRYP PFNGLCREATETRANSFORMFEEDBACKSPROC) (GLsizei n, GLuint *ids);
typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERBASEPROC) (GLuint xfb, GLuint index, GLuint buffer);
typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERRANGEPROC) (GLuint xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizei size);
typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERRANGEPROC) (GLuint xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);
typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKIVPROC) (GLuint xfb, GLenum pname, GLint *param);
typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKI_VPROC) (GLuint xfb, GLenum pname, GLuint index, GLint *param);
typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKI64_VPROC) (GLuint xfb, GLenum pname, GLuint index, GLint64 *param);
typedef void (APIENTRYP PFNGLCREATEBUFFERSPROC) (GLsizei n, GLuint *buffers);
typedef void (APIENTRYP PFNGLNAMEDBUFFERSTORAGEPROC) (GLuint buffer, GLsizei size, const void *data, GLbitfield flags);
typedef void (APIENTRYP PFNGLNAMEDBUFFERDATAPROC) (GLuint buffer, GLsizei size, const void *data, GLenum usage);
typedef void (APIENTRYP PFNGLNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr offset, GLsizei size, const void *data);
typedef void (APIENTRYP PFNGLCOPYNAMEDBUFFERSUBDATAPROC) (GLuint readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizei size);
typedef void (APIENTRYP PFNGLNAMEDBUFFERSTORAGEPROC) (GLuint buffer, GLsizeiptr size, const void *data, GLbitfield flags);
typedef void (APIENTRYP PFNGLNAMEDBUFFERDATAPROC) (GLuint buffer, GLsizeiptr size, const void *data, GLenum usage);
typedef void (APIENTRYP PFNGLNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr offset, GLsizeiptr size, const void *data);
typedef void (APIENTRYP PFNGLCOPYNAMEDBUFFERSUBDATAPROC) (GLuint readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizeiptr size);
typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERDATAPROC) (GLuint buffer, GLenum internalformat, GLenum format, GLenum type, const void *data);
typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLenum internalformat, GLintptr offset, GLsizei size, GLenum format, GLenum type, const void *data);
typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLenum internalformat, GLintptr offset, GLsizeiptr size, GLenum format, GLenum type, const void *data);
typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERPROC) (GLuint buffer, GLenum access);
typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERRANGEPROC) (GLuint buffer, GLintptr offset, GLsizei length, GLbitfield access);
typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERRANGEPROC) (GLuint buffer, GLintptr offset, GLsizeiptr length, GLbitfield access);
typedef GLboolean (APIENTRYP PFNGLUNMAPNAMEDBUFFERPROC) (GLuint buffer);
typedef void (APIENTRYP PFNGLFLUSHMAPPEDNAMEDBUFFERRANGEPROC) (GLuint buffer, GLintptr offset, GLsizei length);
typedef void (APIENTRYP PFNGLFLUSHMAPPEDNAMEDBUFFERRANGEPROC) (GLuint buffer, GLintptr offset, GLsizeiptr length);
typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPARAMETERIVPROC) (GLuint buffer, GLenum pname, GLint *params);
typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPARAMETERI64VPROC) (GLuint buffer, GLenum pname, GLint64 *params);
typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPOINTERVPROC) (GLuint buffer, GLenum pname, void **params);
typedef void (APIENTRYP PFNGLGETNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr offset, GLsizei size, void *data);
typedef void (APIENTRYP PFNGLGETNAMEDBUFFERSUBDATAPROC) (GLuint buffer, GLintptr offset, GLsizeiptr size, void *data);
typedef void (APIENTRYP PFNGLCREATEFRAMEBUFFERSPROC) (GLsizei n, GLuint *framebuffers);
typedef void (APIENTRYP PFNGLNAMEDFRAMEBUFFERRENDERBUFFERPROC) (GLuint framebuffer, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);
typedef void (APIENTRYP PFNGLNAMEDFRAMEBUFFERPARAMETERIPROC) (GLuint framebuffer, GLenum pname, GLint param);
@@ -2687,7 +2706,7 @@ typedef void (APIENTRYP PFNGLINVALIDATENAMEDFRAMEBUFFERSUBDATAPROC) (GLuint fram
typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERIVPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLint *value);
typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERUIVPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLuint *value);
typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERFVPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLfloat *value);
typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERFIPROC) (GLuint framebuffer, GLenum buffer, const GLfloat depth, GLint stencil);
typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERFIPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, GLfloat depth, GLint stencil);
typedef void (APIENTRYP PFNGLBLITNAMEDFRAMEBUFFERPROC) (GLuint readFramebuffer, GLuint drawFramebuffer, GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);
typedef GLenum (APIENTRYP PFNGLCHECKNAMEDFRAMEBUFFERSTATUSPROC) (GLuint framebuffer, GLenum target);
typedef void (APIENTRYP PFNGLGETNAMEDFRAMEBUFFERPARAMETERIVPROC) (GLuint framebuffer, GLenum pname, GLint *param);
@@ -2698,7 +2717,7 @@ typedef void (APIENTRYP PFNGLNAMEDRENDERBUFFERSTORAGEMULTISAMPLEPROC) (GLuint re
typedef void (APIENTRYP PFNGLGETNAMEDRENDERBUFFERPARAMETERIVPROC) (GLuint renderbuffer, GLenum pname, GLint *params);
typedef void (APIENTRYP PFNGLCREATETEXTURESPROC) (GLenum target, GLsizei n, GLuint *textures);
typedef void (APIENTRYP PFNGLTEXTUREBUFFERPROC) (GLuint texture, GLenum internalformat, GLuint buffer);
typedef void (APIENTRYP PFNGLTEXTUREBUFFERRANGEPROC) (GLuint texture, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizei size);
typedef void (APIENTRYP PFNGLTEXTUREBUFFERRANGEPROC) (GLuint texture, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizeiptr size);
typedef void (APIENTRYP PFNGLTEXTURESTORAGE1DPROC) (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width);
typedef void (APIENTRYP PFNGLTEXTURESTORAGE2DPROC) (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height);
typedef void (APIENTRYP PFNGLTEXTURESTORAGE3DPROC) (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth);
@@ -2746,6 +2765,10 @@ typedef void (APIENTRYP PFNGLGETVERTEXARRAYINDEXED64IVPROC) (GLuint vaobj, GLuin
typedef void (APIENTRYP PFNGLCREATESAMPLERSPROC) (GLsizei n, GLuint *samplers);
typedef void (APIENTRYP PFNGLCREATEPROGRAMPIPELINESPROC) (GLsizei n, GLuint *pipelines);
typedef void (APIENTRYP PFNGLCREATEQUERIESPROC) (GLenum target, GLsizei n, GLuint *ids);
typedef void (APIENTRYP PFNGLGETQUERYBUFFEROBJECTI64VPROC) (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
typedef void (APIENTRYP PFNGLGETQUERYBUFFEROBJECTIVPROC) (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
typedef void (APIENTRYP PFNGLGETQUERYBUFFEROBJECTUI64VPROC) (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
typedef void (APIENTRYP PFNGLGETQUERYBUFFEROBJECTUIVPROC) (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
typedef void (APIENTRYP PFNGLMEMORYBARRIERBYREGIONPROC) (GLbitfield barriers);
typedef void (APIENTRYP PFNGLGETTEXTURESUBIMAGEPROC) (GLuint texture, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, GLsizei bufSize, void *pixels);
typedef void (APIENTRYP PFNGLGETCOMPRESSEDTEXTURESUBIMAGEPROC) (GLuint texture, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLsizei bufSize, void *pixels);
@@ -2762,25 +2785,25 @@ typedef void (APIENTRYP PFNGLTEXTUREBARRIERPROC) (void);
GLAPI void APIENTRY glClipControl (GLenum origin, GLenum depth);
GLAPI void APIENTRY glCreateTransformFeedbacks (GLsizei n, GLuint *ids);
GLAPI void APIENTRY glTransformFeedbackBufferBase (GLuint xfb, GLuint index, GLuint buffer);
GLAPI void APIENTRY glTransformFeedbackBufferRange (GLuint xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizei size);
GLAPI void APIENTRY glTransformFeedbackBufferRange (GLuint xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);
GLAPI void APIENTRY glGetTransformFeedbackiv (GLuint xfb, GLenum pname, GLint *param);
GLAPI void APIENTRY glGetTransformFeedbacki_v (GLuint xfb, GLenum pname, GLuint index, GLint *param);
GLAPI void APIENTRY glGetTransformFeedbacki64_v (GLuint xfb, GLenum pname, GLuint index, GLint64 *param);
GLAPI void APIENTRY glCreateBuffers (GLsizei n, GLuint *buffers);
GLAPI void APIENTRY glNamedBufferStorage (GLuint buffer, GLsizei size, const void *data, GLbitfield flags);
GLAPI void APIENTRY glNamedBufferData (GLuint buffer, GLsizei size, const void *data, GLenum usage);
GLAPI void APIENTRY glNamedBufferSubData (GLuint buffer, GLintptr offset, GLsizei size, const void *data);
GLAPI void APIENTRY glCopyNamedBufferSubData (GLuint readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizei size);
GLAPI void APIENTRY glNamedBufferStorage (GLuint buffer, GLsizeiptr size, const void *data, GLbitfield flags);
GLAPI void APIENTRY glNamedBufferData (GLuint buffer, GLsizeiptr size, const void *data, GLenum usage);
GLAPI void APIENTRY glNamedBufferSubData (GLuint buffer, GLintptr offset, GLsizeiptr size, const void *data);
GLAPI void APIENTRY glCopyNamedBufferSubData (GLuint readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset, GLsizeiptr size);
GLAPI void APIENTRY glClearNamedBufferData (GLuint buffer, GLenum internalformat, GLenum format, GLenum type, const void *data);
GLAPI void APIENTRY glClearNamedBufferSubData (GLuint buffer, GLenum internalformat, GLintptr offset, GLsizei size, GLenum format, GLenum type, const void *data);
GLAPI void APIENTRY glClearNamedBufferSubData (GLuint buffer, GLenum internalformat, GLintptr offset, GLsizeiptr size, GLenum format, GLenum type, const void *data);
GLAPI void *APIENTRY glMapNamedBuffer (GLuint buffer, GLenum access);
GLAPI void *APIENTRY glMapNamedBufferRange (GLuint buffer, GLintptr offset, GLsizei length, GLbitfield access);
GLAPI void *APIENTRY glMapNamedBufferRange (GLuint buffer, GLintptr offset, GLsizeiptr length, GLbitfield access);
GLAPI GLboolean APIENTRY glUnmapNamedBuffer (GLuint buffer);
GLAPI void APIENTRY glFlushMappedNamedBufferRange (GLuint buffer, GLintptr offset, GLsizei length);
GLAPI void APIENTRY glFlushMappedNamedBufferRange (GLuint buffer, GLintptr offset, GLsizeiptr length);
GLAPI void APIENTRY glGetNamedBufferParameteriv (GLuint buffer, GLenum pname, GLint *params);
GLAPI void APIENTRY glGetNamedBufferParameteri64v (GLuint buffer, GLenum pname, GLint64 *params);
GLAPI void APIENTRY glGetNamedBufferPointerv (GLuint buffer, GLenum pname, void **params);
GLAPI void APIENTRY glGetNamedBufferSubData (GLuint buffer, GLintptr offset, GLsizei size, void *data);
GLAPI void APIENTRY glGetNamedBufferSubData (GLuint buffer, GLintptr offset, GLsizeiptr size, void *data);
GLAPI void APIENTRY glCreateFramebuffers (GLsizei n, GLuint *framebuffers);
GLAPI void APIENTRY glNamedFramebufferRenderbuffer (GLuint framebuffer, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);
GLAPI void APIENTRY glNamedFramebufferParameteri (GLuint framebuffer, GLenum pname, GLint param);
@@ -2794,7 +2817,7 @@ GLAPI void APIENTRY glInvalidateNamedFramebufferSubData (GLuint framebuffer, GLs
GLAPI void APIENTRY glClearNamedFramebufferiv (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLint *value);
GLAPI void APIENTRY glClearNamedFramebufferuiv (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLuint *value);
GLAPI void APIENTRY glClearNamedFramebufferfv (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLfloat *value);
GLAPI void APIENTRY glClearNamedFramebufferfi (GLuint framebuffer, GLenum buffer, const GLfloat depth, GLint stencil);
GLAPI void APIENTRY glClearNamedFramebufferfi (GLuint framebuffer, GLenum buffer, GLint drawbuffer, GLfloat depth, GLint stencil);
GLAPI void APIENTRY glBlitNamedFramebuffer (GLuint readFramebuffer, GLuint drawFramebuffer, GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);
GLAPI GLenum APIENTRY glCheckNamedFramebufferStatus (GLuint framebuffer, GLenum target);
GLAPI void APIENTRY glGetNamedFramebufferParameteriv (GLuint framebuffer, GLenum pname, GLint *param);
@@ -2805,7 +2828,7 @@ GLAPI void APIENTRY glNamedRenderbufferStorageMultisample (GLuint renderbuffer,
GLAPI void APIENTRY glGetNamedRenderbufferParameteriv (GLuint renderbuffer, GLenum pname, GLint *params);
GLAPI void APIENTRY glCreateTextures (GLenum target, GLsizei n, GLuint *textures);
GLAPI void APIENTRY glTextureBuffer (GLuint texture, GLenum internalformat, GLuint buffer);
GLAPI void APIENTRY glTextureBufferRange (GLuint texture, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizei size);
GLAPI void APIENTRY glTextureBufferRange (GLuint texture, GLenum internalformat, GLuint buffer, GLintptr offset, GLsizeiptr size);
GLAPI void APIENTRY glTextureStorage1D (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width);
GLAPI void APIENTRY glTextureStorage2D (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height);
GLAPI void APIENTRY glTextureStorage3D (GLuint texture, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth);
@@ -2853,6 +2876,10 @@ GLAPI void APIENTRY glGetVertexArrayIndexed64iv (GLuint vaobj, GLuint index, GLe
GLAPI void APIENTRY glCreateSamplers (GLsizei n, GLuint *samplers);
GLAPI void APIENTRY glCreateProgramPipelines (GLsizei n, GLuint *pipelines);
GLAPI void APIENTRY glCreateQueries (GLenum target, GLsizei n, GLuint *ids);
GLAPI void APIENTRY glGetQueryBufferObjecti64v (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
GLAPI void APIENTRY glGetQueryBufferObjectiv (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
GLAPI void APIENTRY glGetQueryBufferObjectui64v (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
GLAPI void APIENTRY glGetQueryBufferObjectuiv (GLuint id, GLuint buffer, GLenum pname, GLintptr offset);
GLAPI void APIENTRY glMemoryBarrierByRegion (GLbitfield barriers);
GLAPI void APIENTRY glGetTextureSubImage (GLuint texture, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, GLsizei bufSize, void *pixels);
GLAPI void APIENTRY glGetCompressedTextureSubImage (GLuint texture, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLsizei bufSize, void *pixels);
@@ -2990,8 +3017,6 @@ GLAPI void APIENTRY glDispatchComputeGroupSizeARB (GLuint num_groups_x, GLuint n
#ifndef GL_ARB_copy_buffer
#define GL_ARB_copy_buffer 1
#define GL_COPY_READ_BUFFER_BINDING 0x8F36
#define GL_COPY_WRITE_BUFFER_BINDING 0x8F37
#endif /* GL_ARB_copy_buffer */
#ifndef GL_ARB_copy_image
@@ -3346,13 +3371,13 @@ GLAPI void APIENTRY glGetNamedStringivARB (GLint namelen, const GLchar *name, GL
#define GL_ARB_sparse_buffer 1
#define GL_SPARSE_STORAGE_BIT_ARB 0x0400
#define GL_SPARSE_BUFFER_PAGE_SIZE_ARB 0x82F8
typedef void (APIENTRYP PFNGLBUFFERPAGECOMMITMENTARBPROC) (GLenum target, GLintptr offset, GLsizei size, GLboolean commit);
typedef void (APIENTRYP PFNGLNAMEDBUFFERPAGECOMMITMENTEXTPROC) (GLuint buffer, GLintptr offset, GLsizei size, GLboolean commit);
typedef void (APIENTRYP PFNGLNAMEDBUFFERPAGECOMMITMENTARBPROC) (GLuint buffer, GLintptr offset, GLsizei size, GLboolean commit);
typedef void (APIENTRYP PFNGLBUFFERPAGECOMMITMENTARBPROC) (GLenum target, GLintptr offset, GLsizeiptr size, GLboolean commit);
typedef void (APIENTRYP PFNGLNAMEDBUFFERPAGECOMMITMENTEXTPROC) (GLuint buffer, GLintptr offset, GLsizeiptr size, GLboolean commit);
typedef void (APIENTRYP PFNGLNAMEDBUFFERPAGECOMMITMENTARBPROC) (GLuint buffer, GLintptr offset, GLsizeiptr size, GLboolean commit);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glBufferPageCommitmentARB (GLenum target, GLintptr offset, GLsizei size, GLboolean commit);
GLAPI void APIENTRY glNamedBufferPageCommitmentEXT (GLuint buffer, GLintptr offset, GLsizei size, GLboolean commit);
GLAPI void APIENTRY glNamedBufferPageCommitmentARB (GLuint buffer, GLintptr offset, GLsizei size, GLboolean commit);
GLAPI void APIENTRY glBufferPageCommitmentARB (GLenum target, GLintptr offset, GLsizeiptr size, GLboolean commit);
GLAPI void APIENTRY glNamedBufferPageCommitmentEXT (GLuint buffer, GLintptr offset, GLsizeiptr size, GLboolean commit);
GLAPI void APIENTRY glNamedBufferPageCommitmentARB (GLuint buffer, GLintptr offset, GLsizeiptr size, GLboolean commit);
#endif
#endif /* GL_ARB_sparse_buffer */
@@ -3360,7 +3385,7 @@ GLAPI void APIENTRY glNamedBufferPageCommitmentARB (GLuint buffer, GLintptr offs
#define GL_ARB_sparse_texture 1
#define GL_TEXTURE_SPARSE_ARB 0x91A6
#define GL_VIRTUAL_PAGE_SIZE_INDEX_ARB 0x91A7
#define GL_MIN_SPARSE_LEVEL_ARB 0x919B
#define GL_NUM_SPARSE_LEVELS_ARB 0x91AA
#define GL_NUM_VIRTUAL_PAGE_SIZES_ARB 0x91A8
#define GL_VIRTUAL_PAGE_SIZE_X_ARB 0x9195
#define GL_VIRTUAL_PAGE_SIZE_Y_ARB 0x9196
@@ -3369,9 +3394,9 @@ GLAPI void APIENTRY glNamedBufferPageCommitmentARB (GLuint buffer, GLintptr offs
#define GL_MAX_SPARSE_3D_TEXTURE_SIZE_ARB 0x9199
#define GL_MAX_SPARSE_ARRAY_TEXTURE_LAYERS_ARB 0x919A
#define GL_SPARSE_TEXTURE_FULL_ARRAY_CUBE_MIPMAPS_ARB 0x91A9
typedef void (APIENTRYP PFNGLTEXPAGECOMMITMENTARBPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLboolean resident);
typedef void (APIENTRYP PFNGLTEXPAGECOMMITMENTARBPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLboolean commit);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glTexPageCommitmentARB (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLboolean resident);
GLAPI void APIENTRY glTexPageCommitmentARB (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLboolean commit);
#endif
#endif /* GL_ARB_sparse_texture */
@@ -3479,8 +3504,6 @@ GLAPI void APIENTRY glTexPageCommitmentARB (GLenum target, GLint level, GLint xo
#ifndef GL_ARB_transform_feedback2
#define GL_ARB_transform_feedback2 1
#define GL_TRANSFORM_FEEDBACK_PAUSED 0x8E23
#define GL_TRANSFORM_FEEDBACK_ACTIVE 0x8E24
#endif /* GL_ARB_transform_feedback2 */
#ifndef GL_ARB_transform_feedback3
@@ -3537,6 +3560,11 @@ GLAPI void APIENTRY glTexPageCommitmentARB (GLenum target, GLint level, GLint xo
#define GL_KHR_debug 1
#endif /* GL_KHR_debug */
#ifndef GL_KHR_no_error
#define GL_KHR_no_error 1
#define GL_CONTEXT_FLAG_NO_ERROR_BIT_KHR 0x00000008
#endif /* GL_KHR_no_error */
#ifndef GL_KHR_robust_buffer_access_behavior
#define GL_KHR_robust_buffer_access_behavior 1
#endif /* GL_KHR_robust_buffer_access_behavior */
@@ -3582,6 +3610,10 @@ GLAPI void APIENTRY glTexPageCommitmentARB (GLenum target, GLint level, GLint xo
#define GL_KHR_texture_compression_astc_ldr 1
#endif /* GL_KHR_texture_compression_astc_ldr */
#ifndef GL_KHR_texture_compression_astc_sliced_3d
#define GL_KHR_texture_compression_astc_sliced_3d 1
#endif /* GL_KHR_texture_compression_astc_sliced_3d */
#ifdef __cplusplus
}
#endif

View File

@@ -6,7 +6,7 @@ extern "C" {
#endif
/*
** Copyright (c) 2013-2015 The Khronos Group Inc.
** Copyright (c) 2013-2016 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
@@ -33,7 +33,7 @@ extern "C" {
** used to make the header, and the header can be found at
** http://www.opengl.org/registry/
**
** Khronos $Revision: 31811 $ on $Date: 2015-08-10 17:01:11 +1000 (Mon, 10 Aug 2015) $
** Khronos $Revision: 32957 $ on $Date: 2016-06-09 17:03:08 -0400 (Thu, 09 Jun 2016) $
*/
#if defined(_WIN32) && !defined(APIENTRY) && !defined(__CYGWIN__) && !defined(__SCITECH_SNAP__)
@@ -53,7 +53,7 @@ extern "C" {
#define GLAPI extern
#endif
#define GL_GLEXT_VERSION 20150809
#define GL_GLEXT_VERSION 20160609
/* Generated C header for:
* API: gl
@@ -2654,7 +2654,7 @@ typedef void (APIENTRYP PFNGLINVALIDATENAMEDFRAMEBUFFERSUBDATAPROC) (GLuint fram
typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERIVPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLint *value);
typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERUIVPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLuint *value);
typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERFVPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLfloat *value);
typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERFIPROC) (GLuint framebuffer, GLenum buffer, const GLfloat depth, GLint stencil);
typedef void (APIENTRYP PFNGLCLEARNAMEDFRAMEBUFFERFIPROC) (GLuint framebuffer, GLenum buffer, GLint drawbuffer, GLfloat depth, GLint stencil);
typedef void (APIENTRYP PFNGLBLITNAMEDFRAMEBUFFERPROC) (GLuint readFramebuffer, GLuint drawFramebuffer, GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);
typedef GLenum (APIENTRYP PFNGLCHECKNAMEDFRAMEBUFFERSTATUSPROC) (GLuint framebuffer, GLenum target);
typedef void (APIENTRYP PFNGLGETNAMEDFRAMEBUFFERPARAMETERIVPROC) (GLuint framebuffer, GLenum pname, GLint *param);
@@ -2777,7 +2777,7 @@ GLAPI void APIENTRY glInvalidateNamedFramebufferSubData (GLuint framebuffer, GLs
GLAPI void APIENTRY glClearNamedFramebufferiv (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLint *value);
GLAPI void APIENTRY glClearNamedFramebufferuiv (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLuint *value);
GLAPI void APIENTRY glClearNamedFramebufferfv (GLuint framebuffer, GLenum buffer, GLint drawbuffer, const GLfloat *value);
GLAPI void APIENTRY glClearNamedFramebufferfi (GLuint framebuffer, GLenum buffer, const GLfloat depth, GLint stencil);
GLAPI void APIENTRY glClearNamedFramebufferfi (GLuint framebuffer, GLenum buffer, GLint drawbuffer, GLfloat depth, GLint stencil);
GLAPI void APIENTRY glBlitNamedFramebuffer (GLuint readFramebuffer, GLuint drawFramebuffer, GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);
GLAPI GLenum APIENTRY glCheckNamedFramebufferStatus (GLuint framebuffer, GLenum target);
GLAPI void APIENTRY glGetNamedFramebufferParameteriv (GLuint framebuffer, GLenum pname, GLint *param);
@@ -4984,6 +4984,10 @@ GLAPI void APIENTRY glBlendBarrierKHR (void);
#define GL_KHR_texture_compression_astc_ldr 1
#endif /* GL_KHR_texture_compression_astc_ldr */
#ifndef GL_KHR_texture_compression_astc_sliced_3d
#define GL_KHR_texture_compression_astc_sliced_3d 1
#endif /* GL_KHR_texture_compression_astc_sliced_3d */
#ifndef GL_OES_byte_coordinates
#define GL_OES_byte_coordinates 1
typedef void (APIENTRYP PFNGLMULTITEXCOORD1BOESPROC) (GLenum texture, GLbyte s);
@@ -5597,6 +5601,10 @@ GLAPI void APIENTRY glSetMultisamplefvAMD (GLenum pname, GLuint index, const GLf
#define GL_AMD_shader_atomic_counter_ops 1
#endif /* GL_AMD_shader_atomic_counter_ops */
#ifndef GL_AMD_shader_explicit_vertex_parameter
#define GL_AMD_shader_explicit_vertex_parameter 1
#endif /* GL_AMD_shader_explicit_vertex_parameter */
#ifndef GL_AMD_shader_stencil_export
#define GL_AMD_shader_stencil_export 1
#endif /* GL_AMD_shader_stencil_export */
@@ -8637,6 +8645,20 @@ GLAPI void APIENTRY glVertexWeightPointerEXT (GLint size, GLenum type, GLsizei s
#endif
#endif /* GL_EXT_vertex_weighting */
#ifndef GL_EXT_window_rectangles
#define GL_EXT_window_rectangles 1
#define GL_INCLUSIVE_EXT 0x8F10
#define GL_EXCLUSIVE_EXT 0x8F11
#define GL_WINDOW_RECTANGLE_EXT 0x8F12
#define GL_WINDOW_RECTANGLE_MODE_EXT 0x8F13
#define GL_MAX_WINDOW_RECTANGLES_EXT 0x8F14
#define GL_NUM_WINDOW_RECTANGLES_EXT 0x8F15
typedef void (APIENTRYP PFNGLWINDOWRECTANGLESEXTPROC) (GLenum mode, GLsizei count, const GLint *box);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glWindowRectanglesEXT (GLenum mode, GLsizei count, const GLint *box);
#endif
#endif /* GL_EXT_window_rectangles */
#ifndef GL_EXT_x11_sync_object
#define GL_EXT_x11_sync_object 1
#define GL_SYNC_X11_FENCE_EXT 0x90E1
@@ -9130,6 +9152,17 @@ GLAPI void APIENTRY glBlendBarrierNV (void);
#define GL_NV_blend_square 1
#endif /* GL_NV_blend_square */
#ifndef GL_NV_clip_space_w_scaling
#define GL_NV_clip_space_w_scaling 1
#define GL_VIEWPORT_POSITION_W_SCALE_NV 0x937C
#define GL_VIEWPORT_POSITION_W_SCALE_X_COEFF_NV 0x937D
#define GL_VIEWPORT_POSITION_W_SCALE_Y_COEFF_NV 0x937E
typedef void (APIENTRYP PFNGLVIEWPORTPOSITIONWSCALENVPROC) (GLuint index, GLfloat xcoeff, GLfloat ycoeff);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glViewportPositionWScaleNV (GLuint index, GLfloat xcoeff, GLfloat ycoeff);
#endif
#endif /* GL_NV_clip_space_w_scaling */
#ifndef GL_NV_command_list
#define GL_NV_command_list 1
#define GL_TERMINATE_SEQUENCE_COMMAND_NV 0x0000
@@ -9232,6 +9265,17 @@ GLAPI void APIENTRY glConservativeRasterParameterfNV (GLenum pname, GLfloat valu
#endif
#endif /* GL_NV_conservative_raster_dilate */
#ifndef GL_NV_conservative_raster_pre_snap_triangles
#define GL_NV_conservative_raster_pre_snap_triangles 1
#define GL_CONSERVATIVE_RASTER_MODE_NV 0x954D
#define GL_CONSERVATIVE_RASTER_MODE_POST_SNAP_NV 0x954E
#define GL_CONSERVATIVE_RASTER_MODE_PRE_SNAP_TRIANGLES_NV 0x954F
typedef void (APIENTRYP PFNGLCONSERVATIVERASTERPARAMETERINVPROC) (GLenum pname, GLint param);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glConservativeRasterParameteriNV (GLenum pname, GLint param);
#endif
#endif /* GL_NV_conservative_raster_pre_snap_triangles */
#ifndef GL_NV_copy_depth_to_color
#define GL_NV_copy_depth_to_color 1
#define GL_DEPTH_STENCIL_TO_RGBA_NV 0x886E
@@ -10224,6 +10268,11 @@ GLAPI void APIENTRY glGetCombinerStageParameterfvNV (GLenum stage, GLenum pname,
#endif
#endif /* GL_NV_register_combiners2 */
#ifndef GL_NV_robustness_video_memory_purge
#define GL_NV_robustness_video_memory_purge 1
#define GL_PURGED_CONTEXT_RESET_NV 0x92BB
#endif /* GL_NV_robustness_video_memory_purge */
#ifndef GL_NV_sample_locations
#define GL_NV_sample_locations 1
#define GL_SAMPLE_LOCATION_SUBPIXEL_BITS_NV 0x933D
@@ -10256,6 +10305,10 @@ GLAPI void APIENTRY glResolveDepthValuesNV (void);
#define GL_NV_shader_atomic_float 1
#endif /* GL_NV_shader_atomic_float */
#ifndef GL_NV_shader_atomic_float64
#define GL_NV_shader_atomic_float64 1
#endif /* GL_NV_shader_atomic_float64 */
#ifndef GL_NV_shader_atomic_fp16_vector
#define GL_NV_shader_atomic_fp16_vector 1
#endif /* GL_NV_shader_atomic_fp16_vector */
@@ -10319,6 +10372,10 @@ GLAPI void APIENTRY glProgramUniformui64vNV (GLuint program, GLint location, GLs
#define GL_NV_shader_thread_shuffle 1
#endif /* GL_NV_shader_thread_shuffle */
#ifndef GL_NV_stereo_view_rendering
#define GL_NV_stereo_view_rendering 1
#endif /* GL_NV_stereo_view_rendering */
#ifndef GL_NV_tessellation_program5
#define GL_NV_tessellation_program5 1
#define GL_MAX_PROGRAM_PATCH_ATTRIBS_NV 0x86D8
@@ -11089,6 +11146,26 @@ GLAPI void APIENTRY glVideoCaptureStreamParameterdvNV (GLuint video_capture_slot
#define GL_NV_viewport_array2 1
#endif /* GL_NV_viewport_array2 */
#ifndef GL_NV_viewport_swizzle
#define GL_NV_viewport_swizzle 1
#define GL_VIEWPORT_SWIZZLE_POSITIVE_X_NV 0x9350
#define GL_VIEWPORT_SWIZZLE_NEGATIVE_X_NV 0x9351
#define GL_VIEWPORT_SWIZZLE_POSITIVE_Y_NV 0x9352
#define GL_VIEWPORT_SWIZZLE_NEGATIVE_Y_NV 0x9353
#define GL_VIEWPORT_SWIZZLE_POSITIVE_Z_NV 0x9354
#define GL_VIEWPORT_SWIZZLE_NEGATIVE_Z_NV 0x9355
#define GL_VIEWPORT_SWIZZLE_POSITIVE_W_NV 0x9356
#define GL_VIEWPORT_SWIZZLE_NEGATIVE_W_NV 0x9357
#define GL_VIEWPORT_SWIZZLE_X_NV 0x9358
#define GL_VIEWPORT_SWIZZLE_Y_NV 0x9359
#define GL_VIEWPORT_SWIZZLE_Z_NV 0x935A
#define GL_VIEWPORT_SWIZZLE_W_NV 0x935B
typedef void (APIENTRYP PFNGLVIEWPORTSWIZZLENVPROC) (GLuint index, GLenum swizzlex, GLenum swizzley, GLenum swizzlez, GLenum swizzlew);
#ifdef GL_GLEXT_PROTOTYPES
GLAPI void APIENTRY glViewportSwizzleNV (GLuint index, GLenum swizzlex, GLenum swizzley, GLenum swizzlez, GLenum swizzlew);
#endif
#endif /* GL_NV_viewport_swizzle */
#ifndef GL_OML_interlace
#define GL_OML_interlace 1
#define GL_INTERLACE_OML 0x8980

View File

@@ -79,6 +79,7 @@ typedef struct __DRIdri2LoaderExtensionRec __DRIdri2LoaderExtension;
typedef struct __DRI2flushExtensionRec __DRI2flushExtension;
typedef struct __DRI2throttleExtensionRec __DRI2throttleExtension;
typedef struct __DRI2fenceExtensionRec __DRI2fenceExtension;
typedef struct __DRI2interopExtensionRec __DRI2interopExtension;
typedef struct __DRIimageLoaderExtensionRec __DRIimageLoaderExtension;
@@ -392,6 +393,31 @@ struct __DRI2fenceExtensionRec {
};
/**
* Extension for API interop.
* See GL/mesa_glinterop.h.
*/
#define __DRI2_INTEROP "DRI2_Interop"
#define __DRI2_INTEROP_VERSION 1
struct mesa_glinterop_device_info;
struct mesa_glinterop_export_in;
struct mesa_glinterop_export_out;
struct __DRI2interopExtensionRec {
__DRIextension base;
/** Same as MesaGLInterop*QueryDeviceInfo. */
int (*query_device_info)(__DRIcontext *ctx,
struct mesa_glinterop_device_info *out);
/** Same as MesaGLInterop*ExportObject. */
int (*export_object)(__DRIcontext *ctx,
struct mesa_glinterop_export_in *in,
struct mesa_glinterop_export_out *out);
};
/*@}*/
/**
@@ -1068,7 +1094,7 @@ struct __DRIdri2ExtensionRec {
* extensions.
*/
#define __DRI_IMAGE "DRI_IMAGE"
#define __DRI_IMAGE_VERSION 11
#define __DRI_IMAGE_VERSION 12
/**
* These formats correspond to the similarly named MESA_FORMAT_*
@@ -1100,8 +1126,18 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_USE_SCANOUT 0x0002
#define __DRI_IMAGE_USE_CURSOR 0x0004 /* Depricated */
#define __DRI_IMAGE_USE_LINEAR 0x0008
/* The buffer will only be read by an external process after SwapBuffers,
* in contrary to gbm buffers, front buffers and fake front buffers, which
* could be read after a flush."
*/
#define __DRI_IMAGE_USE_BACKBUFFER 0x0010
#define __DRI_IMAGE_TRANSFER_READ 0x1
#define __DRI_IMAGE_TRANSFER_WRITE 0x2
#define __DRI_IMAGE_TRANSFER_READ_WRITE \
(__DRI_IMAGE_TRANSFER_READ | __DRI_IMAGE_TRANSFER_WRITE)
/**
* Four CC formats that matches with WL_DRM_FORMAT_* from wayland_drm.h,
* GBM_FORMAT_* from gbm.h, and DRM_FORMAT_* from drm_fourcc.h. Used with
@@ -1127,6 +1163,11 @@ struct __DRIdri2ExtensionRec {
#define __DRI_IMAGE_FOURCC_NV16 0x3631564e
#define __DRI_IMAGE_FOURCC_YUYV 0x56595559
#define __DRI_IMAGE_FOURCC_YVU410 0x39555659
#define __DRI_IMAGE_FOURCC_YVU411 0x31315659
#define __DRI_IMAGE_FOURCC_YVU420 0x32315659
#define __DRI_IMAGE_FOURCC_YVU422 0x36315659
#define __DRI_IMAGE_FOURCC_YVU444 0x34325659
/**
* Queryable on images created by createImageFromNames.
@@ -1350,6 +1391,33 @@ struct __DRIimageExtensionRec {
* \since 10
*/
int (*getCapabilities)(__DRIscreen *screen);
/**
* Returns a map of the specified region of a __DRIimage for the specified usage.
*
* flags may include __DRI_IMAGE_TRANSFER_READ, which will populate the
* mapping with the current buffer content. If __DRI_IMAGE_TRANSFER_READ
* is not included in the flags, the buffer content at map time is
* undefined. Users wanting to modify the mapping must include
* __DRI_IMAGE_TRANSFER_WRITE; if __DRI_IMAGE_TRANSFER_WRITE is not
* included, behaviour when writing the mapping is undefined.
*
* Returns the byte stride in *stride, and an opaque pointer to data
* tracking the mapping in **data, which must be passed to unmapImage().
*
* \since 12
*/
void *(*mapImage)(__DRIcontext *context, __DRIimage *image,
int x0, int y0, int width, int height,
unsigned int flags, int *stride, void **data);
/**
* Unmap a previously mapped __DRIimage
*
* \since 12
*/
void (*unmapImage)(__DRIcontext *context, __DRIimage *image, void *data);
};

304
include/GL/mesa_glinterop.h Normal file
View File

@@ -0,0 +1,304 @@
/*
* Mesa 3-D graphics library
*
* Copyright 2016 Advanced Micro Devices, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
/* Mesa OpenGL inter-driver interoperability interface designed for but not
* limited to OpenCL.
*
* This is a driver-agnostic, backward-compatible interface. The structures
* are only allowed to grow. They can never shrink and their members can
* never be removed, renamed, or redefined.
*
* The interface doesn't return a lot of static texture parameters like
* width, height, etc. It mainly returns mutable buffer and texture view
* parameters that can't be part of the texture allocation (because they are
* mutable). If drivers want to return more data or want to return static
* allocation parameters, they can do it in one of these two ways:
* - attaching the data to the DMABUF handle in a driver-specific way
* - passing the data via "out_driver_data" in the "in" structure.
*
* Mesa is expected to do a lot of error checking on behalf of OpenCL, such
* as checking the target, miplevel, and texture completeness.
*
* OpenCL, on the other hand, needs to check if the display+context combo
* is compatible with the OpenCL driver by querying the device information.
* It also needs to check if the texture internal format and channel ordering
* (returned in a driver-specific way) is supported by OpenCL, among other
* things.
*/
#ifndef MESA_GLINTEROP_H
#define MESA_GLINTEROP_H
#include <stddef.h>
#include <stdint.h>
#ifdef __cplusplus
extern "C" {
#endif
/* Forward declarations to avoid inclusion of GL/glx.h */
struct _XDisplay;
struct __GLXcontextRec;
/* Forward declarations to avoid inclusion of EGL/egl.h */
typedef void *EGLDisplay;
typedef void *EGLContext;
/** Returned error codes. */
enum {
MESA_GLINTEROP_SUCCESS = 0,
MESA_GLINTEROP_OUT_OF_RESOURCES,
MESA_GLINTEROP_OUT_OF_HOST_MEMORY,
MESA_GLINTEROP_INVALID_OPERATION,
MESA_GLINTEROP_INVALID_VERSION,
MESA_GLINTEROP_INVALID_DISPLAY,
MESA_GLINTEROP_INVALID_CONTEXT,
MESA_GLINTEROP_INVALID_TARGET,
MESA_GLINTEROP_INVALID_OBJECT,
MESA_GLINTEROP_INVALID_MIP_LEVEL,
MESA_GLINTEROP_UNSUPPORTED
};
/** Access flags. */
enum {
MESA_GLINTEROP_ACCESS_READ_WRITE = 0,
MESA_GLINTEROP_ACCESS_READ_ONLY,
MESA_GLINTEROP_ACCESS_WRITE_ONLY
};
#define MESA_GLINTEROP_DEVICE_INFO_VERSION 1
/**
* Device information returned by Mesa.
*/
struct mesa_glinterop_device_info {
/* The caller should set this to the version of the struct they support */
/* The callee will overwrite it if it supports a lower version.
*
* The caller should check the value and access up-to the version supported
* by the the callee.
*/
/* NOTE: Do not use the MESA_GLINTEROP_DEVICE_INFO_VERSION macro */
uint32_t version;
/* PCI location */
uint32_t pci_segment_group;
uint32_t pci_bus;
uint32_t pci_device;
uint32_t pci_function;
/* Device identification */
uint32_t vendor_id;
uint32_t device_id;
/* Structure version 1 ends here. */
};
#define MESA_GLINTEROP_EXPORT_IN_VERSION 1
/**
* Input parameters to Mesa interop export functions.
*/
struct mesa_glinterop_export_in {
/* The caller should set this to the version of the struct they support */
/* The callee will overwrite it if it supports a lower version.
*
* The caller should check the value and access up-to the version supported
* by the the callee.
*/
/* NOTE: Do not use the MESA_GLINTEROP_EXPORT_IN_VERSION macro */
uint32_t version;
/* One of the following:
* - GL_TEXTURE_BUFFER
* - GL_TEXTURE_1D
* - GL_TEXTURE_2D
* - GL_TEXTURE_3D
* - GL_TEXTURE_RECTANGLE
* - GL_TEXTURE_1D_ARRAY
* - GL_TEXTURE_2D_ARRAY
* - GL_TEXTURE_CUBE_MAP_ARRAY
* - GL_TEXTURE_CUBE_MAP
* - GL_TEXTURE_CUBE_MAP_POSITIVE_X
* - GL_TEXTURE_CUBE_MAP_NEGATIVE_X
* - GL_TEXTURE_CUBE_MAP_POSITIVE_Y
* - GL_TEXTURE_CUBE_MAP_NEGATIVE_Y
* - GL_TEXTURE_CUBE_MAP_POSITIVE_Z
* - GL_TEXTURE_CUBE_MAP_NEGATIVE_Z
* - GL_TEXTURE_2D_MULTISAMPLE
* - GL_TEXTURE_2D_MULTISAMPLE_ARRAY
* - GL_TEXTURE_EXTERNAL_OES
* - GL_RENDERBUFFER
* - GL_ARRAY_BUFFER
*/
unsigned target;
/* If target is GL_ARRAY_BUFFER, it's a buffer object.
* If target is GL_RENDERBUFFER, it's a renderbuffer object.
* If target is GL_TEXTURE_*, it's a texture object.
*/
unsigned obj;
/* Mipmap level. Ignored for non-texture objects. */
unsigned miplevel;
/* One of MESA_GLINTEROP_ACCESS_* flags. This describes how the exported
* object is going to be used.
*/
uint32_t access;
/* Size of memory pointed to by out_driver_data. */
uint32_t out_driver_data_size;
/* If the caller wants to query driver-specific data about the OpenGL
* object, this should point to the memory where that data will be stored.
* This is expected to be a temporary staging memory. The pointer is not
* allowed to be saved for later use by Mesa.
*/
void *out_driver_data;
/* Structure version 1 ends here. */
};
#define MESA_GLINTEROP_EXPORT_OUT_VERSION 1
/**
* Outputs of Mesa interop export functions.
*/
struct mesa_glinterop_export_out {
/* The caller should set this to the version of the struct they support */
/* The callee will overwrite it if it supports a lower version.
*
* The caller should check the value and access up-to the version supported
* by the the callee.
*/
/* NOTE: Do not use the MESA_GLINTEROP_EXPORT_OUT_VERSION macro */
uint32_t version;
/* The DMABUF handle. It must be closed by the caller using the POSIX
* close() function when it's not needed anymore. Mesa is not responsible
* for closing the handle.
*
* Not closing the handle by the caller will lead to a resource leak,
* will prevent releasing the GPU buffer, and may prevent creating new
* DMABUF handles within the process.
*/
int dmabuf_fd;
/* The mutable OpenGL internal format specified by glTextureView or
* glTexBuffer. If the object is not one of those, the original internal
* format specified by glTexStorage, glTexImage, or glRenderbufferStorage
* will be returned.
*/
unsigned internal_format;
/* Buffer offset and size for GL_ARRAY_BUFFER and GL_TEXTURE_BUFFER.
* This allows interop with suballocations (a buffer allocated within
* a larger buffer).
*
* Parameters specified by glTexBufferRange for GL_TEXTURE_BUFFER are
* applied to these and can shrink the range further.
*/
ptrdiff_t buf_offset;
ptrdiff_t buf_size;
/* Parameters specified by glTextureView. If the object is not a texture
* view, default parameters covering the whole texture will be returned.
*/
unsigned view_minlevel;
unsigned view_numlevels;
unsigned view_minlayer;
unsigned view_numlayers;
/* The number of bytes written to out_driver_data. */
uint32_t out_driver_data_written;
/* Structure version 1 ends here. */
};
/**
* Query device information.
*
* \param dpy GLX display
* \param context GLX context
* \param out where to return the information
*
* \return MESA_GLINTEROP_SUCCESS or MESA_GLINTEROP_* != 0 on error
*/
int
MesaGLInteropGLXQueryDeviceInfo(struct _XDisplay *dpy, struct __GLXcontextRec *context,
struct mesa_glinterop_device_info *out);
/**
* Same as MesaGLInteropGLXQueryDeviceInfo except that it accepts EGLDisplay
* and EGLContext.
*/
int
MesaGLInteropEGLQueryDeviceInfo(EGLDisplay dpy, EGLContext context,
struct mesa_glinterop_device_info *out);
/**
* Create and return a DMABUF handle corresponding to the given OpenGL
* object, and return other parameters about the OpenGL object.
*
* \param dpy GLX display
* \param context GLX context
* \param in input parameters
* \param out return values
*
* \return MESA_GLINTEROP_SUCCESS or MESA_GLINTEROP_* != 0 on error
*/
int
MesaGLInteropGLXExportObject(struct _XDisplay *dpy, struct __GLXcontextRec *context,
struct mesa_glinterop_export_in *in,
struct mesa_glinterop_export_out *out);
/**
* Same as MesaGLInteropGLXExportObject except that it accepts
* EGLDisplay and EGLContext.
*/
int
MesaGLInteropEGLExportObject(EGLDisplay dpy, EGLContext context,
struct mesa_glinterop_export_in *in,
struct mesa_glinterop_export_out *out);
typedef int (PFNMESAGLINTEROPGLXQUERYDEVICEINFOPROC)(struct _XDisplay *dpy, struct __GLXcontextRec *context,
struct mesa_glinterop_device_info *out);
typedef int (PFNMESAGLINTEROPEGLQUERYDEVICEINFOPROC)(EGLDisplay dpy, EGLContext context,
struct mesa_glinterop_device_info *out);
typedef int (PFNMESAGLINTEROPGLXEXPORTOBJECTPROC)(struct _XDisplay *dpy, struct __GLXcontextRec *context,
struct mesa_glinterop_export_in *in,
struct mesa_glinterop_export_out *out);
typedef int (PFNMESAGLINTEROPEGLEXPORTOBJECTPROC)(EGLDisplay dpy, EGLContext context,
struct mesa_glinterop_export_in *in,
struct mesa_glinterop_export_out *out);
#ifdef __cplusplus
}
#endif
#endif /* MESA_GLINTEROP_H */

View File

@@ -169,6 +169,32 @@ mtx_destroy(mtx_t *mtx)
pthread_mutex_destroy(mtx);
}
/*
* XXX: Workaround when building with -O0 and without pthreads link.
*
* In such cases constant folding and dead code elimination won't be
* available, thus the compiler will always add the pthread_mutexattr*
* functions into the binary. As we try to link, we'll fail as the
* symbols are unresolved.
*
* Ideally we'll enable the optimisations locally, yet that does not
* seem to work.
*
* So the alternative workaround is to annotate the symbols as weak.
* Thus the linker will be happy and things don't clash when building
* with -O1 or greater.
*/
#ifdef HAVE_FUNC_ATTRIBUTE_WEAK
__attribute__((weak))
int pthread_mutexattr_init(pthread_mutexattr_t *attr);
__attribute__((weak))
int pthread_mutexattr_settype(pthread_mutexattr_t *attr, int type);
__attribute__((weak))
int pthread_mutexattr_destroy(pthread_mutexattr_t *attr);
#endif
// 7.25.4.2
static inline int
mtx_init(mtx_t *mtx, int type)
@@ -180,9 +206,14 @@ mtx_init(mtx_t *mtx, int type)
&& type != (mtx_timed|mtx_recursive)
&& type != (mtx_try|mtx_recursive))
return thrd_error;
if ((type & mtx_recursive) == 0) {
pthread_mutex_init(mtx, NULL);
return thrd_success;
}
pthread_mutexattr_init(&attr);
if ((type & mtx_recursive) != 0)
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
pthread_mutex_init(mtx, &attr);
pthread_mutexattr_destroy(&attr);
return thrd_success;

View File

@@ -36,8 +36,8 @@
*/
#if defined(_MSC_VER)
# if _MSC_VER < 1800
# error "Microsoft Visual Studio 2013 or higher required"
# if _MSC_VER < 1800 || (_MSC_FULL_VER < 180031101 && !defined(__clang__))
# error "Microsoft Visual Studio 2013 Update 4 or higher required"
# endif
/*
@@ -135,4 +135,48 @@ test_c99_compat_h(const void * restrict a,
#endif
/* Fallback definitions, for build systems other than autoconfig which don't
* auto-detect these things. */
#ifdef HAVE_NO_AUTOCONF
# ifndef _WIN32
# define HAVE_PTHREAD
# define HAVE_POSIX_MEMALIGN
# endif
# ifdef __GNUC__
# if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 2)
# error "GCC version 4.2 or higher required"
# endif
/* https://gcc.gnu.org/onlinedocs/gcc-4.2.4/gcc/Other-Builtins.html */
# define HAVE___BUILTIN_CLZ 1
# define HAVE___BUILTIN_CLZLL 1
# define HAVE___BUILTIN_CTZ 1
# define HAVE___BUILTIN_EXPECT 1
# define HAVE___BUILTIN_FFS 1
# define HAVE___BUILTIN_FFSLL 1
# define HAVE___BUILTIN_POPCOUNT 1
# define HAVE___BUILTIN_POPCOUNTLL 1
/* https://gcc.gnu.org/onlinedocs/gcc-4.2.4/gcc/Function-Attributes.html */
# define HAVE_FUNC_ATTRIBUTE_FLATTEN 1
# define HAVE_FUNC_ATTRIBUTE_UNUSED 1
# define HAVE_FUNC_ATTRIBUTE_FORMAT 1
# define HAVE_FUNC_ATTRIBUTE_PACKED 1
# if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3)
/* https://gcc.gnu.org/onlinedocs/gcc-4.3.6/gcc/Other-Builtins.html */
# define HAVE___BUILTIN_BSWAP32 1
# define HAVE___BUILTIN_BSWAP64 1
# endif
# if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 5)
# define HAVE___BUILTIN_UNREACHABLE 1
# endif
# endif /* __GNUC__ */
#endif /* !HAVE_AUTOCONF */
#endif /* _C99_COMPAT_H_ */

View File

@@ -185,4 +185,27 @@ fpclassify(double x)
#endif
/* Since C++11, the following functions are part of the std namespace. Their C
* counteparts should still exist in the global namespace, however cmath
* undefines those functions, which in glibc 2.23, are defined as macros rather
* than functions as in glibc 2.22.
*/
#if __cplusplus >= 201103L && (__GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 23))
#include <cmath>
using std::fpclassify;
using std::isfinite;
using std::isinf;
using std::isnan;
using std::isnormal;
using std::signbit;
using std::isgreater;
using std::isgreaterequal;
using std::isless;
using std::islessequal;
using std::islessgreater;
using std::isunordered;
#endif
#endif /* #define _C99_MATH_H_ */

View File

@@ -29,7 +29,11 @@
#define D3DADAPTER9DRM_NAME "drm"
/* current version */
#define D3DADAPTER9DRM_MAJOR 0
#define D3DADAPTER9DRM_MINOR 0
#define D3DADAPTER9DRM_MINOR 1
/* version 0.0: Initial release
* 0.1: All IDirect3D objects can be assumed to have a pointer to the
* internal vtable in second position of the structure */
struct D3DAdapter9DRM
{

View File

@@ -71,6 +71,10 @@ typedef struct ID3DPresentVtbl
HRESULT (WINAPI *GetWindowInfo)(ID3DPresent *This, HWND hWnd, int *width, int *height, int *depth);
/* Available since version 1.1 */
BOOL (WINAPI *GetWindowOccluded)(ID3DPresent *This);
/* Available since version 1.2 */
BOOL (WINAPI *ResolutionMismatch)(ID3DPresent *This);
HANDLE (WINAPI *CreateThread)(ID3DPresent *This, void *pThreadfunc, void *pParam);
BOOL (WINAPI *WaitForThread)(ID3DPresent *This, HANDLE thread);
} ID3DPresentVtbl;
struct ID3DPresent
@@ -99,6 +103,9 @@ struct ID3DPresent
#define ID3DPresent_SetGammaRamp(p,a,b) (p)->lpVtbl->SetGammaRamp(p,a,b)
#define ID3DPresent_GetWindowInfo(p,a,b,c,d) (p)->lpVtbl->GetWindowSize(p,a,b,c,d)
#define ID3DPresent_GetWindowOccluded(p) (p)->lpVtbl->GetWindowOccluded(p)
#define ID3DPresent_ResolutionMismatch(p) (p)->lpVtbl->ResolutionMismatch(p)
#define ID3DPresent_CreateThread(p,a,b) (p)->lpVtbl->CreateThread(p,a,b)
#define ID3DPresent_WaitForThread(p,a) (p)->lpVtbl->WaitForThread(p,a)
typedef struct ID3DPresentGroupVtbl
{

View File

@@ -156,10 +156,12 @@ CHIPSET(0x5932, kbl_gt4, "Intel(R) Kabylake GT4")
CHIPSET(0x593A, kbl_gt4, "Intel(R) Kabylake GT4")
CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")
CHIPSET(0x593D, kbl_gt4, "Intel(R) Kabylake GT4")
CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B1, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherrytrail)")
CHIPSET(0x22B1, chv, "Intel(R) HD Graphics XXX (Braswell)") /* Overridden in brw_get_renderer_string */
CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x22B3, chv, "Intel(R) HD Graphics (Cherryview)")
CHIPSET(0x0A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x1A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")
CHIPSET(0x5A84, bxt, "Intel(R) HD Graphics (Broxton)")
CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")

View File

@@ -182,4 +182,26 @@ CHIPSET(0x9877, CARRIZO_, CARRIZO)
CHIPSET(0x7300, FIJI_, FIJI)
CHIPSET(0x67E0, POLARIS11_, POLARIS11)
CHIPSET(0x67E1, POLARIS11_, POLARIS11)
CHIPSET(0x67E3, POLARIS11_, POLARIS11)
CHIPSET(0x67E7, POLARIS11_, POLARIS11)
CHIPSET(0x67E8, POLARIS11_, POLARIS11)
CHIPSET(0x67E9, POLARIS11_, POLARIS11)
CHIPSET(0x67EB, POLARIS11_, POLARIS11)
CHIPSET(0x67EF, POLARIS11_, POLARIS11)
CHIPSET(0x67FF, POLARIS11_, POLARIS11)
CHIPSET(0x67C0, POLARIS10_, POLARIS10)
CHIPSET(0x67C1, POLARIS10_, POLARIS10)
CHIPSET(0x67C2, POLARIS10_, POLARIS10)
CHIPSET(0x67C4, POLARIS10_, POLARIS10)
CHIPSET(0x67C7, POLARIS10_, POLARIS10)
CHIPSET(0x67C8, POLARIS10_, POLARIS10)
CHIPSET(0x67C9, POLARIS10_, POLARIS10)
CHIPSET(0x67CA, POLARIS10_, POLARIS10)
CHIPSET(0x67CC, POLARIS10_, POLARIS10)
CHIPSET(0x67CF, POLARIS10_, POLARIS10)
CHIPSET(0x67DF, POLARIS10_, POLARIS10)
CHIPSET(0x98E4, STONEY_, STONEY)

View File

@@ -1 +1,2 @@
CHIPSET(0x0010, VIRTGL, VIRTGL)
CHIPSET(0x1050, VIRTGL, VIRTGL)

85
include/vulkan/vk_icd.h Normal file
View File

@@ -0,0 +1,85 @@
#ifndef VKICD_H
#define VKICD_H
#include "vk_platform.h"
/*
* The ICD must reserve space for a pointer for the loader's dispatch
* table, at the start of <each object>.
* The ICD must initialize this variable using the SET_LOADER_MAGIC_VALUE macro.
*/
#define ICD_LOADER_MAGIC 0x01CDC0DE
typedef union _VK_LOADER_DATA {
uintptr_t loaderMagic;
void *loaderData;
} VK_LOADER_DATA;
static inline void set_loader_magic_value(void* pNewObject) {
VK_LOADER_DATA *loader_info = (VK_LOADER_DATA *) pNewObject;
loader_info->loaderMagic = ICD_LOADER_MAGIC;
}
static inline bool valid_loader_magic_value(void* pNewObject) {
const VK_LOADER_DATA *loader_info = (VK_LOADER_DATA *) pNewObject;
return (loader_info->loaderMagic & 0xffffffff) == ICD_LOADER_MAGIC;
}
/*
* Windows and Linux ICDs will treat VkSurfaceKHR as a pointer to a struct that
* contains the platform-specific connection and surface information.
*/
typedef enum _VkIcdWsiPlatform {
VK_ICD_WSI_PLATFORM_MIR,
VK_ICD_WSI_PLATFORM_WAYLAND,
VK_ICD_WSI_PLATFORM_WIN32,
VK_ICD_WSI_PLATFORM_XCB,
VK_ICD_WSI_PLATFORM_XLIB,
} VkIcdWsiPlatform;
typedef struct _VkIcdSurfaceBase {
VkIcdWsiPlatform platform;
} VkIcdSurfaceBase;
#ifdef VK_USE_PLATFORM_MIR_KHR
typedef struct _VkIcdSurfaceMir {
VkIcdSurfaceBase base;
MirConnection* connection;
MirSurface* mirSurface;
} VkIcdSurfaceMir;
#endif // VK_USE_PLATFORM_MIR_KHR
#ifdef VK_USE_PLATFORM_WAYLAND_KHR
typedef struct _VkIcdSurfaceWayland {
VkIcdSurfaceBase base;
struct wl_display* display;
struct wl_surface* surface;
} VkIcdSurfaceWayland;
#endif // VK_USE_PLATFORM_WAYLAND_KHR
#ifdef VK_USE_PLATFORM_WIN32_KHR
typedef struct _VkIcdSurfaceWin32 {
VkIcdSurfaceBase base;
HINSTANCE hinstance;
HWND hwnd;
} VkIcdSurfaceWin32;
#endif // VK_USE_PLATFORM_WIN32_KHR
#ifdef VK_USE_PLATFORM_XCB_KHR
typedef struct _VkIcdSurfaceXcb {
VkIcdSurfaceBase base;
xcb_connection_t* connection;
xcb_window_t window;
} VkIcdSurfaceXcb;
#endif // VK_USE_PLATFORM_XCB_KHR
#ifdef VK_USE_PLATFORM_XLIB_KHR
typedef struct _VkIcdSurfaceXlib {
VkIcdSurfaceBase base;
Display* dpy;
Window window;
} VkIcdSurfaceXlib;
#endif // VK_USE_PLATFORM_XLIB_KHR
#endif // VKICD_H

View File

@@ -0,0 +1,127 @@
//
// File: vk_platform.h
//
/*
** Copyright (c) 2014-2015 The Khronos Group Inc.
**
** Permission is hereby granted, free of charge, to any person obtaining a
** copy of this software and/or associated documentation files (the
** "Materials"), to deal in the Materials without restriction, including
** without limitation the rights to use, copy, modify, merge, publish,
** distribute, sublicense, and/or sell copies of the Materials, and to
** permit persons to whom the Materials are furnished to do so, subject to
** the following conditions:
**
** The above copyright notice and this permission notice shall be included
** in all copies or substantial portions of the Materials.
**
** THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
** EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
** MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
** IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
** CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
** TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
*/
#ifndef VK_PLATFORM_H_
#define VK_PLATFORM_H_
#ifdef __cplusplus
extern "C"
{
#endif // __cplusplus
/*
***************************************************************************************************
* Platform-specific directives and type declarations
***************************************************************************************************
*/
/* Platform-specific calling convention macros.
*
* Platforms should define these so that Vulkan clients call Vulkan commands
* with the same calling conventions that the Vulkan implementation expects.
*
* VKAPI_ATTR - Placed before the return type in function declarations.
* Useful for C++11 and GCC/Clang-style function attribute syntax.
* VKAPI_CALL - Placed after the return type in function declarations.
* Useful for MSVC-style calling convention syntax.
* VKAPI_PTR - Placed between the '(' and '*' in function pointer types.
*
* Function declaration: VKAPI_ATTR void VKAPI_CALL vkCommand(void);
* Function pointer type: typedef void (VKAPI_PTR *PFN_vkCommand)(void);
*/
#if defined(_WIN32)
// On Windows, Vulkan commands use the stdcall convention
#define VKAPI_ATTR
#define VKAPI_CALL __stdcall
#define VKAPI_PTR VKAPI_CALL
#elif defined(__ANDROID__) && defined(__ARM_EABI__) && !defined(__ARM_ARCH_7A__)
// Android does not support Vulkan in native code using the "armeabi" ABI.
#error "Vulkan requires the 'armeabi-v7a' or 'armeabi-v7a-hard' ABI on 32-bit ARM CPUs"
#elif defined(__ANDROID__) && defined(__ARM_ARCH_7A__)
// On Android/ARMv7a, Vulkan functions use the armeabi-v7a-hard calling
// convention, even if the application's native code is compiled with the
// armeabi-v7a calling convention.
#define VKAPI_ATTR __attribute__((pcs("aapcs-vfp")))
#define VKAPI_CALL
#define VKAPI_PTR VKAPI_ATTR
#else
// On other platforms, use the default calling convention
#define VKAPI_ATTR
#define VKAPI_CALL
#define VKAPI_PTR
#endif
#include <stddef.h>
#if !defined(VK_NO_STDINT_H)
#if defined(_MSC_VER) && (_MSC_VER < 1600)
typedef signed __int8 int8_t;
typedef unsigned __int8 uint8_t;
typedef signed __int16 int16_t;
typedef unsigned __int16 uint16_t;
typedef signed __int32 int32_t;
typedef unsigned __int32 uint32_t;
typedef signed __int64 int64_t;
typedef unsigned __int64 uint64_t;
#else
#include <stdint.h>
#endif
#endif // !defined(VK_NO_STDINT_H)
#ifdef __cplusplus
} // extern "C"
#endif // __cplusplus
// Platform-specific headers required by platform window system extensions.
// These are enabled prior to #including "vulkan.h". The same enable then
// controls inclusion of the extension interfaces in vulkan.h.
#ifdef VK_USE_PLATFORM_ANDROID_KHR
#include <android/native_window.h>
#endif
#ifdef VK_USE_PLATFORM_MIR_KHR
#include <mir_toolkit/client_types.h>
#endif
#ifdef VK_USE_PLATFORM_WAYLAND_KHR
#include <wayland-client.h>
#endif
#ifdef VK_USE_PLATFORM_WIN32_KHR
#include <windows.h>
#endif
#ifdef VK_USE_PLATFORM_XLIB_KHR
#include <X11/Xlib.h>
#endif
#ifdef VK_USE_PLATFORM_XCB_KHR
#include <xcb/xcb.h>
#endif
#endif

3800
include/vulkan/vulkan.h Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,62 @@
/*
* Copyright © 2015 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#ifndef __VULKAN_INTEL_H__
#define __VULKAN_INTEL_H__
#include "vulkan.h"
#ifdef __cplusplus
extern "C"
{
#endif // __cplusplus
#define VK_STRUCTURE_TYPE_DMA_BUF_IMAGE_CREATE_INFO_INTEL 1024
typedef struct VkDmaBufImageCreateInfo_
{
VkStructureType sType; // Must be VK_STRUCTURE_TYPE_DMA_BUF_IMAGE_CREATE_INFO_INTEL
const void* pNext; // Pointer to next structure.
int fd;
VkFormat format;
VkExtent3D extent; // Depth must be 1
uint32_t strideInBytes;
} VkDmaBufImageCreateInfo;
typedef VkResult (VKAPI_PTR *PFN_vkCreateDmaBufImageINTEL)(VkDevice device, const VkDmaBufImageCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkDeviceMemory* pMem, VkImage* pImage);
#ifndef VK_NO_PROTOTYPES
VKAPI_ATTR VkResult VKAPI_CALL vkCreateDmaBufImageINTEL(
VkDevice _device,
const VkDmaBufImageCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkDeviceMemory* pMem,
VkImage* pImage);
#endif
#ifdef __cplusplus
} // extern "C"
#endif // __cplusplus
#endif // __VULKAN_INTEL_H__

View File

@@ -3,18 +3,18 @@
if BUILD_SHARED
if HAVE_COMPAT_SYMLINKS
all-local : .libs/install-gallium-links
all-local : .install-gallium-links
.libs/install-gallium-links : $(dri_LTLIBRARIES) $(egl_LTLIBRARIES) $(lib_LTLIBRARIES)
.install-gallium-links : $(dri_LTLIBRARIES) $(egl_LTLIBRARIES) $(lib_LTLIBRARIES)
$(AM_V_GEN)$(MKDIR_P) $(top_builddir)/$(LIB_DIR); \
link_dir=$(top_builddir)/$(LIB_DIR)/gallium; \
if test x$(egl_LTLIBRARIES) != x; then \
link_dir=$(top_builddir)/$(LIB_DIR)/egl; \
fi; \
$(MKDIR_P) $$link_dir; \
file_list=$(dri_LTLIBRARIES:%.la=.libs/%.so); \
file_list+=$(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*); \
file_list+=$(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*); \
file_list="$(dri_LTLIBRARIES:%.la=.libs/%.so)"; \
file_list="$$file_list$(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)"; \
file_list="$$file_list$(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)"; \
for f in $$file_list; do \
if test -h .libs/$$f; then \
cp -d $$f $$link_dir; \
@@ -23,4 +23,15 @@ all-local : .libs/install-gallium-links
fi; \
done && touch $@
endif
clean-local:
for f in $(notdir $(dri_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)) \
$(notdir $(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)) \
$(notdir $(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)); do \
echo $$f; \
$(RM) $(top_builddir)/$(LIB_DIR)/gallium/$$f; \
done;
rmdir $(top_builddir)/$(LIB_DIR)/gallium || true
$(RM) .install-gallium-links
endif

View File

@@ -53,6 +53,7 @@
# optimize
# packed
# pure
# returns_nonnull
# unused
# used
# visibility
@@ -76,6 +77,9 @@
#serial 2
# mattst88:
# Added support for returns_nonnull attribute
AC_DEFUN([AX_GCC_FUNC_ATTRIBUTE], [
AS_VAR_PUSHDEF([ac_var], [ax_cv_have_func_attribute_$1])
@@ -175,6 +179,9 @@ AC_DEFUN([AX_GCC_FUNC_ATTRIBUTE], [
[pure], [
int foo( void ) __attribute__(($1));
],
[returns_nonnull], [
int *foo( void ) __attribute__(($1));
],
[unused], [
int foo( void ) __attribute__(($1));
],

View File

@@ -30,11 +30,10 @@ Custom builders and methods.
#
import os
import os.path
import re
import sys
import subprocess
import modulefinder
import SCons.Action
import SCons.Builder
@@ -44,6 +43,13 @@ import fixes
import source_list
# the get_implicit_deps() method changed between 2.4 and 2.5: now it expects
# a callable that takes a scanner as argument and returns a path, rather than
# a path directly. We want to support both, so we need to detect the SCons version,
# for which no API is provided by SCons 8-P
scons_version = tuple(map(int, SCons.__version__.split('.')))
def quietCommandLines(env):
# Quiet command lines
# See also http://www.scons.org/wiki/HidingCommandLinesInOutput
@@ -93,27 +99,19 @@ def createConvenienceLibBuilder(env):
return convenience_lib
# TODO: handle import statements with multiple modules
# TODO: handle from import statements
import_re = re.compile(r'^\s*import\s+(\S+)\s*$', re.M)
def python_scan(node, env, path):
# http://www.scons.org/doc/0.98.5/HTML/scons-user/c2781.html#AEN2789
# https://docs.python.org/2/library/modulefinder.html
contents = node.get_contents()
source_dir = node.get_dir()
imports = import_re.findall(contents)
finder = modulefinder.ModuleFinder()
finder.run_script(node.abspath)
results = []
for imp in imports:
for dir in path:
file = os.path.join(str(dir), imp.replace('.', os.sep) + '.py')
if os.path.exists(file):
results.append(env.File(file))
break
file = os.path.join(str(dir), imp.replace('.', os.sep), '__init__.py')
if os.path.exists(file):
results.append(env.File(file))
break
#print node, map(str, results)
for name, mod in finder.modules.iteritems():
if mod.__file__ is None:
continue
assert os.path.exists(mod.__file__)
results.append(env.File(mod.__file__))
return results
python_scanner = SCons.Scanner.Scanner(function = python_scan, skeys = ['.py'])
@@ -138,7 +136,7 @@ def code_generate(env, script, target, source, command):
# Explicitly mark that the generated code depends on the generator,
# and on implicitly imported python modules
path = (script_src.get_dir(),)
path = (script_src.get_dir(),) if scons_version < (2, 5, 0) else lambda x: script_src
deps = [script_src]
deps += script_src.get_implicit_deps(env, python_scanner, path)
env.Depends(code, deps)

View File

@@ -82,11 +82,6 @@ def install_shared_library(env, sources, version = ()):
return targets
def createInstallMethods(env):
env.AddMethod(install_program, 'InstallProgram')
env.AddMethod(install_shared_library, 'InstallSharedLibrary')
def msvc2013_compat(env):
if env['gcc']:
env.Append(CCFLAGS = [
@@ -94,8 +89,20 @@ def msvc2013_compat(env):
'-Werror=pointer-arith',
])
def createMSVCCompatMethods(env):
env.AddMethod(msvc2013_compat, 'MSVC2013Compat')
def unit_test(env, test_name, program_target, args=None):
env.InstallProgram(program_target)
cmd = [program_target[0].abspath]
if args is not None:
cmd += args
cmd = ' '.join(cmd)
# http://www.scons.org/wiki/UnitTests
action = SCons.Action.Action(cmd, " Running $SOURCE ...")
alias = env.Alias(test_name, program_target, action)
env.AlwaysBuild(alias)
env.Depends('check', alias)
def num_jobs():
@@ -164,16 +171,6 @@ def generate(env):
# Allow override compiler and specify additional flags from environment
if os.environ.has_key('CC'):
env['CC'] = os.environ['CC']
# Update CCVERSION to match
pipe = SCons.Action._subproc(env, [env['CC'], '--version'],
stdin = 'devnull',
stderr = 'devnull',
stdout = subprocess.PIPE)
if pipe.wait() == 0:
line = pipe.stdout.readline()
match = re.search(r'[0-9]+(\.[0-9]+)+', line)
if match:
env['CCVERSION'] = match.group(0)
if os.environ.has_key('CFLAGS'):
env['CCFLAGS'] += SCons.Util.CLVar(os.environ['CFLAGS'])
if os.environ.has_key('CXX'):
@@ -186,14 +183,15 @@ def generate(env):
# Detect gcc/clang not by executable name, but through pre-defined macros
# as autoconf does, to avoid drawing wrong conclusions when using tools
# that overrice CC/CXX like scan-build.
env['gcc'] = 0
env['gcc_compat'] = 0
env['clang'] = 0
env['msvc'] = 0
if host_platform.system() == 'Windows':
env['msvc'] = check_cc(env, 'MSVC', 'defined(_MSC_VER)', '/E')
if not env['msvc']:
env['gcc'] = check_cc(env, 'GCC', 'defined(__GNUC__) && !defined(__clang__)')
env['clang'] = check_cc(env, 'Clang', '__clang__')
env['gcc_compat'] = check_cc(env, 'GCC', 'defined(__GNUC__)')
env['clang'] = check_cc(env, 'Clang', '__clang__')
env['gcc'] = env['gcc_compat'] and not env['clang']
env['suncc'] = env['platform'] == 'sunos' and os.path.basename(env['CC']) == 'cc'
env['icc'] = 'icc' == os.path.basename(env['CC'])
@@ -206,7 +204,7 @@ def generate(env):
platform = env['platform']
x86 = env['machine'] == 'x86'
ppc = env['machine'] == 'ppc'
gcc_compat = env['gcc'] or env['clang']
gcc_compat = env['gcc_compat']
msvc = env['msvc']
suncc = env['suncc']
icc = env['icc']
@@ -292,7 +290,11 @@ def generate(env):
# C preprocessor options
cppdefines = []
cppdefines += ['__STDC_LIMIT_MACROS', '__STDC_CONSTANT_MACROS']
cppdefines += [
'__STDC_LIMIT_MACROS',
'__STDC_CONSTANT_MACROS',
'HAVE_NO_AUTOCONF',
]
if env['build'] in ('debug', 'checked'):
cppdefines += ['DEBUG']
else:
@@ -307,8 +309,6 @@ def generate(env):
'_BSD_SOURCE',
'_GNU_SOURCE',
'_DEFAULT_SOURCE',
'HAVE_PTHREAD',
'HAVE_POSIX_MEMALIGN',
]
if env['platform'] == 'darwin':
cppdefines += [
@@ -329,11 +329,6 @@ def generate(env):
if env['platform'] in ('linux', 'darwin'):
cppdefines += ['HAVE_XLOCALE_H']
if env['platform'] == 'haiku':
cppdefines += [
'HAVE_PTHREAD',
'HAVE_POSIX_MEMALIGN'
]
if platform == 'windows':
cppdefines += [
'WIN32',
@@ -367,26 +362,6 @@ def generate(env):
print 'warning: Floating-point textures enabled.'
print 'warning: Please consult docs/patents.txt with your lawyer before building Mesa.'
cppdefines += ['TEXTURE_FLOAT_ENABLED']
if gcc_compat:
ccversion = env['CCVERSION']
cppdefines += [
'HAVE___BUILTIN_EXPECT',
'HAVE___BUILTIN_FFS',
'HAVE___BUILTIN_FFSLL',
'HAVE_FUNC_ATTRIBUTE_FLATTEN',
'HAVE_FUNC_ATTRIBUTE_UNUSED',
# GCC 3.0
'HAVE_FUNC_ATTRIBUTE_FORMAT',
'HAVE_FUNC_ATTRIBUTE_PACKED',
# GCC 3.4
'HAVE___BUILTIN_CTZ',
'HAVE___BUILTIN_POPCOUNT',
'HAVE___BUILTIN_POPCOUNTLL',
'HAVE___BUILTIN_CLZ',
'HAVE___BUILTIN_CLZLL',
]
if distutils.version.LooseVersion(ccversion) >= distutils.version.LooseVersion('4.5'):
cppdefines += ['HAVE___BUILTIN_UNREACHABLE']
env.Append(CPPDEFINES = cppdefines)
# C compiler options
@@ -394,13 +369,8 @@ def generate(env):
cxxflags = [] # C++
ccflags = [] # C & C++
if gcc_compat:
ccversion = env['CCVERSION']
if env['build'] == 'debug':
ccflags += ['-O0']
elif env['gcc'] and ccversion.startswith('4.2.'):
# gcc 4.2.x optimizer is broken
print "warning: gcc 4.2.x optimizer is broken -- disabling optimizations"
ccflags += ['-O0']
else:
ccflags += ['-O3']
if env['gcc']:
@@ -410,7 +380,7 @@ def generate(env):
# Work around aliasing bugs - developers should comment this out
ccflags += ['-fno-strict-aliasing']
ccflags += ['-g']
if env['build'] in ('checked', 'profile'):
if env['build'] in ('checked', 'profile') or env['asan']:
# See http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#Which_options_should_I_pass_to_gcc_when_compiling_for_profiling?
ccflags += [
'-fno-omit-frame-pointer',
@@ -481,13 +451,13 @@ def generate(env):
'/O2', # optimize for speed
]
if env['build'] == 'release':
ccflags += [
'/GL', # enable whole program optimization
]
if not env['clang']:
ccflags += [
'/GL', # enable whole program optimization
]
else:
ccflags += [
'/Oy-', # disable frame pointer omission
'/GL-', # disable whole program optimization
]
ccflags += [
'/W3', # warning level
@@ -501,6 +471,10 @@ def generate(env):
'/wd4800', # forcing value to bool 'true' or 'false' (performance warning)
'/wd4996', # disable deprecated POSIX name warnings
]
if env['clang']:
ccflags += [
'-Wno-microsoft-enum-value', # enumerator value is not representable in underlying type 'int'
]
if env['machine'] == 'x86':
ccflags += [
'/arch:SSE2', # use the SSE2 instructions (default since MSVC 2012)
@@ -540,6 +514,16 @@ def generate(env):
# scan-build will produce more comprehensive output
env.Append(CCFLAGS = ['--analyze'])
# https://github.com/google/sanitizers/wiki/AddressSanitizer
if env['asan']:
if gcc_compat:
env.Append(CCFLAGS = [
'-fsanitize=address',
])
env.Append(LINKFLAGS = [
'-fsanitize=address',
])
# Assembler options
if gcc_compat:
if env['machine'] == 'x86':
@@ -577,7 +561,7 @@ def generate(env):
shlinkflags += ['-Wl,--enable-stdcall-fixup']
#shlinkflags += ['-Wl,--kill-at']
if msvc:
if env['build'] == 'release':
if env['build'] == 'release' and not env['clang']:
# enable Link-time Code Generation
linkflags += ['/LTCG']
env.Append(ARFLAGS = ['/LTCG'])
@@ -657,8 +641,10 @@ def generate(env):
# Custom builders and methods
env.Tool('custom')
createInstallMethods(env)
createMSVCCompatMethods(env)
env.AddMethod(install_program, 'InstallProgram')
env.AddMethod(install_shared_library, 'InstallSharedLibrary')
env.AddMethod(msvc2013_compat, 'MSVC2013Compat')
env.AddMethod(unit_test, 'UnitTest')
env.PkgCheckModules('X11', ['x11', 'xext', 'xdamage', 'xfixes', 'glproto >= 1.4.13'])
env.PkgCheckModules('XCB', ['x11-xcb', 'xcb-glx >= 1.8.1', 'xcb-dri2 >= 1.8'])

2301
scripts/get_reviewer.pl Executable file

File diff suppressed because it is too large Load Diff

View File

@@ -19,11 +19,65 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
.PHONY: git_sha1.h.tmp
git_sha1.h.tmp:
@# Don't assume that $(top_srcdir)/.git is a directory. It may be
@# a gitlink file if $(top_srcdir) is a submodule checkout or a linked
@# worktree.
@# If we are building from a release tarball copy the bundled header.
@touch git_sha1.h.tmp
@if test -e $(top_srcdir)/.git; then \
if which git > /dev/null; then \
git --git-dir=$(top_srcdir)/.git log -n 1 --oneline | \
sed 's/^\([^ ]*\) .*/#define MESA_GIT_SHA1 "git-\1"/' \
> git_sha1.h.tmp ; \
fi \
fi
git_sha1.h: git_sha1.h.tmp
@echo "updating git_sha1.h"
@if ! cmp -s git_sha1.h.tmp git_sha1.h; then \
mv git_sha1.h.tmp git_sha1.h ;\
else \
rm git_sha1.h.tmp ;\
fi
BUILT_SOURCES = git_sha1.h
CLEANFILES = $(BUILT_SOURCES)
SUBDIRS = . gtest util mapi/glapi/gen mapi
if HAVE_OPENGL
gldir = $(includedir)/GL
gl_HEADERS = \
$(top_srcdir)/include/GL/gl.h \
$(top_srcdir)/include/GL/glext.h \
$(top_srcdir)/include/GL/glcorearb.h \
$(top_srcdir)/include/GL/gl_mangle.h
endif
if HAVE_GLX
glxdir = $(includedir)/GL
glx_HEADERS = \
$(top_srcdir)/include/GL/glx.h \
$(top_srcdir)/include/GL/glxext.h \
$(top_srcdir)/include/GL/glx_mangle.h
pkgconfigdir = $(libdir)/pkgconfig
pkgconfig_DATA = mesa/gl.pc
endif
if HAVE_COMMON_OSMESA
osmesadir = $(includedir)/GL
osmesa_HEADERS = $(top_srcdir)/include/GL/osmesa.h
endif
# include only conditionally ?
SUBDIRS += compiler
if HAVE_INTEL_DRIVERS
SUBDIRS += intel
endif
if NEED_OPENGL_COMMON
SUBDIRS += mesa
endif
@@ -34,24 +88,37 @@ if HAVE_DRI_GLX
SUBDIRS += glx
endif
if HAVE_EGL_PLATFORM_WAYLAND
SUBDIRS += egl/wayland/wayland-egl egl/wayland/wayland-drm
## Optionally required by GBM and EGL
if HAVE_PLATFORM_WAYLAND
SUBDIRS += egl/wayland/wayland-drm
endif
## Optionally required by EGL (aka PLATFORM_GBM)
if HAVE_GBM
SUBDIRS += gbm
endif
## Optionally required by EGL
if HAVE_PLATFORM_WAYLAND
SUBDIRS += egl/wayland/wayland-egl
endif
if HAVE_EGL
SUBDIRS += egl
endif
## Requires the i965 compiler (part of mesa) and wayland-drm
if HAVE_INTEL_VULKAN
SUBDIRS += intel/vulkan
endif
if HAVE_GALLIUM
SUBDIRS += gallium
endif
EXTRA_DIST = \
getopt hgl SConscript
getopt hgl SConscript \
$(top_srcdir)/include/GL/mesa_glinterop.h
AM_CFLAGS = $(VISIBILITY_CFLAGS)
AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS)

View File

@@ -1 +1,5 @@
glsl_compiler
subtest-cr
subtest-cr-lf
subtest-lf
subtest-lf-cr

View File

@@ -32,8 +32,10 @@ intermediates := $(call local-generated-sources-dir)
LOCAL_SRC_FILES := $(LOCAL_SRC_FILES)
LOCAL_C_INCLUDES += \
$(intermediates)/glcpp \
$(MESA_TOP)/src/glsl/glcpp \
$(intermediates)/glsl \
$(intermediates)/glsl/glcpp \
$(LOCAL_PATH)/glsl \
$(LOCAL_PATH)/glsl/glcpp \
LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \
$(LIBGLCPP_GENERATED_FILES) \
@@ -51,6 +53,8 @@ define glsl_local-y-to-c-and-h
$(hide) $(YACC) -o $@ -p "glcpp_parser_" $<
endef
YACC_HEADER_SUFFIX := .hpp
define local-yy-to-cpp-and-h
@mkdir -p $(dir $@)
@echo "Mesa Yacc: $(PRIVATE_MODULE) <= $<"
@@ -63,14 +67,14 @@ define local-yy-to-cpp-and-h
rm -f $(@:$1=$(YACC_HEADER_SUFFIX))
endef
$(intermediates)/glsl_lexer.cpp: $(LOCAL_PATH)/glsl_lexer.ll
$(intermediates)/glsl/glsl_lexer.cpp: $(LOCAL_PATH)/glsl/glsl_lexer.ll
$(call local-l-or-ll-to-c-or-cpp)
$(intermediates)/glsl_parser.cpp: $(LOCAL_PATH)/glsl_parser.yy
$(intermediates)/glsl/glsl_parser.cpp: $(LOCAL_PATH)/glsl/glsl_parser.yy
$(call local-yy-to-cpp-and-h,.cpp)
$(intermediates)/glcpp/glcpp-lex.c: $(LOCAL_PATH)/glcpp/glcpp-lex.l
$(intermediates)/glsl/glcpp/glcpp-lex.c: $(LOCAL_PATH)/glsl/glcpp/glcpp-lex.l
$(call local-l-or-ll-to-c-or-cpp)
$(intermediates)/glcpp/glcpp-parse.c: $(LOCAL_PATH)/glcpp/glcpp-parse.y
$(intermediates)/glsl/glcpp/glcpp-parse.c: $(LOCAL_PATH)/glsl/glcpp/glcpp-parse.y
$(call glsl_local-y-to-c-and-h)

View File

@@ -36,7 +36,6 @@ include $(CLEAR_VARS)
LOCAL_SRC_FILES := \
$(LIBGLCPP_FILES) \
$(LIBGLSL_FILES) \
$(NIR_FILES)
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/mapi \
@@ -44,33 +43,13 @@ LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/gallium/include \
$(MESA_TOP)/src/gallium/auxiliary
LOCAL_STATIC_LIBRARIES := libmesa_compiler
LOCAL_STATIC_LIBRARIES := \
libmesa_compiler \
libmesa_nir
LOCAL_MODULE := libmesa_glsl
include $(LOCAL_PATH)/Android.gen.mk
include $(LOCAL_PATH)/Android.glsl.gen.mk
include $(MESA_COMMON_MK)
include $(BUILD_STATIC_LIBRARY)
# ---------------------------------------
# Build glsl_compiler
# ---------------------------------------
include $(CLEAR_VARS)
LOCAL_SRC_FILES := \
$(GLSL_COMPILER_CXX_FILES)
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/mesa \
$(MESA_TOP)/src/gallium/include \
$(MESA_TOP)/src/gallium/auxiliary
LOCAL_STATIC_LIBRARIES := libmesa_glsl libmesa_glsl_utils libmesa_util
LOCAL_MODULE_TAGS := eng
LOCAL_MODULE := glsl_compiler
include $(MESA_COMMON_MK)
include $(BUILD_EXECUTABLE)

View File

@@ -43,25 +43,6 @@ LOCAL_MODULE := libmesa_compiler
include $(MESA_COMMON_MK)
include $(BUILD_STATIC_LIBRARY)
# ---------------------------------------
# Build libmesa_nir
# ---------------------------------------
include $(LOCAL_PATH)/Android.glsl.mk
include $(CLEAR_VARS)
LOCAL_SRC_FILES := \
$(NIR_FILES)
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/mesa \
$(MESA_TOP)/src/gallium/include \
$(MESA_TOP)/src/gallium/auxiliary
LOCAL_STATIC_LIBRARIES := libmesa_compiler
LOCAL_MODULE := libmesa_nir
include $(LOCAL_PATH)/Android.gen.mk
include $(MESA_COMMON_MK)
include $(BUILD_STATIC_LIBRARY)
include $(LOCAL_PATH)/Android.nir.mk

View File

@@ -42,6 +42,10 @@ LOCAL_EXPORT_C_INCLUDE_DIRS += \
LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \
$(NIR_GENERATED_FILES))
# Modules using libmesa_nir must set LOCAL_GENERATED_SOURCES to this
MESA_GEN_NIR_H := $(addprefix $(call local-generated-sources-dir)/, \
nir/nir_opcodes.h \
nir/nir_builder_opcodes.h)
nir_builder_opcodes_gen := $(LOCAL_PATH)/nir/nir_builder_opcodes_h.py
nir_builder_opcodes_deps := \

View File

@@ -0,0 +1,49 @@
# Mesa 3-D graphics library
#
# Copyright (C) 2015 Intel Corporation
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
LOCAL_PATH := $(call my-dir)
include $(LOCAL_PATH)/Makefile.sources
# ---------------------------------------
# Build libmesa_nir
# ---------------------------------------
include $(CLEAR_VARS)
LOCAL_SRC_FILES := \
$(NIR_FILES)
LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/mapi \
$(MESA_TOP)/src/mesa \
$(MESA_TOP)/src/gallium/include \
$(MESA_TOP)/src/gallium/auxiliary
LOCAL_STATIC_LIBRARIES := libmesa_compiler
LOCAL_MODULE := libmesa_nir
include $(LOCAL_PATH)/Android.nir.gen.mk
include $(MESA_COMMON_MK)
include $(BUILD_STATIC_LIBRARY)

View File

@@ -31,6 +31,8 @@ AM_CPPFLAGS = \
-I$(top_builddir)/src/compiler/glsl\
-I$(top_srcdir)/src/compiler/glsl\
-I$(top_srcdir)/src/compiler/glsl/glcpp\
-I$(top_builddir)/src/compiler/nir \
-I$(top_srcdir)/src/compiler/nir \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/gtest/include \
@@ -54,272 +56,8 @@ BUILT_SOURCES =
CLEANFILES =
EXTRA_DIST = SConscript
EXTRA_DIST += glsl/tests glsl/glcpp/tests glsl/README \
glsl/TODO glsl/glcpp/README \
glsl/glsl_lexer.ll \
glsl/glsl_parser.yy \
glsl/glcpp/glcpp-lex.l \
glsl/glcpp/glcpp-parse.y \
glsl/Makefile.sources \
glsl/SConscript
TESTS += glsl/glcpp/tests/glcpp-test \
glsl/glcpp/tests/glcpp-test-cr-lf \
glsl/tests/blob-test \
glsl/tests/general-ir-test \
glsl/tests/optimization-test \
glsl/tests/sampler-types-test \
glsl/tests/uniform-initializer-test
TESTS_ENVIRONMENT= \
export PYTHON2=$(PYTHON2); \
export PYTHON_FLAGS=$(PYTHON_FLAGS);
check_PROGRAMS += \
glsl/glcpp/glcpp \
glsl/glsl_test \
glsl/tests/blob-test \
glsl/tests/general-ir-test \
glsl/tests/sampler-types-test \
glsl/tests/uniform-initializer-test
noinst_PROGRAMS = glsl_compiler
glsl_tests_blob_test_SOURCES = \
glsl/tests/blob_test.c
glsl_tests_blob_test_LDADD = \
glsl/libglsl.la
glsl_tests_general_ir_test_SOURCES = \
glsl/standalone_scaffolding.cpp \
glsl/tests/builtin_variable_test.cpp \
glsl/tests/invalidate_locations_test.cpp \
glsl/tests/general_ir_test.cpp \
glsl/tests/varyings_test.cpp
glsl_tests_general_ir_test_CFLAGS = \
$(PTHREAD_CFLAGS)
glsl_tests_general_ir_test_LDADD = \
$(top_builddir)/src/gtest/libgtest.la \
glsl/libglsl.la \
$(top_builddir)/src/libglsl_util.la \
$(PTHREAD_LIBS)
glsl_tests_uniform_initializer_test_SOURCES = \
glsl/tests/copy_constant_to_storage_tests.cpp \
glsl/tests/set_uniform_initializer_tests.cpp \
glsl/tests/uniform_initializer_utils.cpp \
glsl/tests/uniform_initializer_utils.h
glsl_tests_uniform_initializer_test_CFLAGS = \
$(PTHREAD_CFLAGS)
glsl_tests_uniform_initializer_test_LDADD = \
$(top_builddir)/src/gtest/libgtest.la \
glsl/libglsl.la \
$(top_builddir)/src/libglsl_util.la \
$(PTHREAD_LIBS)
glsl_tests_sampler_types_test_SOURCES = \
glsl/tests/sampler_types_test.cpp
glsl_tests_sampler_types_test_CFLAGS = \
$(PTHREAD_CFLAGS)
glsl_tests_sampler_types_test_LDADD = \
$(top_builddir)/src/gtest/libgtest.la \
glsl/libglsl.la \
$(top_builddir)/src/libglsl_util.la \
$(PTHREAD_LIBS)
noinst_LTLIBRARIES += glsl/libglsl.la glsl/libglcpp.la
glsl_libglcpp_la_LIBADD = \
$(top_builddir)/src/util/libmesautil.la
glsl_libglcpp_la_SOURCES = \
glsl/glcpp/glcpp-lex.c \
glsl/glcpp/glcpp-parse.c \
glsl/glcpp/glcpp-parse.h \
$(LIBGLCPP_FILES)
glsl_glcpp_glcpp_SOURCES = \
glsl/glcpp/glcpp.c
glsl_glcpp_glcpp_LDADD = \
glsl/libglcpp.la \
$(top_builddir)/src/libglsl_util.la \
-lm
glsl_libglsl_la_LIBADD = \
nir/libnir.la \
glsl/libglcpp.la
glsl_libglsl_la_SOURCES = \
glsl/glsl_lexer.cpp \
glsl/glsl_parser.cpp \
glsl/glsl_parser.h \
$(LIBGLSL_FILES)
glsl_compiler_SOURCES = \
$(GLSL_COMPILER_CXX_FILES)
glsl_compiler_LDADD = \
glsl/libglsl.la \
$(top_builddir)/src/libglsl_util.la \
$(top_builddir)/src/util/libmesautil.la \
$(PTHREAD_LIBS)
glsl_glsl_test_SOURCES = \
glsl/standalone_scaffolding.cpp \
glsl/test.cpp \
glsl/test_optpass.cpp \
glsl/test_optpass.h
glsl_glsl_test_LDADD = \
glsl/libglsl.la \
$(top_builddir)/src/libglsl_util.la \
$(PTHREAD_LIBS)
# We write our own rules for yacc and lex below. We'd rather use automake,
# but automake makes it especially difficult for a number of reasons:
#
# * < automake-1.12 generates .h files from .yy and .ypp files, but
# >=automake-1.12 generates .hh and .hpp files respectively. There's no
# good way of making a project that uses C++ yacc files compatible with
# both versions of automake. Strong work automake developers.
#
# * Since we're generating code from .l/.y files in a subdirectory (glcpp/)
# we'd like the resulting generated code to also go in glcpp/ for purposes
# of distribution. Automake gives no way to do this.
#
# * Since we're building multiple yacc parsers into one library (and via one
# Makefile) we have to use per-target YFLAGS. Using per-target YFLAGS causes
# automake to name the resulting generated code as <library-name>_filename.c.
# Frankly, that's ugly and we don't want a libglcpp_glcpp_parser.h file.
# In order to make build output print "LEX" and "YACC", we reproduce the
# automake variables below.
AM_V_LEX = $(am__v_LEX_$(V))
am__v_LEX_ = $(am__v_LEX_$(AM_DEFAULT_VERBOSITY))
am__v_LEX_0 = @echo " LEX " $@;
am__v_LEX_1 =
AM_V_YACC = $(am__v_YACC_$(V))
am__v_YACC_ = $(am__v_YACC_$(AM_DEFAULT_VERBOSITY))
am__v_YACC_0 = @echo " YACC " $@;
am__v_YACC_1 =
MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)
YACC_GEN = $(AM_V_YACC)$(YACC) $(YFLAGS)
LEX_GEN = $(AM_V_LEX)$(LEX) $(LFLAGS)
glsl/glsl_parser.cpp glsl/glsl_parser.h: glsl/glsl_parser.yy
$(MKDIR_GEN)
$(YACC_GEN) -o $@ -p "_mesa_glsl_" --defines=$(builddir)/glsl/glsl_parser.h $(srcdir)/glsl/glsl_parser.yy
include Makefile.glsl.am
glsl/glsl_lexer.cpp: glsl/glsl_lexer.ll
$(MKDIR_GEN)
$(LEX_GEN) -o $@ $(srcdir)/glsl/glsl_lexer.ll
glsl/glcpp/glcpp-parse.c glsl/glcpp/glcpp-parse.h: glsl/glcpp/glcpp-parse.y
$(MKDIR_GEN)
$(YACC_GEN) -o $@ -p "glcpp_parser_" --defines=$(builddir)/glsl/glcpp/glcpp-parse.h $(srcdir)/glsl/glcpp/glcpp-parse.y
glsl/glcpp/glcpp-lex.c: glsl/glcpp/glcpp-lex.l
$(MKDIR_GEN)
$(LEX_GEN) -o $@ $(srcdir)/glsl/glcpp/glcpp-lex.l
# Only the parsers (specifically the header files generated at the same time)
# need to be in BUILT_SOURCES. Though if we list the parser headers YACC is
# called for the .c/.cpp file and the .h files. By listing the .c/.cpp files
# YACC is only executed once for each parser. The rest of the generated code
# will be created at the appropriate times according to standard automake
# dependency rules.
BUILT_SOURCES += \
glsl/glsl_parser.cpp \
glsl/glsl_lexer.cpp \
glsl/glcpp/glcpp-parse.c \
glsl/glcpp/glcpp-lex.c
CLEANFILES += \
glsl/glcpp/glcpp-parse.h \
glsl/glsl_parser.h \
glsl/glsl_parser.cpp \
glsl/glsl_lexer.cpp \
glsl/glcpp/glcpp-parse.c \
glsl/glcpp/glcpp-lex.c
clean-local:
$(RM) -r subtest-cr subtest-cr-lf subtest-lf subtest-lf-cr
dist-hook:
$(RM) glsl/glcpp/tests/*.out
$(RM) glsl/glcpp/tests/subtest*/*.out
noinst_LTLIBRARIES += nir/libnir.la
nir_libnir_la_CPPFLAGS = \
$(AM_CPPFLAGS) \
-I$(top_builddir)/src/compiler/nir \
-I$(top_srcdir)/src/compiler/nir
nir_libnir_la_LIBADD = \
libcompiler.la
nir_libnir_la_SOURCES = \
$(NIR_FILES) \
$(NIR_GENERATED_FILES)
PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)
nir/nir_builder_opcodes.h: nir/nir_opcodes.py nir/nir_builder_opcodes_h.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/nir/nir_builder_opcodes_h.py > $@ || ($(RM) $@; false)
nir/nir_constant_expressions.c: nir/nir_opcodes.py nir/nir_constant_expressions.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/nir/nir_constant_expressions.py > $@ || ($(RM) $@; false)
nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_h.py > $@ || ($(RM) $@; false)
nir/nir_opcodes.c: nir/nir_opcodes.py nir/nir_opcodes_c.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_c.py > $@ || ($(RM) $@; false)
nir/nir_opt_algebraic.c: nir/nir_opt_algebraic.py nir/nir_algebraic.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/nir/nir_opt_algebraic.py > $@ || ($(RM) $@; false)
check_PROGRAMS += nir/tests/control_flow_tests
nir_tests_control_flow_tests_CPPFLAGS = \
$(AM_CPPFLAGS) \
-I$(top_builddir)/src/compiler/nir \
-I$(top_srcdir)/src/compiler/nir
nir_tests_control_flow_tests_SOURCES = \
nir/tests/control_flow_tests.cpp
nir_tests_control_flow_tests_CFLAGS = \
$(PTHREAD_CFLAGS)
nir_tests_control_flow_tests_LDADD = \
$(top_builddir)/src/gtest/libgtest.la \
nir/libnir.la \
$(top_builddir)/src/util/libmesautil.la \
$(PTHREAD_LIBS)
TESTS += nir/tests/control_flow_tests
BUILT_SOURCES += $(NIR_GENERATED_FILES)
CLEANFILES += $(NIR_GENERATED_FILES)
EXTRA_DIST += \
nir/nir_algebraic.py \
nir/nir_builder_opcodes_h.py \
nir/nir_constant_expressions.py \
nir/nir_opcodes.py \
nir/nir_opcodes_c.py \
nir/nir_opcodes_h.py \
nir/nir_opt_algebraic.py \
nir/tests \
nir/Makefile.sources
include Makefile.nir.am

View File

@@ -1,4 +1,6 @@
#
# Copyright © 2012 Jon TURNEY
# Copyright (C) 2015 Intel Corporation
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
@@ -19,140 +21,130 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
AM_CPPFLAGS = \
-I$(top_srcdir)/include \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa/ \
-I$(top_srcdir)/src/gallium/include \
-I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/glsl/glcpp \
-I$(top_srcdir)/src/gtest/include \
$(DEFINES)
AM_CFLAGS = \
$(VISIBILITY_CFLAGS) \
$(MSVC2013_COMPAT_CFLAGS)
AM_CXXFLAGS = \
$(VISIBILITY_CXXFLAGS) \
$(MSVC2013_COMPAT_CXXFLAGS)
EXTRA_DIST += glsl/tests glsl/glcpp/tests glsl/README \
glsl/TODO glsl/glcpp/README \
glsl/glsl_lexer.ll \
glsl/glsl_parser.yy \
glsl/glcpp/glcpp-lex.l \
glsl/glcpp/glcpp-parse.y \
SConscript.glsl
EXTRA_DIST = tests glcpp/tests README TODO glcpp/README \
glsl_lexer.ll \
glsl_parser.yy \
glcpp/glcpp-lex.l \
glcpp/glcpp-parse.y \
SConscript
include Makefile.sources
TESTS = glcpp/tests/glcpp-test \
glcpp/tests/glcpp-test-cr-lf \
tests/blob-test \
tests/general-ir-test \
tests/optimization-test \
tests/sampler-types-test \
tests/uniform-initializer-test
TESTS += glsl/glcpp/tests/glcpp-test \
glsl/glcpp/tests/glcpp-test-cr-lf \
glsl/tests/blob-test \
glsl/tests/general-ir-test \
glsl/tests/optimization-test \
glsl/tests/sampler-types-test \
glsl/tests/uniform-initializer-test \
glsl/tests/warnings-test
TESTS_ENVIRONMENT= \
export PYTHON2=$(PYTHON2); \
export PYTHON_FLAGS=$(PYTHON_FLAGS);
noinst_LTLIBRARIES = libglsl.la libglcpp.la
check_PROGRAMS = \
glcpp/glcpp \
glsl_test \
tests/blob-test \
tests/general-ir-test \
tests/sampler-types-test \
tests/uniform-initializer-test
check_PROGRAMS += \
glsl/glcpp/glcpp \
glsl/glsl_test \
glsl/tests/blob-test \
glsl/tests/general-ir-test \
glsl/tests/sampler-types-test \
glsl/tests/uniform-initializer-test
noinst_PROGRAMS = glsl_compiler
tests_blob_test_SOURCES = \
tests/blob_test.c
tests_blob_test_LDADD = \
$(top_builddir)/src/glsl/libglsl.la
glsl_tests_blob_test_SOURCES = \
glsl/tests/blob_test.c
glsl_tests_blob_test_LDADD = \
glsl/libglsl.la
tests_general_ir_test_SOURCES = \
standalone_scaffolding.cpp \
tests/builtin_variable_test.cpp \
tests/invalidate_locations_test.cpp \
tests/general_ir_test.cpp \
tests/varyings_test.cpp
tests_general_ir_test_CFLAGS = \
glsl_tests_general_ir_test_SOURCES = \
glsl/tests/builtin_variable_test.cpp \
glsl/tests/invalidate_locations_test.cpp \
glsl/tests/general_ir_test.cpp \
glsl/tests/varyings_test.cpp
glsl_tests_general_ir_test_CFLAGS = \
$(PTHREAD_CFLAGS)
tests_general_ir_test_LDADD = \
glsl_tests_general_ir_test_LDADD = \
$(top_builddir)/src/gtest/libgtest.la \
$(top_builddir)/src/glsl/libglsl.la \
glsl/libglsl.la \
glsl/libstandalone.la \
$(top_builddir)/src/libglsl_util.la \
$(PTHREAD_LIBS)
tests_uniform_initializer_test_SOURCES = \
tests/copy_constant_to_storage_tests.cpp \
tests/set_uniform_initializer_tests.cpp \
tests/uniform_initializer_utils.cpp \
tests/uniform_initializer_utils.h
tests_uniform_initializer_test_CFLAGS = \
glsl_tests_uniform_initializer_test_SOURCES = \
glsl/tests/copy_constant_to_storage_tests.cpp \
glsl/tests/set_uniform_initializer_tests.cpp \
glsl/tests/uniform_initializer_utils.cpp \
glsl/tests/uniform_initializer_utils.h
glsl_tests_uniform_initializer_test_CFLAGS = \
$(PTHREAD_CFLAGS)
tests_uniform_initializer_test_LDADD = \
glsl_tests_uniform_initializer_test_LDADD = \
$(top_builddir)/src/gtest/libgtest.la \
$(top_builddir)/src/glsl/libglsl.la \
glsl/libglsl.la \
$(top_builddir)/src/libglsl_util.la \
$(PTHREAD_LIBS)
tests_sampler_types_test_SOURCES = \
tests/sampler_types_test.cpp
tests_sampler_types_test_CFLAGS = \
glsl_tests_sampler_types_test_SOURCES = \
glsl/tests/sampler_types_test.cpp
glsl_tests_sampler_types_test_CFLAGS = \
$(PTHREAD_CFLAGS)
tests_sampler_types_test_LDADD = \
glsl_tests_sampler_types_test_LDADD = \
$(top_builddir)/src/gtest/libgtest.la \
$(top_builddir)/src/glsl/libglsl.la \
glsl/libglsl.la \
$(top_builddir)/src/libglsl_util.la \
$(PTHREAD_LIBS)
libglcpp_la_LIBADD = \
noinst_LTLIBRARIES += glsl/libglsl.la glsl/libglcpp.la glsl/libstandalone.la
glsl_libglcpp_la_LIBADD = \
$(top_builddir)/src/util/libmesautil.la
libglcpp_la_SOURCES = \
glcpp/glcpp-lex.c \
glcpp/glcpp-parse.c \
glcpp/glcpp-parse.h \
glsl_libglcpp_la_SOURCES = \
glsl/glcpp/glcpp-lex.c \
glsl/glcpp/glcpp-parse.c \
glsl/glcpp/glcpp-parse.h \
$(LIBGLCPP_FILES)
glcpp_glcpp_SOURCES = \
glcpp/glcpp.c
glcpp_glcpp_LDADD = \
libglcpp.la \
glsl_glcpp_glcpp_SOURCES = \
glsl/glcpp/glcpp.c
glsl_glcpp_glcpp_LDADD = \
glsl/libglcpp.la \
$(top_builddir)/src/libglsl_util.la \
-lm
libglsl_la_LIBADD = \
$(top_builddir)/src/compiler/nir/libnir.la \
libglcpp.la
glsl_libglsl_la_LIBADD = \
nir/libnir.la \
glsl/libglcpp.la
libglsl_la_SOURCES = \
glsl_lexer.cpp \
glsl_parser.cpp \
glsl_parser.h \
glsl_libglsl_la_SOURCES = \
glsl/glsl_lexer.cpp \
glsl/glsl_parser.cpp \
glsl/glsl_parser.h \
$(LIBGLSL_FILES)
glsl_compiler_SOURCES = \
glsl_libstandalone_la_SOURCES = \
$(GLSL_COMPILER_CXX_FILES)
glsl_compiler_LDADD = \
libglsl.la \
glsl_libstandalone_la_LIBADD = \
glsl/libglsl.la \
$(top_builddir)/src/libglsl_util.la \
$(top_builddir)/src/util/libmesautil.la \
$(PTHREAD_LIBS)
glsl_test_SOURCES = \
standalone_scaffolding.cpp \
test.cpp \
test_optpass.cpp \
test_optpass.h
glsl_compiler_SOURCES = \
glsl/main.cpp
glsl_test_LDADD = \
libglsl.la \
glsl_compiler_LDADD = \
glsl/libstandalone.la
glsl_glsl_test_SOURCES = \
glsl/test.cpp \
glsl/test_optpass.cpp \
glsl/test_optpass.h
glsl_glsl_test_LDADD = \
glsl/libglsl.la \
glsl/libstandalone.la \
$(top_builddir)/src/libglsl_util.la \
$(PTHREAD_LIBS)
@@ -186,23 +178,24 @@ am__v_YACC_ = $(am__v_YACC_$(AM_DEFAULT_VERBOSITY))
am__v_YACC_0 = @echo " YACC " $@;
am__v_YACC_1 =
MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)
YACC_GEN = $(AM_V_YACC)$(YACC) $(YFLAGS)
LEX_GEN = $(AM_V_LEX)$(LEX) $(LFLAGS)
glsl_parser.cpp glsl_parser.h: glsl_parser.yy
$(YACC_GEN) -o $@ -p "_mesa_glsl_" --defines=$(builddir)/glsl_parser.h $(srcdir)/glsl_parser.yy
glsl_lexer.cpp: glsl_lexer.ll
$(LEX_GEN) -o $@ $(srcdir)/glsl_lexer.ll
glcpp/glcpp-parse.c glcpp/glcpp-parse.h: glcpp/glcpp-parse.y
glsl/glsl_parser.cpp glsl/glsl_parser.h: glsl/glsl_parser.yy
$(MKDIR_GEN)
$(YACC_GEN) -o $@ -p "glcpp_parser_" --defines=$(builddir)/glcpp/glcpp-parse.h $(srcdir)/glcpp/glcpp-parse.y
$(YACC_GEN) -o $@ -p "_mesa_glsl_" --defines=$(builddir)/glsl/glsl_parser.h $(srcdir)/glsl/glsl_parser.yy
glcpp/glcpp-lex.c: glcpp/glcpp-lex.l
glsl/glsl_lexer.cpp: glsl/glsl_lexer.ll
$(MKDIR_GEN)
$(LEX_GEN) -o $@ $(srcdir)/glcpp/glcpp-lex.l
$(LEX_GEN) -o $@ $(srcdir)/glsl/glsl_lexer.ll
glsl/glcpp/glcpp-parse.c glsl/glcpp/glcpp-parse.h: glsl/glcpp/glcpp-parse.y
$(MKDIR_GEN)
$(YACC_GEN) -o $@ -p "glcpp_parser_" --defines=$(builddir)/glsl/glcpp/glcpp-parse.h $(srcdir)/glsl/glcpp/glcpp-parse.y
glsl/glcpp/glcpp-lex.c: glsl/glcpp/glcpp-lex.l
$(MKDIR_GEN)
$(LEX_GEN) -o $@ $(srcdir)/glsl/glcpp/glcpp-lex.l
# Only the parsers (specifically the header files generated at the same time)
# need to be in BUILT_SOURCES. Though if we list the parser headers YACC is
@@ -210,19 +203,22 @@ glcpp/glcpp-lex.c: glcpp/glcpp-lex.l
# YACC is only executed once for each parser. The rest of the generated code
# will be created at the appropriate times according to standard automake
# dependency rules.
BUILT_SOURCES = \
glsl_parser.cpp \
glsl_lexer.cpp \
glcpp/glcpp-parse.c \
glcpp/glcpp-lex.c
CLEANFILES = \
glcpp/glcpp-parse.h \
glsl_parser.h \
$(BUILT_SOURCES)
BUILT_SOURCES += \
glsl/glsl_parser.cpp \
glsl/glsl_lexer.cpp \
glsl/glcpp/glcpp-parse.c \
glsl/glcpp/glcpp-lex.c
CLEANFILES += \
glsl/glcpp/glcpp-parse.h \
glsl/glsl_parser.h \
glsl/glsl_parser.cpp \
glsl/glsl_lexer.cpp \
glsl/glcpp/glcpp-parse.c \
glsl/glcpp/glcpp-lex.c
clean-local:
$(RM) -r subtest-cr subtest-cr-lf subtest-lf subtest-lf-cr
dist-hook:
$(RM) glcpp/tests/*.out
$(RM) glcpp/tests/subtest*/*.out
$(RM) glsl/glcpp/tests/*.out
$(RM) glsl/glcpp/tests/subtest*/*.out

View File

@@ -0,0 +1,90 @@
#
# Copyright © 2012 Jon TURNEY
# Copyright (C) 2015 Intel Corporation
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice (including the next
# paragraph) shall be included in all copies or substantial portions of the
# Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
# IN THE SOFTWARE.
noinst_LTLIBRARIES += nir/libnir.la
nir_libnir_la_LIBADD = \
libcompiler.la
nir_libnir_la_SOURCES = \
$(NIR_FILES) \
$(SPIRV_FILES) \
$(NIR_GENERATED_FILES)
PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)
nir/nir_builder_opcodes.h: nir/nir_opcodes.py nir/nir_builder_opcodes_h.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/nir/nir_builder_opcodes_h.py > $@ || ($(RM) $@; false)
nir/nir_constant_expressions.c: nir/nir_opcodes.py nir/nir_constant_expressions.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/nir/nir_constant_expressions.py > $@ || ($(RM) $@; false)
nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_h.py > $@ || ($(RM) $@; false)
nir/nir_opcodes.c: nir/nir_opcodes.py nir/nir_opcodes_c.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_c.py > $@ || ($(RM) $@; false)
nir/nir_opt_algebraic.c: nir/nir_opt_algebraic.py nir/nir_algebraic.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/nir/nir_opt_algebraic.py > $@ || ($(RM) $@; false)
check_PROGRAMS += nir/tests/control_flow_tests
nir_tests_control_flow_tests_CPPFLAGS = \
$(AM_CPPFLAGS) \
-I$(top_builddir)/src/compiler/nir \
-I$(top_srcdir)/src/compiler/nir
nir_tests_control_flow_tests_SOURCES = \
nir/tests/control_flow_tests.cpp
nir_tests_control_flow_tests_CFLAGS = \
$(PTHREAD_CFLAGS)
nir_tests_control_flow_tests_LDADD = \
$(top_builddir)/src/gtest/libgtest.la \
nir/libnir.la \
$(top_builddir)/src/util/libmesautil.la \
$(PTHREAD_LIBS)
TESTS += nir/tests/control_flow_tests
BUILT_SOURCES += $(NIR_GENERATED_FILES)
CLEANFILES += $(NIR_GENERATED_FILES)
EXTRA_DIST += \
nir/nir_algebraic.py \
nir/nir_builder_opcodes_h.py \
nir/nir_constant_expressions.py \
nir/nir_opcodes.py \
nir/nir_opcodes_c.py \
nir/nir_opcodes_h.py \
nir/nir_opt_algebraic.py \
nir/tests \
SConscript.nir

View File

@@ -25,6 +25,8 @@ LIBGLSL_FILES = \
glsl/glsl_parser_extras.h \
glsl/glsl_symbol_table.cpp \
glsl/glsl_symbol_table.h \
glsl/glsl_to_nir.cpp \
glsl/glsl_to_nir.h \
glsl/hir_field_selection.cpp \
glsl/ir_basic_block.cpp \
glsl/ir_basic_block.h \
@@ -77,10 +79,10 @@ LIBGLSL_FILES = \
glsl/loop_unroll.cpp \
glsl/lower_buffer_access.cpp \
glsl/lower_buffer_access.h \
glsl/lower_clip_distance.cpp \
glsl/lower_const_arrays_to_uniforms.cpp \
glsl/lower_discard.cpp \
glsl/lower_discard_flow.cpp \
glsl/lower_distance.cpp \
glsl/lower_if_to_cond_assign.cpp \
glsl/lower_instructions.cpp \
glsl/lower_jumps.cpp \
@@ -129,6 +131,7 @@ LIBGLSL_FILES = \
glsl/opt_tree_grafting.cpp \
glsl/opt_vectorize.cpp \
glsl/program.h \
glsl/propagate_invariance.cpp \
glsl/s_expression.cpp \
glsl/s_expression.h
@@ -137,7 +140,8 @@ LIBGLSL_FILES = \
GLSL_COMPILER_CXX_FILES = \
glsl/standalone_scaffolding.cpp \
glsl/standalone_scaffolding.h \
glsl/main.cpp
glsl/standalone.cpp \
glsl/standalone.h
# libglsl generated sources
LIBGLSL_GENERATED_CXX_FILES = \
@@ -162,8 +166,6 @@ NIR_GENERATED_FILES = \
nir/nir_opt_algebraic.c
NIR_FILES = \
nir/glsl_to_nir.cpp \
nir/glsl_to_nir.h \
nir/nir.c \
nir/nir.h \
nir/nir_array.h \
@@ -175,23 +177,34 @@ NIR_FILES = \
nir/nir_control_flow_private.h \
nir/nir_dominance.c \
nir/nir_from_ssa.c \
nir/nir_gather_info.c \
nir/nir_gs_count_vertices.c \
nir/nir_intrinsics.c \
nir/nir_intrinsics.h \
nir/nir_inline_functions.c \
nir/nir_instr_set.c \
nir/nir_instr_set.h \
nir/nir_intrinsics.c \
nir/nir_intrinsics.h \
nir/nir_liveness.c \
nir/nir_lower_alu_to_scalar.c \
nir/nir_lower_atomics.c \
nir/nir_lower_bitmap.c \
nir/nir_lower_clamp_color_outputs.c \
nir/nir_lower_clip.c \
nir/nir_lower_double_ops.c \
nir/nir_lower_double_packing.c \
nir/nir_lower_drawpixels.c \
nir/nir_lower_global_vars_to_local.c \
nir/nir_lower_gs_intrinsics.c \
nir/nir_lower_load_const_to_scalar.c \
nir/nir_lower_locals_to_regs.c \
nir/nir_lower_idiv.c \
nir/nir_lower_indirect_derefs.c \
nir/nir_lower_io.c \
nir/nir_lower_outputs_to_temporaries.c \
nir/nir_lower_io_to_temporaries.c \
nir/nir_lower_io_types.c \
nir/nir_lower_passthrough_edgeflags.c \
nir/nir_lower_phis_to_scalar.c \
nir/nir_lower_returns.c \
nir/nir_lower_samplers.c \
nir/nir_lower_system_values.c \
nir/nir_lower_tex.c \
@@ -200,6 +213,8 @@ NIR_FILES = \
nir/nir_lower_vars_to_ssa.c \
nir/nir_lower_var_copies.c \
nir/nir_lower_vec_to_movs.c \
nir/nir_lower_wpos_center.c \
nir/nir_lower_wpos_ytransform.c \
nir/nir_metadata.c \
nir/nir_move_vec_src_uses_to_dest.c \
nir/nir_normalize_cubemap_coords.c \
@@ -213,8 +228,12 @@ NIR_FILES = \
nir/nir_opt_peephole_select.c \
nir/nir_opt_remove_phis.c \
nir/nir_opt_undef.c \
nir/nir_phi_builder.c \
nir/nir_phi_builder.h \
nir/nir_print.c \
nir/nir_propagate_invariant.c \
nir/nir_remove_dead_variables.c \
nir/nir_repair_ssa.c \
nir/nir_search.c \
nir/nir_search.h \
nir/nir_split_var_copies.c \
@@ -224,3 +243,16 @@ NIR_FILES = \
nir/nir_vla.h \
nir/nir_worklist.c \
nir/nir_worklist.h
SPIRV_FILES = \
spirv/GLSL.std.450.h \
spirv/nir_spirv.h \
spirv/spirv.h \
spirv/spirv_info.h \
spirv/spirv_info.c \
spirv/spirv_to_nir.c \
spirv/vtn_alu.c \
spirv/vtn_cfg.c \
spirv/vtn_glsl450.c \
spirv/vtn_private.h \
spirv/vtn_variables.c

View File

@@ -21,4 +21,5 @@ compiler = env.ConvenienceLibrary(
)
Export('compiler')
SConscript('glsl/SConscript')
SConscript('SConscript.glsl')
SConscript('SConscript.nir')

View File

@@ -15,14 +15,17 @@ env.Prepend(CPPPATH = [
'#src/mesa',
'#src/gallium/include',
'#src/gallium/auxiliary',
'#src/glsl',
'#src/glsl/glcpp',
'#src/compiler/glsl',
'#src/compiler/glsl/glcpp',
'#src/compiler/nir',
])
env.Prepend(LIBS = [mesautil])
# Make glcpp-parse.h and glsl_parser.h reachable from the include path.
env.Append(CPPPATH = [Dir('.').abspath, Dir('glcpp').abspath])
env.Prepend(CPPPATH = [Dir('.').abspath, Dir('glsl').abspath])
# Make NIR headers reachable from the include path.
env.Prepend(CPPPATH = [Dir('.').abspath, Dir('nir').abspath])
glcpp_env = env.Clone()
glcpp_env.Append(YACCFLAGS = [
@@ -32,7 +35,7 @@ glcpp_env.Append(YACCFLAGS = [
glsl_env = env.Clone()
glsl_env.Append(YACCFLAGS = [
'--defines=%s' % File('glsl_parser.h').abspath,
'--defines=%s' % File('glsl/glsl_parser.h').abspath,
'-p', '_mesa_glsl_',
])
@@ -40,10 +43,10 @@ glsl_env.Append(YACCFLAGS = [
# "glsl_parser.h", causing glsl_parser.cpp to be regenerated every time
glsl_env['YACCHXXFILESUFFIX'] = '.h'
glcpp_lexer = glcpp_env.CFile('glcpp/glcpp-lex.c', 'glcpp/glcpp-lex.l')
glcpp_parser = glcpp_env.CFile('glcpp/glcpp-parse.c', 'glcpp/glcpp-parse.y')
glsl_lexer = glsl_env.CXXFile('glsl_lexer.cpp', 'glsl_lexer.ll')
glsl_parser = glsl_env.CXXFile('glsl_parser.cpp', 'glsl_parser.yy')
glcpp_lexer = glcpp_env.CFile('glsl/glcpp/glcpp-lex.c', 'glsl/glcpp/glcpp-lex.l')
glcpp_parser = glcpp_env.CFile('glsl/glcpp/glcpp-parse.c', 'glsl/glcpp/glcpp-parse.y')
glsl_lexer = glsl_env.CXXFile('glsl/glsl_lexer.cpp', 'glsl/glsl_lexer.ll')
glsl_parser = glsl_env.CXXFile('glsl/glsl_parser.cpp', 'glsl/glsl_parser.yy')
# common generated sources
glsl_sources = [
@@ -51,7 +54,7 @@ glsl_sources = [
glcpp_parser[0],
glsl_lexer,
glsl_parser[0],
]
]
# parse Makefile.sources
source_lists = env.ParseSourceList('Makefile.sources')
@@ -66,20 +69,20 @@ if env['msvc']:
# Copy these files to avoid generation object files into src/mesa/program
env.Prepend(CPPPATH = ['#src/mesa/main'])
env.Command('imports.c', '#src/mesa/main/imports.c', Copy('$TARGET', '$SOURCE'))
env.Command('glsl/imports.c', '#src/mesa/main/imports.c', Copy('$TARGET', '$SOURCE'))
# Copy these files to avoid generation object files into src/mesa/program
env.Prepend(CPPPATH = ['#src/mesa/program'])
env.Command('prog_hash_table.c', '#src/mesa/program/prog_hash_table.c', Copy('$TARGET', '$SOURCE'))
env.Command('symbol_table.c', '#src/mesa/program/symbol_table.c', Copy('$TARGET', '$SOURCE'))
env.Command('dummy_errors.c', '#src/mesa/program/dummy_errors.c', Copy('$TARGET', '$SOURCE'))
env.Command('glsl/prog_hash_table.c', '#src/mesa/program/prog_hash_table.c', Copy('$TARGET', '$SOURCE'))
env.Command('glsl/symbol_table.c', '#src/mesa/program/symbol_table.c', Copy('$TARGET', '$SOURCE'))
env.Command('glsl/dummy_errors.c', '#src/mesa/program/dummy_errors.c', Copy('$TARGET', '$SOURCE'))
compiler_objs = env.StaticObject(source_lists['GLSL_COMPILER_CXX_FILES'])
mesa_objs = env.StaticObject([
'imports.c',
'prog_hash_table.c',
'symbol_table.c',
'dummy_errors.c',
'glsl/imports.c',
'glsl/prog_hash_table.c',
'glsl/symbol_table.c',
'glsl/dummy_errors.c',
])
compiler_objs += mesa_objs
@@ -109,6 +112,8 @@ if env['platform'] == 'windows':
env.Prepend(LIBS = [compiler, glsl])
compiler_objs += env.StaticObject("glsl/main.cpp")
glsl_compiler = env.Program(
target = 'glsl_compiler',
source = compiler_objs,
@@ -116,7 +121,7 @@ glsl_compiler = env.Program(
env.Alias('glsl_compiler', glsl_compiler)
glcpp = env.Program(
target = 'glcpp/glcpp',
source = ['glcpp/glcpp.c'] + mesa_objs,
target = 'glsl/glcpp/glcpp',
source = ['glsl/glcpp/glcpp.c'] + mesa_objs,
)
env.Alias('glcpp', glcpp)

View File

@@ -0,0 +1,73 @@
import common
Import('*')
from sys import executable as python_cmd
env = env.Clone()
env.MSVC2013Compat()
env.Prepend(CPPPATH = [
'#include',
'#src',
'#src/mapi',
'#src/mesa',
'#src/gallium/include',
'#src/gallium/auxiliary',
'#src/compiler/nir',
])
# Make generated headers reachable from the include path.
env.Prepend(CPPPATH = [Dir('.').abspath, Dir('nir').abspath])
# nir generated sources
nir_builder_opcodes_h = env.CodeGenerate(
target = 'nir/nir_builder_opcodes.h',
script = 'nir/nir_builder_opcodes_h.py',
source = [],
command = python_cmd + ' $SCRIPT > $TARGET'
)
env.CodeGenerate(
target = 'nir/nir_constant_expressions.c',
script = 'nir/nir_constant_expressions.py',
source = [],
command = python_cmd + ' $SCRIPT > $TARGET'
)
env.CodeGenerate(
target = 'nir/nir_opcodes.h',
script = 'nir/nir_opcodes_h.py',
source = [],
command = python_cmd + ' $SCRIPT > $TARGET'
)
env.CodeGenerate(
target = 'nir/nir_opcodes.c',
script = 'nir/nir_opcodes_c.py',
source = [],
command = python_cmd + ' $SCRIPT > $TARGET'
)
env.CodeGenerate(
target = 'nir/nir_opt_algebraic.c',
script = 'nir/nir_opt_algebraic.py',
source = [],
command = python_cmd + ' $SCRIPT > $TARGET'
)
# parse Makefile.sources
source_lists = env.ParseSourceList('Makefile.sources')
nir_sources = source_lists['NIR_FILES']
nir_sources += source_lists['NIR_GENERATED_FILES']
nir = env.ConvenienceLibrary(
target = 'nir',
source = nir_sources,
)
env.Alias('nir', nir)
Export('nir')

Some files were not shown because too many files have changed in this diff Show More